Python: Address many review comments

still need to move concept tests
2025-12-21 03:06:31 +01:00 · 2020-10-13 12:03:23 +02:00
parent 433a36225b
commit 4685f2d5f2
16 changed files with 285 additions and 238 deletions
--- a/python/ql/src/experimental/Security-new-dataflow/CWE-502/JsonGood.py
+++ b/python/ql/src/experimental/Security-new-dataflow/CWE-502/JsonGood.py
@@ -1,10 +0,0 @@
-
-from django.conf.urls import url
-import json
-
-def safe(pickled):
-    return json.loads(pickled)
-
-urlpatterns = [
-    url(r'^(?P<object>.*)$', safe)
-]
--- a/python/ql/src/experimental/Security-new-dataflow/CWE-502/UnpicklingBad.py
+++ b/python/ql/src/experimental/Security-new-dataflow/CWE-502/UnpicklingBad.py
@@ -1,10 +0,0 @@
-
-from django.conf.urls import url
-import pickle
-
-def unsafe(pickled):
-    return pickle.loads(pickled)
-
-urlpatterns = [
-    url(r'^(?P<object>.*)$', unsafe)
-]
--- a/python/ql/src/experimental/Security-new-dataflow/CWE-502/UnsafeDeserialization.qhelp
+++ b/python/ql/src/experimental/Security-new-dataflow/CWE-502/UnsafeDeserialization.qhelp
@@ -1,61 +0,0 @@
-<!DOCTYPE qhelp PUBLIC "-//Semmle//qhelp//EN" "qhelp.dtd">
-<qhelp>
-
-<overview>
-<p>
-Deserializing untrusted data using any deserialization framework that
-allows the construction of arbitrary serializable objects is easily exploitable
-and in many cases allows an attacker to execute arbitrary code.  Even before a
-deserialized object is returned to the caller of a deserialization method a lot
-of code may have been executed, including static initializers, constructors,
-and finalizers.  Automatic deserialization of fields means that an attacker may
-craft a nested combination of objects on which the executed initialization code
-may have unforeseen effects, such as the execution of arbitrary code.
-</p>
-<p>
-There are many different serialization frameworks.  This query currently
-supports Pickle, Marshal and Yaml.
-</p>
-</overview>
-
-<recommendation>
-<p>
-Avoid deserialization of untrusted data if at all possible.  If the
-architecture permits it then use other formats instead of serialized objects,
-for example JSON.
-</p>
-</recommendation>
-
-<example>
-<p>
-The following example calls <code>pickle.loads</code> directly on a
-value provided by an incoming HTTP request. Pickle then creates a new value from untrusted data, and is
-therefore inherently unsafe.
-</p>
-<sample src="UnpicklingBad.py" />
-
-<p>
-Changing the code to use <code>json.loads</code> instead of <code>pickle.loads</code> removes the vulnerability.
-</p>
-<sample src="JsonGood.py" />
-
-</example>
-
-<references>
-
-<li>
-OWASP vulnerability description:
-<a href="https://www.owasp.org/index.php/Deserialization_of_untrusted_data">Deserialization of untrusted data</a>.
-</li>
-<li>
-OWASP guidance on deserializing objects:
-<a href="https://cheatsheetseries.owasp.org/cheatsheets/Deserialization_Cheat_Sheet.html">Deserialization Cheat Sheet</a>.
-</li>
-<li>
-Talks by Chris Frohoff &amp; Gabriel Lawrence:
-<a href="http://frohoff.github.io/appseccali-marshalling-pickles/">
-AppSecCali 2015: Marshalling Pickles - how deserializing objects will ruin your day</a>
-</li>
-</references>
-
-</qhelp>
--- a/python/ql/src/experimental/Security-new-dataflow/CWE-502/UnsafeDeserialization.ql
+++ b/python/ql/src/experimental/Security-new-dataflow/CWE-502/UnsafeDeserialization.ql
@@ -23,7 +23,12 @@ class UnsafeDeserializationConfiguration extends TaintTracking::Configuration {

  override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource }

-  override predicate isSink(DataFlow::Node sink) { sink = any(DeserializationSink d).getData() }
+  override predicate isSink(DataFlow::Node sink) {
+    exists(UnmarshalingFunction d |
+      d.unsafe() and
+      sink = d.getAnInput()
+    )
+  }
 }

 from UnsafeDeserializationConfiguration config, DataFlow::PathNode source, DataFlow::PathNode sink