Commit Graph

1743 Commits

Author SHA1 Message Date
Rasmus Wriedt Larsen
0cc018fec0 Python: Taint tracking setup alá Go
\## TaintFlow sources

The class `RemoteFlowSource` is very similarly defined as the other languages [C++](ac22e7950c/cpp/ql/src/semmle/code/cpp/security/FlowSources.qll), [Java](6de612a566/java/ql/src/semmle/code/java/dataflow/FlowSources.qll), [C#](fddbce0b7b/csharp/ql/src/semmle/code/csharp/security/dataflow/flowsources/Remote.qll), [JS](78334af354/javascript/ql/src/semmle/javascript/security/dataflow/RemoteFlowSources.qll), and [Go](24b3133e0c/ql/src/semmle/go/security/FlowSources.qll). There are some minor differences:

- Java/C++ defines the class in `FlowSources.qll`
- C# uses `csharp/ql/src/semmle/code/csharp/security/dataflow/flowsources/Remote.qll`, and provide `StoredFlowSource` and `LocalFlowSource` in separate classes.
- JS uses `RemoteFlowSources.qll`.
- JS defines additional predicate `RemoteFlowSource.isUserControlledObject`
- Go uses the class name `UntrustedFlowSource`, but still defined in `ql/src/semmle/go/security/FlowSources.qll`
- Go uses the `::Range` pattern to allow both extensibility and refinement

The big difference is how a RemoteFlowSource is specified:

- Java and C# have all subclasses of `RemoteFlowSource` defined in the same file
- Go and JS defines subclasses for frameworks in the actual framework `.qll` file, and all frameworks are transitively imported by `import go` or `import javascript` (so subclasses are always in scope).
- C++ uses class `RemoteFlowFunction` to do all the heavy lifting (and its subclasses are transitively imported).

\### What we will do

Use file `RemoteFlowSource.qll`, define subclasses in framework library classes.

_Why? Personally I really like it, Go/JS is already doing it, and Tom expressed a preference for doing the same for C# (although that is not what they are doing today)._

Jonas gave this advice:
> Whether you split the definitions between multiple files or keep them all in one file, the property you want is that all definitions are included when the abstract class is included. Otherwise you can get unexpected results via transitive includes.

We will make imports of all frameworks in the same file that defines `RemoteFlowSource`, as it seems to be the least intrusive change. If that turns out to be a problem, we can also move them to `python.qll` (the other way is not so easy).

\## TaintFlow sinks

[JS](473787a426/javascript/ql/src/semmle/javascript/Concepts.qll) and [Go](ecff1e6a16/ql/src/semmle/go/Concepts.qll) defines abstract base classes for interesting sinks in `Concepts.qll` (and all uses the `::Range` pattern in Go).

I really like this idea, since it allows multiple queries to reuse the same sink definitions, and it makes it _easy_ to discover what default sinks are available.

Personally I'm not 100% on board with the naming, but I don't have any good reason to change the naming convention.

\## Framework modeling

Following the model from Go ([example](https://github.com/github/codeql-go/blob/main/ql/src/semmle/go/frameworks/Gin.qll)), I propose that we make every definition in a framework modeling `private`. This allows some greater flexibility in changing our modeling, since we don't need to think about keeping deprecated versions around for a whole year.

It _does_ have the downside that someone writing a query can't reuse the classes/predicates for a framework, but it didn't seem to be too big of a concern. If we need to provide access, we can always make the definitions non-private (the other way is not so easy).

\## Customizations

Also introduced `Customizations.qll` like in JS/Java/Go (to replace `site.qll`)
2020-09-01 14:37:11 +02:00
Rasmus Lerchedahl Petersen
6b8d9f2a77 Merge branch 'main' of github.com:github/codeql into SharedDataflow_PostUpdateNodes 2020-08-28 13:01:14 +02:00
Rasmus Lerchedahl Petersen
9503c5d8bb Python: Add post-update nodes 2020-08-28 12:59:11 +02:00
Rasmus Wriedt Larsen
496d856c48 Python: Reformualte explanation of experience from JS 2020-08-28 10:49:33 +02:00
Taus
1206ff5889 Merge pull request #4150 from RasmusWL/python-dataflow-private-import
Python: Make import of python private in shared dataflow
2020-08-27 18:05:55 +02:00
Taus Brock-Nannestad
797e290a67 Python+CPP: Change values to value 2020-08-27 14:12:40 +02:00
Taus Brock-Nannestad
dccbcc15b3 Python: Sync InlineExpectationsTest.qll between Python and C++
Also changes `valuesasas` to `values` in the test example.
2020-08-27 13:37:26 +02:00
Rasmus Wriedt Larsen
9da6da6106 Python: Fix imports in shraed dataflow tests 2020-08-27 13:29:41 +02:00
Taus
e7322d114f Merge pull request #4077 from yoff/MagicMethods
Python: Add support for magic methods
2020-08-27 13:20:56 +02:00
Taus
d3175a7899 Merge pull request #4110 from yoff/SharedDataflow_ParsimoniousFlowNodes
Python: Shared dataflow, parsimonious flow nodes
2020-08-27 13:19:23 +02:00
CodeQL CI
30ac2f9c84 Merge pull request #4143 from tausbn/python-add-inline-test-expectations-library
Approved by RasmusWL
2020-08-27 12:18:41 +01:00
Rasmus Wriedt Larsen
909bff2313 Python: Make import of python private in shared dataflow 2020-08-27 11:48:56 +02:00
Rasmus Wriedt Larsen
569e54e7bb Python: Remove symlink from experimental test 2020-08-27 11:19:55 +02:00
Rasmus Lerchedahl Petersen
09025c2198 Python: Fix test, update results and annotations 2020-08-27 08:40:13 +02:00
Esben Sparre Andreasen
67278d9c93 Merge pull request #4141 from esbena/js/clarify-sanitization
JS: make sanitization a "common" technique rather than "important"
2020-08-27 08:08:17 +02:00
Rasmus Lerchedahl Petersen
dcabd37974 Python: Update test expectations 2020-08-26 17:58:35 +02:00
Rasmus Lerchedahl Petersen
bf6211f639 Merge branch 'main' of github.com:github/codeql into SharedDataflow_ParsimoniousFlowNodes 2020-08-26 17:50:17 +02:00
Rasmus Lerchedahl Petersen
6c173047e6 Merge branch 'MagicMethods' of github.com:yoff/codeql into MagicMethods 2020-08-26 17:43:27 +02:00
Rasmus Lerchedahl Petersen
47e35c530d Merge branch 'main' of github.com:github/codeql into MagicMethods 2020-08-26 17:42:44 +02:00
Taus Brock-Nannestad
e193e12b3f Python: Add support for inline test expectations library 2020-08-26 16:10:04 +02:00
Taus
b1946c60dd Merge pull request #4127 from RasmusWL/python-tainttracking-fstring
Python: Handle f-strings in (current) taint tracking
2020-08-26 16:06:01 +02:00
Esben Sparre Andreasen
89305865d0 JS: make sanitization a "common" technique rather than "important" 2020-08-26 15:41:54 +02:00
Taus
000fa33d54 Merge pull request #4013 from yoff/SharedDataflow_SequenceFlow
Python: Shared dataflow: Content flow
2020-08-25 15:38:14 +02:00
Rasmus Wriedt Larsen
2dbf83b579 Python: TaintTracking: Move tests of py3 string methods 2020-08-25 13:06:27 +02:00
Rasmus Wriedt Larsen
cf121cc4d0 Python: TaintTracking: stringMethods => stringManipualtion 2020-08-25 13:05:27 +02:00
Rasmus Wriedt Larsen
238e0845aa Python: Minor refactoring 2020-08-25 12:50:41 +02:00
Rasmus Wriedt Larsen
0439b83c60 Python: Taint when using unicode 2020-08-25 12:50:32 +02:00
Rasmus Wriedt Larsen
2a29e26687 Python: Fix grammar
Co-authored-by: yoff <lerchedahl@gmail.com>
2020-08-25 12:41:53 +02:00
Rasmus Wriedt Larsen
483bd0e863 Python: Fix shared taint tracking tests
Since there was a .ql file, qltest tried to run a test in
test/experimental/dataflow/taintracking/ which failed since there was no code.
2020-08-25 11:15:11 +02:00
yoff
3140b43db2 Apply suggestions from code review
Co-authored-by: Taus <tausbn@github.com>
2020-08-25 10:48:01 +02:00
Rasmus Wriedt Larsen
13148b42d3 Python: Handle taint of f-strings 2020-08-24 17:23:10 +02:00
Rasmus Wriedt Larsen
2f090df6d3 Python: Transform comments to QLDoc for security.strings.Basic 2020-08-24 17:20:04 +02:00
Rasmus Lerchedahl Petersen
2608509fa7 Merge branch 'main' of github.com:github/codeql into SharedDataflow_SequenceFlow 2020-08-24 17:16:33 +02:00
Rasmus Wriedt Larsen
be2acc00db Python: Add test for tainted f-string 2020-08-24 17:14:51 +02:00
Rasmus Wriedt Larsen
d96ef73033 Python: Handle taint for f-strings
Which we seem to not handle in the current taint tracking :O

f-strings needs to be Python 3 only, so enabled that test setup. I really liked
the idea for having the version specific tests right next to the normal tests,
so you don't have to look in
test/experimental/3/dataflow/i/will/forget/to/look/here.
2020-08-24 16:46:00 +02:00
Rasmus Wriedt Larsen
cb4b4e91ab Python: Taint for string multiplication 2020-08-24 14:54:06 +02:00
Rasmus Wriedt Larsen
b688fe68d6 Python: Add options file to shared dataflow tests
Since there isn't one in top-level of experimental, making a single import made
tests go really slow :|
2020-08-24 14:54:05 +02:00
Rasmus Wriedt Larsen
5125c7a55c Python: Add taint tests for encode/decode functions 2020-08-24 14:54:04 +02:00
Rasmus Wriedt Larsen
31b398937a Python: Handle taint from bytes(obj) 2020-08-24 14:17:59 +02:00
Rasmus Wriedt Larsen
1e447c5ca2 Python: Handle taint for % formatting 2020-08-24 14:15:27 +02:00
Rasmus Wriedt Larsen
80745e8881 Python: Model string methods in shared taint tracking library 2020-08-24 13:58:42 +02:00
Rasmus Wriedt Larsen
a77f118b62 Python: Shared taint tracking: Handle string concat + subcript 2020-08-24 13:58:41 +02:00
Rasmus Wriedt Larsen
61f89ca3c3 Python: Add tests for shared taint tracking for strings
I adopted the TestTaint testing setup that I made for the "old" taint tracking
tests. This time around we should figure out if we can use .qlref or similar so
it doesn't end up in multiple copies that are not kept up to date :|

The `repr` predicate could probably be placed somewhere better. For now I just
wanted something that could help me. I considered just expanding the `repr`
predicate in `ql/src/semmle/python/strings.qll`, but since it's currently used
by queries, I didn't want to do anything about it.

Anyway, the output it gives is much more useful than seeing this ;)

```
| test.py:20 | ok   | str_operations | test.py:20:9:20:10 | ts |
| test.py:21 | fail | str_operations | test.py:21:9:21:18 | BinaryExpr |
| test.py:22 | fail | str_operations | test.py:22:9:22:18 | BinaryExpr |
| test.py:23 | fail | str_operations | test.py:23:9:23:21 | Subscript |
| test.py:24 | fail | str_operations | test.py:24:9:24:13 | Subscript |
| test.py:25 | fail | str_operations | test.py:25:9:25:18 | Subscript |
| test.py:26 | fail | str_operations | test.py:26:9:26:13 | Subscript |
| test.py:27 | fail | str_operations | test.py:27:9:27:15 | str() |
| test.py:35 | fail | str_methods | test.py:35:9:35:23 | Attribute() |
| test.py:36 | fail | str_methods | test.py:36:9:36:21 | Attribute() |
| test.py:37 | fail | str_methods | test.py:37:9:37:22 | Attribute() |
| test.py:38 | fail | str_methods | test.py:38:9:38:23 | Attribute() |
| test.py:40 | fail | str_methods | test.py:40:9:40:19 | Attribute() |
| test.py:41 | fail | str_methods | test.py:41:9:41:23 | Attribute() |
| test.py:42 | fail | str_methods | test.py:42:9:42:36 | Attribute() |
| test.py:44 | fail | str_methods | test.py:44:9:44:25 | Attribute() |
| test.py:45 | fail | str_methods | test.py:45:9:45:45 | Attribute() |
| test.py:47 | fail | str_methods | test.py:47:9:47:21 | Attribute() |
| test.py:48 | fail | str_methods | test.py:48:9:48:19 | Attribute() |
| test.py:49 | fail | str_methods | test.py:49:9:49:18 | Attribute() |
| test.py:51 | fail | str_methods | test.py:51:9:51:32 | Attribute() |
| test.py:52 | fail | str_methods | test.py:52:9:52:34 | Attribute() |
| test.py:54 | fail | str_methods | test.py:54:9:54:21 | Attribute() |
| test.py:55 | fail | str_methods | test.py:55:9:55:19 | Attribute() |
| test.py:56 | fail | str_methods | test.py:56:9:56:18 | Attribute() |
| test.py:57 | fail | str_methods | test.py:57:9:57:21 | Attribute() |
| test.py:58 | fail | str_methods | test.py:58:9:58:18 | Attribute() |
| test.py:59 | fail | str_methods | test.py:59:9:59:18 | Attribute() |
| test.py:60 | fail | str_methods | test.py:60:9:60:21 | Attribute() |
| test.py:62 | fail | str_methods | test.py:62:9:62:26 | Attribute() |
| test.py:63 | fail | str_methods | test.py:63:9:63:42 | Attribute() |
| test.py:65 | fail | str_methods | test.py:65:9:65:26 | Attribute() |
| test.py:66 | fail | str_methods | test.py:66:9:66:42 | Attribute() |
| test.py:69 | fail | str_methods | test.py:69:9:69:25 | Attribute() |
| test.py:70 | fail | str_methods | test.py:70:9:70:26 | Attribute() |
| test.py:71 | fail | str_methods | test.py:71:9:71:22 | Attribute() |
| test.py:72 | fail | str_methods | test.py:72:9:72:21 | Attribute() |
| test.py:73 | fail | str_methods | test.py:73:9:73:23 | Attribute() |
| test.py:78 | ok   | str_methods | test.py:78:9:78:39 | Attribute() |
```
2020-08-24 13:58:39 +02:00
Taus
b8d6f76749 Merge pull request #4056 from yoff/SharedDataflow_ParameterTests
Python: Shared dataflow, parameter routing tests
2020-08-24 11:36:30 +02:00
Rasmus Lerchedahl Petersen
e1343c7f1e Python: Support set literals. 2020-08-21 11:15:04 +02:00
Rasmus Lerchedahl Petersen
ccff84d546 Python: Test flow into conprehension 2020-08-21 10:40:22 +02:00
Rasmus Lerchedahl Petersen
f9b1c5e4bd Python: Fix bug pointed out by reviewer 2020-08-21 10:04:27 +02:00
yoff
bfd9c0860f Apply suggestions from code review
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2020-08-21 09:43:29 +02:00
yoff
8e2b2540fa Apply suggestions from code review
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2020-08-21 09:39:00 +02:00
Rasmus Lerchedahl Petersen
94e6fd9199 Python: Convenience methods
asVar, asCfgNode, and asExpr
2020-08-20 15:16:23 +02:00