Commit Graph

41 Commits

Author SHA1 Message Date
Rasmus Wriedt Larsen
6e9d9fcbbd Python: Improve taint steps in for & iterable unpacking
These were written way before the ones in DataFlowPrivate, but
apparently didn't cover quite as much :|
2021-07-22 14:16:17 +02:00
Rasmus Wriedt Larsen
e2d3fa7093 Python: Add list-comprehension taint test 2021-07-22 11:59:46 +02:00
Rasmus Wriedt Larsen
cc311ac4cd Python: Re-introduce syntactic handling of str/bytes/unicode (again)
This reverts commit 870389addb.
2021-06-14 14:23:12 +02:00
Rasmus Wriedt Larsen
870389addb Revert "Python: Re-introduce syntactic handling of str/bytes/unicode"
This reverts commit c4987e94e0.

Hoping that our new handling of builtins would solve this problem... but
it did not :|
2021-06-14 14:22:40 +02:00
Rasmus Wriedt Larsen
af13064f6a Merge branch 'main' into pr/RasmusWL/5926 2021-06-14 14:17:33 +02:00
Rasmus Wriedt Larsen
9dbb364cca Python: Move json tests to be part of stdlib
This is better, since the modeling is also part of Stdlib.qll
2021-05-19 17:10:33 +02:00
Rasmus Wriedt Larsen
c4987e94e0 Python: Re-introduce syntactic handling of str/bytes/unicode
I don't want to loose results on this, so until type-tracking/API graphs
can handle this, I want to keep our syntactic handling.
2021-05-19 13:00:11 +02:00
Rasmus Wriedt Larsen
aa8b7306a3 Python: Use more API graphs in TaintTrackingPrivate
But now we suddenly don't handle the call to `unicode` :O -- at least
not when I run the test locally (using Python 3).
2021-05-19 12:59:58 +02:00
Rasmus Wriedt Larsen
72d08f4d6e Python: Model json load/dump 2021-05-10 15:10:30 +02:00
Rasmus Wriedt Larsen
63f28d7d9b Python: Model keyword args to json loads/dumps 2021-05-10 15:10:29 +02:00
Rasmus Wriedt Larsen
784e0cdb96 Python: Improve tests of json module
Inspired by the work on previous commit
2021-05-10 15:10:28 +02:00
Rasmus Wriedt Larsen
3e7dc12246 Python: Port taint tests to use inline expectations
The meat of this PR is described in the new python/ql/test/experimental/meta/InlineTaintTest.qll file:

> Defines a InlineExpectationsTest for checking whether any arguments in
> `ensure_tainted` and `ensure_not_tainted` calls are tainted.
>
> Also defines query predicates to ensure that:
> - if any arguments to `ensure_not_tainted` are tainted, their annotation is marked with `SPURIOUS`.
> - if any arguments to `ensure_tainted` are not tainted, their annotation is marked with `MISSING`.
>
> The functionality of this module is tested in `ql/test/experimental/meta/inline-taint-test-demo`.
2021-04-15 18:00:33 +02:00
Rasmus Lerchedahl Petersen
f6fa1276a6 Python: Add consistency checks
to all data-flow test floders
2021-01-29 21:28:43 +01:00
Rasmus Wriedt Larsen
a0c7365ae6 Python: Proper models of json.loads and json.dumps 2020-11-27 15:57:56 +01:00
Rasmus Wriedt Larsen
247fd4f5f3 Python: Make encoding/decoding preserve taint automatically
With the way we have set things up, there is no way to opt out of this behavior.
2020-11-02 14:53:30 +01:00
Rasmus Wriedt Larsen
76c9b8c49f Python: Expose importNode instead of importModule/importMember
Since predicate name `import` is not allowed, I adopted `importNode` as it sort
of matches what `exprNode` does.

---

Due to only using `importMember` in `os_attr` we previously didn't handle
`import os.path as alias` :|

I did creat a hotfix for this (https://github.com/github/codeql/pull/4446), but
in doing so I realized the core of the problem: We're exposing ourselves to
making these kinds of mistakes by having BOTH importModule and importMember, and
we don't really gain anything from doing this!

We do loose the ability to easily only modeling `from mod import val` and not
`import mod.val`, but I don't think that will ever be relevant.

This change will also make us to recognize some invalid code, for example in

    import os.system as runtime_error

we would now model that `runtime_error` is a reference to the `os.system`
function (although the actual import would result in a runtime error).

Overall these are tradeoffs I'm willing to make, as it does makes things simpler
from a QL modeling point of view, and THAT sounds nice 👍
2020-10-13 15:03:22 +02:00
Rasmus Wriedt Larsen
4bfd55f1af Python: Show problem with os.path modeling
This is not a very good test for showing that we don't handle direct imports,
but it was the best I had available without inventing something new. It's very
fragile, since any of these would propagate taint (due to handling all `join`
calls as if the qualifier was a string):

    ospath_alias.join(ts)
    ospath_alias.join(ts, "foo", "bar")

But this test DOES serve the purpose of illustrating that my fix works :D
2020-10-13 14:50:00 +02:00
Rasmus Wriedt Larsen
0542c3b91e Python: Model os.path.join and add taint-step 2020-09-30 11:42:36 +02:00
Rasmus Wriedt Larsen
efa2484718 Python: Add taint test for os.path.join
Surprisingly the first two just worked, due to our very general handling of any
`join` methods :D
2020-09-30 11:35:21 +02:00
Rasmus Wriedt Larsen
aa6fad558c Python: Minor cleanup in taint-step tests 2020-09-30 11:15:53 +02:00
Rasmus Lerchedahl Petersen
b1567827a0 Python: Repair flow out of post-update nodes 2020-09-09 15:52:07 +02:00
Rasmus Lerchedahl Petersen
9e59d79a72 Python: Repair flow from pre-update nodes 2020-09-09 13:51:24 +02:00
Rasmus Lerchedahl Petersen
c661f43316 Python: Port use-use implementation from Java 2020-09-09 12:19:40 +02:00
Rasmus Wriedt Larsen
c5e3333d10 Python: Update expected tests after last commit
I'm pushing too fast it seems
2020-09-01 12:01:34 +02:00
Rasmus Wriedt Larsen
e0cfe8123e Python: Update comments for new taint tests
I see I didn't keep them up to date as I implemented things
2020-09-01 11:58:26 +02:00
Rasmus Wriedt Larsen
e5a361c230 Python: Better taint tests for copy.deepcopy 2020-09-01 11:50:33 +02:00
Rasmus Wriedt Larsen
2d2b036b8c Python: Fix expected output for moved taint tests 2020-08-28 11:25:46 +02:00
Rasmus Wriedt Larsen
7213da195c Python: Use standard naming scheme for taint flow tests
We got into problems since using `string.py` would shadow the string module from
the standard library. By some reason I adopted a pattern of `_` as suffix, but
let us just use the standard pattern of `test_` prefix like a normal testing
framework like pytest does.
2020-08-28 11:22:42 +02:00
Rasmus Wriedt Larsen
f12d29de07 Python: Add taint test of more colleciton methods 2020-08-27 17:36:10 +02:00
Rasmus Wriedt Larsen
627363d6ea Python: Test taint step for string augmented assignment
Apprently it just works 😕 :magic:
2020-08-27 11:37:56 +02:00
Rasmus Wriedt Larsen
d0081dfbfa Python: Attempt at taint step for list.append/set.add 2020-08-27 10:57:07 +02:00
Rasmus Wriedt Larsen
af20c3e082 Python: Make new taint tracking tests runnable again
since the files was called `collection`, that conflicted with import system :|
2020-08-27 10:44:14 +02:00
Rasmus Wriedt Larsen
c24e3452f5 Python: Add more expected collection taint steps 2020-08-26 20:28:33 +02:00
Rasmus Wriedt Larsen
423139bc22 Python: Add additional taint steps for iterable-unpacking 2020-08-26 20:21:15 +02:00
Rasmus Wriedt Larsen
afb160fbbb Python: Add additional taint steps for for-iteration 2020-08-26 20:18:31 +02:00
Rasmus Wriedt Larsen
e2a89aa296 Python: Add additional taint steps for copy
deepcopy was already handled somehow, don't really know how :D
2020-08-26 19:39:38 +02:00
Rasmus Wriedt Larsen
b974dadca1 Python: Add additional taint steps for containers 2020-08-26 19:39:37 +02:00
Rasmus Wriedt Larsen
b6049765a8 Python: Add a few more collection taint tests 2020-08-26 19:39:36 +02:00
Rasmus Wriedt Larsen
32f9d30136 Python: Add syntactic taint steps for json methods 2020-08-26 19:39:36 +02:00
Rasmus Wriedt Larsen
41e24ae93f Python: Add non-syntactical test for taint of json methods 2020-08-26 19:39:35 +02:00
Rasmus Wriedt Larsen
5f9aa4c3b9 Python: Restructure defaultAdditionalTaintStep tests
This makes it easier to add a new test-case, and makes it easier to work with
the existing files. It does have a downside on making it a bit more annoying
looking at TestTaint.expected, and possible longer runtime, but I think it's
still worth it.
2020-08-26 19:39:33 +02:00