Commit Graph

847 Commits

Author SHA1 Message Date
Rasmus Wriedt Larsen
9615e2ded9 Python: Remove deprecated stubs for points-to tests
I grep'ed through all our options files, and couldn't find any tests
that relies on these anymore 👍
2024-04-10 13:12:36 +02:00
Taus
f1392712ee Python: Add .copy() as a copy step 2024-02-22 13:09:27 +00:00
Taus
5125973f9b Python: Add test case for .copy() as a copy step 2024-02-22 13:01:03 +00:00
Anders Schack-Mulligen
088a0a54ba Python: Add empty provenance column to expected files. 2024-02-09 11:32:08 +01:00
Sid Shankar
b1d7a635f5 Renames diagnostic query files and tests
This commit renames the files relating to the diagnostic query that produces information on the number of files extracted. The files have been renamed from "SuccessfullExtractedFiles.*" to "ExtractedFiles.*". All related tests and test files have been renamed too.

The `@tags` and `@id` attributes of the queries have been left untouched, consistent with the `@tags` and `@id` for similar queries in other languages.
2024-01-29 20:19:20 +00:00
yoff
930f1b50b9 Merge pull request #15397 from github/tausbn/python-fix-deepcopy-mutable-default-fp
Python: Fix `deepcopy` mutable default FP
2024-01-25 10:32:58 +01:00
Taus
96b1b8e402 Python: Remove empty lines from test file 2024-01-24 12:31:23 +00:00
Taus
d6d59377d3 Python: Fix flow through deepcopy
Or, more generally, any copy step, as these presumably do not preserve
object identity.

(Arguably, `copy` could still be susceptible to interior mutability, but
I think that's outside the scope of this query anyway.)
2024-01-22 15:40:30 +00:00
Taus
14c958ac4d Python: Remove mutable default sources from inside stdlib 2024-01-22 15:23:52 +00:00
Taus
411c107660 Python: Add tests for deepcopy FPs
There are two issues with `deepcopy` here. Firstly, the `deepcopy` function itself
has a mutable default value in its parameter `_nil` (set to the empty list by default).

Now, this value is never actually returned from `deepcopy`, as it is only used as a
sentinel, but our analysis is not clever enough to see this. Thus, it thinks that this
mutable default is returned, and hence the result of any call to `deepcopy` is a
potential source.

To remedy this, I opted to simply exclude all sources that originate from within the
standard library. It is very unlikely for any of the sources in the standard library
to be legit.

Secondly, `deepcopy` -- by virtue of being a function that we model as preserving
values -- admits data-flow through its calls, but this is not correct for the mutable
default query, as it is here the _identity_ of the default value in question that is
important. Thus, we get spurious flow through `deepcopy` for this specific query.
2024-01-22 15:21:57 +00:00
Taus
4742481070 Python: Consolidate "mutable default" tests
Moves the existing tests into the `ModificationOfParameterWithDefault` subdirectory
which already contained a bunch more tests. In the process, I also removed some
duplicated test cases.
2024-01-22 13:50:33 +00:00
Max Schaefer
98178458d0 Python: Add support for more URL redirect sanitisers.
Since some sanitisers don't handle backslashes correctly, I updated the data-flow configuration to incorporate a flow state tracking whether or not backslashes have been eliminated or converted to forward slashes.
2024-01-22 13:24:18 +00:00
Sid Shankar
fb660b8f05 Py: Report any extracted file as successfully extracted 2024-01-08 22:20:51 +00:00
Rasmus Lerchedahl Petersen
e091ae84ab Merge branch 'main' of https://github.com/github/codeql into python/remove-ssa-nodes-from-dataflow-graph 2023-12-04 14:05:40 +01:00
Rasmus Wriedt Larsen
2fed0adde7 Merge pull request #8457 from RasmusWL/add-dataflow-consistency-query
Python: Add dataflow consistency query
2023-12-04 12:50:46 +01:00
Taus
95e9284d08 Python: Add support for extraction filters
Adds support for extraction filters as defined in
https://peps.python.org/pep-0706/
and implemented in Python 3.12.

By my reading, setting the filter to `'data'` or `'tar'` is probably
safe, whereas `'fully_trusted'` or the default (which is the same as
`None`) is not.

For now, I have just added this modelling to the tarslip query. We could
also share it with the modelling of `shutil.unpack_archive` (which has also
gained a `filter` argument), but it was unclear to me where we should put
this modelling in that case. Perhaps the best solution would be to merge
the experimental `py/tarslip-extended` query into the existing query (in
which case the current location is perhaps not too bad).
2023-11-27 14:11:17 +00:00
Rasmus Wriedt Larsen
2ec1822e9c Python: Accept consistency-errors in django-orm 2023-11-21 12:44:42 +01:00
Rasmus Lerchedahl Petersen
11c71fdd18 Python: remove EssaNodes
This commit removes SSA nodes from the data flow graph. Specifically, for a definition and use such as
```python
  x = expr
  y = x + 2
```
we used to have flow from `expr` to an SSA variable representing x and from that SSA variable to the use of `x` in the definition of `y`. Now we instead have flow from `expr` to the control flow node for `x` at line 1 and from there to the control flow node for `x` at line 2.

Specific changes:
- `EssaNode` from the data flow layer no longer exists.
- Several glue steps between `EssaNode`s and `CfgNode`s have been deleted.
- Entry nodes are now admitted as `CfgNodes` in the data flow layer (they were filtered out before).
- Entry nodes now have a new `toString` taking into account that the module name may be ambigous.
- Some tests have been rewritten to accomodate the changes, but only `python/ql/test/experimental/dataflow/basic/maximalFlowsConfig.qll` should have semantic changes.
- Comments have been updated
- Test output has been updated, but apart from `python/ql/test/experimental/dataflow/basic/maximalFlows.expected` only `python/ql/test/experimental/dataflow/typetracking-summaries/summaries.py` should have a semantic change. This is a bonus fix, probably meaning that something was never connected up correctly.
2023-11-20 21:35:32 +01:00
Rasmus Wriedt Larsen
25d3af9236 Merge branch 'main' into clean-tests 2023-11-16 11:21:01 +01:00
Rasmus Wriedt Larsen
e02c32f3d4 Python: options file was not enough, split into 2/3
I reckon this is due to the Python 3 version used by the Python 2 tests
is different from 3.12, so even with --lang=3 the tests are still using
an incompatible version :(
2023-11-15 14:24:11 +01:00
Rasmus Wriedt Larsen
0f1dc9b2d9 Python: Add missing options file 2023-11-15 13:24:08 +01:00
Rasmus Wriedt Larsen
23419ee634 Python: Update .expected to support Python 3.12
You might wonder why the number of lines changed, but it's due to `tty`
module receiving its' first update since 2001, so the actual number of
lines DID change :phew:

https://github.com/python/cpython/commits/3.12/Lib/tty.py

Since there is now a difference between Python 2 and Python 3, we need to restrict the lines of code test to only run as Python 3.
2023-11-15 11:42:38 +01:00
Rasmus Wriedt Larsen
55f5b26ba6 Python: Accept new ordering of query predicates in .expected 2023-11-15 10:09:54 +01:00
Rasmus Wriedt Larsen
721bde1ce8 Python: Delete orphaned .expected files 2023-11-15 09:59:26 +01:00
Max Schaefer
104700f6d3 Address review comment. 2023-10-27 10:19:28 +01:00
Max Schaefer
3939167ba2 Include more details in the message for py/weak-cryptographic-algorithm.
Specifically, we add a link to the location where the cryptographic algorithm is configured, which can be far away from its use.
2023-10-26 11:28:09 +01:00
Rasmus Wriedt Larsen
2d947a4f53 Merge pull request #13781 from maikypedia/maikypedia/python-unsafe-deserialization
Python: Add unsafe deserialization sinks (CWE-502)
2023-10-10 13:30:38 +02:00
Erik Krogh Kristensen
625e889c62 Merge pull request #14339 from erik-krogh/range-printing
JS/PY/RB/Java: escape unicode chars in overly-large-range
2023-10-09 14:22:38 +02:00
yoff
dbecb1bd0f Merge pull request #14070 from yoff/python/promote-nosql-query
Python: promote nosql query
2023-09-29 14:21:22 +02:00
Rasmus Wriedt Larsen
9b73bbfc31 Python: Add keyword argument support
and a fair bit of refactoring
2023-09-29 13:54:21 +02:00
Rasmus Wriedt Larsen
3676262313 Python: Clean trailing whitespace 2023-09-29 13:54:21 +02:00
Rasmus Wriedt Larsen
16e1a00e88 Python: NoSQLInjection -> NoSqlInjection 2023-09-29 13:52:51 +02:00
Rasmus Lerchedahl Petersen
2d845e3e55 Python: nicer paths
turn "the long jump" that would end up
straight at the argument into a short jump
that ends up at the dictionary being written to.
Dataflow takes care of the rest of the path.
2023-09-29 12:02:16 +02:00
erik-krogh
5d4b542995 escape unicode chars in overly-large-range 2023-09-28 20:16:09 +02:00
Rasmus Lerchedahl Petersen
d5b64c5ff2 Python: update test expectations 2023-09-28 14:20:30 +02:00
Rasmus Lerchedahl Petersen
3fb579eaff Python: add test for type tracking 2023-09-28 12:14:12 +02:00
yoff
8156fa9a4d Apply naming suggestions from code review
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2023-09-28 11:47:10 +02:00
Rasmus Wriedt Larsen
f3acc89900 Python: Accept .expected 2023-09-28 10:41:16 +02:00
Rasmus Lerchedahl Petersen
db95eade64 Python: accept improved test output 2023-09-26 20:58:51 +02:00
Rasmus Wriedt Larsen
db7b1eea55 Merge branch 'main' into maikypedia/python-unsafe-deserialization 2023-09-25 10:29:18 +02:00
Max Schaefer
dfec1620ea Update expected test output. 2023-09-22 11:28:50 +01:00
Rasmus Lerchedahl Petersen
4ec8b3f02f Python: Model map_reduce 2023-09-20 15:44:12 +02:00
Rasmus Lerchedahl Petersen
7c085ecc61 Python: Add test for map_reduce
Also log requirement for old versions of `pymongo`
2023-09-20 15:23:18 +02:00
Rasmus Lerchedahl Petersen
30c37ca8cb Python: model §accumulator
also slightly rearrange the modelling
2023-09-19 22:21:14 +02:00
Rasmus Lerchedahl Petersen
5611bda7ee Python: add test for $accumulator 2023-09-19 17:04:28 +02:00
Erik Krogh Kristensen
cd5973764b Merge pull request #14112 from erik-krogh/pyAllowedHosts
Py: add sanitizer guard for `url_has_allowed_host_and_scheme`
2023-09-13 12:59:38 +02:00
Rasmus Lerchedahl Petersen
a063d7d510 Python: sinks -> decodings
Query operators that interpret JavaScript
are no longer considered sinks.
Instead they are considered decodings
and the output is the tainted dictionary.
The state changes to `DictInput` to reflect
that the user now controls a dangerous dictionary.

This fixes the spurious result and moves the error reporting
to a more logical place.
2023-09-11 16:33:20 +02:00
Rasmus Lerchedahl Petersen
d9f63e1ed3 Python: Split modelling of query operators
`$where` and `$function` behave quite differently.
2023-09-11 15:54:00 +02:00
Rasmus Lerchedahl Petersen
154a36934d Python: Add test for function 2023-09-11 14:49:03 +02:00
Rasmus Lerchedahl Petersen
d91cd21204 Python: rename file 2023-09-08 13:37:54 +02:00