I'm beginning to realise why I didn't do the `toString` overriding way
back when. Thankfully, now that all of our tests are in the same place,
this is actually not a terrible ordeal.
This commit renames the files relating to the diagnostic query that produces information on the number of files extracted. The files have been renamed from "SuccessfullExtractedFiles.*" to "ExtractedFiles.*". All related tests and test files have been renamed too.
The `@tags` and `@id` attributes of the queries have been left untouched, consistent with the `@tags` and `@id` for similar queries in other languages.
Or, more generally, any copy step, as these presumably do not preserve
object identity.
(Arguably, `copy` could still be susceptible to interior mutability, but
I think that's outside the scope of this query anyway.)
There are two issues with `deepcopy` here. Firstly, the `deepcopy` function itself
has a mutable default value in its parameter `_nil` (set to the empty list by default).
Now, this value is never actually returned from `deepcopy`, as it is only used as a
sentinel, but our analysis is not clever enough to see this. Thus, it thinks that this
mutable default is returned, and hence the result of any call to `deepcopy` is a
potential source.
To remedy this, I opted to simply exclude all sources that originate from within the
standard library. It is very unlikely for any of the sources in the standard library
to be legit.
Secondly, `deepcopy` -- by virtue of being a function that we model as preserving
values -- admits data-flow through its calls, but this is not correct for the mutable
default query, as it is here the _identity_ of the default value in question that is
important. Thus, we get spurious flow through `deepcopy` for this specific query.
Moves the existing tests into the `ModificationOfParameterWithDefault` subdirectory
which already contained a bunch more tests. In the process, I also removed some
duplicated test cases.
Since some sanitisers don't handle backslashes correctly, I updated the data-flow configuration to incorporate a flow state tracking whether or not backslashes have been eliminated or converted to forward slashes.
Adds support for extraction filters as defined in
https://peps.python.org/pep-0706/
and implemented in Python 3.12.
By my reading, setting the filter to `'data'` or `'tar'` is probably
safe, whereas `'fully_trusted'` or the default (which is the same as
`None`) is not.
For now, I have just added this modelling to the tarslip query. We could
also share it with the modelling of `shutil.unpack_archive` (which has also
gained a `filter` argument), but it was unclear to me where we should put
this modelling in that case. Perhaps the best solution would be to merge
the experimental `py/tarslip-extended` query into the existing query (in
which case the current location is perhaps not too bad).
This commit removes SSA nodes from the data flow graph. Specifically, for a definition and use such as
```python
x = expr
y = x + 2
```
we used to have flow from `expr` to an SSA variable representing x and from that SSA variable to the use of `x` in the definition of `y`. Now we instead have flow from `expr` to the control flow node for `x` at line 1 and from there to the control flow node for `x` at line 2.
Specific changes:
- `EssaNode` from the data flow layer no longer exists.
- Several glue steps between `EssaNode`s and `CfgNode`s have been deleted.
- Entry nodes are now admitted as `CfgNodes` in the data flow layer (they were filtered out before).
- Entry nodes now have a new `toString` taking into account that the module name may be ambigous.
- Some tests have been rewritten to accomodate the changes, but only `python/ql/test/experimental/dataflow/basic/maximalFlowsConfig.qll` should have semantic changes.
- Comments have been updated
- Test output has been updated, but apart from `python/ql/test/experimental/dataflow/basic/maximalFlows.expected` only `python/ql/test/experimental/dataflow/typetracking-summaries/summaries.py` should have a semantic change. This is a bonus fix, probably meaning that something was never connected up correctly.
I reckon this is due to the Python 3 version used by the Python 2 tests
is different from 3.12, so even with --lang=3 the tests are still using
an incompatible version :(
You might wonder why the number of lines changed, but it's due to `tty`
module receiving its' first update since 2001, so the actual number of
lines DID change :phew:
https://github.com/python/cpython/commits/3.12/Lib/tty.py
Since there is now a difference between Python 2 and Python 3, we need to restrict the lines of code test to only run as Python 3.