I'm beginning to realise why I didn't do the `toString` overriding way
back when. Thankfully, now that all of our tests are in the same place,
this is actually not a terrible ordeal.
Since some sanitisers don't handle backslashes correctly, I updated the data-flow configuration to incorporate a flow state tracking whether or not backslashes have been eliminated or converted to forward slashes.
Adds support for extraction filters as defined in
https://peps.python.org/pep-0706/
and implemented in Python 3.12.
By my reading, setting the filter to `'data'` or `'tar'` is probably
safe, whereas `'fully_trusted'` or the default (which is the same as
`None`) is not.
For now, I have just added this modelling to the tarslip query. We could
also share it with the modelling of `shutil.unpack_archive` (which has also
gained a `filter` argument), but it was unclear to me where we should put
this modelling in that case. Perhaps the best solution would be to merge
the experimental `py/tarslip-extended` query into the existing query (in
which case the current location is perhaps not too bad).
This commit removes SSA nodes from the data flow graph. Specifically, for a definition and use such as
```python
x = expr
y = x + 2
```
we used to have flow from `expr` to an SSA variable representing x and from that SSA variable to the use of `x` in the definition of `y`. Now we instead have flow from `expr` to the control flow node for `x` at line 1 and from there to the control flow node for `x` at line 2.
Specific changes:
- `EssaNode` from the data flow layer no longer exists.
- Several glue steps between `EssaNode`s and `CfgNode`s have been deleted.
- Entry nodes are now admitted as `CfgNodes` in the data flow layer (they were filtered out before).
- Entry nodes now have a new `toString` taking into account that the module name may be ambigous.
- Some tests have been rewritten to accomodate the changes, but only `python/ql/test/experimental/dataflow/basic/maximalFlowsConfig.qll` should have semantic changes.
- Comments have been updated
- Test output has been updated, but apart from `python/ql/test/experimental/dataflow/basic/maximalFlows.expected` only `python/ql/test/experimental/dataflow/typetracking-summaries/summaries.py` should have a semantic change. This is a bonus fix, probably meaning that something was never connected up correctly.
turn "the long jump" that would end up
straight at the argument into a short jump
that ends up at the dictionary being written to.
Dataflow takes care of the rest of the path.
Query operators that interpret JavaScript
are no longer considered sinks.
Instead they are considered decodings
and the output is the tainted dictionary.
The state changes to `DictInput` to reflect
that the user now controls a dangerous dictionary.
This fixes the spurious result and moves the error reporting
to a more logical place.
This allows us to make more precise modelling
The query tests now pass.
I do wonder, if there is a cleaner approach, similar to
`TaintedObject` in JavaScript. I want the option to
get this query in the hands of the custumors before
such an investigation, though.