This commit removes SSA nodes from the data flow graph. Specifically, for a definition and use such as
```python
x = expr
y = x + 2
```
we used to have flow from `expr` to an SSA variable representing x and from that SSA variable to the use of `x` in the definition of `y`. Now we instead have flow from `expr` to the control flow node for `x` at line 1 and from there to the control flow node for `x` at line 2.
Specific changes:
- `EssaNode` from the data flow layer no longer exists.
- Several glue steps between `EssaNode`s and `CfgNode`s have been deleted.
- Entry nodes are now admitted as `CfgNodes` in the data flow layer (they were filtered out before).
- Entry nodes now have a new `toString` taking into account that the module name may be ambigous.
- Some tests have been rewritten to accomodate the changes, but only `python/ql/test/experimental/dataflow/basic/maximalFlowsConfig.qll` should have semantic changes.
- Comments have been updated
- Test output has been updated, but apart from `python/ql/test/experimental/dataflow/basic/maximalFlows.expected` only `python/ql/test/experimental/dataflow/typetracking-summaries/summaries.py` should have a semantic change. This is a bonus fix, probably meaning that something was never connected up correctly.
copy, pop, get, getitem, setdefault
Also add read steps to taint tracking.
Reading from a tainted collection can be done in two situations:
1. There is an acces path
In this case a read step (possibly from a flow summary)
gives rise to a taint step.
2. There is no access path
In this case an explicit taint step (possibly via a flow
summary) should exist.
It's really hard to audit that this is all good.. I tried my best with
`icdiff` though -- and there is a problem with
ql/src/experimental/Security/CWE-348/ClientSuppliedIpUsedInSecurityCheck.ql
that needs to be fixed in the next commit
I am slightly concerned that the test now generates many more
intermediate results. I suppose that maes the analysis heavy.
Should the new library get a new name instead, so the old code
does not get evaluated?
Along with the root cause, which is the `StringConstCompare`
BarrierGuard, that does only allows `in <iterable literal>` and not
`in <variable referencing iterable literal>`
When I changed the taint modeling in 19b7ea8d85, that obviously also means that
some of the related locations for alerts will change. So that's why all the
examples needs to be updated.
Besides this, I had to fix a minor problem with having too many alerts. If
running a query agaisnt code like in the example below, there would be 3 alerts,
2 of them originating from the import.
```
from flask import Flask, request
app = Flask(__name__)
@app.route("/route")
def route():
SINK(request.args.get['input'])
```
The 2 import sources where:
- ControlFlowNode for ImportMember
- GSSA Variable request
I removed these from being a RemoteFlowSource, as seen in the diff.
I considered restricting `FlaskRequestSource` so it only extends
`DataFlow::CfgNode` (and make the logic a bit simpler), but I wasn't actually
sure if that was safe to do or not... If you know, please let me know :)
Not quite sure how to deal with these cases of safe if UNIX-only, otherwise not
safe.
If/when we actually try to deal with these, we also need to figure that
out. We _could_ split this queyr into 3: (1) for path injection on any
platform, (2) path injection on windows, (3) path injection on UNIX. Then
UNIX-only projects could disable the path-injection on windows query. -- that's
my best idea, if you have better ideas, DO tell 👍
This was a bit of a mess, since there was crosstalk between the
TarSlip and PathInjection queries. (Also one of these needs the
`options` file to be in one way, and the other not). To fix this, I
split these out into separate directories.