This commit removes SSA nodes from the data flow graph. Specifically, for a definition and use such as
```python
x = expr
y = x + 2
```
we used to have flow from `expr` to an SSA variable representing x and from that SSA variable to the use of `x` in the definition of `y`. Now we instead have flow from `expr` to the control flow node for `x` at line 1 and from there to the control flow node for `x` at line 2.
Specific changes:
- `EssaNode` from the data flow layer no longer exists.
- Several glue steps between `EssaNode`s and `CfgNode`s have been deleted.
- Entry nodes are now admitted as `CfgNodes` in the data flow layer (they were filtered out before).
- Entry nodes now have a new `toString` taking into account that the module name may be ambigous.
- Some tests have been rewritten to accomodate the changes, but only `python/ql/test/experimental/dataflow/basic/maximalFlowsConfig.qll` should have semantic changes.
- Comments have been updated
- Test output has been updated, but apart from `python/ql/test/experimental/dataflow/basic/maximalFlows.expected` only `python/ql/test/experimental/dataflow/typetracking-summaries/summaries.py` should have a semantic change. This is a bonus fix, probably meaning that something was never connected up correctly.
This partly reverts the changes from https://github.com/github/codeql/pull/10252
Although consistency is nice, the new messages didn't sound as natural.
New alert message would read
> Insecure hashing algorithm (md5) depends on sensitive data (password). (...)
I'm not sure what it means that a hashing algorithm depends on data. So
for me, the original text below is much easier to understand.
> Sensitive data (password) is used in a hashing algorithm (md5) that is insecure (...)
Same goes for the other sensitive data queries.
Now there is a path from the _imports_ of the functions that would
return sensitive data, so we produce more alerts.
I'm not entirely happy about this "double reporting", but I'm not sure
how to get around it without either:
1. disabling the extra taint-step for calls. Not ideal since we would
loose good sources.
2. disabling the extra sources based on function name. Not ideal since
we would loose good sources.
3. disabling the extra sources based on function name, for those calls
that would be handled with the extra taint-step for calls. Not ideal
since that would require running the data-flow query initially to
prune these out :|
So for now, I think the best approach is to accept some risk on this,
and ship to learn :)