Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.
This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:
P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
P3: Add new shared-CFG-backed control flow graph (#21921).
P4: Add new shared-SSA-backed SSA adapter (#21923).
The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.
GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.
Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.
A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
pre/post-order pairs as distinct nodes.
Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.
Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.
Verification: all 367 lib + src + consistency-queries compile clean.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
I'm beginning to realise why I didn't do the `toString` overriding way
back when. Thankfully, now that all of our tests are in the same place,
this is actually not a terrible ordeal.
I reckon this is due to the Python 3 version used by the Python 2 tests
is different from 3.12, so even with --lang=3 the tests are still using
an incompatible version :(
since
- Python 3 is ok from 3.7 onwards
- support for Python 3.6 was just dropped
- we do not actually know the minor version of the analysed code
(only of the extractor)
For syntax errors, we simply report the major version.
For unused imports, we were getting a result for `typing.py` when run under
Python 3.7.3. To prevent this import from being considered, I've set the maximum
import depth to `0`.
Fixes#1969.
The points-to analysis does not know that the assignment `input = raw_input`
cannot fail under Python 2, and so there are two possible values that `input`
could point-to after exiting the exception handler: the built-in `input`, or the
built-in `raw_input`. In the latter case we do not want to report the alert, and
so adding a check that the given function does not point-to the built-in
`raw_input` suffices.
Due to internal PR#35123 we now actually run the tests under
`python/ql/test/2/...`
Since we haven't done this in a while, test output has changed a bit. These
changes look perfectly fine.