Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.
This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:
P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
P3: Add new shared-CFG-backed control flow graph (#21921).
P4: Add new shared-SSA-backed SSA adapter (#21923).
The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.
GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.
Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.
A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
pre/post-order pairs as distinct nodes.
Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.
Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.
Verification: all 367 lib + src + consistency-queries compile clean.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Since using `.DictionaryElementAny` doesn't actually do a store on the
source, (so we can later follow any dict read-steps).
I added the ensure_tainted steps to highlight that the result of the
WHOLE expression ends up "tainted", and that we don't just mark
`os.environ` as the source without further flow.
This is partly motivated by the MaD tests which looks much better now in
my opinion.
I also wanted this for testing argument passing. In Python we're
adopting the same argument positions as Ruby has
[here](4f3751dfea/ruby/ql/lib/codeql/ruby/dataflow/internal/DataFlowDispatch.qll (L508-L540))
So it would be nice if `arg[keyword foo]=...` was allowed, without
having to transform the `toString()` result of an argument position into
something without a space.
This enables us to keep the framework modeling tests under `/frameworks`
folder
I had hoped to use `mad-sink[<kind>]` syntax, but that was not allowed
:(
Maybe it oculd be allowed in the future, but for now I'll stick with the
more ugly solution of `mad-sink__<kind>`