Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.
This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:
P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
P3: Add new shared-CFG-backed control flow graph (#21921).
P4: Add new shared-SSA-backed SSA adapter (#21923).
The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.
GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.
Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.
A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
pre/post-order pairs as distinct nodes.
Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.
Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.
Verification: all 367 lib + src + consistency-queries compile clean.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds API graph support for observing that in
```python
def foo(x : Bar): ...
```
The variable `x` is likely to be an instance of the type `Bar` inside
this function.
In particular, we add `getInstanceFromAnnotation` as a predicate on API
graph nodes that tracks this step (corresponding to a new edge type
labeled with "annotation" in the API graph), and extend the existing
`getAnInstance` predicate to also include instances arising from type
annotations.
A more complete solution would also add support for annotated
assignments (`x : Foo = ...` or just `x : Foo`) as well as track types
through type aliases (`type Foo = Bar`). This turns out to be
non-trivial, however, as these type constructs don't have any CFG nodes
(and so no data-flow nodes by default either). In order to not have
perfect be the enemy of good, this commit is only targeting the type
parameter case (which is also likely to be the most common use case
anyway).
The tests for API graphs have been extended accordingly, including tests
for the kinds of type ascriptions that we _don't_ currently model in API
graphs (marked with `MISSING:` in the inline tests).
Factor out use-use flow in order to do this.
Also improve names and comments.
I also wanted to change the types in `difinitionFlowStep`, but
that broke the module instantiation.
type tracking and the API graph.
- In `TypeTrackerSpecific.qll` we add a jump step
- to every scope entry definition
- from the value of any defining `DefinitionNode`
(In our example, the definition is the class name, `Users`,
while the assigned value is the class definition, and it is
the latter which receives flow in this case.)
- In `LocalSources.qll` we allow scope entry definitions as local sources.
- This feels natural enough, as they are a local source for the value, they represent.
It is perhaps a bit funne to see an Ssa variable here,
rather than a control flow node.
- This is necessary in order for type tracking to see the local flow
from the scope entry definition.
- In `ApiGraphs.qll` we no longer restrict the result of `trackUseNode`
to be an `ExprNode`. To keep the positive formulation, we do not
prohibit module variable nodes. Instead we restrict to the new
`LocalSourceNodeNotModule` which avoids those cases.
This is a condensed versio of the user reported example
found [here](eb377d5918/app.py (L278))
The `MISSING` annotation indicates where our API graph falls short.
A class decorator could change the class definition in any way.
In this specific case, it would be better if we allowed the subclass to
be found with API graphs still.
inspired by
c2250cfb80/tests/auth_tests/test_views.py (L40-L46)