Preparatory refactor for the shared-CFG dataflow migration. Adds the
new Python CFG library additively, without changing any production
behaviour.
Library additions:
- semmle.python.controlflow.internal.AstNodeImpl — mediates between
the Python AST and the shared codeql.controlflow.ControlFlowGraph
signature. Wraps Python's Stmt/Expr/Scope/Pattern and adds two
synthetic kinds of node (BlockStmt for body slots, intermediate
nodes for multi-operand boolean expressions).
- semmle.python.controlflow.internal.Cfg — public facade
re-exposing the same API surface as semmle/python/Flow.qll
(ControlFlowNode, CallNode, BasicBlock, NameNode, DefinitionNode,
CompareNode, ...), backed by the shared CFG.
- lib/printCfgNew.ql — debug/visualisation query for the new CFG.
- consistency-queries/CfgConsistency.ql — consistency query running
the shared CFG's standard checks against Python.
Shared library:
- shared.controlflow.ControlFlowGraph — adds two defaulted
getWhileElse / getForeachElse predicates to AstSig so Python can
model while-else / for-else (no behavioural change for other
languages).
Test additions:
- ControlFlow/bindings/* — annotation-driven SSA-binding tests for
the new CFG (annassign, compound, comprehension, decorated,
except_handler, imports, match_pattern, parameters, simple,
type_params, walrus_starred, with_stmt, dead_under_no_raise).
- ControlFlow/store-load/* — basic store/load coverage.
- ControlFlow/evaluation-order/NewCfg*.ql — mirrors of the existing
OldCfg evaluation-order self-validation suite, run against the
new CFG via NewCfgImpl.qll.
- Minor extensions to existing test_if.py / test_boolean.py +
cosmetic .expected churn on a handful of OldCfg tests.
No dataflow, SSA, or production query is migrated yet — that lands in
follow-up PRs. The new CFG library has zero callers in lib/ and src/.
Verified by:
- All lib + src + consistency-queries compile clean (367 queries).
- All 56 ControlFlow library-tests pass.
- All 474 dataflow + PointsTo library-tests + consistency tests pass.
- syntax_error/CONSISTENCY/CfgConsistency passes.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
These tests consist of various Python constructions (hopefully a
somewhat comprehensive set) with specific timestamp annotations
scattered throughout. When the tests are run using the Python 3
interpreter, these annotations are checked and compared to the "current
timestamp" to see that they are in agreement. This is what makes the
tests "self-validating".
There are a few different kinds of annotations: the basic `t[4]` style
(meaning this is executed at timestamp 4), the `t[dead(4)]` variant
(meaning this _would_ happen at timestamp 4, but it is in a dead
branch), and `t[never]` (meaning this is never executed at all).
In addition to this, there is a query, MissingAnnotations, which checks
whether we have applied these annotations maximally. Many expression
nodes are not actually annotatable, so there is a sizeable list of
excluded nodes for that query.