The new CFG previously only emitted exception edges for explicit `raise`
and `assert` statements. As a result, code that became reachable only
via the exception path of an arbitrary expression (e.g., the body of an
`except` handler following a try-body whose `call()` could raise) was
classified as dead, breaking analyses like StackTraceExposure,
FileNotAlwaysClosed, ExceptionInfo, UseOfExit, and CatchingBaseException.
This commit adds a `mayThrow` predicate over expressions that are known
sources of implicit exceptions in Python (calls, attribute access,
subscripts, arithmetic/comparison operators, imports, await/yield/yield
from) plus `from m import *` at the statement level, and routes them
through the shared CFG's `beginAbruptCompletion(_, _, ExceptionSuccessor,
always=false)` hook.
The set of exception sources is restricted to nodes that are
syntactically inside a `try`/`with` statement in the same scope.
This mirrors Java's `ControlFlowGraph::mayThrow`, which only emits
exception edges where local handling can observe them — outside such
contexts, the edges add CFG complexity (weakening BarrierGuard
precision and breaking SSA continuity around augmented assignments and
subscript stores) without analysis benefit, since exceptions just
propagate to the function exit anyway.
Net effect on the test suite: ~100 alerts restored across the exception-
related query tests (StackTraceExposure +29, ExceptionInfo +17,
FileNotAlwaysClosed +52, UseOfExit +1, CatchingBaseException restored)
with no precision regressions. Affected `.expected` files and the
regression-guard `dead_under_no_raise.py` are updated accordingly.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Preparatory refactor for the shared-CFG dataflow migration. Adds the
new Python CFG library additively, without changing any production
behaviour.
Library additions:
- semmle.python.controlflow.internal.AstNodeImpl — mediates between
the Python AST and the shared codeql.controlflow.ControlFlowGraph
signature. Wraps Python's Stmt/Expr/Scope/Pattern and adds two
synthetic kinds of node (BlockStmt for body slots, intermediate
nodes for multi-operand boolean expressions).
- semmle.python.controlflow.internal.Cfg — public facade
re-exposing the same API surface as semmle/python/Flow.qll
(ControlFlowNode, CallNode, BasicBlock, NameNode, DefinitionNode,
CompareNode, ...), backed by the shared CFG.
- lib/printCfgNew.ql — debug/visualisation query for the new CFG.
- consistency-queries/CfgConsistency.ql — consistency query running
the shared CFG's standard checks against Python.
Shared library:
- shared.controlflow.ControlFlowGraph — adds two defaulted
getWhileElse / getForeachElse predicates to AstSig so Python can
model while-else / for-else (no behavioural change for other
languages).
Test additions:
- ControlFlow/bindings/* — annotation-driven SSA-binding tests for
the new CFG (annassign, compound, comprehension, decorated,
except_handler, imports, match_pattern, parameters, simple,
type_params, walrus_starred, with_stmt, dead_under_no_raise).
- ControlFlow/store-load/* — basic store/load coverage.
- ControlFlow/evaluation-order/NewCfg*.ql — mirrors of the existing
OldCfg evaluation-order self-validation suite, run against the
new CFG via NewCfgImpl.qll.
- Minor extensions to existing test_if.py / test_boolean.py +
cosmetic .expected churn on a handful of OldCfg tests.
No dataflow, SSA, or production query is migrated yet — that lands in
follow-up PRs. The new CFG library has zero callers in lib/ and src/.
Verified by:
- All lib + src + consistency-queries compile clean (367 queries).
- All 56 ControlFlow library-tests pass.
- All 474 dataflow + PointsTo library-tests + consistency tests pass.
- syntax_error/CONSISTENCY/CfgConsistency passes.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>