Commit Graph

3273 Commits

Author SHA1 Message Date
Copilot
f12307278a Python: wire match-pattern bindings into the shared CFG (green)
Adds concrete `Pattern` subclasses in `AstNodeImpl.qll` for every
`MatchPattern` AST kind, with `getChild` overrides that expose
sub-patterns and bound Names. Specifically:

- MatchCapturePattern (`case x:`) -> getVariable()
- MatchAsPattern (`case … as v:`) -> getPattern(), getAlias()
- MatchStarPattern (`case [*rest]:`) -> getTarget()
- MatchSequencePattern (`case [a, b]:`) -> getPattern(i)
- MatchClassPattern (`case Cls(p, q, k=v)`) -> getClass(), positional, keyword
- MatchMappingPattern (`case {k: v}:`) -> getMapping(i)
- MatchKeyValuePattern, MatchKeywordPattern, MatchDoubleStarPattern
- MatchOrPattern, MatchLiteralPattern, MatchValuePattern

Without these, every Name bound by a match pattern lacked a CFG node.
Removes the corresponding MISSING: annotations from match_pattern.py
(all 11 cases).

Verified: all 24 ControlFlow/evaluation-order tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:42 +00:00
Copilot
ba9dc9f5f1 Python: wire import-statement bindings into the shared CFG (green)
Adds `ImportStmt` and `ImportStarStmt` wrappers in `AstNodeImpl.qll`.
For each `Alias` in an import statement, both the value (module/member
expression) and the bound `asname` Name become children of the CFG node
for the import statement, in evaluation order.

Without this, every `Name` introduced by `import` / `from .. import ..`
lacked a CFG node, even though `Name.defines(v)` returns true for it on
the AST side. This was the highest-volume gap: 20,332 missing import
aliases across CPython.

Removes the corresponding MISSING: annotations from imports.py.

Verified: all 24 ControlFlow/evaluation-order tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:42 +00:00
Copilot
768ebc1e2d Python: wire parameters into the shared CFG (C# pattern)
Implements `AstSig::Parameter` and `callableGetParameter(c, i)` in
`AstNodeImpl.qll`, following the C# template
(`csharp/.../ControlFlowGraph.qll:147-156`) rather than Java's
`Parameter() { none() }`.

Each Python parameter (positional, *args, keyword-only, **kwargs) now
becomes a CFG node at a stable position in the enclosing callable's
entry sequence. Defaults still evaluate at function-definition time
via `FunctionDefExpr.getDefault` / `LambdaExpr.getDefault`, so
`Parameter::getDefaultValue()` returns `none()` (the shared CFG
library calls this to model the missing-argument fallback, which
Python does not surface at the CFG level).

The bindings test now exercises parameters (the `py_expr_contexts(_, 4, ...)`
exclusion has been removed). A new `parameters.py` test case covers
positional, defaulted, vararg, kwarg, keyword-only, kitchen-sink,
method (self/cls), lambda, and PEP 570 positional-only parameters.
Several other test files were updated to annotate parameters that the
test had previously hidden (synthetic `.0` comprehension parameter,
method `self`, decorator `f`, etc.).

Verified:
- All 24 ControlFlow/evaluation-order tests still pass.
- CFG consistency query (`python/ql/consistency-queries/CfgConsistency.ql`)
  shows zero violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:42 +00:00
Copilot
5d60a0d7c1 Python: wire AnnAssign into the shared CFG (green)
Adds an `AnnAssignStmt` wrapper in `AstNodeImpl.qll` so that PEP 526
annotated assignments (`x: int = 1`, `x: int`) participate in the
control flow graph. Evaluation order follows CPython: annotation,
optional value, target binding.

Without this, `x: int = 1` had no CFG node for `x` even though
`Name.defines(v)` returns true for it on the AST side. SSA built on
the new CFG would therefore miss every annotated-assignment write.

Removes the corresponding MISSING: annotations from the CFG-binding
gap test:
- annassign.py — all four cases now green.
- match_pattern.py — class-body annotated fields (`x: int`, `y: int`).
- type_params.py — `item: T` inside class.

Verified: all 24 ControlFlow/evaluation-order tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:42 +00:00
Copilot
a70abdd007 Python: project via as* helpers outside characteristic predicates
Style cleanup: avoid naming newtype branch constructors (TPyStmt,
TPyExpr, TBlockStmt, TPattern, TBoolExprPair, TScope) outside the
char-preds that classify their wrappers. Method bodies and helper
predicates now use the as* projections instead:

  // Before: result = TBlockStmt(ifStmt.getBody())
  // After:  result.asStmtList() = ifStmt.getBody()

  // Before: result = TPyStmt(matchStmt.getCase(index))
  // After:  result.asStmt() = matchStmt.getCase(index)

Adds:

- AstNode.asStmtList() - the inverse of TBlockStmt(_).
- BinaryExpr.getIndex() - exposes the synthetic-pair index, used
  internally by getRightOperand to find the next pair without
  naming TBoolExprPair.

No behaviour change: all 24 NewCfg evaluation-order tests pass; all
11 shared-CFG consistency queries report 0 violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:42 +00:00
Copilot
115a762f4c Python: use newtype-branch constructors in characteristic predicates
Style cleanup: when a class's characteristic predicate binds via a
'cast' helper like

  IfStmt() { ifStmt = this.asStmt() }

prefer naming the newtype branch directly:

  IfStmt() { this = TPyStmt(ifStmt) }

This makes the wrapped representation explicit. Apply throughout:
~30 charpreds (every Stmt/Expr leaf wrapper, plus LoopStmt, BreakStmt,
ContinueStmt, BooleanLiteral, UnaryExpr, ArithUnaryExpr, Comprehension).

Method bodies that use asStmt/asExpr to project an underlying
Python AST node (Stmt.toString, BlockStmt.getEnclosingCallable,
UnaryExpr.getOperand, etc.) keep that form - they're projections,
not classifications.

No behaviour change: all 24 NewCfg evaluation-order tests pass; all
11 shared-CFG consistency queries report 0 violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:42 +00:00
Copilot
8f419d1050 Python: introduce TExpr union via newtype-branch alias
Mirror the TStmt refactor for the Expr hierarchy: rename the TExpr
newtype branch to TPyExpr and add

  private class TExpr = TPyExpr or TBoolExprPair;

This lets the public Expr class use TExpr directly:

  class Expr extends AstNodeImpl, TExpr { ... }

instead of

  class Expr extends AstNodeImpl {
    Expr() { this instanceof TExpr or this instanceof TBoolExprPair }
    ...
  }

No behaviour change: all 24 NewCfg evaluation-order tests pass; all
11 shared-CFG consistency queries report 0 violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:42 +00:00
Copilot
8ca2a30dea Python: simplify TBlockStmt char pred via exclusion list
Replace the 14-disjunct allow-list with a 2-conjunct exclusion list.
Of the 17 Py::StmtList getters in AstGenerated.qll, only Try.getHandlers()
and MatchStmt.getCases() should not be wrapped as BlockStmts (they are
iterated individually by the shared library's Try/Switch logic via
getCatch(int) and getCase(int)). All other StmtLists are imperative
block bodies.

No behaviour change: all 24 NewCfg evaluation-order tests pass; all
11 shared-CFG consistency queries report 0 violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:41 +00:00
Copilot
fe394788d3 Python: introduce TStmt union via newtype-branch alias
Rename the TStmt newtype branch to TPyStmt, and add a private union
type alias

  private class TStmt = TPyStmt or TBlockStmt;

This lets the public Stmt class use TStmt directly in its extends
clause:

  class Stmt extends AstNodeImpl, TStmt { ... }

instead of the previous

  class Stmt extends AstNodeImpl {
    Stmt() { this instanceof TStmt or this instanceof TBlockStmt }
    ...
  }

The same pattern is used in cpp/.../TInstruction.qll and
rust/.../Synth.qll.

No behaviour change: all 24 NewCfg evaluation-order tests pass; all
11 shared-CFG consistency queries report 0 violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:41 +00:00
Copilot
761b3e38a2 Python: use private-abstract + final-alias pattern for AstNode
Convert AstNode from a concrete class with empty default predicates into
a private abstract class plus a final alias, matching the pattern used
in cpp/.../EdgeKind.qll and cpp/.../IRVariable.qll:

  abstract private class AstNodeImpl extends TAstNode {
    abstract string toString();
    abstract Py::Location getLocation();
    abstract Callable getEnclosingCallable();
    ...
  }

  final class AstNode = AstNodeImpl;

This makes the compiler enforce that every concrete subclass implements
toString/getLocation/getEnclosingCallable, replacing the brittle
'empty default + per-branch override' arrangement. Sister classes
inside the module now extend AstNodeImpl instead of AstNode (which is
final and cannot be extended).

The empty Parameter stub gains explicit none() overrides for the
three abstract members, since QL requires them statically even when
the class has no instances.

No behaviour change: all 24 NewCfg evaluation-order tests pass; all
11 shared-CFG consistency queries report 0 violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:41 +00:00
Copilot
72d74ae9dc Python: document why Assignment subclasses are empty
Explain that the shared library's Assignment / CompoundAssignment
hierarchy extends BinaryExpr, so it cannot host Python's statement-
level assignment forms (Assign, AugAssign), and that Python has no
short-circuiting compound operators (&&=, ||=, ??=) so all
subclasses remain empty.

No behaviour change; doc comments only.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:41 +00:00
Copilot
1e51c8250b Python: index TBlockStmt by Py::StmtList instead of (parent, slot)
Replace the two-key TBlockStmt(Py::AstNode parent, string slot) newtype
branch with the simpler TBlockStmt(Py::StmtList sl). Each Py::StmtList
that represents an imperative block (function/class/module body, if/
while/for branch, try/except/finally body, case body, except/except*
body) becomes one BlockStmt directly. The slot string disappears;
toString just defers to Py::StmtList.toString() ('StmtList').

The newtype branch keeps an explicit characteristic predicate listing
the slots that count as block bodies. This excludes Try.getHandlers(),
which is a Py::StmtList of ExceptStmt items already iterated by the
shared library's Try logic via getCatch(int) - including it would
produce parallel CFG edges (verified: a permissive
TBlockStmt(Py::StmtList sl) version regressed CPython to 1720
multipleSuccessors and 584 deadEnds before this restriction).

Drops the getBodyStmtList helper. Caller sites now use the StmtList
accessor directly: TBlockStmt(ifStmt.getBody()),
TBlockStmt(tryStmt.getFinalbody()), etc.

No behaviour change: all 24 NewCfg evaluation-order tests pass; all
11 shared-CFG consistency queries report 0 violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:41 +00:00
Copilot
7176fd8dbc Python: unify Py::BoolExpr handling via TBoolExprPair
Previously a Py::BoolExpr appeared in two newtype branches: as TExpr(be)
(the outermost pair) and TBoolExprPair(be, i) for inner pairs of 3+
operand expressions. This forced BinaryExpr/LogicalAndExpr/LogicalOrExpr
to disjoin two cases, and the synthetic-pair handling spanned multiple
layers.

Restrict TExpr to non-BoolExpr Py::Expr, and extend TBoolExprPair to
cover every operand pair (index 0..n-2). Now every Py::BoolExpr is
represented uniformly as TBoolExprPair(_, 0) for the whole expression
and TBoolExprPair(_, i) for inner pairs.

Extend AstNode.asExpr() to also recover the underlying Py::BoolExpr
from TBoolExprPair(_, 0). This makes asExpr() the inverse of
construction: every 'result = TExpr(e)' turns into 'result.asExpr() = e',
which works uniformly for BoolExprs and non-BoolExprs alike.

Consequences:

- BinaryExpr now extends TBoolExprPair directly with a single uniform
  rule for left/right operands.
- LogicalAndExpr/LogicalOrExpr are one-line char preds via
  getBoolExpr().
- The private BoolExprPair wrapper class folds into BinaryExpr.
- 60+ leaf wrappers now read 'result.asExpr() = py_expr' instead of
  'result = TExpr(py_expr)'.

No behaviour change: all 24 NewCfg evaluation-order tests pass; all
11 shared-CFG consistency queries report 0 violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:41 +00:00
Copilot
19b9aa8ba8 Python: merge T*AstNode wrappers into matching public classes
Five of the six per-newtype-branch wrapper classes had a natural
public class corresponding to that branch:

  TStmtAstNode      -> Stmt        (TStmt subset; BlockStmt overrides for TBlockStmt)
  TExprAstNode      -> Expr        (TExpr subset; BoolExprPair overrides for TBoolExprPair)
  TScopeAstNode     -> Callable    (= TScope exactly)
  TPatternAstNode   -> Pattern     (= TPattern exactly)
  TBlockStmtAstNode -> BlockStmt   (= TBlockStmt exactly)

Move toString/getLocation/getEnclosingCallable onto these classes and
delete the wrappers.

The sixth wrapper (TBoolExprPair) has no exact public counterpart -
BinaryExpr is broader, including TExpr-branch BoolExprs - so it
remains as a small private class, renamed BoolExprPair.

No behaviour change: all 24 NewCfg evaluation-order tests pass; all
11 shared-CFG consistency queries report 0 violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:39 +00:00
Copilot
4dbd904365 Python: dispatch toString/getLocation/getEnclosingCallable per branch
Replace the three big disjunctive predicates on AstNode with empty
defaults plus per-newtype-branch override classes:

  AstNode.toString()              { none() }
  AstNode.getLocation()           { none() }
  AstNode.getEnclosingCallable()  { none() }

Six private subclasses (one per newtype branch — TStmt, TExpr,
TScope, TPattern, TBoolExprPair, TBlockStmt) override these with
the branch-specific implementation. This mirrors the per-class
dispatch already used for getChild.

No behaviour change: all 24 NewCfg evaluation-order tests pass and
all 11 shared-CFG consistency queries still report 0 violations on
CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:39 +00:00
Copilot
372944b4b9 Python: adapt to new shared CFG signature
Main added two new requirements to AstSig:
- A 'Parameter' class with a 'getDefaultValue()' method, plus a
  'callableGetParameter(Callable, int)' predicate.
- A 'CallableContext' class in InputSig1, replacing the previous
  'CallableBodyPartContext'.

Add stub implementations: 'Parameter' is empty (none()) and
'callableGetParameter' returns nothing, mirroring Java's TODO. Rename
'CallableBodyPartContext = Void' to 'CallableContext = Void' in the
Python Input module.

NewCfg evaluation-order tests still pass at the 22/24 baseline; all
11 shared-CFG consistency queries still report 0 violations on
CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:39 +00:00
Copilot
a3270ec9f5 Python: refactor getChild into per-class OO dispatch
Replace the single ~240-line top-level getChild predicate with one
override per AST class. AstNode declares a default

  AstNode getChild(int index) { none() }

and each subclass with children overrides it (41 classes total).
The top-level predicate becomes a one-line dispatch:

  AstNode getChild(AstNode n, int index) { result = n.getChild(index) }

No behavioral change: NewCfg evaluation-order tests still pass at the
same 22/24 baseline, and all 11 shared-CFG consistency queries still
report 0 violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:39 +00:00
Copilot
d375900809 Python: include try-else in getChild for completion propagation
The shared CFG library propagates abrupt completions from child to
parent via getChild(parent, _) = child. Python's try.getElse() was
wired into normal step rules but not listed in getChild(TryStmt, ...),
so return/break/continue/raise statements occurring inside a try-else
block had no parent path and ended up as dead-end CFG nodes.

Add the else block at index -2 (alongside finally at -1). This affects
only completion propagation; the normal-flow CFG is unchanged because
TryStmt has explicit step rules.

Verified on a CPython database: all 11 shared-CFG consistency queries
now pass with 0 violations (deadEnd: 244 -> 0).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:39 +00:00
Copilot
577cf4a630 Shared CFG: support for-else and while-else loops
Add two default predicates to AstSig:

  default AstNode getWhileElse(WhileStmt loop) { none() }
  default AstNode getForeachElse(ForeachStmt loop) { none() }

When defined, the explicit-step rules for While/Do and Foreach
route the loop's normal-completion exits through the else block
before reaching the after-loop node:

  - WhileStmt: after-false condition -> before-else -> after-while
    (instead of directly after-while).
  - ForeachStmt: after-collection [empty] and the LoopHeader exit
    are both routed through before-else -> after-foreach.

Python's Ast module overrides the predicates to return the
synthetic BlockStmt for the orelse slot, replacing the previous
customisations in Input::step. This eliminates parallel direct
successors emitted by the previous Python-side step additions
(verified: multipleSuccessors on a CPython database goes from
1340 to 0).

Java and C# CFG tests are unaffected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:39 +00:00
Copilot
158c81c06d Python: compact-renumber FunctionExpr/Lambda defaults
`Args.getDefault(int)` and `Args.getKwDefault(int)` are indexed by
argument position (with gaps for args without defaults), not by
default position. The CFG `getChild` predicate for FunctionDefExpr
and LambdaExpr therefore had gaps at low indices and collisions
where defaults and kwdefaults overlapped, producing parallel
edges before the FunctionExpr.

Use `rank` to compact-renumber `getDefault(n)` and `getKwDefault(n)`
in source order. Verified on a CPython database: removes ~536
`multipleSuccessors` consistency results (1340 -> 804); the rest are
`for/else` and `while/else`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:39 +00:00
Copilot
2de3733fe3 Python: collapse two-layer AstNodeImpl into a single Ast module
Merge the previous `Ast` and `AstSigImpl` modules into a single
`module Ast implements AstSig<Py::Location>`. Classes now use the
signature names (IfStmt, WhileStmt, ForeachStmt, etc.) and signature
predicates (getCondition, getThen, getElse, etc.) directly, with no
intermediate renaming layer.

Drop the TStmtListNode newtype branch entirely. Replace it with a
synthetic TBlockStmt(parent, slot) keyed by a parent AST node and a
slot label string ('body', 'orelse', 'finally'). Py::StmtList no
longer appears in the newtype; the BlockStmt class provides indexed
access to the underlying body items via getStmt(n).

All 22 of 24 evaluation-order tests still pass; the same 2
comprehension-related failures predate this refactor.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:38 +00:00
yoff
0dabf47344 Python: add pattern nodes
Co-authored-by: Copilot <copilot@github.com>
2026-05-26 16:32:38 +00:00
Taus
661a77b415 Cleanup, printCFG
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:38 +00:00
Taus
71a547b0d3 Python: Handle dict unpacking in calls
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:38 +00:00
Taus
bac48b4914 Python: Fix exception issue
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:38 +00:00
Taus
852aba880d Python: Fix match
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:38 +00:00
Taus
356907990a Python: Support match
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:38 +00:00
Taus
024702e019 Python: More nodes
Not entirely sure about the `else:` blocks.

Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:37 +00:00
Taus
98637bcdc7 Python: Comprehensions
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:37 +00:00
Taus
abd7c2989d Python: Add with
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:37 +00:00
Taus
6573eed42b Python: More simple statements
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:37 +00:00
Taus
fc3940fb5d Python: assignments
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:37 +00:00
Taus
319e49b955 Python: Attributes
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:37 +00:00
Taus
da663da87b Python: Function calls
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:37 +00:00
Taus
5680477179 Python: Assert statements
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:37 +00:00
Taus
2b3df57eea Python: Support various literals
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:37 +00:00
Taus
2f2c071920 Python: More AstNodeImpl improvements
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:36 +00:00
Taus
28ebe21337 Python: Instantiate CFG module fully
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:36 +00:00
Taus
5519570157 Python: Use fields everywhere in new AST classes
Co-authored-by: yoff <yoff@github.com>
2026-05-26 16:32:36 +00:00
Taus
53f34376c0 Python: First stab at shared control-flow 2026-05-26 16:32:36 +00:00
Óscar San José
996e79131e Merge branch 'main' into post-release-prep/codeql-cli-2.25.5 2026-05-22 16:32:30 +02:00
github-actions[bot]
9f64000962 Post-release preparation for codeql-cli-2.25.5 2026-05-18 15:20:31 +00:00
github-actions[bot]
e38616a2ef Release preparation for version 2.25.5 2026-05-18 12:05:32 +00:00
Geoffrey White
a4b2c0f6fd Update change notes (Copilot's suggestions). 2026-05-15 09:24:29 +01:00
Geoffrey White
59dbd68a5e Add change notes. 2026-05-14 14:46:05 +01:00
github-actions[bot]
7610277199 Post-release preparation for codeql-cli-2.25.4 2026-05-05 10:10:06 +00:00
github-actions[bot]
88e1d86c27 Release preparation for version 2.25.4 2026-05-05 09:34:30 +00:00
Josef Svenningsson
68be006a29 Merge pull request #21641 from github/josefs/promptInjectionImprovements
Improve prompt inject for Python
2026-04-29 11:23:52 +01:00
Josef Svenningsson
bb18bb084c Improve prompt inject for Python 2026-04-28 18:24:16 +01:00
Owen Mansel-Chan
6efb21314a Merge pull request #21523 from owen-mc/docs/mad/barriers
Document models-as-data barriers and barrier guards and add change notes
2026-04-21 13:49:19 +01:00