Commit Graph

1331 Commits

Author SHA1 Message Date
yoff
558cd5b00c Python: test dead bindings under no-raise CFG abstraction
Adds 'dead_under_no_raise.py' to the bindings test suite, capturing the
three CPython patterns where bindings legitimately have no CFG node
because the surrounding code is unreachable under the 'no expressions
raise' abstraction:

  1. Statements after a 'try: return X; except: pass' block.
  2. The 'else:' clause of a try whose body always raises.
  3. Cache-lookup pattern 'try: return cache[k]; except: pass' followed
     by computation and store.

These bindings intentionally carry no 'cfgdefines=' annotations. If
raise modelling is later added to the CFG, the BindingsTest will surface
the new CFG nodes as unexpected results and this file will need to be
revisited.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-18 10:48:43 +00:00
yoff
77149be759 Python: wire PEP 695 type parameters into the shared CFG (green)
Adds CFG coverage for the binding 'Name's introduced by PEP 695
type-parameter syntax on functions, classes, and 'type' aliases:

  def func[T](...): ...
  class Box[T]: ...
  def multi[T: int, *Ts, **P](...): ...
  type Alias[T] = ...

For each parametrised AST node, the type-parameter names (and, for
'type' aliases, the alias name itself) are added as children of the
enclosing CFG node so that 'Name.defines(v)' has a corresponding
position. Bounds and defaults are intentionally not wired (they have
no SSA-relevant semantics for our purposes).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-18 09:58:27 +00:00
Copilot
aa8402d1b7 Python: wire match-pattern bindings into the shared CFG (green)
Adds concrete `Pattern` subclasses in `AstNodeImpl.qll` for every
`MatchPattern` AST kind, with `getChild` overrides that expose
sub-patterns and bound Names. Specifically:

- MatchCapturePattern (`case x:`) -> getVariable()
- MatchAsPattern (`case … as v:`) -> getPattern(), getAlias()
- MatchStarPattern (`case [*rest]:`) -> getTarget()
- MatchSequencePattern (`case [a, b]:`) -> getPattern(i)
- MatchClassPattern (`case Cls(p, q, k=v)`) -> getClass(), positional, keyword
- MatchMappingPattern (`case {k: v}:`) -> getMapping(i)
- MatchKeyValuePattern, MatchKeywordPattern, MatchDoubleStarPattern
- MatchOrPattern, MatchLiteralPattern, MatchValuePattern

Without these, every Name bound by a match pattern lacked a CFG node.
Removes the corresponding MISSING: annotations from match_pattern.py
(all 11 cases).

Verified: all 24 ControlFlow/evaluation-order tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-12 13:11:18 +00:00
Copilot
b8c093eefb Python: wire import-statement bindings into the shared CFG (green)
Adds `ImportStmt` and `ImportStarStmt` wrappers in `AstNodeImpl.qll`.
For each `Alias` in an import statement, both the value (module/member
expression) and the bound `asname` Name become children of the CFG node
for the import statement, in evaluation order.

Without this, every `Name` introduced by `import` / `from .. import ..`
lacked a CFG node, even though `Name.defines(v)` returns true for it on
the AST side. This was the highest-volume gap: 20,332 missing import
aliases across CPython.

Removes the corresponding MISSING: annotations from imports.py.

Verified: all 24 ControlFlow/evaluation-order tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-12 12:35:47 +00:00
Copilot
0742e1e901 Python: wire parameters into the shared CFG (C# pattern)
Implements `AstSig::Parameter` and `callableGetParameter(c, i)` in
`AstNodeImpl.qll`, following the C# template
(`csharp/.../ControlFlowGraph.qll:147-156`) rather than Java's
`Parameter() { none() }`.

Each Python parameter (positional, *args, keyword-only, **kwargs) now
becomes a CFG node at a stable position in the enclosing callable's
entry sequence. Defaults still evaluate at function-definition time
via `FunctionDefExpr.getDefault` / `LambdaExpr.getDefault`, so
`Parameter::getDefaultValue()` returns `none()` (the shared CFG
library calls this to model the missing-argument fallback, which
Python does not surface at the CFG level).

The bindings test now exercises parameters (the `py_expr_contexts(_, 4, ...)`
exclusion has been removed). A new `parameters.py` test case covers
positional, defaulted, vararg, kwarg, keyword-only, kitchen-sink,
method (self/cls), lambda, and PEP 570 positional-only parameters.
Several other test files were updated to annotate parameters that the
test had previously hidden (synthetic `.0` comprehension parameter,
method `self`, decorator `f`, etc.).

Verified:
- All 24 ControlFlow/evaluation-order tests still pass.
- CFG consistency query (`python/ql/consistency-queries/CfgConsistency.ql`)
  shows zero violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-12 12:29:28 +00:00
Copilot
a20af3cf41 Python: wire AnnAssign into the shared CFG (green)
Adds an `AnnAssignStmt` wrapper in `AstNodeImpl.qll` so that PEP 526
annotated assignments (`x: int = 1`, `x: int`) participate in the
control flow graph. Evaluation order follows CPython: annotation,
optional value, target binding.

Without this, `x: int = 1` had no CFG node for `x` even though
`Name.defines(v)` returns true for it on the AST side. SSA built on
the new CFG would therefore miss every annotated-assignment write.

Removes the corresponding MISSING: annotations from the CFG-binding
gap test:
- annassign.py — all four cases now green.
- match_pattern.py — class-body annotated fields (`x: int`, `y: int`).
- type_params.py — `item: T` inside class.

Verified: all 24 ControlFlow/evaluation-order tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-12 12:22:57 +00:00
Copilot
f077e7c27c Python: add CFG-binding gap tests (red)
Adds inline-expectation tests for the new shared CFG implementation in
python/ql/lib/semmle/python/controlflow/internal/AstNodeImpl.qll,
covering every Python binding construct that introduces a variable.

The test files use MISSING: annotations to record bindings whose
defining Name AST node is *not* currently reachable from the new CFG.
These are the 'red' half of red-green commit pairs: subsequent commits
will extend AstNodeImpl to cover each construct and remove the
corresponding MISSING: marker.

Confirmed-broken categories:
- Import aliases (from x import a)
- Annotated assignment (x: int = 1)
- Exception handler (except E as e)
- Match patterns (case x, case [a,b], case ... as v)
- PEP 695 type params (def f[T], class C[T])

Confirmed-working (no MISSING:):
- Compound targets, with-as, comprehensions, decorated def/class,
  walrus, starred.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-12 12:19:34 +00:00
Copilot
76724c5391 Shared CFG: support for-else and while-else loops
Add two default predicates to AstSig:

  default AstNode getWhileElse(WhileStmt loop) { none() }
  default AstNode getForeachElse(ForeachStmt loop) { none() }

When defined, the explicit-step rules for While/Do and Foreach
route the loop's normal-completion exits through the else block
before reaching the after-loop node:

  - WhileStmt: after-false condition -> before-else -> after-while
    (instead of directly after-while).
  - ForeachStmt: after-collection [empty] and the LoopHeader exit
    are both routed through before-else -> after-foreach.

Python's Ast module overrides the predicates to return the
synthetic BlockStmt for the orelse slot, replacing the previous
customisations in Input::step. This eliminates parallel direct
successors emitted by the previous Python-side step additions
(verified: multipleSuccessors on a CPython database goes from
1340 to 0).

Java and C# CFG tests are unaffected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-05 15:21:43 +00:00
Taus
a33b49a3f3 WIP2 2026-05-05 15:21:42 +00:00
Taus
1af415bec3 WIP 2026-05-05 15:21:42 +00:00
Taus
3be562929a Python: Ignore synthetic CFG nodes
We can only annotate the ones that correspond directly to AST nodes
anyway.

Co-authored-by: yoff <yoff@github.com>
2026-05-05 15:21:41 +00:00
Taus
2ed75e7ca7 Python: Instantiate CFG tests with new CFG library
Co-authored-by: yoff <yoff@github.com>
2026-05-05 15:21:40 +00:00
Taus
4582855de1 Python: Make CFG tests parameterised
Currently we only instantiate them with the old CFG library, but in the
future we'll want to do this with the new library as well.

Co-authored-by: yoff <yoff@github.com>
2026-05-05 15:21:40 +00:00
Taus
ba29e7e34d Python: Add ConsecutiveTimestamps test
This one is potentially a bit iffy -- it checks for a very powerful
propetry (that implies many of the other queries), but as the test
results show, it can produce false positives when there is in fact no
problem. We may want to get rid of it entirely, if it becomes too noisy.
2026-05-05 15:21:40 +00:00
Taus
f97bf38f3b Python: Add NeverReachable test
This looks for nodes annotated with `t.never` in the test that are
reachable in the CFG. This should not happen (it messes with various
queries, e.g. the "mixed returns" query), but the test shows that in a
few particular cases (involving the `match` statement where all cases
contain `return`s), we _do_ have reachable nodes that shouldn't be.
2026-05-05 15:21:40 +00:00
Taus
a8d136d3d6 Python: Add BasicBlockOrdering test
This one demonstrates a bug in the current CFG. In a dictionary
comprehension `{k: v for k, v in d.items()}`, we evaluate the value
before the key, which is incorrect. (A fix for this bug has been
implemented in a separate PR.)
2026-05-05 15:21:40 +00:00
Taus
710a43ac7f Python: Add some CFG-validation queries
These use the annotated, self-verifying test files to check various
consistency requirements.

Some of these may be expressing the same thing in different ways, but
it's fairly cheap to keep them around, so I have not attempted to
produce a minimal set of queries for this.
2026-05-05 15:21:40 +00:00
Taus
3402d0eaeb Python: Add self-validating CFG tests
These tests consist of various Python constructions (hopefully a
somewhat comprehensive set) with specific timestamp annotations
scattered throughout. When the tests are run using the Python 3
interpreter, these annotations are checked and compared to the "current
timestamp" to see that they are in agreement. This is what makes the
tests "self-validating".

There are a few different kinds of annotations: the basic `t[4]` style
(meaning this is executed at timestamp 4), the `t.dead[4]` variant
(meaning this _would_ happen at timestamp 4, but it is in a dead
branch), and `t.never` (meaning this is never executed at all).

In addition to this, there is a query, MissingAnnotations, which checks
whether we have applied these annotations maximally. Many expression
nodes are not actually annotatable, so there is a sizeable list of
excluded nodes for that query.
2026-05-05 15:21:39 +00:00
Taus
ac23e16786 Python: Move Python 3.15 data-flow tests to a separate file
We won't be able to run these tests until Python 3.15 is actually out
(and our CI is using it), so it seemed easiest to just put them in their
own test directory.
2026-04-17 13:16:46 +00:00
Taus
dc36609743 Python: Add data-flow tests
Alas, all these demonstrate is that we already don't fully support the
desugared `yield from` form.
2026-04-17 12:15:04 +00:00
Taus
8b1ecf05c9 Python: Update test output
This change reflects the `(value, key)` to `(key, value)` fix in an
earlier commit.
2026-04-14 13:27:31 +02:00
Taus
fa61f6f3df Python: Model @typing.overload in method resolution
Adds `hasOverloadDecorator` as a predicate on functions. It looks for
decorators called `overload` or `something.overload` (usually
`typing.overload` or `t.overload`). These are then filtered out in the
predicates that (approximate) resolving methods according to the MRO.

As the test introduced in the previous commit shows, this removes the
spurious resolutions we had before.
2026-03-05 22:20:03 +00:00
Taus
0561a63003 Python: Add test for overloaded __init__ resolution
Adds a test showing that `@typing.overload` stubs are spuriously
resolved as call targets alongside the actual `__init__` implementation.
2026-03-05 22:20:03 +00:00
Owen Mansel-Chan
99a4fe4828 Update expected test output column numbers 2026-03-04 15:02:53 +00:00
Owen Mansel-Chan
aa28c94562 Remove double space after $ in inline expectations tests 2026-03-04 14:12:42 +00:00
Owen Mansel-Chan
91b6801db1 py: Inline expectation should have space before $ 2026-03-04 13:11:38 +00:00
Owen Mansel-Chan
5a97348e78 python: Inline expectation should have space after $
This was a regex-find-replace from `# \$(?! )` (using a negative lookahead) to `# $ `.
2026-03-04 12:45:05 +00:00
yoff
600f585a31 Merge pull request #21296 from yoff/python/bool-comparison-guards
Python: Handle guards being compared to boolean literals
2026-02-26 21:13:51 +01:00
Taus
6bfb1e1fae Merge pull request #21344 from github/tausbn/python-remove-points-to-from-metrics-libraries
Python: Remove points-to from metrics library
2026-02-24 15:55:16 +01:00
yoff
7351e82c92 python: handle guards compared to boolean literals 2026-02-24 10:00:22 +01:00
yoff
8488039fb9 python: add tests for guards compared to booleans 2026-02-24 10:00:21 +01:00
Taus
e8de8433f4 Python: Update all metrics-dependant queries
The ones that no longer require points-to no longer import
`LegacyPointsTo`. The ones that do use the specific
`...MetricsWithPointsTo` classes that are applicable.
2026-02-19 12:32:27 +00:00
Taus
248932db7a Python: Fix frameworks/data/warnings.ql 2026-02-16 13:48:32 +00:00
Taus
df0f2f8ce4 Python: Simple dataflow annotations
None of these required any changes to the dataflow libraries, so it
seemed easiest to put them in their own commit.
2026-02-16 13:48:32 +00:00
Taus
958c798c3f Python: Accept dataflow test changes
New nodes means new results. Luckily we rarely have a test that selects
_all_ dataflow nodes.
2026-01-30 12:50:25 +00:00
Taus
ac5a74448f Python: Fix tests
With `ModuleVariableNode`s now appearing for _all_ global variables (not
just the ones that actually seem to be used), some of the tests changed
a bit. Mostly this was in the form of new flow (because of new nodes
that popped into existence). For some inline expectation tests, I opted
to instead exclude these results, as there was no suitable location to
annotate. For the normal tests, I just accepted the output (after having
vetted it carefully, of course).
2026-01-30 12:50:25 +00:00
Taus
34800d1519 Merge pull request #20945 from joefarebrother/python-websockets
Python: Model remote flow sources for the `websockets` library
2026-01-29 15:47:46 +01:00
Tom Hvitved
b974a84bef Merge pull request #21051 from hvitved/shared/flow-summary-provenance-filtering
Shared: Provenance-based filtering of flow summaries
2026-01-26 17:24:34 +01:00
Tom Hvitved
0adece7cde Python: Adapt to changes in FlowSummaryImpl 2026-01-26 12:40:19 +01:00
yoff
3dbfb9fa4b python: add machinery for MaD barriers
and reinstate previously removed barrier
now as a MaD row
2026-01-22 17:30:24 +01:00
yoff
1ac3706e75 Python support ListElement in MaD 2026-01-09 13:08:06 +01:00
yoff
5c6d83ed65 Merge pull request #20877 from joefarebrother/python-tornado-websocket
Python: Add models for websocket handlers for Tornado
2025-12-09 10:08:59 +01:00
Taus
1b519384d7 Merge pull request #20739 from github/tausbn/python-remove-top-level-points-to-imports
Python: Hide points-to imports in `python.qll`
2025-12-05 14:24:41 +01:00
Joe Farebrother
ac55cf9544 Update test and qldoc 2025-12-01 20:41:59 +00:00
Joe Farebrother
7cf3964e44 Update expectations 2025-12-01 20:27:48 +00:00
Joe Farebrother
384e17a4ef Implement websockets models 2025-12-01 16:24:59 +00:00
Taus
24a29f46be Python: Fix all metrics-related compilation failures
In hindsight, having a `.getMetrics()` method that just returns `this`
is somewhat weird. It's possible that it predates the existence of the
inline cast, however.
2025-11-26 21:28:51 +00:00
Taus
cd1619b43e Python: Fix queries and tests 2025-11-26 17:06:55 +00:00
Joe Farebrother
16018e91a2 Minor test fix 2025-11-26 15:47:56 +00:00
Taus
9dc774aaa3 Python: Remove points-to dependency from parts of SSA
For whatever reason, the CFG node for exceptions and exception groups
was placed with the points-to code. (Probably because a lot of the
predicates depended on points-to.)

However, as it turned out, two of the SSA modules only depended on
non-points-to properties of these nodes, and so it was fairly
straightforward to remove the imports of `LegacyPointsTo` for those
modules.

In the process, I moved the aforementioned CFG node types into
`Flow.qll`, and changed the classes in the `Exceptions` module to the
`...WithPointsTo` form that we introduced elsewhere.
2025-11-26 12:30:31 +00:00