Commit Graph

87757 Commits

Author SHA1 Message Date
yoff
08e2f37757 Python: visit function parameter and return annotations in new CFG
The new (shared-CFG-based) Python control flow graph in
`semmle.python.controlflow.internal.Cfg` previously did not emit CFG
nodes for parameter type annotations (`def f(x: T): ...`) or for the
return type annotation (`-> T`). The legacy CFG emitted both, and a
small number of framework models rely on this: `LocalSources.qll`'s
`annotatedInstance` walks the parameter annotation expression by way
of its CFG node to track that a parameter receives an instance of the
annotated class.

After the dataflow flip to the new CFG/SSA this regression manifested
as lost flows in any test exercising annotation-based parameter
tracking: FastAPI `Depends()` receivers, Pydantic request bodies,
Starlette `WebSocket`, the call-graph type-annotation test, and so on.
Extend `FunctionDefExpr` to visit each annotation as a child of the
function-def expression, in CPython evaluation order: positional
parameter annotations, `*args` annotation, keyword-only parameter
annotations, `**kwargs` annotation, then the return annotation. (Lambda
expressions have no annotations in Python syntax, so `LambdaExpr` is
unchanged.) PEP 695 type parameters remain out of scope; they belong
to the inner annotation scope, not the enclosing CFG.

Restored test results across `framework/aiohttp`, `framework/fastapi`,
`framework/lxml`, the `CallGraph-type-annotations` test, and
`CWE-022-PathInjection`. Two FastAPI list-comprehension MISSING markers
become positive (`taint_test.py:41,55`). CPython CFG consistency
remains clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 15:17:36 +00:00
yoff
cada7e98d2 Python: model exception edges for raise-prone expressions inside try/with
The new CFG previously only emitted exception edges for explicit `raise`
and `assert` statements. As a result, code that became reachable only
via the exception path of an arbitrary expression (e.g., the body of an
`except` handler following a try-body whose `call()` could raise) was
classified as dead, breaking analyses like StackTraceExposure,
FileNotAlwaysClosed, ExceptionInfo, UseOfExit, and CatchingBaseException.

This commit adds a `mayThrow` predicate over expressions that are known
sources of implicit exceptions in Python (calls, attribute access,
subscripts, arithmetic/comparison operators, imports, await/yield/yield
from) plus `from m import *` at the statement level, and routes them
through the shared CFG's `beginAbruptCompletion(_, _, ExceptionSuccessor,
always=false)` hook.

The set of exception sources is restricted to nodes that are
syntactically inside a `try`/`with` statement in the same scope.
This mirrors Java's `ControlFlowGraph::mayThrow`, which only emits
exception edges where local handling can observe them — outside such
contexts, the edges add CFG complexity (weakening BarrierGuard
precision and breaking SSA continuity around augmented assignments and
subscript stores) without analysis benefit, since exceptions just
propagate to the function exit anyway.

Net effect on the test suite: ~100 alerts restored across the exception-
related query tests (StackTraceExposure +29, ExceptionInfo +17,
FileNotAlwaysClosed +52, UseOfExit +1, CatchingBaseException restored)
with no precision regressions. Affected `.expected` files and the
regression-guard `dead_under_no_raise.py` are updated accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 15:17:36 +00:00
yoff
3c9b0f770b Python: switch dataflow library to new (shared) CFG + SSA
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 15:17:35 +00:00
Copilot
e93ef0f9d3 Python: add new shared-SSA-backed SSA adapter
Preparatory refactor for the shared-CFG dataflow migration. Adds the
new Python SSA adapter additively, without changing any production
behaviour.

Library additions:

- semmle.python.dataflow.new.internal.SsaImpl — Python SSA
  implementation built on the new (shared) CFG. Mirrors the Java SSA
  adapter (java/ql/lib/semmle/code/java/dataflow/internal/SsaImpl.qll):
  an InputSig is defined in terms of positional (BasicBlock, int)
  variable references, and the shared
  codeql.ssa.Ssa::Make<Location, Cfg, Input> module is then
  instantiated.

  SourceVariable is the AST-level Py::Variable. Variable references
  are looked up via the new CFG facade's NameNode.defines/uses/deletes
  predicates (added in the preceding PR), which themselves are
  one-line bridges to AST-level Name.defines/uses/deletes.

  Implicit-entry definitions are inserted for non-local/global/builtin
  reads, captured variables, and (when needed) parameters.

Test additions:

- library-tests/dataflow-new-ssa/ — exercises the new SSA over a
  representative test corpus and checks expected def/use chains.

- library-tests/dataflow-new-ssa-vs-legacy/ — runs both new SSA and
  legacy ESSA over the same corpus and diffs the results, so any
  semantic divergence shows up as a test failure.

Production impact:

None. The new SSA adapter has zero callers in lib/ and src/ — the
legacy ESSA SSA (semmle/python/essa/*) remains the default. The
dataflow library is not migrated yet; that lands in a follow-up PR.

Verified by:
- All 367 lib + src + consistency-queries compile clean.
- All 641 ControlFlow + PointsTo + dataflow + essa + consistency
  library-tests pass.
- Both new dataflow-new-ssa[/vs-legacy] test packs pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 15:17:34 +00:00
Copilot
6248d3ad9d Python: add new shared-CFG-backed control flow graph
Preparatory refactor for the shared-CFG dataflow migration. Adds the
new Python CFG library additively, without changing any production
behaviour.

Library additions:

- semmle.python.controlflow.internal.AstNodeImpl — mediates between
  the Python AST and the shared codeql.controlflow.ControlFlowGraph
  signature. Wraps Python's Stmt/Expr/Scope/Pattern and adds two
  synthetic kinds of node (BlockStmt for body slots, intermediate
  nodes for multi-operand boolean expressions).

- semmle.python.controlflow.internal.Cfg — public facade
  re-exposing the same API surface as semmle/python/Flow.qll
  (ControlFlowNode, CallNode, BasicBlock, NameNode, DefinitionNode,
  CompareNode, ...), backed by the shared CFG.

- lib/printCfgNew.ql — debug/visualisation query for the new CFG.

- consistency-queries/CfgConsistency.ql — consistency query running
  the shared CFG's standard checks against Python.

Shared library:

- shared.controlflow.ControlFlowGraph — adds two defaulted
  getWhileElse / getForeachElse predicates to AstSig so Python can
  model while-else / for-else (no behavioural change for other
  languages).

Test additions:

- ControlFlow/bindings/* — annotation-driven SSA-binding tests for
  the new CFG (annassign, compound, comprehension, decorated,
  except_handler, imports, match_pattern, parameters, simple,
  type_params, walrus_starred, with_stmt, dead_under_no_raise).

- ControlFlow/store-load/* — basic store/load coverage.

- ControlFlow/evaluation-order/NewCfg*.ql — mirrors of the existing
  OldCfg evaluation-order self-validation suite, run against the
  new CFG via NewCfgImpl.qll.

- Minor extensions to existing test_if.py / test_boolean.py +
  cosmetic .expected churn on a handful of OldCfg tests.

No dataflow, SSA, or production query is migrated yet — that lands in
follow-up PRs. The new CFG library has zero callers in lib/ and src/.

Verified by:
- All lib + src + consistency-queries compile clean (367 queries).
- All 56 ControlFlow library-tests pass.
- All 474 dataflow + PointsTo library-tests + consistency tests pass.
- syntax_error/CONSISTENCY/CfgConsistency passes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 15:17:34 +00:00
yoff
f0cdfbd9f4 Shared CFG: add defaulted getWhileElse/getForeachElse/getCatchType to AstSig
Adds three new defaulted signature predicates to the shared CFG library:

- getWhileElse / getForeachElse: `else` block of a while/for loop, if
  any (used by Python's `while-else` / `for-else` constructs).
- getCatchType: type expression of a catch clause, if any (used by
  Python's `except SomeExpr:` where the catch type is a runtime
  expression that needs CFG evaluation).

Each predicate defaults to `none()`, so behaviour is unchanged for any
language that doesn't override it (verified by re-running
java/ql/test/library-tests/controlflow/).

The Make0 succession rules are extended:
- WhileStmt/ForeachStmt: route the loop-exit edge through the else
  block before reaching the after-position.
- CatchClause: route the matching-evaluation through the type
  expression (if present) before reaching the after-value position.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 15:17:34 +00:00
Owen Mansel-Chan
91a57f7773 Update tests 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
e38a5a52b5 Model panicking log functions better 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
6bc6051ef0 Taint doesn't flow through panicking functions 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
8402153711 Accept changed test output 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
22cefa0f46 Improve test for non-returning fns 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
87f3c92a7f Add missing QLDocs 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
06bf40cfc6 (Trivial) Fix QLDoc grammar 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
2d6a80c70f Add change note 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
db229689eb Improve glog and klog tests 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
024fb74410 Improve glog (and klog) modelling 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
f80d9cab9b Improve noretFunctions test 2026-06-18 15:17:33 +00:00
Owen Mansel-Chan
9053c67e95 Recognize more non-returning logging functions 2026-06-18 15:17:32 +00:00
Tom Hvitved
f2bb30ee0f Address review comments 2026-06-18 15:17:32 +00:00
Tom Hvitved
fd7fbd4993 Apply suggestions from code review
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-06-18 15:17:32 +00:00
Tom Hvitved
5c3b231a01 Ruby: Adopt shared local name resolution library 2026-06-18 15:17:32 +00:00
Tom Hvitved
a08c24a38f Ruby: Add more variable tests 2026-06-18 15:17:32 +00:00
Tom Hvitved
51d752b2ca Rust: Run codegen 2026-06-18 15:17:32 +00:00
Tom Hvitved
15cd643f8a Rust: Adopt shared local name resolution library 2026-06-18 15:17:32 +00:00
Tom Hvitved
9fd266e576 Rust: More local variable tests 2026-06-18 15:17:32 +00:00
Tom Hvitved
996777dfe9 Shared: Add local name binding library 2026-06-18 15:17:32 +00:00
yoff
52d2309186 Apply suggestions from code review
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-06-18 15:17:31 +00:00
Copilot
b99e4898b8 Python: qualify Flow.qll's AST references with Py:: prefix
Preparatory refactor for the shared-CFG dataflow migration. Switches
'import python' to 'import python as Py' inside Flow.qll, and qualifies
every AST-class reference (Expr, Bytes, Dict, AssignExpr, Compare,
Module, Scope, Call, Attribute, SsaVariable, AugAssign, etc.) with the
Py:: prefix.

Flow.qll's own CFG types (ControlFlowNode, BasicBlock, CallNode,
NameNode, DefinitionNode, CompareNode, ...) keep their unqualified
names — they remain the public CFG API exported from this file.

This is a semantic noop: the qualification was applied mechanically by
script and no name resolution changes. Verified by:
- All 361 lib/ + src/ queries compile clean.
- All 186 ControlFlow + PointsTo + dataflow library-tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 15:17:31 +00:00
yoff
3bce589d45 Python: deprecate Function.getAReturnValueFlowNode() and rewrite internal callers
Follow-up to the getAFlowNode deprecation in the same PR: same AST→legacy-CFG
bridge pattern. The 11 internal call sites (across objects/, types/,
frameworks/, and TypeTrackingImpl) are rewritten to bind a `Return ret`
explicitly, then constrain via `ret.getScope() = f and n.getNode() = ret.getValue()`.

The predicate itself is preserved with a deprecation note so external
users do not experience churn.

Semantic noop.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 15:17:12 +00:00
Copilot
a13dfaa44f Python: deprecate AstNode.getAFlowNode() and rewrite internal callers
Preparatory refactor for the shared-CFG dataflow migration.

Deprecates the AstNode.getAFlowNode() cached predicate on the public
Python QL API and rewrites all ~140 internal callers across lib/, src/,
test/, and tools/ from `expr.getAFlowNode() = cfgNode` to
`cfgNode.getNode() = expr`, using ControlFlowNode.getNode() which
already exists in Flow.qll.

The predicate itself is preserved (with a deprecation note pointing at
the new pattern) so external users do not experience churn — they can
migrate at their own pace and the AST/CFG hierarchies still get the
intended untangling once the deprecation eventually elapses.

Semantic noop verified by:
- All 361 lib/ + src/ queries compile clean.
- All 122 ControlFlow + PointsTo library-tests pass.
- All 64 dataflow library-tests pass.
- All 113 Variables/Exceptions/Expressions/Statements/Functions/Imports/
  Security/CWE-798/ModificationOfParameterWithDefault query-tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-02 08:37:30 +00:00
yoff
ac5fa629ef Python: inline init_module_submodule_defn into ImportResolution
The new-dataflow ImportResolution module only used
semmle.python.essa.SsaDefinitions for the 5-line helper predicate
SsaSource::init_module_submodule_defn. Inline it locally and drop the
dependency on legacy SsaDefinitions. This is the only remaining direct
import of semmle.python.essa.* in the new dataflow stack, so dropping
it makes the layering cleaner.

Semantic noop on the current SSA: SsaSourceVariable.getName() and
GlobalVariable.getId() both project the same DB column
(variable(_,_,result)), and the old call's 'init.getEntryNode() = f'
join was just constraining init = package via Scope.getEntryNode()'s
functional uniqueness. RA dump of accesses.ql confirms only the
expected predicate-rename shuffle; all 70 dataflow + ApiGraphs library
tests pass.

This factors out commit 8cab5a20f2 from the larger shared-CFG
migration #21925.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-02 08:24:17 +00:00
yoff
5fb75ac987 Python: simplify decorator-detection predicates to pure AST match
The internal predicates that identify `@staticmethod`, `@classmethod` and
`@property` decorators previously required the decorator's `NameNode` to
satisfy `isGlobal()` (i.e. no SSA def reaches the decorator's name use).
That filter was correct but unnecessarily indirect: these three names
are builtins, and even when a class body redefines one, the class body
has not started executing at the decorator position, so Python uses the
builtin.

Match the decorator's AST `Name` directly instead, dropping the CFG/SSA
detour. The slight semantic change — `isGlobal()` would have rejected
module-level shadowing of these builtins — is negligible in practice
and explicitly documented in the change note.

`hasContextmanagerDecorator` and `hasOverloadDecorator` keep the
`NameNode.isGlobal()` check because their target names (`contextmanager`,
`overload`) are imported, not builtin, and local shadowing is a real
concern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-01 14:04:43 +00:00
Jeroen Ketema
ab4a575243 Merge pull request #21899 from MathiasVP/use-new-prototype-extensionals
C++: Use the new `prototype`-related extensionals in MaD
2026-06-01 10:24:19 +02:00
Mathias Vorreiter Pedersen
22b08f1ea4 C++: Add a test with a kind of "partial function template" instantiation. 2026-05-31 12:47:31 +02:00
Mathias Vorreiter Pedersen
e18448dd59 C++: Add more tests. 2026-05-29 18:22:13 +02:00
Henry Mercer
a16f1c555c Merge pull request #21912 from github/post-release-prep/codeql-cli-2.25.6
Post-release preparation for codeql-cli-2.25.6
2026-05-29 14:43:56 +01:00
Geoffrey White
43c1152634 Merge pull request #21905 from geoffw0/swiftflow2
Swift: Update the new metatype sinks
2026-05-29 14:18:45 +01:00
github-actions[bot]
cfb18c2477 Post-release preparation for codeql-cli-2.25.6 2026-05-29 12:04:35 +00:00
Henry Mercer
1a82a682e9 Merge pull request #21911 from github/release-prep/2.25.6
Release preparation for version 2.25.6
codeql-cli/latest codeql-cli/v2.25.6
2026-05-29 12:28:44 +01:00
github-actions[bot]
8b6f969cdb Release preparation for version 2.25.6 2026-05-29 11:27:54 +00:00
Henry Mercer
f4da0df3c7 Merge pull request #21910 from github/revert-21892-release-prep/2.25.6
Revert "Release preparation for version 2.25.6"
2026-05-29 12:25:55 +01:00
Henry Mercer
9bc0c1b1ab Revert "Release preparation for version 2.25.6" 2026-05-29 12:13:50 +01:00
Anders Schack-Mulligen
4c31866910 Merge pull request #21867 from aschackmull/ruby/callable-body
Ruby: Split callable and its body into two AST nodes.
2026-05-29 10:16:19 +02:00
Taus
6165623cbf Merge pull request #21724 from github/tausbn/python-add-self-validating-cfg-tests 2026-05-28 22:07:55 +02:00
Michael Nebel
2eac8890d3 Merge pull request #21893 from michaelnebel/cshar/updateroslyn
C#: Update Roslyn and other pinned depenencies.
2026-05-28 13:49:29 +02:00
Mathias Vorreiter Pedersen
2d581504f7 C++: Fix Copilot comments. 2026-05-28 13:34:18 +02:00
Mathias Vorreiter Pedersen
9f211cebd5 C++: Accept test changes. 2026-05-28 13:34:16 +02:00
Mathias Vorreiter Pedersen
8393b40b59 C++: Use the new extensionals to map template functions and classes to their fully templated versions. 2026-05-28 13:34:12 +02:00
Geoffrey White
f8ab76e1ba Swift: Update the new metatype sinks to not rely on name matching '.Type'. 2026-05-28 12:14:10 +01:00
Geoffrey White
34d4e9a8e2 Merge pull request #21898 from geoffw0/swiftflow
Swift: Extend swift/weak-sensitive-data-hashing, swift/weak-password-hashing sinks
2026-05-28 11:52:32 +01:00