Commit Graph

9988 Commits

Author SHA1 Message Date
Copilot
c24e476879 Shared CFG: support for-else and while-else loops
Add two default predicates to AstSig:

  default AstNode getWhileElse(WhileStmt loop) { none() }
  default AstNode getForeachElse(ForeachStmt loop) { none() }

When defined, the explicit-step rules for While/Do and Foreach
route the loop's normal-completion exits through the else block
before reaching the after-loop node:

  - WhileStmt: after-false condition -> before-else -> after-while
    (instead of directly after-while).
  - ForeachStmt: after-collection [empty] and the LoopHeader exit
    are both routed through before-else -> after-foreach.

Python's Ast module overrides the predicates to return the
synthetic BlockStmt for the orelse slot, replacing the previous
customisations in Input::step. This eliminates parallel direct
successors emitted by the previous Python-side step additions
(verified: multipleSuccessors on a CPython database goes from
1340 to 0).

Java and C# CFG tests are unaffected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 21:09:40 +00:00
Copilot
2a04316d69 Python: compact-renumber FunctionExpr/Lambda defaults
`Args.getDefault(int)` and `Args.getKwDefault(int)` are indexed by
argument position (with gaps for args without defaults), not by
default position. The CFG `getChild` predicate for FunctionDefExpr
and LambdaExpr therefore had gaps at low indices and collisions
where defaults and kwdefaults overlapped, producing parallel
edges before the FunctionExpr.

Use `rank` to compact-renumber `getDefault(n)` and `getKwDefault(n)`
in source order. Verified on a CPython database: removes ~536
`multipleSuccessors` consistency results (1340 -> 804); the rest are
`for/else` and `while/else`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 21:09:40 +00:00
Copilot
7912e1b257 Python: collapse two-layer AstNodeImpl into a single Ast module
Merge the previous `Ast` and `AstSigImpl` modules into a single
`module Ast implements AstSig<Py::Location>`. Classes now use the
signature names (IfStmt, WhileStmt, ForeachStmt, etc.) and signature
predicates (getCondition, getThen, getElse, etc.) directly, with no
intermediate renaming layer.

Drop the TStmtListNode newtype branch entirely. Replace it with a
synthetic TBlockStmt(parent, slot) keyed by a parent AST node and a
slot label string ('body', 'orelse', 'finally'). Py::StmtList no
longer appears in the newtype; the BlockStmt class provides indexed
access to the underlying body items via getStmt(n).

All 22 of 24 evaluation-order tests still pass; the same 2
comprehension-related failures predate this refactor.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 21:09:39 +00:00
yoff
5746ed713f python: add consistency checks
Co-authored-by: aschackmull <aschackmull@github.com>
2026-05-28 21:09:39 +00:00
yoff
768bdb5937 Python: add pattern nodes
Co-authored-by: Copilot <copilot@github.com>
2026-05-28 21:09:39 +00:00
Taus
41b5589460 Cleanup, printCFG
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:39 +00:00
Taus
498aece892 WIP2 2026-05-28 21:09:39 +00:00
Taus
8d814e1fbf WIP 2026-05-28 21:09:39 +00:00
Taus
655f84ed0d Python: Handle dict unpacking in calls
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:39 +00:00
Taus
aaf9cc52d4 Python: Fix exception issue
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:39 +00:00
Taus
cc77f0bcfa Python: Fix match
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:39 +00:00
Taus
146a3a929d Python: Support match
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
d83d943f68 Python: More nodes
Not entirely sure about the `else:` blocks.

Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
5b1de9eacd Python: Comprehensions
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
9f93d6c902 Python: Add with
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
cc09df27ba Python: More simple statements
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
da408d7c75 Python: assignments
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
999b8f23cb Python: Attributes
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
4336b07d48 Python: Function calls
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
b8bc230a38 Python: Assert statements
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
4e3a633f14 Python: Support various literals
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
f89a773b80 Python: Ignore synthetic CFG nodes
We can only annotate the ones that correspond directly to AST nodes
anyway.

Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
4583244ec6 Python: More AstNodeImpl improvements
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:37 +00:00
Taus
e66bf87f22 Python: Instantiate CFG tests with new CFG library
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:37 +00:00
Taus
6b3a790015 Python: Instantiate CFG module fully
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:37 +00:00
Taus
b5df1886ea Python: Use fields everywhere in new AST classes
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:37 +00:00
Taus
30f28bab8d Python: First stab at shared control-flow 2026-05-28 21:09:37 +00:00
Taus
019e6f233f Python: Make CFG tests parameterised
Currently we only instantiate them with the old CFG library, but in the
future we'll want to do this with the new library as well.

Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:37 +00:00
Taus
df6d0cad5e Python: Add ConsecutiveTimestamps test
This one is potentially a bit iffy -- it checks for a very powerful
propetry (that implies many of the other queries), but as the test
results show, it can produce false positives when there is in fact no
problem. We may want to get rid of it entirely, if it becomes too noisy.
2026-05-28 21:09:37 +00:00
Taus
6d829d6cc8 Python: Add NeverReachable test
This looks for nodes annotated with `t.never` in the test that are
reachable in the CFG. This should not happen (it messes with various
queries, e.g. the "mixed returns" query), but the test shows that in a
few particular cases (involving the `match` statement where all cases
contain `return`s), we _do_ have reachable nodes that shouldn't be.
2026-05-28 21:09:37 +00:00
Taus
661fd3156f Python: Add BasicBlockOrdering test
This one demonstrates a bug in the current CFG. In a dictionary
comprehension `{k: v for k, v in d.items()}`, we evaluate the value
before the key, which is incorrect. (A fix for this bug has been
implemented in a separate PR.)
2026-05-28 21:09:37 +00:00
Taus
cc471fd672 Python: Add some CFG-validation queries
These use the annotated, self-verifying test files to check various
consistency requirements.

Some of these may be expressing the same thing in different ways, but
it's fairly cheap to keep them around, so I have not attempted to
produce a minimal set of queries for this.
2026-05-28 21:09:37 +00:00
Taus
9a4fb5c971 Python: Add self-validating CFG tests
These tests consist of various Python constructions (hopefully a
somewhat comprehensive set) with specific timestamp annotations
scattered throughout. When the tests are run using the Python 3
interpreter, these annotations are checked and compared to the "current
timestamp" to see that they are in agreement. This is what makes the
tests "self-validating".

There are a few different kinds of annotations: the basic `t[4]` style
(meaning this is executed at timestamp 4), the `t.dead[4]` variant
(meaning this _would_ happen at timestamp 4, but it is in a dead
branch), and `t.never` (meaning this is never executed at all).

In addition to this, there is a query, MissingAnnotations, which checks
whether we have applied these annotations maximally. Many expression
nodes are not actually annotatable, so there is a sizeable list of
excluded nodes for that query.
2026-05-28 21:09:36 +00:00
Taus
6165623cbf Merge pull request #21724 from github/tausbn/python-add-self-validating-cfg-tests 2026-05-28 22:07:55 +02:00
Taus
35faec3db1 Python: Address review comments
- Get rid of unnecessary parentheses
- Use call syntax in the relevant test
- Get rid of `dead(2)` annotation
2026-05-27 15:27:19 +00:00
Óscar San José
996e79131e Merge branch 'main' into post-release-prep/codeql-cli-2.25.5 2026-05-22 16:32:30 +02:00
Geoffrey White
3aa660663e Merge pull request #21806 from geoffw0/extsensitive
Shared: Improvements to SensitiveDataHeuristics.qll
2026-05-19 16:22:03 +01:00
github-actions[bot]
9f64000962 Post-release preparation for codeql-cli-2.25.5 2026-05-18 15:20:31 +00:00
github-actions[bot]
e38616a2ef Release preparation for version 2.25.5 2026-05-18 12:05:32 +00:00
Geoffrey White
a4b2c0f6fd Update change notes (Copilot's suggestions). 2026-05-15 09:24:29 +01:00
Geoffrey White
59dbd68a5e Add change notes. 2026-05-14 14:46:05 +01:00
Geoffrey White
c8196e439f Merge branch 'main' into extsensitive 2026-05-13 13:04:48 +01:00
Paolo Tranquilli
ee13ea0f6b Harden _relative_path for Windows and mixed-form inputs
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-13 11:35:02 +02:00
Paolo Tranquilli
d28792537b Python extractor: use relative paths in diagnostic locations
Diagnostic `Location.file` fields contained absolute filesystem paths,
causing the GitHub UI to generate broken file links with runner paths
like `/home/runner/work/...`. Now paths are relativized against the
source root (`LGTM_SRC` or cwd), falling back to absolute if the file
is outside the source root.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-13 10:32:05 +02:00
Owen Mansel-Chan
0b808e1170 Merge pull request #21807 from owen-mc/java/improve-qhelp-unsafe-deserialization
Shared: improve qhelp for unsafe deserialization queries
2026-05-12 22:22:49 +01:00
Taus
1ef557c972 Python: Address Copilot's comments 2026-05-12 15:27:14 +00:00
Taus
f5c3b63a4a Python: Add ConsecutiveTimestamps test
This one is potentially a bit iffy -- it checks for a very powerful
property (that implies many of the other queries), but as the test
results show, it can produce false positives when there is in fact no
problem. We may want to get rid of it entirely, if it becomes too noisy.
2026-05-12 12:54:26 +00:00
Taus
c30d6ae3aa Python: Add NeverReachable test
This looks for nodes annotated with `t[never]` in the test that are
reachable in the CFG. This should not happen (it messes with various
queries, e.g. the "mixed returns" query), but the test shows that in a
few particular cases (involving the `match` statement where all cases
contain `return`s), we _do_ have reachable nodes that shouldn't be.
2026-05-12 12:54:26 +00:00
Taus
fc2bc26f36 Python: Add BasicBlockOrdering test
This one demonstrates a bug in the current CFG. In a dictionary
comprehension `{k: v for k, v in d.items()}`, we evaluate the value
before the key, which is incorrect. (A fix for this bug has been
implemented in a separate PR.)
2026-05-12 12:54:25 +00:00
Taus
3a979ac2f8 Python: Add some CFG-validation queries
These use the annotated, self-verifying test files to check various
consistency requirements.

Some of these may be expressing the same thing in different ways, but
it's fairly cheap to keep them around, so I have not attempted to
produce a minimal set of queries for this.
2026-05-12 12:54:25 +00:00