Commit Graph

4167 Commits

Author SHA1 Message Date
Copilot
89180671c8 Python: wire import-statement bindings into the shared CFG (green)
Adds `ImportStmt` and `ImportStarStmt` wrappers in `AstNodeImpl.qll`.
For each `Alias` in an import statement, both the value (module/member
expression) and the bound `asname` Name become children of the CFG node
for the import statement, in evaluation order.

Without this, every `Name` introduced by `import` / `from .. import ..`
lacked a CFG node, even though `Name.defines(v)` returns true for it on
the AST side. This was the highest-volume gap: 20,332 missing import
aliases across CPython.

Removes the corresponding MISSING: annotations from imports.py.

Verified: all 24 ControlFlow/evaluation-order tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 21:09:43 +00:00
Copilot
17c6d10c66 Python: wire parameters into the shared CFG (C# pattern)
Implements `AstSig::Parameter` and `callableGetParameter(c, i)` in
`AstNodeImpl.qll`, following the C# template
(`csharp/.../ControlFlowGraph.qll:147-156`) rather than Java's
`Parameter() { none() }`.

Each Python parameter (positional, *args, keyword-only, **kwargs) now
becomes a CFG node at a stable position in the enclosing callable's
entry sequence. Defaults still evaluate at function-definition time
via `FunctionDefExpr.getDefault` / `LambdaExpr.getDefault`, so
`Parameter::getDefaultValue()` returns `none()` (the shared CFG
library calls this to model the missing-argument fallback, which
Python does not surface at the CFG level).

The bindings test now exercises parameters (the `py_expr_contexts(_, 4, ...)`
exclusion has been removed). A new `parameters.py` test case covers
positional, defaulted, vararg, kwarg, keyword-only, kitchen-sink,
method (self/cls), lambda, and PEP 570 positional-only parameters.
Several other test files were updated to annotate parameters that the
test had previously hidden (synthetic `.0` comprehension parameter,
method `self`, decorator `f`, etc.).

Verified:
- All 24 ControlFlow/evaluation-order tests still pass.
- CFG consistency query (`python/ql/consistency-queries/CfgConsistency.ql`)
  shows zero violations on CPython.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 21:09:43 +00:00
Copilot
bd20042636 Python: wire AnnAssign into the shared CFG (green)
Adds an `AnnAssignStmt` wrapper in `AstNodeImpl.qll` so that PEP 526
annotated assignments (`x: int = 1`, `x: int`) participate in the
control flow graph. Evaluation order follows CPython: annotation,
optional value, target binding.

Without this, `x: int = 1` had no CFG node for `x` even though
`Name.defines(v)` returns true for it on the AST side. SSA built on
the new CFG would therefore miss every annotated-assignment write.

Removes the corresponding MISSING: annotations from the CFG-binding
gap test:
- annassign.py — all four cases now green.
- match_pattern.py — class-body annotated fields (`x: int`, `y: int`).
- type_params.py — `item: T` inside class.

Verified: all 24 ControlFlow/evaluation-order tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 21:09:43 +00:00
Copilot
93bd4e3b85 Python: add CFG-binding gap tests (red)
Adds inline-expectation tests for the new shared CFG implementation in
python/ql/lib/semmle/python/controlflow/internal/AstNodeImpl.qll,
covering every Python binding construct that introduces a variable.

The test files use MISSING: annotations to record bindings whose
defining Name AST node is *not* currently reachable from the new CFG.
These are the 'red' half of red-green commit pairs: subsequent commits
will extend AstNodeImpl to cover each construct and remove the
corresponding MISSING: marker.

Confirmed-broken categories:
- Import aliases (from x import a)
- Annotated assignment (x: int = 1)
- Exception handler (except E as e)
- Match patterns (case x, case [a,b], case ... as v)
- PEP 695 type params (def f[T], class C[T])

Confirmed-working (no MISSING:):
- Compound targets, with-as, comprehensions, decorated def/class,
  walrus, starred.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 21:09:42 +00:00
Copilot
c24e476879 Shared CFG: support for-else and while-else loops
Add two default predicates to AstSig:

  default AstNode getWhileElse(WhileStmt loop) { none() }
  default AstNode getForeachElse(ForeachStmt loop) { none() }

When defined, the explicit-step rules for While/Do and Foreach
route the loop's normal-completion exits through the else block
before reaching the after-loop node:

  - WhileStmt: after-false condition -> before-else -> after-while
    (instead of directly after-while).
  - ForeachStmt: after-collection [empty] and the LoopHeader exit
    are both routed through before-else -> after-foreach.

Python's Ast module overrides the predicates to return the
synthetic BlockStmt for the orelse slot, replacing the previous
customisations in Input::step. This eliminates parallel direct
successors emitted by the previous Python-side step additions
(verified: multipleSuccessors on a CPython database goes from
1340 to 0).

Java and C# CFG tests are unaffected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 21:09:40 +00:00
Taus
498aece892 WIP2 2026-05-28 21:09:39 +00:00
Taus
8d814e1fbf WIP 2026-05-28 21:09:39 +00:00
Taus
f89a773b80 Python: Ignore synthetic CFG nodes
We can only annotate the ones that correspond directly to AST nodes
anyway.

Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:38 +00:00
Taus
e66bf87f22 Python: Instantiate CFG tests with new CFG library
Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:37 +00:00
Taus
019e6f233f Python: Make CFG tests parameterised
Currently we only instantiate them with the old CFG library, but in the
future we'll want to do this with the new library as well.

Co-authored-by: yoff <yoff@github.com>
2026-05-28 21:09:37 +00:00
Taus
df6d0cad5e Python: Add ConsecutiveTimestamps test
This one is potentially a bit iffy -- it checks for a very powerful
propetry (that implies many of the other queries), but as the test
results show, it can produce false positives when there is in fact no
problem. We may want to get rid of it entirely, if it becomes too noisy.
2026-05-28 21:09:37 +00:00
Taus
6d829d6cc8 Python: Add NeverReachable test
This looks for nodes annotated with `t.never` in the test that are
reachable in the CFG. This should not happen (it messes with various
queries, e.g. the "mixed returns" query), but the test shows that in a
few particular cases (involving the `match` statement where all cases
contain `return`s), we _do_ have reachable nodes that shouldn't be.
2026-05-28 21:09:37 +00:00
Taus
661fd3156f Python: Add BasicBlockOrdering test
This one demonstrates a bug in the current CFG. In a dictionary
comprehension `{k: v for k, v in d.items()}`, we evaluate the value
before the key, which is incorrect. (A fix for this bug has been
implemented in a separate PR.)
2026-05-28 21:09:37 +00:00
Taus
cc471fd672 Python: Add some CFG-validation queries
These use the annotated, self-verifying test files to check various
consistency requirements.

Some of these may be expressing the same thing in different ways, but
it's fairly cheap to keep them around, so I have not attempted to
produce a minimal set of queries for this.
2026-05-28 21:09:37 +00:00
Taus
9a4fb5c971 Python: Add self-validating CFG tests
These tests consist of various Python constructions (hopefully a
somewhat comprehensive set) with specific timestamp annotations
scattered throughout. When the tests are run using the Python 3
interpreter, these annotations are checked and compared to the "current
timestamp" to see that they are in agreement. This is what makes the
tests "self-validating".

There are a few different kinds of annotations: the basic `t[4]` style
(meaning this is executed at timestamp 4), the `t.dead[4]` variant
(meaning this _would_ happen at timestamp 4, but it is in a dead
branch), and `t.never` (meaning this is never executed at all).

In addition to this, there is a query, MissingAnnotations, which checks
whether we have applied these annotations maximally. Many expression
nodes are not actually annotatable, so there is a sizeable list of
excluded nodes for that query.
2026-05-28 21:09:36 +00:00
Taus
6165623cbf Merge pull request #21724 from github/tausbn/python-add-self-validating-cfg-tests 2026-05-28 22:07:55 +02:00
Taus
35faec3db1 Python: Address review comments
- Get rid of unnecessary parentheses
- Use call syntax in the relevant test
- Get rid of `dead(2)` annotation
2026-05-27 15:27:19 +00:00
Taus
1ef557c972 Python: Address Copilot's comments 2026-05-12 15:27:14 +00:00
Taus
f5c3b63a4a Python: Add ConsecutiveTimestamps test
This one is potentially a bit iffy -- it checks for a very powerful
property (that implies many of the other queries), but as the test
results show, it can produce false positives when there is in fact no
problem. We may want to get rid of it entirely, if it becomes too noisy.
2026-05-12 12:54:26 +00:00
Taus
c30d6ae3aa Python: Add NeverReachable test
This looks for nodes annotated with `t[never]` in the test that are
reachable in the CFG. This should not happen (it messes with various
queries, e.g. the "mixed returns" query), but the test shows that in a
few particular cases (involving the `match` statement where all cases
contain `return`s), we _do_ have reachable nodes that shouldn't be.
2026-05-12 12:54:26 +00:00
Taus
fc2bc26f36 Python: Add BasicBlockOrdering test
This one demonstrates a bug in the current CFG. In a dictionary
comprehension `{k: v for k, v in d.items()}`, we evaluate the value
before the key, which is incorrect. (A fix for this bug has been
implemented in a separate PR.)
2026-05-12 12:54:25 +00:00
Taus
3a979ac2f8 Python: Add some CFG-validation queries
These use the annotated, self-verifying test files to check various
consistency requirements.

Some of these may be expressing the same thing in different ways, but
it's fairly cheap to keep them around, so I have not attempted to
produce a minimal set of queries for this.
2026-05-12 12:54:25 +00:00
Taus
71cd5be513 Python: Add self-validating CFG tests
These tests consist of various Python constructions (hopefully a
somewhat comprehensive set) with specific timestamp annotations
scattered throughout. When the tests are run using the Python 3
interpreter, these annotations are checked and compared to the "current
timestamp" to see that they are in agreement. This is what makes the
tests "self-validating".

There are a few different kinds of annotations: the basic `t[4]` style
(meaning this is executed at timestamp 4), the `t[dead(4)]` variant
(meaning this _would_ happen at timestamp 4, but it is in a dead
branch), and `t[never]` (meaning this is never executed at all).

In addition to this, there is a query, MissingAnnotations, which checks
whether we have applied these annotations maximally. Many expression
nodes are not actually annotatable, so there is a sizeable list of
excluded nodes for that query.
2026-05-12 12:42:29 +00:00
Geoffrey White
1c704a0912 Python: Accept test changes (improvement). 2026-05-07 10:28:19 +01:00
Josef Svenningsson
68be006a29 Merge pull request #21641 from github/josefs/promptInjectionImprovements
Improve prompt inject for Python
2026-04-29 11:23:52 +01:00
Josef Svenningsson
25a8aa97b2 Fix openai prompt injection tests 2026-04-28 18:24:26 +01:00
Josef Svenningsson
a05e191518 Add tests for anthropic prompt injection models 2026-04-28 18:24:22 +01:00
Josef Svenningsson
e069c9c2ee Fix tests 2026-04-28 18:24:19 +01:00
Taus
ac23e16786 Python: Move Python 3.15 data-flow tests to a separate file
We won't be able to run these tests until Python 3.15 is actually out
(and our CI is using it), so it seemed easiest to just put them in their
own test directory.
2026-04-17 13:16:46 +00:00
Taus
dc36609743 Python: Add data-flow tests
Alas, all these demonstrate is that we already don't fully support the
desugared `yield from` form.
2026-04-17 12:15:04 +00:00
Taus
8b1ecf05c9 Python: Update test output
This change reflects the `(value, key)` to `(key, value)` fix in an
earlier commit.
2026-04-14 13:27:31 +02:00
Taus
de900fc3b5 Python: Add QL test for comprehensions with unpacking 2026-04-14 13:27:31 +02:00
Taus
c748fdf8ee Merge pull request #21694 from github/tausbn/python-add-support-for-pep-810
Python: Add support for PEP 810
2026-04-14 13:27:08 +02:00
Taus
2eeb31b472 Python: Add tests for lazy from ... import * as well 2026-04-13 11:49:06 +00:00
Taus
6b7d47ee7d Python: Add QL test for the new syntax 2026-04-10 14:39:13 +00:00
Taus
e3688444d7 Python: Also exclude class scope
Changing the `locals()` dictionary actually _does_ change the attributes
of the class being defined, so we shouldn't alert in this case.
2026-04-07 23:46:03 +02:00
Taus
16683aee0e Merge pull request #21590 from github/tausbn/python-improve-bind-all-interfaces-query
Python: Improve "bind all interfaces" query
2026-04-07 17:59:48 +02:00
Taus
187f7c7bcf Python: Move isNetworkBind check into isSink 2026-03-27 22:45:26 +00:00
Taus
4f74d421b9 Python: Exclude AF_UNIX sockets from BindToAllInterfaces
Looking at the results of the the previous DCA run, there was a bunch of
false positives where `bind` was being used with a `AF_UNIX` socket (a
filesystem path encoded as a string), not a `(host, port)` tuple. These
results should be excluded from the query, as they are not vulnerable.

Ideally, we would just add `.TupleElement[0]` to the MaD sink, except we
don't actually support this in Python MaD...

So, instead I opted for a more low-tech solution: check that the
argument in question flows from a tuple in the local scope.

This eliminates a bunch of false positives on `python/cpython` leaving
behind four true positive results.
2026-03-27 16:55:10 +00:00
Taus
47d24632e6 Python: Port ShouldUseWithStatement.ql
Only trivial test changes.
2026-03-27 12:34:20 +00:00
Taus
c9832c330a Python: Convert BindToAllInterfaces to path-problem
Now that we're using global data-flow, we might as well make use of the
fact that we know where the source is.
2026-03-26 21:10:43 +00:00
Taus
c439fc5d45 Python: Replace type tracking with global data-flow
This takes care of most of the false negatives from the preceding
commit.

Additionally, we add models for some known wrappers of `socket.socket`
from the `gevent` and `eventlet` packages.
2026-03-26 15:35:33 +00:00
Taus
1ecd9e83b8 Python: Add test cases for BindToAllInterfaces FNs
Adds test cases from github/codeql#21582 demonstrating false negatives:
- Address stored in class attribute (`self.bind_addr`)
- `os.environ.get` with insecure default value
- `gevent.socket` (alternative socket module)
2026-03-26 14:57:24 +00:00
Taus
824d004a27 Python: Convert BindToAllInterfaces test to inline expectations 2026-03-26 14:56:57 +00:00
Taus
1ffcdc9293 Python: Select property instead of function
in PropertyInOldStyleClass. This matches the previous behaviour more
closely.
2026-03-23 14:55:28 +00:00
Taus
3584ad1905 Python: Port DeprecatedSliceMethod.ql
Only trivial test changes.
2026-03-20 13:30:29 +00:00
Taus
283231bdbc Python: Port ShouldBeContextManager.ql
Only trivial test changes.
2026-03-20 13:28:45 +00:00
Taus
8cfdea2001 Python: Port PropertyInOldStyleClass.ql
Only trivial test changes.
2026-03-20 13:28:45 +00:00
Taus
3d20050c0a Python: Port SlotsInOldStyleClass.ql
Only trivial test changes.
2026-03-20 13:28:45 +00:00
Taus
a99b3f2c3b Merge pull request #21459 from github/tausbn/python-fix-missing-relative-imports
Python: Fix resolution of relative imports from namespace packages
2026-03-16 14:59:44 +01:00