Commit Graph

988 Commits

Author SHA1 Message Date
yoff
92e03318ea Python: model exception edges for raise-prone expressions inside try/with
The new CFG previously only emitted exception edges for explicit `raise`
and `assert` statements. As a result, code that became reachable only
via the exception path of an arbitrary expression (e.g., the body of an
`except` handler following a try-body whose `call()` could raise) was
classified as dead, breaking analyses like StackTraceExposure,
FileNotAlwaysClosed, ExceptionInfo, UseOfExit, and CatchingBaseException.

This commit adds a `mayThrow` predicate over expressions that are known
sources of implicit exceptions in Python (calls, attribute access,
subscripts, arithmetic/comparison operators, imports, await/yield/yield
from) plus `from m import *` at the statement level, and routes them
through the shared CFG's `beginAbruptCompletion(_, _, ExceptionSuccessor,
always=false)` hook.

The set of exception sources is restricted to nodes that are
syntactically inside a `try`/`with` statement in the same scope.
This mirrors Java's `ControlFlowGraph::mayThrow`, which only emits
exception edges where local handling can observe them — outside such
contexts, the edges add CFG complexity (weakening BarrierGuard
precision and breaking SSA continuity around augmented assignments and
subscript stores) without analysis benefit, since exceptions just
propagate to the function exit anyway.

Net effect on the test suite: ~100 alerts restored across the exception-
related query tests (StackTraceExposure +29, ExceptionInfo +17,
FileNotAlwaysClosed +52, UseOfExit +1, CatchingBaseException restored)
with no precision regressions. Affected `.expected` files and the
regression-guard `dead_under_no_raise.py` are updated accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-22 13:46:51 +00:00
yoff
408ba6218f Python: switch dataflow library to new (shared) CFG + SSA
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-22 13:46:43 +00:00
Owen Mansel-Chan
451fc2e4e7 Undo conversion for queries that import LegacyPointsTo 2026-06-19 12:22:42 +01:00
Owen Mansel-Chan
5497f2c5fe Convert Python qlref tests to inline expectations 2026-06-19 12:22:40 +01:00
Owen Mansel-Chan
dd61dd2d74 Fix FP for py/modification-of-locals 2026-06-17 14:24:18 +01:00
Owen Mansel-Chan
47c2c9e763 Add test for FP for py/modification-of-locals 2026-06-17 14:22:42 +01:00
Owen Mansel-Chan
415857cacb Fix FP for py/should-use-with 2026-06-17 13:01:36 +01:00
Owen Mansel-Chan
d72144646a Add test for FP for py/should-use-with 2026-06-17 12:55:17 +01:00
Owen Mansel-Chan
199fd864ad Fix FP for py/file-not-closed 2026-06-17 12:36:04 +01:00
Owen Mansel-Chan
890969433f Add test for FP for py/file-not-closed 2026-06-17 12:19:03 +01:00
Owen Mansel-Chan
9c65082189 Fix MISSING alert 2026-06-15 00:14:52 +01:00
Owen Mansel-Chan
434a99447e Add thorough tests, including one MISSING alert 2026-06-12 13:45:02 +01:00
Owen Mansel-Chan
d389ea4039 Convert sql-injection test to inline expectations 2026-06-12 13:44:56 +01:00
Owen Mansel-Chan
20ce679d61 Accept changed edges in test output
No changes to alerts
2026-06-02 16:15:08 +01:00
Owen Mansel-Chan
f62ebef9e0 Adjust expected test output 2026-06-02 16:15:06 +01:00
Owen Mansel-Chan
e8779295ee Update test results 2026-05-22 11:43:18 +01:00
Rasmus Lerchedahl Petersen
3275c814bd Python: reset test expectations 2026-05-21 16:59:11 +01:00
Rasmus Lerchedahl Petersen
93e7ab52b7 Python: adjust test expectations
We now find an alert on this line as we hope to
It is not an alert for _full_ SSRF, though, since that configuration cannot handle multiple substitutions.
2026-05-21 16:58:51 +01:00
Rasmus Lerchedahl Petersen
b67694b2ab Python: Remove imprecise container steps
- remove `tupleStoreStep` and `dictStoreStep` from `containerStep`
   These are imprecise compared to the content being precise.
- add implicit reads to recover taint at sinks
- add implicit read steps for decoders
  to supplement the `AdditionalTaintStep`
  that now only covers when the full container is tainted.
2026-05-21 16:57:44 +01:00
Geoffrey White
1c704a0912 Python: Accept test changes (improvement). 2026-05-07 10:28:19 +01:00
Taus
e3688444d7 Python: Also exclude class scope
Changing the `locals()` dictionary actually _does_ change the attributes
of the class being defined, so we shouldn't alert in this case.
2026-04-07 23:46:03 +02:00
Taus
16683aee0e Merge pull request #21590 from github/tausbn/python-improve-bind-all-interfaces-query
Python: Improve "bind all interfaces" query
2026-04-07 17:59:48 +02:00
Taus
187f7c7bcf Python: Move isNetworkBind check into isSink 2026-03-27 22:45:26 +00:00
Taus
4f74d421b9 Python: Exclude AF_UNIX sockets from BindToAllInterfaces
Looking at the results of the the previous DCA run, there was a bunch of
false positives where `bind` was being used with a `AF_UNIX` socket (a
filesystem path encoded as a string), not a `(host, port)` tuple. These
results should be excluded from the query, as they are not vulnerable.

Ideally, we would just add `.TupleElement[0]` to the MaD sink, except we
don't actually support this in Python MaD...

So, instead I opted for a more low-tech solution: check that the
argument in question flows from a tuple in the local scope.

This eliminates a bunch of false positives on `python/cpython` leaving
behind four true positive results.
2026-03-27 16:55:10 +00:00
Taus
47d24632e6 Python: Port ShouldUseWithStatement.ql
Only trivial test changes.
2026-03-27 12:34:20 +00:00
Taus
c9832c330a Python: Convert BindToAllInterfaces to path-problem
Now that we're using global data-flow, we might as well make use of the
fact that we know where the source is.
2026-03-26 21:10:43 +00:00
Taus
c439fc5d45 Python: Replace type tracking with global data-flow
This takes care of most of the false negatives from the preceding
commit.

Additionally, we add models for some known wrappers of `socket.socket`
from the `gevent` and `eventlet` packages.
2026-03-26 15:35:33 +00:00
Taus
1ecd9e83b8 Python: Add test cases for BindToAllInterfaces FNs
Adds test cases from github/codeql#21582 demonstrating false negatives:
- Address stored in class attribute (`self.bind_addr`)
- `os.environ.get` with insecure default value
- `gevent.socket` (alternative socket module)
2026-03-26 14:57:24 +00:00
Taus
824d004a27 Python: Convert BindToAllInterfaces test to inline expectations 2026-03-26 14:56:57 +00:00
Taus
3584ad1905 Python: Port DeprecatedSliceMethod.ql
Only trivial test changes.
2026-03-20 13:30:29 +00:00
Taus
283231bdbc Python: Port ShouldBeContextManager.ql
Only trivial test changes.
2026-03-20 13:28:45 +00:00
Owen Mansel-Chan
91b6801db1 py: Inline expectation should have space before $ 2026-03-04 13:11:38 +00:00
Owen Mansel-Chan
5a97348e78 python: Inline expectation should have space after $
This was a regex-find-replace from `# \$(?! )` (using a negative lookahead) to `# $ `.
2026-03-04 12:45:05 +00:00
REDMOND\brodes
4d4e7a1b5c Pretty print for tests. 2026-02-12 08:28:08 -05:00
REDMOND\brodes
9f9c353806 Update expected files. Copilot suggestions broke unit test expected results (column numbers). 2026-02-10 11:47:23 -05:00
REDMOND\brodes
4bb110beb8 More copilot suggestions. 2026-02-10 11:46:16 -05:00
REDMOND\brodes
a91cf6b7cb Applying copilot PR suggestions. 2026-02-10 11:37:11 -05:00
Ben Rodes
9f8ed710e2 Update python/ql/test/query-tests/Security/CWE-918-ServerSideRequestForgery/test_path_validation.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-10 11:09:25 -05:00
REDMOND\brodes
f6c302b68c Removing commented out test cases. 2026-02-06 11:28:48 -05:00
REDMOND\brodes
97f19d03ad Updating test case expected alerts. 2026-02-06 11:20:13 -05:00
REDMOND\brodes
97ddab0724 Added support for new URIValidator in AntiSSRF library. Updated test caes to use postprocessing results. Currently results for partial ssrf still need work, it is flagging cases where the URL is fully controlled, but is sanitized. I'm not sure if this should be flagged yet. 2026-02-06 11:20:11 -05:00
Ben Rodes
08b72d0a86 Update python/ql/test/query-tests/Security/CWE-918-ServerSideRequestForgery/test_azure_client.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-06 11:18:51 -05:00
Ben Rodes
46a2a249f9 Update python/ql/test/query-tests/Security/CWE-918-ServerSideRequestForgery/test_azure_client.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-06 11:18:49 -05:00
REDMOND\brodes
9912aaaf1a Adding azure sdk test cases and updated test expected file. 2026-02-06 11:18:16 -05:00
REDMOND\brodes
0a88425170 Python: Altering SSRF MaD to use 'request-forgery' tag. Update to test cases expected results, off by one line. Changed to using ModelOutput::sinkNode. 2026-02-04 09:04:22 -05:00
Ben Rodes
7ddfa80399 Merge branch 'main' into azure_python_sdk_url_summary_upstream 2026-02-02 09:00:35 -05:00
Owen Mansel-Chan
ad6f800022 Pretty print model numbers in tests 2026-01-30 09:21:24 +00:00
yoff
3dbfb9fa4b python: add machinery for MaD barriers
and reinstate previously removed barrier
now as a MaD row
2026-01-22 17:30:24 +01:00
yoff
699ed50432 python: remove barrier that can be expressed in MaD 2026-01-22 17:30:24 +01:00
Taus
1b519384d7 Merge pull request #20739 from github/tausbn/python-remove-top-level-points-to-imports
Python: Hide points-to imports in `python.qll`
2025-12-05 14:24:41 +01:00