Commit Graph

650 Commits

Author SHA1 Message Date
yoff
893301098c Python: visit function parameter and return annotations in new CFG
The new (shared-CFG-based) Python control flow graph in
`semmle.python.controlflow.internal.Cfg` previously did not emit CFG
nodes for parameter type annotations (`def f(x: T): ...`) or for the
return type annotation (`-> T`). The legacy CFG emitted both, and a
small number of framework models rely on this: `LocalSources.qll`'s
`annotatedInstance` walks the parameter annotation expression by way
of its CFG node to track that a parameter receives an instance of the
annotated class.

After the dataflow flip to the new CFG/SSA this regression manifested
as lost flows in any test exercising annotation-based parameter
tracking: FastAPI `Depends()` receivers, Pydantic request bodies,
Starlette `WebSocket`, the call-graph type-annotation test, and so on.
Extend `FunctionDefExpr` to visit each annotation as a child of the
function-def expression, in CPython evaluation order: positional
parameter annotations, `*args` annotation, keyword-only parameter
annotations, `**kwargs` annotation, then the return annotation. (Lambda
expressions have no annotations in Python syntax, so `LambdaExpr` is
unchanged.) PEP 695 type parameters remain out of scope; they belong
to the inner annotation scope, not the enclosing CFG.

Restored test results across `framework/aiohttp`, `framework/fastapi`,
`framework/lxml`, the `CallGraph-type-annotations` test, and
`CWE-022-PathInjection`. Two FastAPI list-comprehension MISSING markers
become positive (`taint_test.py:41,55`). CPython CFG consistency
remains clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-22 13:47:04 +00:00
yoff
408ba6218f Python: switch dataflow library to new (shared) CFG + SSA
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-22 13:46:43 +00:00
Owen Mansel-Chan
5497f2c5fe Convert Python qlref tests to inline expectations 2026-06-19 12:22:40 +01:00
copilot-swe-agent[bot]
73bc2d70ae Model instance-attribute type flow
Use a field level step like JS and Ruby.
2026-06-11 14:48:55 +02:00
copilot-swe-agent[bot]
a4585d8d94 Add test documenting missing PEP249 alerts for connection stored in self attribute 2026-06-11 05:48:40 +00:00
Owen Mansel-Chan
b27d08ee32 Update edges in expected test output 2026-06-02 18:29:56 +01:00
Owen Mansel-Chan
f62ebef9e0 Adjust expected test output 2026-06-02 16:15:06 +01:00
Owen Mansel-Chan
ec13e1bcd3 Add wildcard ContentSets to avoid performance problems 2026-05-27 15:28:07 +01:00
Rasmus Lerchedahl Petersen
fa758d6bf5 python: fix test 2026-05-21 16:59:19 +01:00
Rasmus Lerchedahl Petersen
fa9426c749 Python: extra tests for comprehension 2026-05-21 16:59:18 +01:00
Rasmus Lerchedahl Petersen
b67694b2ab Python: Remove imprecise container steps
- remove `tupleStoreStep` and `dictStoreStep` from `containerStep`
   These are imprecise compared to the content being precise.
- add implicit reads to recover taint at sinks
- add implicit read steps for decoders
  to supplement the `AdditionalTaintStep`
  that now only covers when the full container is tainted.
2026-05-21 16:57:44 +01:00
Owen Mansel-Chan
aa28c94562 Remove double space after $ in inline expectations tests 2026-03-04 14:12:42 +00:00
Owen Mansel-Chan
91b6801db1 py: Inline expectation should have space before $ 2026-03-04 13:11:38 +00:00
Owen Mansel-Chan
5a97348e78 python: Inline expectation should have space after $
This was a regex-find-replace from `# \$(?! )` (using a negative lookahead) to `# $ `.
2026-03-04 12:45:05 +00:00
Taus
248932db7a Python: Fix frameworks/data/warnings.ql 2026-02-16 13:48:32 +00:00
Taus
ac5a74448f Python: Fix tests
With `ModuleVariableNode`s now appearing for _all_ global variables (not
just the ones that actually seem to be used), some of the tests changed
a bit. Mostly this was in the form of new flow (because of new nodes
that popped into existence). For some inline expectation tests, I opted
to instead exclude these results, as there was no suitable location to
annotate. For the normal tests, I just accepted the output (after having
vetted it carefully, of course).
2026-01-30 12:50:25 +00:00
Taus
34800d1519 Merge pull request #20945 from joefarebrother/python-websockets
Python: Model remote flow sources for the `websockets` library
2026-01-29 15:47:46 +01:00
yoff
3dbfb9fa4b python: add machinery for MaD barriers
and reinstate previously removed barrier
now as a MaD row
2026-01-22 17:30:24 +01:00
yoff
1ac3706e75 Python support ListElement in MaD 2026-01-09 13:08:06 +01:00
yoff
5c6d83ed65 Merge pull request #20877 from joefarebrother/python-tornado-websocket
Python: Add models for websocket handlers for Tornado
2025-12-09 10:08:59 +01:00
Joe Farebrother
ac55cf9544 Update test and qldoc 2025-12-01 20:41:59 +00:00
Joe Farebrother
7cf3964e44 Update expectations 2025-12-01 20:27:48 +00:00
Joe Farebrother
384e17a4ef Implement websockets models 2025-12-01 16:24:59 +00:00
Joe Farebrother
16018e91a2 Minor test fix 2025-11-26 15:47:56 +00:00
Joe Farebrother
eb7fe71557 Fix namespace instances and update tests 2025-11-26 10:51:16 +00:00
Joe Farebrother
83eadbad60 Add namespace models 2025-11-25 16:56:36 +00:00
Joe Farebrother
b0be8184ac Add taint test 2025-11-24 16:54:21 +00:00
Joe Farebrother
dada49f402 Fix qldoc and tests 2025-11-24 13:57:43 +00:00
Joe Farebrother
a83c70f99d Add tests 2025-11-24 11:03:16 +00:00
Joe Farebrother
cdc44c3267 Model tornado websockets 2025-11-20 10:49:30 +00:00
Taus
f89fae39c5 Merge pull request #20276 from github/tausbn/python-model-psycopg2-connection-pools
Python: Add support for Psycopg2 database connection pools
2025-08-29 13:52:59 +02:00
Taus
1008ca9744 Python: Add psycopg2.pool tests 2025-08-25 14:14:16 +00:00
Napalys Klicius
638f6498f0 Removed lxml.etree.XMLParser from xml bomb sinks 2025-07-15 13:43:00 +02:00
Sylwia Budzynska
55c70a4cae Fix nitpicks 2025-05-27 13:44:21 +02:00
Sylwia Budzynska
84228e0ec8 Add Pandas SQLi sinks 2025-05-27 13:10:39 +02:00
Napalys Klicius
f652686607 Merge pull request #19444 from Napalys/python/hdbcli
Python: modeling of `hdbcli`
2025-05-01 17:58:31 +02:00
Napalys Klicius
e1fc0ca051 Added implementation hdbcli as part of PEP249::PEP249ModuleApiNode 2025-05-01 14:18:02 +02:00
Napalys Klicius
0325f368fe Added test case for hdbcli 2025-05-01 13:57:14 +02:00
yoff
531f2a15a4 python: model send_header from http.server 2025-04-30 19:58:14 +02:00
Joe Farebrother
a7fb73a2b2 Merge pull request #18185 from joefarebrother/python-lxml
Python: Model additional flow steps for the lxml framework
2025-01-10 13:40:16 +00:00
Joe Farebrother
35961e454b Fix tests to check for the correct type 2025-01-07 15:23:07 +00:00
Rasmus Wriedt Larsen
34631a8784 Python: Model FastAPI requests
Co-authored-by: Joe Farebrother <joefarebrother@github.com>
2024-12-18 15:58:51 +01:00
Rasmus Wriedt Larsen
79dfbf7b21 Python: Add FastAPI request test
Co-authored-by: Joe Farebrother <joefarebrother@github.com>
2024-12-18 15:48:29 +01:00
Joe Farebrother
dcbcf7e2bd Add additional tests demonstrating false negative flow 2024-12-12 15:55:36 +00:00
Michael Nebel
2321ca59f6 Python: Update all test util paths to point to the new location. 2024-12-12 13:54:30 +01:00
Joe Farebrother
2019ddfa7f Qldoc improvements + add a few extra tests 2024-12-11 12:25:40 +00:00
Joe Farebrother
bcb08bbc7b Update test output 2024-12-10 19:24:05 +00:00
Joe Farebrother
29a90235e8 Improve tests and use API graphs 2024-12-10 19:09:45 +00:00
Joe Farebrother
d2ed92d6d0 Added tests 2024-12-10 19:09:20 +00:00
Joe Farebrother
f82fa20249 Update test outputs 2024-12-09 20:37:11 +00:00