codeql

mirror of https://github.com/github/codeql.git synced 2026-06-03 04:40:14 +02:00

Author	SHA1	Message	Date
Copilot	4ed5722e3e	Python: switch dataflow library to new (shared) CFG + SSA Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll) and legacy ESSA SSA (semmle/python/essa/) to the new shared CFG facade (semmle.python.controlflow.internal.Cfg) and the new SSA adapter (semmle.python.dataflow.new.internal.SsaImpl), both introduced additively in the preceding PRs in this stack. This is the trunk-flip equivalent of the original draft PR #21894 (kept around as documentation), rebased on top of the four preparatory PRs: P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919). P2: Qualify Flow.qll's AST references with Py:: prefix (#21920). P3: Add new shared-CFG-backed control flow graph (#21921). P4: Add new shared-SSA-backed SSA adapter (#21923). The Python dataflow library (semmle/python/dataflow/new/) now imports the new CFG facade and SSA adapter. All CFG-typed predicates (ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are qualified with the Cfg:: prefix; SSA references switch from EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable. GuardNode is redesigned to use the new CFG's outcome-node model (isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock + flipped indirection. Only BarrierGuard<...> is preserved as public API. Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib, ...) are updated to take CFG nodes from the new facade. A handful of dataflow consistency tweaks for the new CFG: - Augmented-assignment targets are treated as both load and store. - 'from X import ' produces uncertain SSA writes for unknown names. - CFG nodes are canonicalised so dataflow does not see equivalent pre/post-order pairs as distinct nodes. Two AST tweaks for the new CFG: - AstNodeImpl: omit PEP 695 type-parameter names from FunctionDefExpr / ClassDefExpr children. - ImportResolution: drop the legacy essa import. Test churn (~175 files): reblessed library- and query-test .expected files reflect slightly different CFG granularity, different toString output, and a handful of true alert deltas in security queries. Verification: all 367 lib + src + consistency-queries compile clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-02 14:09:45 +00:00
Copilot	b56488c982	Python: add new shared-SSA-backed SSA adapter Preparatory refactor for the shared-CFG dataflow migration. Adds the new Python SSA adapter additively, without changing any production behaviour. Library additions: - semmle.python.dataflow.new.internal.SsaImpl — Python SSA implementation built on the new (shared) CFG. Mirrors the Java SSA adapter (java/ql/lib/semmle/code/java/dataflow/internal/SsaImpl.qll): an InputSig is defined in terms of positional (BasicBlock, int) variable references, and the shared codeql.ssa.Ssa::Make<Location, Cfg, Input> module is then instantiated. SourceVariable is the AST-level Py::Variable. Variable references are looked up via the new CFG facade's NameNode.defines/uses/deletes predicates (added in the preceding PR), which themselves are one-line bridges to AST-level Name.defines/uses/deletes. Implicit-entry definitions are inserted for non-local/global/builtin reads, captured variables, and (when needed) parameters. Test additions: - library-tests/dataflow-new-ssa/ — exercises the new SSA over a representative test corpus and checks expected def/use chains. - library-tests/dataflow-new-ssa-vs-legacy/ — runs both new SSA and legacy ESSA over the same corpus and diffs the results, so any semantic divergence shows up as a test failure. Production impact: None. The new SSA adapter has zero callers in lib/ and src/ — the legacy ESSA SSA (semmle/python/essa/*) remains the default. The dataflow library is not migrated yet; that lands in a follow-up PR. Verified by: - All 367 lib + src + consistency-queries compile clean. - All 641 ControlFlow + PointsTo + dataflow + essa + consistency library-tests pass. - Both new dataflow-new-ssa[/vs-legacy] test packs pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-02 14:09:41 +00:00
Copilot	e68f2fd717	Python: add new shared-CFG-backed control flow graph Preparatory refactor for the shared-CFG dataflow migration. Adds the new Python CFG library additively, without changing any production behaviour. Library additions: - semmle.python.controlflow.internal.AstNodeImpl — mediates between the Python AST and the shared codeql.controlflow.ControlFlowGraph signature. Wraps Python's Stmt/Expr/Scope/Pattern and adds two synthetic kinds of node (BlockStmt for body slots, intermediate nodes for multi-operand boolean expressions). - semmle.python.controlflow.internal.Cfg — public facade re-exposing the same API surface as semmle/python/Flow.qll (ControlFlowNode, CallNode, BasicBlock, NameNode, DefinitionNode, CompareNode, ...), backed by the shared CFG. - lib/printCfgNew.ql — debug/visualisation query for the new CFG. - consistency-queries/CfgConsistency.ql — consistency query running the shared CFG's standard checks against Python. Shared library: - shared.controlflow.ControlFlowGraph — adds two defaulted getWhileElse / getForeachElse predicates to AstSig so Python can model while-else / for-else (no behavioural change for other languages). Test additions: - ControlFlow/bindings/* — annotation-driven SSA-binding tests for the new CFG (annassign, compound, comprehension, decorated, except_handler, imports, match_pattern, parameters, simple, type_params, walrus_starred, with_stmt, dead_under_no_raise). - ControlFlow/store-load/* — basic store/load coverage. - ControlFlow/evaluation-order/NewCfg*.ql — mirrors of the existing OldCfg evaluation-order self-validation suite, run against the new CFG via NewCfgImpl.qll. - Minor extensions to existing test_if.py / test_boolean.py + cosmetic .expected churn on a handful of OldCfg tests. No dataflow, SSA, or production query is migrated yet — that lands in follow-up PRs. The new CFG library has zero callers in lib/ and src/. Verified by: - All lib + src + consistency-queries compile clean (367 queries). - All 56 ControlFlow library-tests pass. - All 474 dataflow + PointsTo library-tests + consistency tests pass. - syntax_error/CONSISTENCY/CfgConsistency passes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-02 14:09:28 +00:00
Copilot	a13dfaa44f	Python: deprecate AstNode.getAFlowNode() and rewrite internal callers Preparatory refactor for the shared-CFG dataflow migration. Deprecates the AstNode.getAFlowNode() cached predicate on the public Python QL API and rewrites all ~140 internal callers across lib/, src/, test/, and tools/ from `expr.getAFlowNode() = cfgNode` to `cfgNode.getNode() = expr`, using ControlFlowNode.getNode() which already exists in Flow.qll. The predicate itself is preserved (with a deprecation note pointing at the new pattern) so external users do not experience churn — they can migrate at their own pace and the AST/CFG hierarchies still get the intended untangling once the deprecation eventually elapses. Semantic noop verified by: - All 361 lib/ + src/ queries compile clean. - All 122 ControlFlow + PointsTo library-tests pass. - All 64 dataflow library-tests pass. - All 113 Variables/Exceptions/Expressions/Statements/Functions/Imports/ Security/CWE-798/ModificationOfParameterWithDefault query-tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-02 08:37:30 +00:00
Taus	35faec3db1	Python: Address review comments - Get rid of unnecessary parentheses - Use call syntax in the relevant test - Get rid of `dead(2)` annotation	2026-05-27 15:27:19 +00:00
Taus	1ef557c972	Python: Address Copilot's comments	2026-05-12 15:27:14 +00:00
Taus	f5c3b63a4a	Python: Add ConsecutiveTimestamps test This one is potentially a bit iffy -- it checks for a very powerful property (that implies many of the other queries), but as the test results show, it can produce false positives when there is in fact no problem. We may want to get rid of it entirely, if it becomes too noisy.	2026-05-12 12:54:26 +00:00
Taus	c30d6ae3aa	Python: Add NeverReachable test This looks for nodes annotated with `t[never]` in the test that are reachable in the CFG. This should not happen (it messes with various queries, e.g. the "mixed returns" query), but the test shows that in a few particular cases (involving the `match` statement where all cases contain `return`s), we _do_ have reachable nodes that shouldn't be.	2026-05-12 12:54:26 +00:00
Taus	fc2bc26f36	Python: Add BasicBlockOrdering test This one demonstrates a bug in the current CFG. In a dictionary comprehension `{k: v for k, v in d.items()}`, we evaluate the value before the key, which is incorrect. (A fix for this bug has been implemented in a separate PR.)	2026-05-12 12:54:25 +00:00
Taus	3a979ac2f8	Python: Add some CFG-validation queries These use the annotated, self-verifying test files to check various consistency requirements. Some of these may be expressing the same thing in different ways, but it's fairly cheap to keep them around, so I have not attempted to produce a minimal set of queries for this.	2026-05-12 12:54:25 +00:00
Taus	71cd5be513	Python: Add self-validating CFG tests These tests consist of various Python constructions (hopefully a somewhat comprehensive set) with specific timestamp annotations scattered throughout. When the tests are run using the Python 3 interpreter, these annotations are checked and compared to the "current timestamp" to see that they are in agreement. This is what makes the tests "self-validating". There are a few different kinds of annotations: the basic `t[4]` style (meaning this is executed at timestamp 4), the `t[dead(4)]` variant (meaning this _would_ happen at timestamp 4, but it is in a dead branch), and `t[never]` (meaning this is never executed at all). In addition to this, there is a query, MissingAnnotations, which checks whether we have applied these annotations maximally. Many expression nodes are not actually annotatable, so there is a sizeable list of excluded nodes for that query.	2026-05-12 12:42:29 +00:00
Taus	ac23e16786	Python: Move Python 3.15 data-flow tests to a separate file We won't be able to run these tests until Python 3.15 is actually out (and our CI is using it), so it seemed easiest to just put them in their own test directory.	2026-04-17 13:16:46 +00:00
Taus	dc36609743	Python: Add data-flow tests Alas, all these demonstrate is that we already don't fully support the desugared `yield from` form.	2026-04-17 12:15:04 +00:00
Taus	8b1ecf05c9	Python: Update test output This change reflects the `(value, key)` to `(key, value)` fix in an earlier commit.	2026-04-14 13:27:31 +02:00
Taus	fa61f6f3df	Python: Model `@typing.overload` in method resolution Adds `hasOverloadDecorator` as a predicate on functions. It looks for decorators called `overload` or `something.overload` (usually `typing.overload` or `t.overload`). These are then filtered out in the predicates that (approximate) resolving methods according to the MRO. As the test introduced in the previous commit shows, this removes the spurious resolutions we had before.	2026-03-05 22:20:03 +00:00
Taus	0561a63003	Python: Add test for overloaded `__init__` resolution Adds a test showing that `@typing.overload` stubs are spuriously resolved as call targets alongside the actual `__init__` implementation.	2026-03-05 22:20:03 +00:00
Owen Mansel-Chan	99a4fe4828	Update expected test output column numbers	2026-03-04 15:02:53 +00:00
Owen Mansel-Chan	aa28c94562	Remove double space after $ in inline expectations tests	2026-03-04 14:12:42 +00:00
Owen Mansel-Chan	91b6801db1	py: Inline expectation should have space before $	2026-03-04 13:11:38 +00:00
Owen Mansel-Chan	5a97348e78	python: Inline expectation should have space after $ This was a regex-find-replace from `# \$(?! )` (using a negative lookahead) to `# $ `.	2026-03-04 12:45:05 +00:00
yoff	600f585a31	Merge pull request #21296 from yoff/python/bool-comparison-guards Python: Handle guards being compared to boolean literals	2026-02-26 21:13:51 +01:00
Taus	6bfb1e1fae	Merge pull request #21344 from github/tausbn/python-remove-points-to-from-metrics-libraries Python: Remove points-to from metrics library	2026-02-24 15:55:16 +01:00
yoff	7351e82c92	python: handle guards compared to boolean literals	2026-02-24 10:00:22 +01:00
yoff	8488039fb9	python: add tests for guards compared to booleans	2026-02-24 10:00:21 +01:00
Taus	e8de8433f4	Python: Update all metrics-dependant queries The ones that no longer require points-to no longer import `LegacyPointsTo`. The ones that do use the specific `...MetricsWithPointsTo` classes that are applicable.	2026-02-19 12:32:27 +00:00
Taus	248932db7a	Python: Fix `frameworks/data/warnings.ql`	2026-02-16 13:48:32 +00:00
Taus	df0f2f8ce4	Python: Simple dataflow annotations None of these required any changes to the dataflow libraries, so it seemed easiest to put them in their own commit.	2026-02-16 13:48:32 +00:00
Taus	958c798c3f	Python: Accept dataflow test changes New nodes means new results. Luckily we rarely have a test that selects _all_ dataflow nodes.	2026-01-30 12:50:25 +00:00
Taus	ac5a74448f	Python: Fix tests With `ModuleVariableNode`s now appearing for _all_ global variables (not just the ones that actually seem to be used), some of the tests changed a bit. Mostly this was in the form of new flow (because of new nodes that popped into existence). For some inline expectation tests, I opted to instead exclude these results, as there was no suitable location to annotate. For the normal tests, I just accepted the output (after having vetted it carefully, of course).	2026-01-30 12:50:25 +00:00
Taus	34800d1519	Merge pull request #20945 from joefarebrother/python-websockets Python: Model remote flow sources for the `websockets` library	2026-01-29 15:47:46 +01:00
Tom Hvitved	b974a84bef	Merge pull request #21051 from hvitved/shared/flow-summary-provenance-filtering Shared: Provenance-based filtering of flow summaries	2026-01-26 17:24:34 +01:00
Tom Hvitved	0adece7cde	Python: Adapt to changes in `FlowSummaryImpl`	2026-01-26 12:40:19 +01:00
yoff	3dbfb9fa4b	python: add machinery for MaD barriers and reinstate previously removed barrier now as a MaD row	2026-01-22 17:30:24 +01:00
yoff	1ac3706e75	Python support `ListElement` in MaD	2026-01-09 13:08:06 +01:00
yoff	5c6d83ed65	Merge pull request #20877 from joefarebrother/python-tornado-websocket Python: Add models for websocket handlers for Tornado	2025-12-09 10:08:59 +01:00
Taus	1b519384d7	Merge pull request #20739 from github/tausbn/python-remove-top-level-points-to-imports Python: Hide points-to imports in `python.qll`	2025-12-05 14:24:41 +01:00
Joe Farebrother	ac55cf9544	Update test and qldoc	2025-12-01 20:41:59 +00:00
Joe Farebrother	7cf3964e44	Update expectations	2025-12-01 20:27:48 +00:00
Joe Farebrother	384e17a4ef	Implement websockets models	2025-12-01 16:24:59 +00:00
Taus	24a29f46be	Python: Fix all metrics-related compilation failures In hindsight, having a `.getMetrics()` method that just returns `this` is somewhat weird. It's possible that it predates the existence of the inline cast, however.	2025-11-26 21:28:51 +00:00
Taus	cd1619b43e	Python: Fix queries and tests	2025-11-26 17:06:55 +00:00
Joe Farebrother	16018e91a2	Minor test fix	2025-11-26 15:47:56 +00:00
Taus	9dc774aaa3	Python: Remove points-to dependency from parts of SSA For whatever reason, the CFG node for exceptions and exception groups was placed with the points-to code. (Probably because a lot of the predicates depended on points-to.) However, as it turned out, two of the SSA modules only depended on non-points-to properties of these nodes, and so it was fairly straightforward to remove the imports of `LegacyPointsTo` for those modules. In the process, I moved the aforementioned CFG node types into `Flow.qll`, and changed the classes in the `Exceptions` module to the `...WithPointsTo` form that we introduced elsewhere.	2025-11-26 12:30:31 +00:00
Taus	e09840426c	Python: Get rid of points-to from `Definitions.qll` Turns out the `ImportTime` module (despite living in `semmle.python.types` does not actually depend on points-to, so some of the `LegacyPointsTo` imports could be replaced or removed.	2025-11-26 12:30:31 +00:00
Taus	7176898503	Python: Fix library tests	2025-11-26 12:30:31 +00:00
Taus	f0465f441f	Python: Get rid of some `get...Object` methods This frees `Class.qll`, `Exprs.qll`, and `Function.qll` from the clutches of points-to. For the somewhat complicated setup with `getLiteralObject` (an abstract method), I opted for a slightly ugly but workable solution of just defining a predicate on `ImmutableLiteral` that inlines each predicate body, special-cased to the specific instance to which it applies.	2025-11-26 12:30:30 +00:00
Joe Farebrother	eb7fe71557	Fix namespace instances and update tests	2025-11-26 10:51:16 +00:00
Joe Farebrother	83eadbad60	Add namespace models	2025-11-25 16:56:36 +00:00
Joe Farebrother	b0be8184ac	Add taint test	2025-11-24 16:54:21 +00:00
Joe Farebrother	dada49f402	Fix qldoc and tests	2025-11-24 13:57:43 +00:00

1 2 3 4 5 ...

1324 Commits