Commit Graph

33 Commits

Author SHA1 Message Date
Copilot
4ed5722e3e Python: switch dataflow library to new (shared) CFG + SSA
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-02 14:09:45 +00:00
Owen Mansel-Chan
91b6801db1 py: Inline expectation should have space before $ 2026-03-04 13:11:38 +00:00
Michael Nebel
2321ca59f6 Python: Update all test util paths to point to the new location. 2024-12-12 13:54:30 +01:00
Jeroen Ketema
c3ea883b11 Python: Update expected test results 2024-12-03 19:18:57 +01:00
Taus
f1392712ee Python: Add .copy() as a copy step 2024-02-22 13:09:27 +00:00
Taus
5125973f9b Python: Add test case for .copy() as a copy step 2024-02-22 13:01:03 +00:00
Anders Schack-Mulligen
088a0a54ba Python: Add empty provenance column to expected files. 2024-02-09 11:32:08 +01:00
Taus
d6d59377d3 Python: Fix flow through deepcopy
Or, more generally, any copy step, as these presumably do not preserve
object identity.

(Arguably, `copy` could still be susceptible to interior mutability, but
I think that's outside the scope of this query anyway.)
2024-01-22 15:40:30 +00:00
Taus
14c958ac4d Python: Remove mutable default sources from inside stdlib 2024-01-22 15:23:52 +00:00
Taus
411c107660 Python: Add tests for deepcopy FPs
There are two issues with `deepcopy` here. Firstly, the `deepcopy` function itself
has a mutable default value in its parameter `_nil` (set to the empty list by default).

Now, this value is never actually returned from `deepcopy`, as it is only used as a
sentinel, but our analysis is not clever enough to see this. Thus, it thinks that this
mutable default is returned, and hence the result of any call to `deepcopy` is a
potential source.

To remedy this, I opted to simply exclude all sources that originate from within the
standard library. It is very unlikely for any of the sources in the standard library
to be legit.

Secondly, `deepcopy` -- by virtue of being a function that we model as preserving
values -- admits data-flow through its calls, but this is not correct for the mutable
default query, as it is here the _identity_ of the default value in question that is
important. Thus, we get spurious flow through `deepcopy` for this specific query.
2024-01-22 15:21:57 +00:00
Taus
4742481070 Python: Consolidate "mutable default" tests
Moves the existing tests into the `ModificationOfParameterWithDefault` subdirectory
which already contained a bunch more tests. In the process, I also removed some
duplicated test cases.
2024-01-22 13:50:33 +00:00
Rasmus Wriedt Larsen
55f5b26ba6 Python: Accept new ordering of query predicates in .expected 2023-11-15 10:09:54 +01:00
Rasmus Wriedt Larsen
ce6335866b Python: Move ModificationOfParameterWithDefault to new dataflow API 2023-08-28 16:19:47 +02:00
Jeroen Ketema
8f599faf85 Python: Rewrite inline expectation tests to use parameterized module 2023-06-09 10:42:29 +02:00
Kasper Svendsen
d9f29a85d6 Python: Enable implicit this warnings 2023-05-04 10:16:52 +02:00
erik-krogh
944ca4a0da fix some more style-guide violations in the alert-messages 2022-10-07 11:23:34 +02:00
erik-krogh
0de0325c8e change the alert-message for py/modification-of-default-value 2022-09-05 13:30:56 +02:00
erik-krogh
089ce5a8a4 change alert messages of path queries to use the same template 2022-09-02 14:45:40 +02:00
Rasmus Lerchedahl Petersen
2eb11731e2 Python: Subpaths in test output 2021-09-10 14:04:57 +02:00
Rasmus Lerchedahl Petersen
7cfa08abc8 Python: Do not use BarrierGuards
They are simply not right for this problem.
We should not even make them available as an extension point.
2021-09-10 12:48:24 +02:00
Rasmus Lerchedahl Petersen
b20232db3c Python: Simplify guards as suggested 2021-09-10 10:31:48 +02:00
Rasmus Lerchedahl Petersen
b48caaf465 Python: fix reference to PrintNode.qll 2021-09-07 10:19:42 +02:00
Rasmus Lerchedahl Petersen
29cb067769 Python: Remember to update test expectations 2021-09-07 10:13:17 +02:00
Rasmus Lerchedahl Petersen
4998a48f99 Python: Fix simple guards 2021-09-06 22:40:30 +02:00
Rasmus Lerchedahl Petersen
913990bc62 Python: Add suggested comments and test case 2021-09-03 14:40:16 +02:00
yoff
c6eb795e76 Apply suggestions from code review
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2021-09-03 14:23:57 +02:00
Rasmus Lerchedahl Petersen
a762373ad6 Python: Implement simple barrier guard
The one found in the original test case
2021-08-30 11:04:27 +02:00
Rasmus Lerchedahl Petersen
49ae549e89 Python: Implement modifying syntax 2021-08-26 14:29:18 +02:00
Rasmus Lerchedahl Petersen
097c23e437 Python: add inline expectations test
Consider removing the original test
2021-08-26 14:08:52 +02:00
Rasmus Lerchedahl Petersen
d834cec9b9 Python: test simple sanitizer 2021-08-26 11:31:20 +02:00
Rasmus Lerchedahl Petersen
8614563b42 Python: More tests of syntactic constructs 2021-08-26 10:56:41 +02:00
Rasmus Lerchedahl Petersen
e865a290de Python: straight port of query
The old query uses `pointsTo` to limit the sinks
to methods on lists and dictionaries.
That constraint is omitted here which could hurt performance.
2021-08-24 16:35:11 +02:00
Rasmus Lerchedahl Petersen
e3765ced78 Python: Add tests for modification of defaults 2021-08-24 16:35:11 +02:00