Commit Graph

14 Commits

Author SHA1 Message Date
Copilot
4ed5722e3e Python: switch dataflow library to new (shared) CFG + SSA
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-02 14:09:45 +00:00
Anders Schack-Mulligen
987d5712b8 Python: Accept qltest .expected file changes. 2024-05-22 15:43:49 +02:00
Anders Schack-Mulligen
088a0a54ba Python: Add empty provenance column to expected files. 2024-02-09 11:32:08 +01:00
Rasmus Lerchedahl Petersen
11c71fdd18 Python: remove EssaNodes
This commit removes SSA nodes from the data flow graph. Specifically, for a definition and use such as
```python
  x = expr
  y = x + 2
```
we used to have flow from `expr` to an SSA variable representing x and from that SSA variable to the use of `x` in the definition of `y`. Now we instead have flow from `expr` to the control flow node for `x` at line 1 and from there to the control flow node for `x` at line 2.

Specific changes:
- `EssaNode` from the data flow layer no longer exists.
- Several glue steps between `EssaNode`s and `CfgNode`s have been deleted.
- Entry nodes are now admitted as `CfgNodes` in the data flow layer (they were filtered out before).
- Entry nodes now have a new `toString` taking into account that the module name may be ambigous.
- Some tests have been rewritten to accomodate the changes, but only `python/ql/test/experimental/dataflow/basic/maximalFlowsConfig.qll` should have semantic changes.
- Comments have been updated
- Test output has been updated, but apart from `python/ql/test/experimental/dataflow/basic/maximalFlows.expected` only `python/ql/test/experimental/dataflow/typetracking-summaries/summaries.py` should have a semantic change. This is a bonus fix, probably meaning that something was never connected up correctly.
2023-11-20 21:35:32 +01:00
Rasmus Wriedt Larsen
ca93f4d223 Python: Accept .expected changes 2023-08-11 10:36:05 +02:00
Anders Schack-Mulligen
ae24d68b5d C/C++/C#/Java/Python/Ruby/Swift: Adjust expected output. 2023-07-19 11:41:15 +02:00
Rasmus Wriedt Larsen
d73289ac4e Python: Accept .expected changes 2023-04-27 11:54:39 +02:00
Rasmus Wriedt Larsen
f3937a4a12 Python: Update .expected from PostUpdateNode commit 2023-03-30 10:17:33 +02:00
Rasmus Wriedt Larsen
86333e3ba5 Python: Remove duplicate results from azure blob query 2023-03-29 11:47:29 +02:00
Rasmus Wriedt Larsen
32d52c023e Python: Allow any order for azure blob query
By only allowing the sink in the state where encryption v1 is used, we
can handle the new case where the order of attribute assignment is
flipped.

However, we get a few too many paths because we can have multiple
sources reaching the same sink... let's fix in next commit.
2023-03-29 11:42:01 +02:00
Rasmus Wriedt Larsen
480f171d9b Python: Add azure blob tests with swapped order
Just shows we need to use some state in the query to get the correct
behavior.
2023-03-29 11:25:37 +02:00
Rasmus Wriedt Larsen
683985a00a Python: Expand azure blob modeling
Now we can differentiate between the classes
2023-03-29 11:24:36 +02:00
Rasmus Wriedt Larsen
8ea6b6f256 Python: Update py/azure-storage/unsafe-client-side-encryption-in-use to use datafow 2023-03-28 10:09:22 +02:00
Rasmus Wriedt Larsen
691ffcd3a4 Python: Add tests of py/azure-storage/unsafe-client-side-encryption-in-use
Notice that it doesn't find the potentially unsafe version, or the vuln that spans calls.
2023-03-28 10:05:09 +02:00