Commit Graph

73 Commits

Author SHA1 Message Date
yoff
b6f1421f3c Python: model from X import * as uncertain SSA writes
Add a 4th disjunct to `SsaImplInput::variableWrite` in the shared-SSA
adapter that mirrors legacy ESSA's `ImportStarRefinement`: every
variable whose scope is the import-star's scope, OR which is used in
the import-star's scope, gets an uncertain write at the `import *`
position.

Uncertain writes do not kill prior definitions; shared SSA's
`SsaUncertainWrite` joins the new value with the immediately-preceding
definition via `uncertainWriteDefinitionInput`. This is the equivalent
of legacy ESSA's two-input refinement.

Cannot depend on `ImportStar` / `ImportResolution` (those modules
import `SsaImpl`), so the predicate uses the structural heuristic on
`Cfg::ImportStarNode` directly.

This closes the two remaining failing dataflow library-tests:

- `import-star/global` — `module_export` chains via `from X import *`
  re-exports now resolve: the importing module has an SSA def of every
  re-exported name, so `lastUseVar` finds the read at the use site.
- `typetracking_imports/highlight_problem` — a direct `from .foo import
  foo` immediately followed by `from .other import *` is now correctly
  marked as dead at the direct import.

Two scope-entry-def noise rows in `highlight_problem.expected` are also
dropped — legacy ESSA needed them as refinement inputs, but shared SSA
handles uncertain writes without an explicit prior def. They were
always tagged `no use to normal exit` (dead).

Dataflow library-tests: 62/64 → 64/64 passing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:44 +00:00
yoff
1a2be46cb5 Python: update dataflow tests for new CFG + shared SSA
Test-side changes accompanying the dataflow migration:

  * Test queries (.ql) and shared test harness (TestSummaries,
    TestTaintLib) qualify CFG / SSA types with Cfg:: / SsaImpl::,
    bridge via AST (Name, Call, ...) instead of legacy NameNode /
    CallNode, and switch GlobalSsaVariable / EssaVariable usages
    to the new adapter API.

  * .expected files updated for legitimate precision and toString
    changes:
      - phi-node def-use edges newly exposed in def_use_counts.
      - scope-exit synthetic use surfaces one extra implicit use
        in use-use-counts.
      - For [empty]/[non-empty] outcome rows added in
        EnclosingCallable.
      - SsaSourceVariable / Global Variable label cosmetics
        normalised throughout.

  * Inline annotations:
      - typetracking/test.py: removed MISSING:tracked on lines
        93/95 (now found), added SPURIOUS:tracked on line 108
        (decorator over-reach).
      - global-flow/test.py: added SPURIOUS writes=g_mod on line
        20 (correctly reports immediately-overwritten write).
      - tainttracking/customSanitizer/test.py: marked
        try/except: ensure_tainted(s) cases as MISSING: tainted
        (no-raise CFG abstraction does not connect try body to
        except body).
      - coverage/test.py: marked
        SINK(return_from_inner_scope([])) as
        MISSING: flow=... pending closer investigation.

  * regression/{dataflow,custom_dataflow}.expected: accept two
    if/else cond-correlation over-reaches (documented limitation;
    same imprecision applies under legacy semantics by design).

After this change the dataflow library-tests stand at 62 of 64
passing; the two remaining failures are tracked under the
ImportStarRefinement workstream.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:44 +00:00
yoff
26df3945d3 Python: migrate dataflow library to new CFG + shared SSA
Switches the trunk dataflow library and all in-tree consumers
(frameworks, ApiGraphs, Concepts, regexp, security customisations,
test harness) from the legacy Flow.qll/ESSA stack to the new
shared-CFG facade (Cfg.qll) and the ESSA-shaped adapter on the
shared-SSA library (SsaImpl.qll).

Highlights:

  * DataFlowPublic/Private/Dispatch, Attributes, VariableCapture,
    IterableUnpacking, ImportResolution, ImportStar, LocalSources,
    TaintTrackingPrivate, MatchUnpacking, TypeTrackingImpl,
    SsaImpl, Builtins all now qualify CFG/SSA references with
    Cfg:: / SsaImpl:: and stop pulling in semmle.python.essa.*.

  * AstNodeImpl.qll/Cfg.qll: ImportMember exposes its inner
    ImportExpr, DefinitionNode.getValue covers Alias / AnnAssign /
    AugAssign / AssignExpr / For-target / Parameter-default,
    ForNode is treated as an expression node, AnnotatedExitNode is
    canonical, and BoolExprNode.getAnOperand drops the dominance
    constraint that did not hold for short-circuit BBs.

  * SsaImpl.qll: parameters always get a ParameterDefinition (so
    unused parameters still have SSA defs), scope-entry defs for
    module globals require an actual store somewhere, scope-exit
    has a synthetic use so reaching-defs survives to module
    boundary, and the legacy SsaSourceVariable / EssaVariable
    surface (getName, getScope, getAUse, getASourceUse,
    getAnImplicitUse) is reinstated for downstream queries.

  * DataFlowPublic.qll: GuardNode redesigned around the new
    structural outcome nodes (isAfterTrue / isAfterFalse).  The
    legacy ConditionBlock + flipped indirection is gone;
    controlsBlock walks UP through 'not' / '==True' / 'is False'
    etc. via outcomeOfGuard, accumulating polarity cleanly.  Only
    BarrierGuard<...> is preserved as public API.

  * ModuleVariableNode.getAWrite and LocalFlow::definitionFlowStep
    bypass SSA and consult Cfg::NameNode.defines /
    Cfg::DefinitionNode.getValue directly, so that write defs
    pruned by shared SSA (because the variable has no in-scope
    read) still produce dataflow steps.

  * Frameworks + downstream consumers: replace
    EssaVariable.hasDefiningNode, getAReturnValueFlowNode,
    Parameter.getDefault, Scope.getEntryNode / getANormalExit etc.
    with CFG-side bridges through Cfg::ControlFlowNode.

The legacy Flow.qll / Essa.qll stack is untouched and remains
available for queries that import it directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 16:32:44 +00:00
Taus
ac23e16786 Python: Move Python 3.15 data-flow tests to a separate file
We won't be able to run these tests until Python 3.15 is actually out
(and our CI is using it), so it seemed easiest to just put them in their
own test directory.
2026-04-17 13:16:46 +00:00
Taus
dc36609743 Python: Add data-flow tests
Alas, all these demonstrate is that we already don't fully support the
desugared `yield from` form.
2026-04-17 12:15:04 +00:00
Taus
fa61f6f3df Python: Model @typing.overload in method resolution
Adds `hasOverloadDecorator` as a predicate on functions. It looks for
decorators called `overload` or `something.overload` (usually
`typing.overload` or `t.overload`). These are then filtered out in the
predicates that (approximate) resolving methods according to the MRO.

As the test introduced in the previous commit shows, this removes the
spurious resolutions we had before.
2026-03-05 22:20:03 +00:00
Taus
0561a63003 Python: Add test for overloaded __init__ resolution
Adds a test showing that `@typing.overload` stubs are spuriously
resolved as call targets alongside the actual `__init__` implementation.
2026-03-05 22:20:03 +00:00
Owen Mansel-Chan
99a4fe4828 Update expected test output column numbers 2026-03-04 15:02:53 +00:00
Owen Mansel-Chan
aa28c94562 Remove double space after $ in inline expectations tests 2026-03-04 14:12:42 +00:00
Owen Mansel-Chan
91b6801db1 py: Inline expectation should have space before $ 2026-03-04 13:11:38 +00:00
Owen Mansel-Chan
5a97348e78 python: Inline expectation should have space after $
This was a regex-find-replace from `# \$(?! )` (using a negative lookahead) to `# $ `.
2026-03-04 12:45:05 +00:00
yoff
7351e82c92 python: handle guards compared to boolean literals 2026-02-24 10:00:22 +01:00
yoff
8488039fb9 python: add tests for guards compared to booleans 2026-02-24 10:00:21 +01:00
Taus
df0f2f8ce4 Python: Simple dataflow annotations
None of these required any changes to the dataflow libraries, so it
seemed easiest to put them in their own commit.
2026-02-16 13:48:32 +00:00
Taus
958c798c3f Python: Accept dataflow test changes
New nodes means new results. Luckily we rarely have a test that selects
_all_ dataflow nodes.
2026-01-30 12:50:25 +00:00
Taus
ac5a74448f Python: Fix tests
With `ModuleVariableNode`s now appearing for _all_ global variables (not
just the ones that actually seem to be used), some of the tests changed
a bit. Mostly this was in the form of new flow (because of new nodes
that popped into existence). For some inline expectation tests, I opted
to instead exclude these results, as there was no suitable location to
annotate. For the normal tests, I just accepted the output (after having
vetted it carefully, of course).
2026-01-30 12:50:25 +00:00
Tom Hvitved
0adece7cde Python: Adapt to changes in FlowSummaryImpl 2026-01-26 12:40:19 +01:00
Napalys Klicius
8393ccf39d Python: Update globalVariableAttrPathAtDepth base case 2025-09-16 18:08:53 +02:00
Napalys Klicius
e60d0c88f1 Python: Add global variable nested field jump steps 2025-09-16 18:08:53 +02:00
Napalys Klicius
9d4b168977 Python: Added extra test for global variable nested attribute reads/writes. 2025-09-16 18:08:53 +02:00
Arthur Baars
5d3ec35e29 Remove non-breaking spaces from code 2025-09-05 09:41:15 +02:00
yoff
158430af82 Merge pull request #17765 from yoff/python/test-functional-behaviour
Python: Add tests for functional-like programming
2025-02-11 16:28:37 +01:00
Joe Farebrother
2aea356756 Add change note + fix tests 2025-01-15 10:24:18 +00:00
Joe Farebrother
6a6585e415 Add tests for zip and enumerate 2025-01-15 09:57:15 +00:00
Joe Farebrother
460de3f7d5 Reduce generality of map and zip for performance 2025-01-14 09:39:57 +00:00
Joe Farebrother
4e36008ed9 Add tests 2025-01-14 09:39:56 +00:00
Michael Nebel
2321ca59f6 Python: Update all test util paths to point to the new location. 2024-12-12 13:54:30 +01:00
Jeroen Ketema
c3ea883b11 Python: Update expected test results 2024-12-03 19:18:57 +01:00
yoff
44c94e02fe Merge pull request #18037 from joefarebrother/pythob-test-global-capture
Python: Add some test cases for flow involving global and captured variables
2024-11-22 11:33:31 +01:00
Joe Farebrother
52cd7f2c5c Add 2 more cases 2024-11-20 11:22:42 +00:00
Joe Farebrother
9b4b01a442 Fix typo 2024-11-20 10:59:27 +00:00
Joe Farebrother
a398f707fe Add some test cases for flow involving global variables and captured variables 2024-11-19 16:34:59 +00:00
yoff
cec0544ca5 Merge pull request #17789 from aschackmull/python/resolvecall-refactor
Python: Refactor references to NormalCall.
2024-11-01 14:20:34 +01:00
Tom Hvitved
7c4d5981dd Shared: Add missing spaces in inline test expectation output 2024-10-25 13:23:03 +02:00
Anders Schack-Mulligen
5950c336e2 Python: Refactor references to NormalCall. 2024-10-16 16:04:31 +02:00
Taus
65dbc1de91 Python: Add copy.replace test to list of runnable tests 2024-10-15 18:17:00 +02:00
yoff
9ed8fe5dd0 Update python/ql/test/library-tests/dataflow/coverage/functional.py
Co-authored-by: Taus <tausbn@github.com>
2024-10-15 17:35:36 +02:00
Taus
778b96aa39 Python: Update test expectations 2024-10-15 12:14:19 +00:00
Taus
e16405c675 Python: Add test for copy.replace
This test demonstrates the current state of affairs: that `copy.replace`
essentially blocks all flow of taint through it, because it has not been
modelled yet.
2024-10-15 11:48:43 +00:00
Rasmus Lerchedahl Petersen
195b70aca6 python: Add test for functional-like programming
This can also serve for a place to add tests for
constructs like threading.Thread, mulitprocess.Process, concurrent.futures.ThreadPoolExecutor, and concurrent.futures.ProcessPoolExecutor.
2024-10-15 12:54:30 +02:00
Rasmus Lerchedahl Petersen
bb78c2a67e Python: update test expectations 2024-10-11 15:36:44 +02:00
Rasmus Lerchedahl Petersen
6f5b949ec8 Python: adjust test expectations
note that we do retain precision in
`test_dict_from_keyword()`
2024-10-04 15:30:02 +02:00
Rasmus Lerchedahl Petersen
a4c1a622b7 Merge branch 'main' of https://github.com/github/codeql into python/add-comprehension-capture-flow 2024-10-04 14:53:03 +02:00
Rasmus Lerchedahl Petersen
38b1eb7c71 Python: just use ListElementContent for iterables 2024-10-01 16:24:15 +02:00
yoff
62509a10c2 Update python/ql/test/library-tests/dataflow/coverage/test_builtins.py
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2024-10-01 11:39:12 +02:00
Rasmus Lerchedahl Petersen
bd68986fa4 Python: add test showing dict can take multiple arguments 2024-10-01 10:01:22 +02:00
Rasmus Lerchedahl Petersen
fb07a56de6 Python: adjust test expectations 2024-09-30 13:26:59 +02:00
Rasmus Lerchedahl Petersen
f9f46f0f98 Python: update test expectations
We now have a new callable, yielding new enclosing callables
2024-09-30 12:00:38 +02:00
Rasmus Lerchedahl Petersen
d4ea62edec Python: flow through yield
- add yield as a dataflow return
- replace comprehension store step
   with a store step to the yield
2024-09-30 09:01:29 +02:00
Rasmus Lerchedahl Petersen
fc2dc28f87 python: capture flow through comprehensions
- add comprehension functions as `DataFlowCallable`s
- add comprehension call as `DataFlowCall`
- create capture argument node for comprehension calls
2024-09-25 10:02:31 +02:00