Commit Graph

15 Commits

Author SHA1 Message Date
yoff
cada7e98d2 Python: model exception edges for raise-prone expressions inside try/with
The new CFG previously only emitted exception edges for explicit `raise`
and `assert` statements. As a result, code that became reachable only
via the exception path of an arbitrary expression (e.g., the body of an
`except` handler following a try-body whose `call()` could raise) was
classified as dead, breaking analyses like StackTraceExposure,
FileNotAlwaysClosed, ExceptionInfo, UseOfExit, and CatchingBaseException.

This commit adds a `mayThrow` predicate over expressions that are known
sources of implicit exceptions in Python (calls, attribute access,
subscripts, arithmetic/comparison operators, imports, await/yield/yield
from) plus `from m import *` at the statement level, and routes them
through the shared CFG's `beginAbruptCompletion(_, _, ExceptionSuccessor,
always=false)` hook.

The set of exception sources is restricted to nodes that are
syntactically inside a `try`/`with` statement in the same scope.
This mirrors Java's `ControlFlowGraph::mayThrow`, which only emits
exception edges where local handling can observe them — outside such
contexts, the edges add CFG complexity (weakening BarrierGuard
precision and breaking SSA continuity around augmented assignments and
subscript stores) without analysis benefit, since exceptions just
propagate to the function exit anyway.

Net effect on the test suite: ~100 alerts restored across the exception-
related query tests (StackTraceExposure +29, ExceptionInfo +17,
FileNotAlwaysClosed +52, UseOfExit +1, CatchingBaseException restored)
with no precision regressions. Affected `.expected` files and the
regression-guard `dead_under_no_raise.py` are updated accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 15:17:36 +00:00
yoff
3c9b0f770b Python: switch dataflow library to new (shared) CFG + SSA
Flips the Python dataflow trunk from the legacy CFG (semmle/python/Flow.qll)
and legacy ESSA SSA (semmle/python/essa/*) to the new shared CFG facade
(semmle.python.controlflow.internal.Cfg) and the new SSA adapter
(semmle.python.dataflow.new.internal.SsaImpl), both introduced
additively in the preceding PRs in this stack.

This is the trunk-flip equivalent of the original draft PR #21894 (kept
around as documentation), rebased on top of the four preparatory PRs:

  P1: Remove AstNode.getAFlowNode() and rewrite callers (#21919).
  P2: Qualify Flow.qll's AST references with Py:: prefix (#21920).
  P3: Add new shared-CFG-backed control flow graph (#21921).
  P4: Add new shared-SSA-backed SSA adapter (#21923).

The Python dataflow library (semmle/python/dataflow/new/) now imports
the new CFG facade and SSA adapter. All CFG-typed predicates
(ControlFlowNode, CallNode, BasicBlock, NameNode, AttrNode, ...) are
qualified with the Cfg:: prefix; SSA references switch from
EssaVariable/EssaDefinition to SsaImpl::Definition/SourceVariable.

GuardNode is redesigned to use the new CFG's outcome-node model
(isAfterTrue / isAfterFalse) instead of the legacy ConditionBlock +
flipped indirection. Only BarrierGuard<...> is preserved as public
API.

Framework files (Bottle, FastApi, Django, Tornado, Pyramid, Stdlib,
...) are updated to take CFG nodes from the new facade.

A handful of dataflow consistency tweaks for the new CFG:
- Augmented-assignment targets are treated as both load and store.
- 'from X import *' produces uncertain SSA writes for unknown names.
- CFG nodes are canonicalised so dataflow does not see equivalent
  pre/post-order pairs as distinct nodes.

Two AST tweaks for the new CFG:
- AstNodeImpl: omit PEP 695 type-parameter names from
  FunctionDefExpr / ClassDefExpr children.
- ImportResolution: drop the legacy essa import.

Test churn (~175 files): reblessed library- and query-test .expected
files reflect slightly different CFG granularity, different toString
output, and a handful of true alert deltas in security queries.

Verification: all 367 lib + src + consistency-queries compile clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 15:17:35 +00:00
Rasmus Wriedt Larsen
3e878f5a0b Python: Model django response subclass relationship 2023-12-19 17:07:02 +01:00
Rasmus Wriedt Larsen
abe6f1639a Python: Add example of models subclassing problem
In reality, we only want to model this as a `rest_framework.response.Response`, since our .qll modeling is more precise for rest-framework responses than if we also modeled it as a basic django http response. (specifically, that default mime-type handling is way different).
2023-12-19 17:07:02 +01:00
Rasmus Wriedt Larsen
0fe29b6a86 Python: Recover subclass finder .expected after cherry picking commits from https://github.com/github/codeql/pull/15030 2023-12-19 17:07:01 +01:00
Rasmus Wriedt Larsen
e050f2e998 Python: Adjust subclass finder to no ESSA nodes
But the new test results looks very strange indeed!
2023-12-19 17:07:01 +01:00
Rasmus Wriedt Larsen
6ef9a2b11e Python: Fix problem if import is used
I fixed it in both predicates... I think we might still be able to remove
`newDirectAlias` -- but with it being better, it will allow us to better test if `newImportAlias` actually cover everything we need!
2023-12-08 11:27:52 +01:00
Rasmus Wriedt Larsen
fcdc8102e2 Python: Add test highlight problem is import is used :O 2023-12-08 11:27:52 +01:00
Rasmus Wriedt Larsen
f1fd9b4c7a Python: Fix underlying problem of not using Alias 2023-12-08 11:27:52 +01:00
Rasmus Wriedt Larsen
a956e1f613 Python: Use django View instead of MethodView
Due to the 'only model most specific spec' logic highlighted in previous
commit, I'm changing away from MethodView/View, and use Django view instead.

In practice this shouldn't matter at all, but for writing tests it would
have been a nice fix to only have the "same name but more specific"
logic apply when it's the same _definition_ location. We used to have
this information available, but right now we don't... so instead of
spending a lot of time rewriting the core library, I simply used a
different class :D :O :(
2023-12-08 11:27:52 +01:00
Rasmus Wriedt Larsen
03aa2e27df Python: Explain the funky logic in Find.ql 2023-12-08 11:27:52 +01:00
Rasmus Wriedt Larsen
1f8f6dd0ec Python: Ensure no deps visible in FindSubclass tests 2023-12-08 11:27:52 +01:00
Rasmus Wriedt Larsen
af2d783b38 Python: More examples of things to handle in find-subclass 2023-12-08 11:27:52 +01:00
Rasmus Wriedt Larsen
f19b672656 Python: Also capture alias with new name 2023-12-08 11:27:50 +01:00
Rasmus Wriedt Larsen
e7d55736b0 Python: Add test of find-subclass code 2023-12-08 11:27:50 +01:00