The shared CFG library overrides ControlFlowNode.toString() as 'final'
(shared/controlflow/codeql/controlflow/Cfg.qll:1217), so the legacy
'ControlFlowNode for X' prefix is gone — the new toString returns just
'X' for normal nodes and 'After X' for after-nodes. This produces a
large cosmetic diff in test expected files with no semantic change.
Mass-rebless 78 .expected files whose actual output differs from the
checked-in expected only by this rename. Each file was verified to be
identical after normalising 'ControlFlowNode for ' and 'After ' away
from both sides.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After the shared-CFG migration, DataFlow::Node.asCfgNode() returns
Cfg::ControlFlowNode rather than the legacy Flow::ControlFlowNode, and
funcValue.getACall() / dfCall.getNode() now return different CFG types
(legacy vs new). Update the two remaining test queries that still cast
to legacy NameNode/CallNode types to bridge through Cfg:: types or AST.
* experimental/import-resolution-namespace-relative/test.ql: cast to
Cfg::NameNode instead of legacy NameNode.
* experimental/library-tests/CallGraph/InlineCallGraphTest.ql: change
predicate signatures from CallNode to AST Call, and bridge to
legacy CallNode (points-to) and Cfg::CallNode (type-tracking)
via getNode() on each side.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sweep the last few uses of legacy AstNode.getAFlowNode() in tests over to
explicit ControlFlowNode joins after the shared-CFG migration. importflow.ql
needs the new Cfg::ControlFlowNode/CompareNode types because DataFlow::Node.
asCfgNode() now returns the shared-CFG node.
Also extend ImportResolution::allowedEssaImportStep to walk back through
uncertain-write SSA inputs, so that a later 'from X import *' does not hide
the preceding explicit (re)assignment from module-export resolution. Without
this, a reassigned name that survives a wildcard import was no longer
recognised as the module export. Rebless ModuleExport.expected to drop the
legacy 'ControlFlowNode for' toString prefix and pick up the two correct rows
exposed by the fix.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Switches the trunk dataflow library and all in-tree consumers
(frameworks, ApiGraphs, Concepts, regexp, security customisations,
test harness) from the legacy Flow.qll/ESSA stack to the new
shared-CFG facade (Cfg.qll) and the ESSA-shaped adapter on the
shared-SSA library (SsaImpl.qll).
Highlights:
* DataFlowPublic/Private/Dispatch, Attributes, VariableCapture,
IterableUnpacking, ImportResolution, ImportStar, LocalSources,
TaintTrackingPrivate, MatchUnpacking, TypeTrackingImpl,
SsaImpl, Builtins all now qualify CFG/SSA references with
Cfg:: / SsaImpl:: and stop pulling in semmle.python.essa.*.
* AstNodeImpl.qll/Cfg.qll: ImportMember exposes its inner
ImportExpr, DefinitionNode.getValue covers Alias / AnnAssign /
AugAssign / AssignExpr / For-target / Parameter-default,
ForNode is treated as an expression node, AnnotatedExitNode is
canonical, and BoolExprNode.getAnOperand drops the dominance
constraint that did not hold for short-circuit BBs.
* SsaImpl.qll: parameters always get a ParameterDefinition (so
unused parameters still have SSA defs), scope-entry defs for
module globals require an actual store somewhere, scope-exit
has a synthetic use so reaching-defs survives to module
boundary, and the legacy SsaSourceVariable / EssaVariable
surface (getName, getScope, getAUse, getASourceUse,
getAnImplicitUse) is reinstated for downstream queries.
* DataFlowPublic.qll: GuardNode redesigned around the new
structural outcome nodes (isAfterTrue / isAfterFalse). The
legacy ConditionBlock + flipped indirection is gone;
controlsBlock walks UP through 'not' / '==True' / 'is False'
etc. via outcomeOfGuard, accumulating polarity cleanly. Only
BarrierGuard<...> is preserved as public API.
* ModuleVariableNode.getAWrite and LocalFlow::definitionFlowStep
bypass SSA and consult Cfg::NameNode.defines /
Cfg::DefinitionNode.getValue directly, so that write defs
pruned by shared SSA (because the variable has no in-scope
read) still produce dataflow steps.
* Frameworks + downstream consumers: replace
EssaVariable.hasDefiningNode, getAReturnValueFlowNode,
Parameter.getDefault, Scope.getEntryNode / getANormalExit etc.
with CFG-side bridges through Cfg::ControlFlowNode.
The legacy Flow.qll / Essa.qll stack is untouched and remains
available for queries that import it directly.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The fix may look a bit obscure, so here's what's going on.
When we see `from . import helper`, we create an `ImportExpr` with level
equal to 1 (corresponding to the number of dots). To resolve such
imports, we compute the name of the enclosing package, as part of
`ImportExpr.qualifiedTopName()`. For this form of import expression, it
is equivalent to `this.getEnclosingModule().getPackageName()`. But
`qualifiedTopName` requires that `valid_module_name` holds for its
result, and this was _not_ the case for namespace packages.
To fix this, we extend `valid_module_name` to include the module names
of _any_ folder, not just regular package (which are the ones where
there's a `__init__.py` in the folder). Note that this doesn't simply
include all folders -- only the ones that result in valid module names
in Python.
With `ModuleVariableNode`s now appearing for _all_ global variables (not
just the ones that actually seem to be used), some of the tests changed
a bit. Mostly this was in the form of new flow (because of new nodes
that popped into existence). For some inline expectation tests, I opted
to instead exclude these results, as there was no suitable location to
annotate. For the normal tests, I just accepted the output (after having
vetted it carefully, of course).
This pull request introduces a new CodeQL query for detecting prompt injection vulnerabilities in Python code targeting AI prompting APIs such as agents and openai. The changes includes a new experimental query, new taint flow and type models, a customizable dataflow configuration, documentation, and comprehensive test coverage.
Also fixes an issue with the return type annotations that caused these
to not work properly.
Currently, annotated assignments don't work properly, due to the fact
that our flow relation doesn't consider flow going to the "type" part of
an annotated assignment. This means that in `x : Foo`, we do correctly
note that `x` is annotated with `Foo`, but we have no idea what `Foo`
is, since it has no incoming flow.
To fix this we should probably just extend the flow relation, but this
may need to be done with some care, so I have left it as future work.