This join had badness 1127 on the project FiacreT/M-moire, producing ~31
million tuples in order to end up with only ~27k tuples later in the
pipeline. With the fix, we reduce this by roughly the full 31 million
(the new materialised helper predicate accounting for roughly 130k
tuples on its own).
Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>
The ones that no longer require points-to no longer import
`LegacyPointsTo`. The ones that do use the specific
`...MetricsWithPointsTo` classes that are applicable.
Fixes the test failures that arose from making `ExtractedArgumentNode`
local.
For the consistency checks, we now explicitly exclude the
`ExtractedArgumentNode`s (now much more plentiful due to the
overapproximation) that don't have a corresponding `getCallArg` tuple.
For various queries/tests using `instanceof ArgumentNode`, we instead us
`isArgumentNode`, which explicitly filters out the ones for which
`isArgumentOf` doesn't hold (which, again, is the case for most of the
nodes in the overapproximation).
This pull request introduces a new CodeQL query for detecting prompt injection vulnerabilities in Python code targeting AI prompting APIs such as agents and openai. The changes includes a new experimental query, new taint flow and type models, a customizable dataflow configuration, documentation, and comprehensive test coverage.
See https://docs.python.org/3/library/compression.zstd.html for
information about this library.
As far as I can tell, the `zstd` library is not vulnerable to things
like ZipSlip, but it _could_ be vulnerable to a decompression bomb
attack, so I extended those models accordingly.
In hindsight, having a `.getMetrics()` method that just returns `this`
is somewhat weird. It's possible that it predates the existence of the
inline cast, however.
Fixed 73 .ql query files where the @name metadata contained an ending period.
This ensures consistency with the CodeQL query metadata style guidelines.
For whatever reason, the CFG node for exceptions and exception groups
was placed with the points-to code. (Probably because a lot of the
predicates depended on points-to.)
However, as it turned out, two of the SSA modules only depended on
non-points-to properties of these nodes, and so it was fairly
straightforward to remove the imports of `LegacyPointsTo` for those
modules.
In the process, I moved the aforementioned CFG node types into
`Flow.qll`, and changed the classes in the `Exceptions` module to the
`...WithPointsTo` form that we introduced elsewhere.
Turns out the `ImportTime` module (despite living in
`semmle.python.types` does not actually depend on points-to, so some of
the `LegacyPointsTo` imports could be replaced or removed.
For now, these have just been made into `private` imports. After doing
this, I went through all of the (now not compiling) files and added in
private imports to the modules that they actually depended on.
I also added an explicit import of `LegacyPointsTo` (even though it may
be unnecessary) in cases where the points-to dependency was somewhat
surprising (and one we want to get rid of). This was primarily inside
the various SSA layers.
For modules inside `semmle.python.{types, objects, pointsto}` I did not
bother, as these are fairly clearly related to points-to.