Explicitly adds a bunch of nodes that were previously (using a global
analysis) identified as `ExtractedArgumentNode`s. These are then
subsequently filtered out in `argumentOf` (which is global) by putting
the call to `getCallArg` there instead of in the charpred.
With `ModuleVariableNode`s now appearing for _all_ global variables (not
just the ones that actually seem to be used), some of the tests changed
a bit. Mostly this was in the form of new flow (because of new nodes
that popped into existence). For some inline expectation tests, I opted
to instead exclude these results, as there was no suitable location to
annotate. For the normal tests, I just accepted the output (after having
vetted it carefully, of course).
This pull request introduces a new CodeQL query for detecting prompt injection vulnerabilities in Python code targeting AI prompting APIs such as agents and openai. The changes includes a new experimental query, new taint flow and type models, a customizable dataflow configuration, documentation, and comprehensive test coverage.
See https://peps.python.org/pep-0758/ for more details.
We implement this by extending the syntax for exceptions and exception
groups so that the `type` field can now contain either an expression
(which matches the old behaviour), or a comma-separated list of at least
two elements (representing the new behaviour).
We model the latter case using a new node type `exception_list`, which
in `tsg-python` is simply mapped to a tuple. This means it matches the
existing behaviour (when the tuple is surrounded by parentheses)
exactly, hence we don't need to change any other code.
As a consequence of this, however, we cannot directly parse the Python
2.7 syntax `except Foo, e: ...` as `except Foo as e: ...`, as this would
introduce an ambiguity in the grammar. Thus, we have removed support for
the (deprecated) 2.7-style syntax, and only allow `as` to indicate
binding of the exception. The syntax `except Foo, e: ...` continues to
be parsed (in particular, it's not suddenly a syntax error), but it will
be parsed as if it were `except (Foo, e): ...`, which may not give the
correct results.
In principle we could extend the QL libraries to account for this case
(specifically when analysing Python 2 code). In practice, however, I
expect this to have a minor impact on results, and not worth the
additional investment at this time.