We won't be able to run these tests until Python 3.15 is actually out
(and our CI is using it), so it seemed easiest to just put them in their
own test directory.
First, we extend the various location overriding hacks to also accept
list and dict splats in various places. Having done this, we then have
to tackle how to actually desugar these new comprehension forms (as this
is what we currently do for the old forms).
As a reminder, a list comprehension like `[x for x in y]` currently gets
desugared into a small local function, something like
```python
def listcomp(a):
for x in a:
yield x
listcomp(y)
```
For `[*x for x in y]`, the behaviour we want is that we unpack `x`
before yielding its elements in turn. This is essentially what we would
get if we were to use `yield from x` instead of `yield x` in the above
desugaring, so that's what we do. This also works for set
comprehensions.
For dict comprehensions, it's slightly more complicated. Here, the
generator function instead yields a stream of `(key, value)` tuples.
(And apparently the old parser got this wrong and emitted `(value, key)`
pairs instead, which we faithfully recreated in the new parser as well.
We fix that bug in both parsers while we're at it). So, a bare `yield
from` is not enough, we also need a `.items()` call to get the
double-starred expression to emit its items as a stream of tuples (that
we then `yield from`.
To make this (hopefully) less verbose in the implementation, we defer
the decision of whether to use `yield` or `yield from` by introducing a
`yield_kind` scoped variable that determines the type of the actual AST
node. And of course for dict comprehensions with unpacking we need to
synthesise the extra machinery mentioned above.
On the plus side, this means we don't have to mess with control-flow, as
the existing machinery should be able to handle the desugared syntax
just fine.
Adds a new `isLazy` predicate to the relevant classes, and adds the
relevant dbscheme (and up/downgrade) changes. On upgrades we do nothing,
and on downgrades we remove the `is_lazy` bits.
As defined in PEP-810. We implement this in much the same way as how we
handle `async` annotations currently. The relevant nodes get an
`is_lazy` field that defaults to being false.
This allows us to build and test the extractor (for actual QL extraction
-- not just the extractor unit tests) entirely from within the
`github/codeql` repo, just as we do with Ruby. All that's needed is a
`--search-path` argument that points to the repo root.
Looking at the results of the the previous DCA run, there was a bunch of
false positives where `bind` was being used with a `AF_UNIX` socket (a
filesystem path encoded as a string), not a `(host, port)` tuple. These
results should be excluded from the query, as they are not vulnerable.
Ideally, we would just add `.TupleElement[0]` to the MaD sink, except we
don't actually support this in Python MaD...
So, instead I opted for a more low-tech solution: check that the
argument in question flows from a tuple in the local scope.
This eliminates a bunch of false positives on `python/cpython` leaving
behind four true positive results.