Particularly in value and literal patterns.
This is getting a little bit into the guards aspect of matching.
We could similarly add reverse flow in terms of
sub-patterns storing to a sequence pattern,
a flow step from alternatives to an-or-pattern, etc..
It does not seem too likely that sources are embedded in patterns
to begin with, but for secrets perhaps?
It is illustrated by the literal test. The value test still fails.
I believe we miss flow in general from the static attribute.
The idea behind optional results is that there may be instances where
each line of source code has many results and you don't want to annotate
all of them, but you still want to ensure that any annotations you do
have are correct.
This change makes that possible by exposing a new predicate
`hasOptionalResult`, which has the same signature as `hasResult`.
Results produced by `hasOptionalResult` will be matched against any
annotations, but the lack of a matching annotation will not cause a
failure.
We will use this in the inline tests for the API edge getASubclass,
because for each API path that uses getASubclass there is always a
shorter path that does not use it, and thus we can't use the normal
shortest-path matching approach that works for other API Graph tests.
- also update `validTest.py`, but commented out for now
otherwise CI will fail until we force it to run with Python 3.10
- added debug utility for dataflow (`dataflowTestPaths.ql`)
Adds a test demostrating the false positive observed by andersfugmann.
Note that this does not change the `.expected` file, and so the tests
will fail. This is expected.
I've added 2 queries:
- one that detects full SSRF, where an attacker can control the full URL,
which is always bad
- and one for partial SSRF, where an attacker can control parts of an
URL (such as the path, query parameters, or fragment), which is not a
big problem in many cases (but might still be exploitable)
full SSRF should run by default, and partial SSRF should not (but makes
it easy to see the other results).
Some elements of the full SSRF queries needs a bit more polishing, like
being able to detect `"https://" + user_input` is in fact controlling
the full URL.
I think `getUrl` is a bit too misleading, since from the name, I would
only ever expect ONE result for one request being made.
`getAUrlPart` captures that there could be multiple results, and that
they might not constitute a whole URl.
Which is the same naming I used when I tried to model this a long time ago
a80860cdc6/python/ql/lib/semmle/python/web/Http.qll (L102-L111)
Also adjusts test slightly. Writing
`clientRequestDisablesCertValidation=False` to mean that certificate
validation was disabled by the `False` expression is just confusing, as
it easily reads as _certificate validate was NOT disabled_ :|
The new one ties to each request that is being made, which seems like
the right setup.
For the snippet below, our current query is able to show _why_ we
consider `var` to be a falsey value that would disable SSL/TLS
verification. I'm not sure we're going to need the part that Ruby did,
for being able to specify _where_ the verification was removed, but
we'll see.
```
requests.get(url, verify=var)
```
Taken from Ruby, except that `getURL` member predicate was changed to
`getUrl` to keep consistency with the rest of our concepts, and stick
to our naming convention.