A few change notes slipped through the cracks of my previous change. These are now in the proper locations: `old-change-notes` for older notes, and `<lang>\ql\[src|lib]\change-notes` for current change notes.
I've added 2 queries:
- one that detects full SSRF, where an attacker can control the full URL,
which is always bad
- and one for partial SSRF, where an attacker can control parts of an
URL (such as the path, query parameters, or fragment), which is not a
big problem in many cases (but might still be exploitable)
full SSRF should run by default, and partial SSRF should not (but makes
it easy to see the other results).
Some elements of the full SSRF queries needs a bit more polishing, like
being able to detect `"https://" + user_input` is in fact controlling
the full URL.
I think `getUrl` is a bit too misleading, since from the name, I would
only ever expect ONE result for one request being made.
`getAUrlPart` captures that there could be multiple results, and that
they might not constitute a whole URl.
Which is the same naming I used when I tried to model this a long time ago
a80860cdc6/python/ql/lib/semmle/python/web/Http.qll (L102-L111)
Also adjusts test slightly. Writing
`clientRequestDisablesCertValidation=False` to mean that certificate
validation was disabled by the `False` expression is just confusing, as
it easily reads as _certificate validate was NOT disabled_ :|
The new one ties to each request that is being made, which seems like
the right setup.
For the snippet below, our current query is able to show _why_ we
consider `var` to be a falsey value that would disable SSL/TLS
verification. I'm not sure we're going to need the part that Ruby did,
for being able to specify _where_ the verification was removed, but
we'll see.
```
requests.get(url, verify=var)
```
Taken from Ruby, except that `getURL` member predicate was changed to
`getUrl` to keep consistency with the rest of our concepts, and stick
to our naming convention.
CWE-185: Incorrect Regular Expression
The software specifies a regular expression in a way that causes data to
be improperly matched or compared.
https://cwe.mitre.org/data/definitions/185.html
CWE-186: Overly Restrictive Regular Expression
> A regular expression is overly restrictive, which prevents dangerous values from being detected.
>
> (...) [this CWE] is about a regular expression that does not match all
> values that are intended. (...)
https://cwe.mitre.org/data/definitions/186.html
From my understanding,
CWE-625: Permissive Regular Expression, is not applicable. (since this
is about accepting a regex match where there should not be a match).