I've added 2 queries:
- one that detects full SSRF, where an attacker can control the full URL,
which is always bad
- and one for partial SSRF, where an attacker can control parts of an
URL (such as the path, query parameters, or fragment), which is not a
big problem in many cases (but might still be exploitable)
full SSRF should run by default, and partial SSRF should not (but makes
it easy to see the other results).
Some elements of the full SSRF queries needs a bit more polishing, like
being able to detect `"https://" + user_input` is in fact controlling
the full URL.
I think `getUrl` is a bit too misleading, since from the name, I would
only ever expect ONE result for one request being made.
`getAUrlPart` captures that there could be multiple results, and that
they might not constitute a whole URl.
Which is the same naming I used when I tried to model this a long time ago
a80860cdc6/python/ql/lib/semmle/python/web/Http.qll (L102-L111)
Now that change notes are per-package, new change notes should be created in the `change-notes` folder under the affected pack (e.g., `cpp/ql/src/change-notes` for C++ query change notes. I've moved all of the change note files that were added before we started publishing them in packs to an `old-change-notes` directory under each language, to reduce the temptation to add new change notes there.
I'm working on a document to describe how and when to create change notes for packs separately.
Also adjusts test slightly. Writing
`clientRequestDisablesCertValidation=False` to mean that certificate
validation was disabled by the `False` expression is just confusing, as
it easily reads as _certificate validate was NOT disabled_ :|
The new one ties to each request that is being made, which seems like
the right setup.
For the snippet below, our current query is able to show _why_ we
consider `var` to be a falsey value that would disable SSL/TLS
verification. I'm not sure we're going to need the part that Ruby did,
for being able to specify _where_ the verification was removed, but
we'll see.
```
requests.get(url, verify=var)
```
Taken from Ruby, except that `getURL` member predicate was changed to
`getUrl` to keep consistency with the rest of our concepts, and stick
to our naming convention.
CWE-185: Incorrect Regular Expression
The software specifies a regular expression in a way that causes data to
be improperly matched or compared.
https://cwe.mitre.org/data/definitions/185.html
CWE-186: Overly Restrictive Regular Expression
> A regular expression is overly restrictive, which prevents dangerous values from being detected.
>
> (...) [this CWE] is about a regular expression that does not match all
> values that are intended. (...)
https://cwe.mitre.org/data/definitions/186.html
From my understanding,
CWE-625: Permissive Regular Expression, is not applicable. (since this
is about accepting a regex match where there should not be a match).
Unrolling the transitive closure had slightly better performance here.
Also, we exclude names of builtins, since those will be handled by a
separate case of `isDefinedLocally`.