Really, this boils down to "Port `re` library model to use API graphs
instead of points-to", which is what this PR actually does.
Instead of using points-to to track flags, we use a type tracker. To
handle multiple flags at the same time, we add additional flow from
`x` to `x | y` and `y | x`
and, as an added bonus, the above with `+` instead of `|`, neatly
fixing https://github.com/github/codeql/issues/4707
I had to modify the `Qualified.ql` test slightly, as it now had a
result stemming from the standard library (in `warnings.py`) that
points-to previously ignored.
It might be possible to implement this as a type tracker on
`LocalSourceNode`s, but with the added steps for the above operations,
this was not obvious to me, and so I opted for the simpler
"`smallstep`" variant.
Since we need to reserve the flexibility to change this setup within the next
few months, we don't want to commit to keeping this extension point around for
the 12 months that the normal API deprecation cycle requires.
Although it is becoming non-trivial to get an overview of what tests we have and
don't have, I didn't find any that highlighted this one
I used all 3 variants of parameters, just to be sure :)
Since WeakCrypto always makes me think that it's about all weak crypto (like
using MD5, or completely broken ciphers such as ARC4 ro DES) and not just about
weak key generation.
instead of points-to.
Looking at query results also made me realize I didn't supply a very good
"origin" for ECC in cryptography package, so I improved that 👍 -- maybe that
sohuld have been split into multiple commits... too late :(
* Removed backend arugment that is not required
* Added DSA constants (they are just accidentially the same as RSA right now)
* Removed FakeWeakEllipticCurve and used a real weak elliptic curve instead
I did spend some time to figure out how to best write `minimumSecureKeySize`
predicate. I wanted to write once and for all the recommended sizes for each
cryptosystem.
I considered making the predicate such as
```codeql
int minimumSecureKeySize() {
this.getName() = "RSA" and result = 2048
or
this.getName() = "DSA" and result = 2048
or
this.getName() = "ECC" and result = 244
}
```
but then it would be impossible to add a new model without also being able to
modify the body of this predicate -- which seems like a bad way to start off a
brand new way of modeling things.
So I considered if we could add it to the non-range class, such as
```codeql
class RSAKeyGeneration extends KeyGeneration {
RSAKeyGeneration() { this.getName() = "RSA" }
override int minimumSecureKeySize() { result = 2048 }
}
```
This has the major problem that when you're writing the models for a new
API (and therefore extending KeyGeneration::Range), there is no way for you to
see that you need to take this extra step :| (also problem about how we should
define `minimumSecureKeySize` on `KeyGeneration` class then, since if we make it
abstract, we effectively disable the ability to refine `KeyGeneration` since any
subclass must provide an implementation.)
So, therefore I ended up with this solution ;)
Tests working can be verified by running
```
ls ql/python/ql/test/experimental/library-tests/frameworks/crypto*/*.py | xargs -L1 sh -c 'python $0 || exit 255'
```
When I changed the taint modeling in 19b7ea8d85, that obviously also means that
some of the related locations for alerts will change. So that's why all the
examples needs to be updated.
Besides this, I had to fix a minor problem with having too many alerts. If
running a query agaisnt code like in the example below, there would be 3 alerts,
2 of them originating from the import.
```
from flask import Flask, request
app = Flask(__name__)
@app.route("/route")
def route():
SINK(request.args.get['input'])
```
The 2 import sources where:
- ControlFlowNode for ImportMember
- GSSA Variable request
I removed these from being a RemoteFlowSource, as seen in the diff.
I considered restricting `FlaskRequestSource` so it only extends
`DataFlow::CfgNode` (and make the logic a bit simpler), but I wasn't actually
sure if that was safe to do or not... If you know, please let me know :)