This gives muche nicer path explanations on some snapshots.
It is achieved by making stepped-to nodes `CastNode`s.
This seems somewhat reasonable as types then to change, when we move
between content and container.
We could probably refine it, though.
These were increased because of the indirection needed to get to the
regex flags, but as we no longer rely on this, we can make do with a
smaller import depth.
Really, this boils down to "Port `re` library model to use API graphs
instead of points-to", which is what this PR actually does.
Instead of using points-to to track flags, we use a type tracker. To
handle multiple flags at the same time, we add additional flow from
`x` to `x | y` and `y | x`
and, as an added bonus, the above with `+` instead of `|`, neatly
fixing https://github.com/github/codeql/issues/4707
I had to modify the `Qualified.ql` test slightly, as it now had a
result stemming from the standard library (in `warnings.py`) that
points-to previously ignored.
It might be possible to implement this as a type tracker on
`LocalSourceNode`s, but with the added steps for the above operations,
this was not obvious to me, and so I opted for the simpler
"`smallstep`" variant.
Since we need to reserve the flexibility to change this setup within the next
few months, we don't want to commit to keeping this extension point around for
the 12 months that the normal API deprecation cycle requires.
Although it is becoming non-trivial to get an overview of what tests we have and
don't have, I didn't find any that highlighted this one
I used all 3 variants of parameters, just to be sure :)
When I changed the taint modeling in 19b7ea8d85, that obviously also means that
some of the related locations for alerts will change. So that's why all the
examples needs to be updated.
Besides this, I had to fix a minor problem with having too many alerts. If
running a query agaisnt code like in the example below, there would be 3 alerts,
2 of them originating from the import.
```
from flask import Flask, request
app = Flask(__name__)
@app.route("/route")
def route():
SINK(request.args.get['input'])
```
The 2 import sources where:
- ControlFlowNode for ImportMember
- GSSA Variable request
I removed these from being a RemoteFlowSource, as seen in the diff.
I considered restricting `FlaskRequestSource` so it only extends
`DataFlow::CfgNode` (and make the logic a bit simpler), but I wasn't actually
sure if that was safe to do or not... If you know, please let me know :)
We don't actually need it for anything right now, but I have plans for the
future where would need it.
Although it would be nice to have it as an `API::Node`, and we could re-write
implementations so we could provide it in this instance, I'm not convinced we
can do that in general right now.
For example, if <n'th> parameter of a function has to be modeled as belonging to
a certain type, I don't see any way to specify that as an API::Node.
For me, that's ok. Until we _can_ specify things like this as API::Nodes in the
future, I would like to keep things consistent, and use `DataFlow::Node` as the
result type.
This was a good time to do this, so we don't have 2 different ways of doing the
same thing.
I needed to do this to figure out if we should expose
`API::moduleImport("flask").getMember("request")` in a helper predicate or
not. I think I ended up using more refenreces to this in the end. Although it's
not unreasonable to let someone do this themselves, I also think it's reasonable
that we provide a helper predicate for this.