I had to rewrite the SINK1-SINK7 definitions, since this new requirement
complained that we had to add this `MISSING: flow` annotation :D
Doing this implementation also revealed that there was a bug, since I
did not compare files when checking for these `MISSING:` annotations. So
fixed that up in the implementation for inline taint tests as well.
(extra whitespace in argumentPassing.py to avoid changing line numbers
for other tests)
I went with NormalDataflowTest to signify that if you don't know what
you're looking for, this is probably the one. I did not want to just
call it DataflowTest, since that becomes a big vague when there are also
`FlowTest.qll` and `MaximalFlowTest.qll` -- I'm open to renaming this
though 👍
- move from custom concept `LogOutput` to standard concept `Logging`
- remove `Log.qll` from experimental frameworks
- fold models into standard models (naively for now)
- stdlib:
- make Logger module public
- broaden definition of instance
- add `extra` keyword as possible source
- flak: add app.logger as logger instance
- django: `add django.utils.log.request_logger` as logger instance
(should we add the rest?)
- remove LogOutput from experimental concepts
I am slightly concerned that the test now generates many more
intermediate results. I suppose that maes the analysis heavy.
Should the new library get a new name instead, so the old code
does not get evaluated?
Particularly in value and literal patterns.
This is getting a little bit into the guards aspect of matching.
We could similarly add reverse flow in terms of
sub-patterns storing to a sequence pattern,
a flow step from alternatives to an-or-pattern, etc..
It does not seem too likely that sources are embedded in patterns
to begin with, but for secrets perhaps?
It is illustrated by the literal test. The value test still fails.
I believe we miss flow in general from the static attribute.
The idea behind optional results is that there may be instances where
each line of source code has many results and you don't want to annotate
all of them, but you still want to ensure that any annotations you do
have are correct.
This change makes that possible by exposing a new predicate
`hasOptionalResult`, which has the same signature as `hasResult`.
Results produced by `hasOptionalResult` will be matched against any
annotations, but the lack of a matching annotation will not cause a
failure.
We will use this in the inline tests for the API edge getASubclass,
because for each API path that uses getASubclass there is always a
shorter path that does not use it, and thus we can't use the normal
shortest-path matching approach that works for other API Graph tests.
- also update `validTest.py`, but commented out for now
otherwise CI will fail until we force it to run with Python 3.10
- added debug utility for dataflow (`dataflowTestPaths.ql`)
Adds a test demostrating the false positive observed by andersfugmann.
Note that this does not change the `.expected` file, and so the tests
will fail. This is expected.
I've added 2 queries:
- one that detects full SSRF, where an attacker can control the full URL,
which is always bad
- and one for partial SSRF, where an attacker can control parts of an
URL (such as the path, query parameters, or fragment), which is not a
big problem in many cases (but might still be exploitable)
full SSRF should run by default, and partial SSRF should not (but makes
it easy to see the other results).
Some elements of the full SSRF queries needs a bit more polishing, like
being able to detect `"https://" + user_input` is in fact controlling
the full URL.
I think `getUrl` is a bit too misleading, since from the name, I would
only ever expect ONE result for one request being made.
`getAUrlPart` captures that there could be multiple results, and that
they might not constitute a whole URl.
Which is the same naming I used when I tried to model this a long time ago
a80860cdc6/python/ql/lib/semmle/python/web/Http.qll (L102-L111)