In `x.arg = TAINTED_STRING` there is a store step to the attribute `arg`
of `x`. In our taint modeling, we allow _any_ store step with the code
below. This means that we also say there is a taint-step directly from
`TAINTED_STRING` to `x` :|
```codeql
// construction by literal
// TODO: Not limiting the content argument here feels like a BIG hack, but we currently get nothing for free :|
DataFlowPrivate::storeStep(nodeFrom, _, nodeTo)
```
This means that DataFlowCall is only for resolvable calls, which might not seem
like a big thing in itself, but enables the next commit to actually work :P
Notice the strange thing with treating `mypkg.foo(42)` as a ClassCall,
but completely ignoring `mypkg.subpkg.bar(43)` -- due to having the two
`ClassValue`s:
- `Missing module attribute mypkg.foo`
- `Missing module attribute mypkg.subpkg`
But not `Missing module attribute mypkg.subpkg` with the current import
structure.
I had to rewrite the SINK1-SINK7 definitions, since this new requirement
complained that we had to add this `MISSING: flow` annotation :D
Doing this implementation also revealed that there was a bug, since I
did not compare files when checking for these `MISSING:` annotations. So
fixed that up in the implementation for inline taint tests as well.
(extra whitespace in argumentPassing.py to avoid changing line numbers
for other tests)
I went with NormalDataflowTest to signify that if you don't know what
you're looking for, this is probably the one. I did not want to just
call it DataflowTest, since that becomes a big vague when there are also
`FlowTest.qll` and `MaximalFlowTest.qll` -- I'm open to renaming this
though 👍
Particularly in value and literal patterns.
This is getting a little bit into the guards aspect of matching.
We could similarly add reverse flow in terms of
sub-patterns storing to a sequence pattern,
a flow step from alternatives to an-or-pattern, etc..
It does not seem too likely that sources are embedded in patterns
to begin with, but for secrets perhaps?
It is illustrated by the literal test. The value test still fails.
I believe we miss flow in general from the static attribute.
- also update `validTest.py`, but commented out for now
otherwise CI will fail until we force it to run with Python 3.10
- added debug utility for dataflow (`dataflowTestPaths.ql`)
The easiest way to implement this was to change the definition of
`module_export` to account for chains of `import *`. We reuse the
machinery from `ImportStar.qll` for this, naturally.
TL;DR: We were missing out on flow in the following situation:
`mod1.py`:
```python
foo = SOURCE
```
`mod2.py`:
```python
from mod1 import *
```
`test.py`:
```python
from mod2 import foo
SINK(foo)
```
This is because there's no node at which a read of `foo` takes place
within `test.py`, and so the added reads make no difference.
Unfortunately, this means the previous test was a bit too simplistic,
since it only looks for module variable reads and writes. Because of
this, we change the test to be a more traditional "all flow" style
(though restricted to `CfgNode`s).
Adds result for `ModuleVariableNode::getARead` corresponding to reads
that go through (chains of) `import *`.
This required a bit of a change to _which_ module variables we define.
Previously, we only included variables that were accessed elsewhere in
the same file, but now we must ensure to also include variables that may
be accessed through `import *`.
Along with the root cause, which is the `StringConstCompare`
BarrierGuard, that does only allows `in <iterable literal>` and not
`in <variable referencing iterable literal>`