mostly removing of nodes from the graph.
One result lost:
```
check("submodule.submodule_attr", submodule.submodule_attr, "submodule_attr", globals()) #$ MISSING:prints=submodule_attr
```
This commit removes SSA nodes from the data flow graph. Specifically, for a definition and use such as
```python
x = expr
y = x + 2
```
we used to have flow from `expr` to an SSA variable representing x and from that SSA variable to the use of `x` in the definition of `y`. Now we instead have flow from `expr` to the control flow node for `x` at line 1 and from there to the control flow node for `x` at line 2.
Specific changes:
- `EssaNode` from the data flow layer no longer exists.
- Several glue steps between `EssaNode`s and `CfgNode`s have been deleted.
- Entry nodes are now admitted as `CfgNodes` in the data flow layer (they were filtered out before).
- Entry nodes now have a new `toString` taking into account that the module name may be ambigous.
- Some tests have been rewritten to accomodate the changes, but only `python/ql/test/experimental/dataflow/basic/maximalFlowsConfig.qll` should have semantic changes.
- Comments have been updated
- Test output has been updated, but apart from `python/ql/test/experimental/dataflow/basic/maximalFlows.expected` only `python/ql/test/experimental/dataflow/typetracking-summaries/summaries.py` should have a semantic change. This is a bonus fix, probably meaning that something was never connected up correctly.
It's nice that it fixes the `InsecureProtocol` test-case (which maybe
should have been a test-case for the import resolution library in the
first place?)
But it's not quite right:
1. it adds spurious flow for `clashing_attr`
2. it runs into huge problems for typetracking_imports/tracked.expected
3. it runs into the problem for
https://github.com/github/codeql/pull/10176 with an `from <pkg>
import *` blocking flow from previously defined variable, that is NOT
overridden. (simplistic_reexport.bar_attr)
Just like the one added for `py/insecure-protocol` in fb425b7, but
instead added in the import-resolution tests, such that we don't have to
remember it's in a completely different directory.
It's not very useful to look at, and it's a mess when you change any
tests to see all the changes lines in the expected output that you
really do not care about!
I guess we could have done this at the very start of introducing this
test in this PR, but I think the last commit was mostly inspired from
looking at all the things that evidently was re-exported from the trace
import, even when I knew they were not available because of the
`__all__` definition.
However, we can see that `from <pkg> import *` and `import pkg` are
handled differently. Would have liked `has_defined_all_indirection` to
behave in the same way no matter how the import was made.
Notice that `has_defined_all_indirection` all have both
`all_defined_bar_copy` and `all_defined_foo_copy` marked as exported,
even though only `all_defined_foo_copy` is available.
For `from <pkg> import <attr>` we would use to treat the `<pkg>`
(ImportExpr) as a definition of the name `<attr>`.
Since this removes bad import-flow, and nothing broke, I'm guessing this
was never intentional.
However, as illustrated by the `CWE-327-InsecureProtocol` test, this fix
is NOT good enough, since now even the `secure_context` is considered to
be insecure (for both versions). Ouch.
Will fix this in a later commit, since it was only discoverd late on.
This was causing issues with imports with many "dots" in the name.
Previously, the test added in this commit would not have the desired
result for the `check` call.
This was due to bad manual magic: restricting the attribute name makes
sense when we're talking about submodules of a package, but it doesn't
when we're talking about reexported modules.
Also (hopefully) fixes the tests so that the Python 3-specific bits are
ignored under Python 2.
Extends the tests to
1. Account parts of the test code that may be specific to Python 2 or 3,
2. Also track which arguments passed to `check` are references to
modules.
The latter revealed a bunch of spurious results, which I have annotated
accordingly.
I could have sworn I added all of them to the batch, but somehow these slipped through.
Co-authored-by: yoff <lerchedahl@gmail.com>
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
A slightly complicated test setup. I wanted to both make sure I captured
the semantics of Python and also the fact that the kinds of global flow
we expect to see are indeed present.
The code is executable, and prints out both when the execution reaches
certain files, and also what values are assigned to the various
attributes that are referenced throughout the program. These values are
validated in the test as well.
My original version used introspection to avoid referencing attributes
directly (thus enabling better error diagnostics), but unfortunately
that made it so that the model couldn't follow what was going on.
The current setup is a bit clunky (and Python's scoping rules makes it
especially so -- cf. the explicit calls to `globals` and `locals`), but
I think it does the job okay.