I'm beginning to realise why I didn't do the `toString` overriding way
back when. Thankfully, now that all of our tests are in the same place,
this is actually not a terrible ordeal.
mostly removing of nodes from the graph.
One result lost:
```
check("submodule.submodule_attr", submodule.submodule_attr, "submodule_attr", globals()) #$ MISSING:prints=submodule_attr
```
This commit removes SSA nodes from the data flow graph. Specifically, for a definition and use such as
```python
x = expr
y = x + 2
```
we used to have flow from `expr` to an SSA variable representing x and from that SSA variable to the use of `x` in the definition of `y`. Now we instead have flow from `expr` to the control flow node for `x` at line 1 and from there to the control flow node for `x` at line 2.
Specific changes:
- `EssaNode` from the data flow layer no longer exists.
- Several glue steps between `EssaNode`s and `CfgNode`s have been deleted.
- Entry nodes are now admitted as `CfgNodes` in the data flow layer (they were filtered out before).
- Entry nodes now have a new `toString` taking into account that the module name may be ambigous.
- Some tests have been rewritten to accomodate the changes, but only `python/ql/test/experimental/dataflow/basic/maximalFlowsConfig.qll` should have semantic changes.
- Comments have been updated
- Test output has been updated, but apart from `python/ql/test/experimental/dataflow/basic/maximalFlows.expected` only `python/ql/test/experimental/dataflow/typetracking-summaries/summaries.py` should have a semantic change. This is a bonus fix, probably meaning that something was never connected up correctly.
This commit is a squash of 80 other commits. While developing, things
changed majorly 2-3 times, and it just wasn't feasible to go back and
write a really nice commit history.
My apologies for this HUGE commit.
Also, later on this is where I solved merge conflicts after flow-summaries
PR was merged.
For your amusement, I've included the original commit messages below.
Python: Add proper argument/parameter positions
Python: Handle normal function calls
Python: Reduce dataflow-consistency warnings
Previously there was a lot of failures for `uniqueEnclosingCallable` and
`argHasPostUpdate`
Removing the override of `getEnclosingCallable` in ParameterNode is
probably the most controversial... although from my point of view it's a
change for the better, since we're able to provide data-flow
ParameterNodes for more of the AST parameter nodes.
Python: Adjust `dataflow/calls` test
Python: Implement `isParameterOf`/`argumentOf`/`OutNode`
This makes the tests under `dataflow/basic` work as well 👍
(initially I had these as separate commits, but it felt like it was too much noise)
Python: Accept fix for `dataflow/consistency`
Python: Changes to `coverage/argumentRoutingTest.ql`
Notice we gain a few new resolved arguments.
We loose out on stuff due to:
1. not handling `*` or `**` in either arguments/parameters (yet)
2. not handling special calls (yet)
Python: Small fix for `TestUtil/RoutingTest.qll`
Since the helper predicates do not depend on this, moved outside class.
Python: Accept changes to `dataflow/coverage/NormalDataflowTest.ql`
Most of this is due to:
- not handling any kinds of methods yet
- not handling `*` or `**`
Python: Small investigation of `test_deep_callgraph`
Python: Accept changes to `coverage/localFlow.ql`
I don't fully understand why the .expected file changed.
Since we still have the desired flow, I'm not going to worry too much
about it.
with this commit, the `dataflow/coverage` tests passes 👍
Python: Minor doc update
Python: Add staticmethod/classmethod to `dataflow/calls`
Python: Handle method calls on class instances
without trying to deal with any class inheritance, or
staticmethod/classmethod at all.
Notice that with this change, we only have a DataFlowCall for the calls
that we can actually resolve. I'm not 100% sure if we need to add a
`UnresolvedCall` subclass of `DataFlowCall` for MaD in the future, but
it should be easy to do.
I'm still unsure about the value of `classesCallGraph`, but have just
accepted the changes.
Python: Handle direct method calls `C.foo(C, arg0)`
Python: Handle `@staticmethod`
Python: Handle class method calls... but the code is shit
WIP todo
Rewrite method calls to be better
also fixed a problem with `self` being an argument to the `x.staticmethod()` call :|
Python: Add subclass tests
Python: Split `class_advanced` test
Python: Rewrite call-graph tests to be inline expectation (1/2)
This adds inline expectations, next commit will remove old annotations
code... but I thought it would be easier to review like this.
Minor fixup
Python: Add simple subclass support
Python: more precise subclass lookup
Still not 100% precise.. but it's better
New ambiguous
Python: Add test for `self.m()` and `cls.m()` calls
Python: Handle `self.m()` and `cls.m()` calls
Python: Add tests for `__init__` and `__new__`
Python: Handle class calls
Python: Fix `self` argument passing for class calls
Now field-flow tests also pass 💪 (although the crosstalk
fieldflow test changes were due to this specific commit)
I also copied much of the setup for pre/post update nodes from Ruby,
specifically having the abstract `PostUpdateNodeImpl` in DataFlowPrivate
seemed like a nice change.
Same for the setup with `TNode` definition having the specification
directly in the body, instead of a `NeedsSyntheticPostUpdateNode` class.
Python: Add new crosstalk test WIP
Maybe needs a bit of refactoring, and to see how it all behaves with points-to
Python: Add `super()` call-graph tests
Python: Refactor MethodCall char-pred
In anticipation of supporting `super(MyClass, self).foo()`, where the
`self` argument doesn't come from an AttrNode, but from the second
argument to super.
Without `pragma[inline]` the optimizer found a terrible join-order --
this won't guarantee a good join-order for the future, but for now it
was just so simple and could let me move on with life.
Python: Add basic `super()` support
I debated a little (with myself) whether I should really do
`superTracker`, but I thought "why not" and just rolled with it. I did
not confirm whether it was actually needed anywhere, that is if anyone
does `ref = super; ref().foo()` -- although I certainly doubt it's very
wide-spread.
Python: InlineCallGraphTest: Allow non-unique callable name in different files
Python: more MRO tests
Python: Add MRO approximation for `super()`
Although it's not 100% accurate, it seems to be on level with the one in
points-to.
Python: Remove some spurious targets for direct calls
removal of TODO from refactoring
remove TODOs class call support
Python: Add contrived subclass call example
Python: Remove more spurious call targets
NOTE: I initially forgot to use
`findFunctionAccordingToMroKnownStartingClass` instead of
`findFunctionAccordingToMro` for __init__ and __new__, and since I did
make that mistake myself, I wanted to add something to the test to
highlight this fact, and make it viewable by PR reviewer... this will be
fixed in the next commit.
Python: Proper fix for spurious __init__ targets
Python: Add call-graph example of class decorator
Python: Support decorated classes in new call-graph
Python: Add call-graph tests for `type(obj).meth()`
Python: support `type(obj).meth()`
Python: Add test for callable defined in function
Python: Add test for callable as argument
Current'y we don't find these with type-tracking, which is super
mysterious. I did check that we have proper flow from the arguments to
the parameters.
Python: Found problem for callable as argument :| MAJOR WIP
WIP commit
IT WORKS AGAIN (but terrible performance)
remove pragma[inline]
remove oops
Fix performance problem
I tried to optimize it even further, but I didn't end up achieving anything :|
Fix call-graph comparison
add comparison version with easy lookup
incomplete missing call-graph tests
unhandled tests
trying to replicate missing call-edge due to missing imports ... but it's hard
also seems to be problems with the inline-expectation-value that I used, seems like it has both missing/unexpected results with same value
Python: Add import-problem test
Python: Add shadowing problem
some cleanup of rewrite fix
a little more cleanup
Add consistency queries to call-graph tests
Python: Add post-update nodes for `self` in implicit `super()` uses
But we do need to discuss whether this is the right approach :O
Fix for field-flow tests
This came from more precise argument passing
Fixed results in type-tracking
Comes from better argument passing with super() and handling of
functions with decorators
fix of inline call graph tests
Fixup call annotation test
Many minor cleanups/fixes
NewNormalCall -> NormalCall
Python: Major restructuring + qldoc writing
Python: Accept changes from pre/post update node .toString changes
Python: Reduce `super` complexity !! WIP !!
Python: Only pass self-reference if in same enclosing-callable
Python: Add call-graph test with nested class
This was inspired by the ImpliesDataflow test that showed missing flow
for q_super, but at least for the call-graph, I'm not able to reproduce
this missing result :|
Python: Restrict `super()` to function defined directly on class
Python: Accept fixes to ImpliesDataflow
Python: Expand field-flow crosstalk tests
Add a step from that `CfgNode` to the corresponding `EssaNode`.
The intended effect is seen in `ImpliesDataflow.expected`.
The efeect seen in other `.expected`-files is that parameter nodes
change type, that the extra steps are seen, and that flow from
`EssaVar`s is mirrored in flow from `CfgNode`s.
There is one surprise, which is the `.0` node in
`coverage/localFlow.expected`.
How could the tests fail because of autoformatting, you may ask?
The answer is deprecation warnings. These specify the location of the deprecated
entity, and due to autoformatting these moved around.
This turned out to be a fairly simple but easy to make bug. When we want to
figure out the value pointed-to in a multi-assignment, we look at the left hand
side to see what value from the right hand side we should assign. Unfortunately,
we accidentally attempted to look up this information in the _left hand side_ of
the assignment, resulting in no points-to information at all. The only thing
needed to fix this was to properly link up the left and right hand sides: using
the left hand side to figure out what index to look at, and then looking up the
points-to information for the corresponding place in the right hand side.
To ease the rollout of this test, currently we only report missing points-to
information for nodes that either
- appear as an argument in a call to a function named `check`, or
- appear inside a scope where the first line is annotated with a comment ending
in "check".
The idea behind the second version is that once we have points-to running at a
level where no node inside a scope that _ought_ to have points-to is missing
this information, we can simply remove all uses of `check(...)` from inside this
scope, and annotate the entire scope with `# check`. Once this has been done for
the entire file, we can then remove all the comments and just require
_everything_ to be checked.
Note that I don't expect all nodes to have the need for points-to information.
For instance, there are nodes representing scope entry and exit, and for these
it doesn't make sense to require that they "point-to" anything. Similarly,
`NameNode` appearing in a "store" (i.e. as the left hand side of an assignment)
do not strictly need to have points-to information, although it might be more
intuitive if they did.
Thus, the `relevant_node` predicate will almost certainly need to be extended to
exclude these kinds of nodes.