Commit Graph

267 Commits

Author SHA1 Message Date
Rasmus Wriedt Larsen
9c275c177a Python: Implement call-graph with type-trackers
This commit is a squash of 80 other commits. While developing, things
changed majorly 2-3 times, and it just wasn't feasible to go back and
write a really nice commit history.

My apologies for this HUGE commit.

Also, later on this is where I solved merge conflicts after flow-summaries
PR was merged.

For your amusement, I've included the original commit messages below.

Python: Add proper argument/parameter positions

Python: Handle normal function calls

Python: Reduce dataflow-consistency warnings

Previously there was a lot of failures for `uniqueEnclosingCallable` and
`argHasPostUpdate`

Removing the override of `getEnclosingCallable` in ParameterNode is
probably the most controversial... although from my point of view it's a
change for the better, since we're able to provide data-flow
ParameterNodes for more of the AST parameter nodes.

Python: Adjust `dataflow/calls` test

Python: Implement `isParameterOf`/`argumentOf`/`OutNode`

This makes the tests under `dataflow/basic` work as well 👍

(initially I had these as separate commits, but it felt like it was too much noise)

Python: Accept fix for `dataflow/consistency`

Python: Changes to `coverage/argumentRoutingTest.ql`

Notice we gain a few new resolved arguments.

We loose out on stuff due to:

1. not handling `*` or `**` in either arguments/parameters (yet)
2. not handling special calls (yet)

Python: Small fix for `TestUtil/RoutingTest.qll`

Since the helper predicates do not depend on this, moved outside class.

Python: Accept changes to `dataflow/coverage/NormalDataflowTest.ql`

Most of this is due to:

- not handling any kinds of methods yet
- not handling `*` or `**`

Python: Small investigation of `test_deep_callgraph`

Python: Accept changes to `coverage/localFlow.ql`

I don't fully understand why the .expected file changed.

Since we still have the desired flow, I'm not going to worry too much
about it.

with this commit, the `dataflow/coverage` tests passes 👍

Python: Minor doc update

Python: Add staticmethod/classmethod to `dataflow/calls`

Python: Handle method calls on class instances

without trying to deal with any class inheritance, or
staticmethod/classmethod at all.

Notice that with this change, we only have a DataFlowCall for the calls
that we can actually resolve. I'm not 100% sure if we need to add a
`UnresolvedCall` subclass of `DataFlowCall` for MaD in the future, but
it should be easy to do.

I'm still unsure about the value of `classesCallGraph`, but have just
accepted the changes.

Python: Handle direct method calls `C.foo(C, arg0)`

Python: Handle `@staticmethod`

Python: Handle class method calls... but the code is shit

WIP todo

Rewrite method calls to be better

also fixed a problem with `self` being an argument to the `x.staticmethod()` call :|

Python: Add subclass tests

Python: Split `class_advanced` test

Python: Rewrite call-graph tests to be inline expectation (1/2)

This adds inline expectations, next commit will remove old annotations
code... but I thought it would be easier to review like this.

Minor fixup

Python: Add simple subclass support

Python: more precise subclass lookup

Still not 100% precise.. but it's better

New ambiguous

Python: Add test for `self.m()` and `cls.m()` calls

Python: Handle `self.m()` and `cls.m()` calls

Python: Add tests for `__init__` and `__new__`

Python: Handle class calls

Python: Fix `self` argument passing for class calls

Now field-flow tests also pass 💪 (although the crosstalk
fieldflow test changes were due to this specific commit)

I also copied much of the setup for pre/post update nodes from Ruby,
specifically having the abstract `PostUpdateNodeImpl` in DataFlowPrivate
seemed like a nice change.

Same for the setup with `TNode` definition having the specification
directly in the body, instead of a `NeedsSyntheticPostUpdateNode` class.

Python: Add new crosstalk test WIP

Maybe needs a bit of refactoring, and to see how it all behaves with points-to

Python: Add `super()` call-graph tests

Python: Refactor MethodCall char-pred

In anticipation of supporting `super(MyClass, self).foo()`, where the
`self` argument doesn't come from an AttrNode, but from the second
argument to super.

Without `pragma[inline]` the optimizer found a terrible join-order --
this won't guarantee a good join-order for the future, but for now it
was just so simple and could let me move on with life.

Python: Add basic `super()` support

I debated a little (with myself) whether I should really do
`superTracker`, but I thought "why not" and just rolled with it. I did
not confirm whether it was actually needed anywhere, that is if anyone
does `ref = super; ref().foo()` -- although I certainly doubt it's very
wide-spread.

Python: InlineCallGraphTest: Allow non-unique callable name in different files

Python: more MRO tests

Python: Add MRO approximation for `super()`

Although it's not 100% accurate, it seems to be on level with the one in
points-to.

Python: Remove some spurious targets for direct calls

removal of TODO from refactoring

remove TODOs class call support

Python: Add contrived subclass call example

Python: Remove more spurious call targets

NOTE: I initially forgot to use
`findFunctionAccordingToMroKnownStartingClass` instead of
`findFunctionAccordingToMro` for __init__ and __new__, and since I did
make that mistake myself, I wanted to add something to the test to
highlight this fact, and make it viewable by PR reviewer... this will be
fixed in the next commit.

Python: Proper fix for spurious __init__ targets

Python: Add call-graph example of class decorator

Python: Support decorated classes in new call-graph

Python: Add call-graph tests for `type(obj).meth()`

Python: support `type(obj).meth()`

Python: Add test for callable defined in function

Python: Add test for callable as argument

Current'y we don't find these with type-tracking, which is super
mysterious. I did check that we have proper flow from the arguments to
the parameters.

Python: Found problem for callable as argument :| MAJOR WIP

WIP commit

IT WORKS AGAIN (but terrible performance)

remove pragma[inline]

remove oops

Fix performance problem

I tried to optimize it even further, but I didn't end up achieving anything :|

Fix call-graph comparison

add comparison version with easy lookup

incomplete missing call-graph tests

unhandled tests

trying to replicate missing call-edge due to missing imports ... but it's hard

also seems to be problems with the inline-expectation-value that I used, seems like it has both missing/unexpected results with same value

Python: Add import-problem test

Python: Add shadowing problem

some cleanup of rewrite fix

a little more cleanup

Add consistency queries to call-graph tests

Python: Add post-update nodes for `self` in implicit `super()` uses

But we do need to discuss whether this is the right approach :O

Fix for field-flow tests

This came from more precise argument passing

Fixed results in type-tracking

Comes from better argument passing with super() and handling of
functions with decorators

fix of inline call graph tests

Fixup call annotation test

Many minor cleanups/fixes

NewNormalCall -> NormalCall

Python: Major restructuring + qldoc writing

Python: Accept changes from pre/post update node .toString changes

Python: Reduce `super` complexity !! WIP !!

Python: Only pass self-reference if in same enclosing-callable

Python: Add call-graph test with nested class

This was inspired by the ImpliesDataflow test that showed missing flow
for q_super, but at least for the call-graph, I'm not able to reproduce
this missing result :|

Python: Restrict `super()` to function defined directly on class

Python: Accept fixes to ImpliesDataflow

Python: Expand field-flow crosstalk tests
2022-11-22 14:46:29 +01:00
Rasmus Wriedt Larsen
1b30cf8eca Merge branch 'main' into call-graph-tests 2022-11-22 13:39:27 +01:00
Rasmus Wriedt Larsen
88f703af1f DataFlow: Accept changes to .expected 2022-11-10 22:13:34 +01:00
Rasmus Wriedt Larsen
0f34752f8f Python: Delete classesCallGraph.ql
I don't see the value from this, so just going to outright delete it.
(it actually stayed alive for quite some time in the original git history,
but never seemed to be that useful.)
2022-10-28 09:31:01 +02:00
Rasmus Wriedt Larsen
7d8c0c663f Python: Remove dataflow/coverage/dataflow.ql
The selected edges is covered by `NormalDataflowTest.ql` now... and
reading the test-output changes in `edges` is just going to make commits
larger while not providing any real value.
2022-10-28 09:29:32 +02:00
Rasmus Wriedt Larsen
609a4cfd42 Python: validate tests in datamodel.py
And adopt argument passing tests as well.

turns out that `C.staticmethod.__func__` doesn't actually work :O
2022-10-28 09:29:32 +02:00
Rasmus Wriedt Larsen
39081e9c1c Python: Fix staticmethod datamodel test 2022-10-28 09:29:32 +02:00
Tom Hvitved
f4b82cb2e8 Python: Update expected test output 2022-09-22 15:01:40 +02:00
yoff
ea743173d5 Merge pull request #8781 from yoff/python-dataflow/flow-summaries-from-scratch
Python dataflow: flow summaries restart
2022-09-20 14:08:31 +02:00
Rasmus Lerchedahl Petersen
245baa51a3 Python: rename summary map -> list_map,
since map resolves to a class call

also fix test expectation
2022-09-14 11:21:16 +02:00
Rasmus Lerchedahl Petersen
efc5cfb852 Merge branch 'main' of github.com:github/codeql into python-dataflow/flow-summaries-from-scratch 2022-09-12 19:56:16 +02:00
Rasmus Lerchedahl Petersen
0f95992b2f Python: remove NonLibraryDataFlowCallable
this required managing parameters and their pre-update nodes a bit
2022-09-12 15:17:29 +02:00
Rasmus Wriedt Larsen
4296ac1ac0 Python: Allow CallNode.getArgByName for keyword args after **kwargs 2022-09-12 15:03:13 +02:00
erik-krogh
1d1aa7c8b4 update some expected output 2022-08-25 20:52:30 +02:00
yoff
8bf60301da python: we have hidden isParameterOf
but now allow a clear alternative
2022-06-23 08:57:50 +00:00
yoff
fe0c5d8ee5 python: make ArgumentNode publicly usable
- add `getCall`
2022-06-23 08:48:55 +00:00
yoff
cedf9ef538 python: make DataFlowCall "publicly usable"
- add `getCallable`, `getArg` and `getNode`
- these are `none` for summary calls
- revert "external" uses (they had been changed to `DataFlowSourceCall`)
2022-06-23 08:32:23 +00:00
yoff
dd69100dcd python: ParameterNode -> SourceParameterNode 2022-06-21 12:55:22 +00:00
Rasmus Lerchedahl Petersen
506efcf051 python: refactor TDataFlowCall
- Branch predicates are made simple. In particular, they do not try to detect library calls.
- All branches based on `CallNode`s are gathered into one.
- That branch has been given a class `NonSpecialCall`, which is the new parent of call classes based on `CallNode`s. (Those classes now have more involved charpreds.)
- A new such class, 'LambdaCall` has been split out from `FunctionCall` to allow the latter to replace its
  general `CallNode` field with a specific `FunctionValue` one.
- `NonSpecialCall` is not an abstract class, but it has some abstract overrides. Therefor, it is not
  considered a resolved call in the test `UnresolvedCalls.qll`.
2022-05-10 12:48:42 +00:00
Rasmus Lerchedahl Petersen
828db3a392 python: Add summary nodes
allowing more `OutNode`s (not restricting to `CallNode`s),
gives more flow in the `classesCallGraph` test
2022-05-10 12:48:42 +00:00
Rasmus Lerchedahl Petersen
80175a9af5 Python: Compiles and mostly pass tests
- add flowsummaries shared files
- register in indentical files
- fix initial non-monotonic recursions
  - add DataFlowSourceCall
  - add resolvedCall
  - add SourceParameterNode

failing tests:
- 3/library-tests/with/test.ql
2022-05-10 12:48:42 +00:00
Rasmus Wriedt Larsen
01d426dc58 Python: Replace rest of from testlib import *
I think we should write our tests in a way that puts points-to in the
best condition to resolve calls. Although this specific change did not
change much, it should help set us up for success in the future 👍
2022-02-28 10:58:44 +01:00
Rasmus Wriedt Larsen
d2cd77aefb Merge branch 'main' into dataflow-improvements 2022-02-21 14:49:40 +01:00
Rasmus Wriedt Larsen
67ca14876a Python: Apply suggestions from code review
Co-authored-by: yoff <lerchedahl@gmail.com>
2022-02-18 13:47:07 +01:00
Rasmus Wriedt Larsen
a8edd44a3c Python: Update .expected 2022-02-08 11:12:34 +01:00
Rasmus Wriedt Larsen
cc4fe38fbd Python: Delete dedicated argumentRouting<N> tests
I feel like they don't bring any value anymore, since we have the nice
inline expectation tests. If I'm wrong, happy to revert this commit
though.
2022-02-01 17:51:33 +01:00
Rasmus Wriedt Larsen
54f53c828e Python: Refactor argumentRoutingTest.ql to be more generic
I checked to see that the tests still works. If I deleted the `arg5`
annotation, it got failures:

```diff
diff --git a/python/ql/test/experimental/dataflow/coverage/argumentPassing.py b/python/ql/test/experimental/dataflow/coverage/argumentPassing.py
index e218bdde9b..71816c1e01 100644
--- a/python/ql/test/experimental/dataflow/coverage/argumentPassing.py
+++ b/python/ql/test/experimental/dataflow/coverage/argumentPassing.py
@@ -46,7 +46,7 @@ def argument_passing(
     c,
     d=arg4,  #$ arg4 func=argument_passing
     *,
-    e=arg5,  #$ arg5 func=argument_passing
+    e=arg5,
     f,
     **g,
 ):
diff --git a/python/ql/test/experimental/dataflow/coverage/argumentRoutingTest.expected b/python/ql/test/experimental/dataflow/coverage/argumentRoutingTest.expected
index e69de29bb2..22037a40c3 100644
--- a/python/ql/test/experimental/dataflow/coverage/argumentRoutingTest.expected
+++ b/python/ql/test/experimental/dataflow/coverage/argumentRoutingTest.expected
@@ -0,0 +1,2 @@
+| argumentPassing.py:49:7:49:10 | ControlFlowNode for arg5 | Unexpected result: arg5= |
+| argumentPassing.py:49:7:49:10 | ControlFlowNode for arg5 | Unexpected result: func=argument_passing |
```
2022-02-01 17:50:06 +01:00
Rasmus Wriedt Larsen
76f3d74fed Python: Remove extra whitespace from argumentPassing.py 2022-02-01 17:48:16 +01:00
Rasmus Wriedt Larsen
5ee755db09 Python: Require MISSING: flow annotations for normal data-flow tests
I had to rewrite the SINK1-SINK7 definitions, since this new requirement
complained that we had to add this `MISSING: flow` annotation :D

Doing this implementation also revealed that there was a bug, since I
did not compare files when checking for these `MISSING:` annotations. So
fixed that up in the implementation for inline taint tests as well.

(extra whitespace in argumentPassing.py to avoid changing line numbers
for other tests)
2022-02-01 17:46:53 +01:00
Rasmus Wriedt Larsen
2bc4a60496 Python: Unify normal dataflow test setup
I went with NormalDataflowTest to signify that if you don't know what
you're looking for, this is probably the one. I did not want to just
call it DataflowTest, since that becomes a big vague when there are also
`FlowTest.qll` and `MaximalFlowTest.qll` -- I'm open to renaming this
though 👍
2022-02-01 17:31:31 +01:00
Rasmus Wriedt Larsen
8444388ec7 Python: Update .expected 2021-10-11 09:48:56 +02:00
Rasmus Wriedt Larsen
a50b193c40 Python: Model data-flow for x or y and x and y 2021-10-08 18:32:30 +02:00
Rasmus Wriedt Larsen
15476c2513 Python: Add data-flow tests for BoolExp
> 6.11. Boolean operations

> The expression x and y first evaluates x; if x is false, its value is
> returned; otherwise, y is evaluated and the resulting value is
> returned.

> The expression x or y first evaluates x; if x is true, its value is
> returned; otherwise, y is evaluated and the resulting value is
> returned.
2021-10-08 18:29:06 +02:00
Rasmus Lerchedahl Petersen
baca9edbb1 Merge branch 'main' of github.com:github/codeql into python-add-parameter-default-value-flow-step 2021-09-08 14:48:13 +02:00
Rasmus Lerchedahl Petersen
4a5f70e6c8 Python: Reclassify defaultValueFlowStep
as a `jumpStep`.
2021-09-08 10:05:31 +02:00
Anders Schack-Mulligen
f30dad7705 Dataflow: Update test expected outputs. 2021-09-07 13:02:20 +02:00
Taus
53711dc82f Merge pull request #5238 from RasmusWL/no-flow-default-value
Python: Highlight missing flow from default value in functions
2021-02-23 13:27:41 +01:00
Rasmus Wriedt Larsen
5249b54a9b Python: Highlight missing flow from default value in functions
Although it is becoming non-trivial to get an overview of what tests we have and
don't have, I didn't find any that highlighted this one

I used all 3 variants of parameters, just to be sure :)
2021-02-22 14:52:51 +01:00
Rasmus Lerchedahl Petersen
d23a8ad016 Python: elide test output 2021-02-21 13:12:54 +01:00
Rasmus Lerchedahl Petersen
46faba69ff Python: Fix for-iteration of tuples 2021-02-21 12:41:16 +01:00
Rasmus Lerchedahl Petersen
0aecf33fe6 Python: test iteration through overflow parameters
These are in a tuple, so the for-step does not fire
2021-02-21 12:33:04 +01:00
Taus
634041d2d7 Merge pull request #5047 from yoff/python-dataflow-unpacking-unifying-experiments
Python: dataflow, unify iterated unpacking
2021-02-04 12:57:43 +01:00
Rasmus Lerchedahl Petersen
a7ca065411 Python: Fix ForTarget 2021-02-03 22:14:15 +01:00
Rasmus Lerchedahl Petersen
27fd46b855 Python: Update test expectation 2021-02-01 08:55:20 +01:00
Rasmus Lerchedahl Petersen
f6fa1276a6 Python: Add consistency checks
to all data-flow test floders
2021-01-29 21:28:43 +01:00
Rasmus Lerchedahl Petersen
05a138694d Python: Fix crashing test 2021-01-29 21:12:44 +01:00
Rasmus Lerchedahl Petersen
182d435dc6 Python: Replace comprehension read-step by for
read-step. Add a version targetting sequence nodes.
2021-01-29 17:31:59 +01:00
Taus
cb195a0dc4 Merge pull request #4752 from yoff/python-dataflow-unpacking-assignment
Python: Dataflow, unpacking assignment
2021-01-29 14:15:28 +01:00
Rasmus Wriedt Larsen
902bade5ae Merge pull request #5015 from yoff/python-add-missing-postupdate-nodes
Python: add missing postupdate nodes
2021-01-26 14:39:29 +01:00
Rasmus Lerchedahl Petersen
7b9ca7171a Python: update test expectations 2021-01-26 09:47:48 +01:00