Commit Graph

3059 Commits

Author SHA1 Message Date
Rasmus Wriedt Larsen
d151e21f15 Python: Move ControlFlowNode.toString() to AST cached stage
This means points-to is no longer evaluated for sql injection 🎉

Thanks @asgerf 💪
2022-11-24 10:14:39 +01:00
Erik Krogh Kristensen
1eec067474 Merge pull request #11294 from erik-krogh/fileDoc
QL: improve the "this block-comment should have been a QLDoc"-query
2022-11-23 22:23:36 +01:00
erik-krogh
95f35196e4 add missing additional keywords 2022-11-23 20:45:51 +01:00
Asger F
abf0c0f296 Python: update more comments referring to the package column 2022-11-23 15:02:08 +01:00
Asger F
1c910550e6 Python: merge package/type columns 2022-11-23 11:17:42 +01:00
Rasmus Wriedt Larsen
69b43f147a Python: Fix ql4ql alerts
The rest will be ignored.
2022-11-22 16:24:47 +01:00
Rasmus Wriedt Larsen
5866af413f Merge pull request #11347 from tausbn/python-clean-up-import-resolution
Python: Add change note for module resolution
2022-11-22 15:28:38 +01:00
Rasmus Wriedt Larsen
04a68f8d52 Merge pull request #11372 from RasmusWL/getpass
Python: Model `getpass.getpass` as source of passwords
2022-11-22 14:49:04 +01:00
Rasmus Wriedt Larsen
c0ad870949 Python: Exclude synthetic generator functions from DataFlowCallable 2022-11-22 14:46:33 +01:00
Rasmus Wriedt Larsen
36e8b8bfb9 Python: Add call-graph to cached dataflow stage
I didn't do any performance investigation on this, since it just seems
so much like the right approach.
2022-11-22 14:46:32 +01:00
Rasmus Wriedt Larsen
fc0545561e Python: Introduce points-to cached stage
With points-to not being used for the call-graph any longer, it's time
to split them.
2022-11-22 14:46:32 +01:00
Rasmus Wriedt Larsen
bd46b7deaa Python: Cache a few call-graph predicates
We DON'T want to recompute these ones for sure!
2022-11-22 14:46:32 +01:00
Rasmus Wriedt Larsen
6646e98d20 Python: Fix results outside DB for StackTraceExposure 2022-11-22 14:46:32 +01:00
Rasmus Wriedt Larsen
a301c93ebf Python: Fix results outside DB for CleartextLogging 2022-11-22 14:46:32 +01:00
Rasmus Wriedt Larsen
39ce50fadc Python: Fix problems with sinks in pathlib
This must mean that we did not have this flow with the old call-graph,
which means the new call-graph is doing a better job (yay).
2022-11-22 14:46:32 +01:00
Rasmus Wriedt Larsen
478f5ffe96 Python: Limit self argument for PotentialLibraryCall
Using the object from `MethodCallNode` meant that in the code below,
`lib` from the import expression would be considered a self argument

(this showed up in dataflow-consistency query results, that were not
comitted... sorry)

```
from lib import func
func()
```
2022-11-22 14:46:32 +01:00
Rasmus Wriedt Larsen
c4122275dc Python: Bring back support for flow-summaries
Also needed to fix up `TestUtil/UnresolvedCalls.qll` after a bad merge
conflict resolution. Since all calls are now DataFlowCall, and not JUST
the ones that can be resolved, we need to put in the restriction that
the callable can also be resolved.
2022-11-22 14:46:32 +01:00
Rasmus Wriedt Larsen
8a56b48357 Python: Support super().__new__(cls) 2022-11-22 14:46:32 +01:00
Rasmus Wriedt Larsen
a4e6433942 Python: add support for type(self)() 2022-11-22 14:46:31 +01:00
Rasmus Wriedt Larsen
1e96ced3ab Python: Ignore functions with @property decorator for now 2022-11-22 14:46:31 +01:00
Rasmus Wriedt Larsen
b33f02f9dc Python: Fix self-passing problems
This also fixes performance problems for pandas-dev/pandas
2022-11-22 14:46:31 +01:00
Rasmus Wriedt Larsen
5e5bab5a7c Python: Don't pass synthetic class instance to __new__ on class calls 2022-11-22 14:46:31 +01:00
Rasmus Wriedt Larsen
9949824810 Python: Expand implicit classmethods 2022-11-22 14:46:31 +01:00
Rasmus Wriedt Larsen
6fefd54533 Python: Consider __new__ a classmethod 2022-11-22 14:46:31 +01:00
Rasmus Wriedt Larsen
57c7dc8ea9 Python: Allow cls passing to classmethod 2022-11-22 14:46:31 +01:00
Rasmus Wriedt Larsen
8e0bb62516 Python: Remove pragma[inline] from parameterMatch
It's gotten complex enough that it doesn't by definition seem necessary
to inline it. (in the range of ~2200 results for django and pandas)
2022-11-22 14:46:31 +01:00
Rasmus Wriedt Larsen
98a849405f Python: Add support for late *args arguments 2022-11-22 14:46:30 +01:00
Rasmus Wriedt Larsen
035d083515 Python: Support flow to *args param from positional arg 2022-11-22 14:46:30 +01:00
Rasmus Wriedt Larsen
db921ac036 Python: Add basic support for *args 2022-11-22 14:46:30 +01:00
Rasmus Wriedt Larsen
c687df4ddc Python: Support flow to keyword param from **kwargs arg
When resolving merge conflict after flow-summaries was merged, this is
the original commit where I introduced ParameterNodeImpl, so this is the
commit where differences in that implementation was committed...

I removed TParameterNode, since I could not see we we gain anything from
having it.
2022-11-22 14:46:30 +01:00
Rasmus Wriedt Larsen
215a03d948 Python: Support flow to **kwargs param from keyword arg 2022-11-22 14:46:30 +01:00
Rasmus Wriedt Larsen
503ad544e9 Python: Remove impossible flow for **kwargs params 2022-11-22 14:46:30 +01:00
Rasmus Wriedt Larsen
5722d231bd Python: Add basic support for **kwargs
For now this is JUST from `**kwargs` in arguments, to `**kwargs`
parameters, and this part is based on field-flow

Note that dataflow-library complains about missing post update nodes for
these. This needs to be ignored, since post update nodes for `**kwargs`
arguments doesn't make sense, it's not possible to alter the dictionary
inside the method.
2022-11-22 14:46:30 +01:00
Rasmus Wriedt Larsen
7014be2047 Python: Reduce size of attrReadTracker
On pallets/flask, this reduced the number of tuples from
100866 results => 33060 results
2022-11-22 14:46:30 +01:00
Rasmus Wriedt Larsen
a5c3e850f1 Python: Handle __call__ 2022-11-22 14:46:30 +01:00
Rasmus Wriedt Larsen
9c275c177a Python: Implement call-graph with type-trackers
This commit is a squash of 80 other commits. While developing, things
changed majorly 2-3 times, and it just wasn't feasible to go back and
write a really nice commit history.

My apologies for this HUGE commit.

Also, later on this is where I solved merge conflicts after flow-summaries
PR was merged.

For your amusement, I've included the original commit messages below.

Python: Add proper argument/parameter positions

Python: Handle normal function calls

Python: Reduce dataflow-consistency warnings

Previously there was a lot of failures for `uniqueEnclosingCallable` and
`argHasPostUpdate`

Removing the override of `getEnclosingCallable` in ParameterNode is
probably the most controversial... although from my point of view it's a
change for the better, since we're able to provide data-flow
ParameterNodes for more of the AST parameter nodes.

Python: Adjust `dataflow/calls` test

Python: Implement `isParameterOf`/`argumentOf`/`OutNode`

This makes the tests under `dataflow/basic` work as well 👍

(initially I had these as separate commits, but it felt like it was too much noise)

Python: Accept fix for `dataflow/consistency`

Python: Changes to `coverage/argumentRoutingTest.ql`

Notice we gain a few new resolved arguments.

We loose out on stuff due to:

1. not handling `*` or `**` in either arguments/parameters (yet)
2. not handling special calls (yet)

Python: Small fix for `TestUtil/RoutingTest.qll`

Since the helper predicates do not depend on this, moved outside class.

Python: Accept changes to `dataflow/coverage/NormalDataflowTest.ql`

Most of this is due to:

- not handling any kinds of methods yet
- not handling `*` or `**`

Python: Small investigation of `test_deep_callgraph`

Python: Accept changes to `coverage/localFlow.ql`

I don't fully understand why the .expected file changed.

Since we still have the desired flow, I'm not going to worry too much
about it.

with this commit, the `dataflow/coverage` tests passes 👍

Python: Minor doc update

Python: Add staticmethod/classmethod to `dataflow/calls`

Python: Handle method calls on class instances

without trying to deal with any class inheritance, or
staticmethod/classmethod at all.

Notice that with this change, we only have a DataFlowCall for the calls
that we can actually resolve. I'm not 100% sure if we need to add a
`UnresolvedCall` subclass of `DataFlowCall` for MaD in the future, but
it should be easy to do.

I'm still unsure about the value of `classesCallGraph`, but have just
accepted the changes.

Python: Handle direct method calls `C.foo(C, arg0)`

Python: Handle `@staticmethod`

Python: Handle class method calls... but the code is shit

WIP todo

Rewrite method calls to be better

also fixed a problem with `self` being an argument to the `x.staticmethod()` call :|

Python: Add subclass tests

Python: Split `class_advanced` test

Python: Rewrite call-graph tests to be inline expectation (1/2)

This adds inline expectations, next commit will remove old annotations
code... but I thought it would be easier to review like this.

Minor fixup

Python: Add simple subclass support

Python: more precise subclass lookup

Still not 100% precise.. but it's better

New ambiguous

Python: Add test for `self.m()` and `cls.m()` calls

Python: Handle `self.m()` and `cls.m()` calls

Python: Add tests for `__init__` and `__new__`

Python: Handle class calls

Python: Fix `self` argument passing for class calls

Now field-flow tests also pass 💪 (although the crosstalk
fieldflow test changes were due to this specific commit)

I also copied much of the setup for pre/post update nodes from Ruby,
specifically having the abstract `PostUpdateNodeImpl` in DataFlowPrivate
seemed like a nice change.

Same for the setup with `TNode` definition having the specification
directly in the body, instead of a `NeedsSyntheticPostUpdateNode` class.

Python: Add new crosstalk test WIP

Maybe needs a bit of refactoring, and to see how it all behaves with points-to

Python: Add `super()` call-graph tests

Python: Refactor MethodCall char-pred

In anticipation of supporting `super(MyClass, self).foo()`, where the
`self` argument doesn't come from an AttrNode, but from the second
argument to super.

Without `pragma[inline]` the optimizer found a terrible join-order --
this won't guarantee a good join-order for the future, but for now it
was just so simple and could let me move on with life.

Python: Add basic `super()` support

I debated a little (with myself) whether I should really do
`superTracker`, but I thought "why not" and just rolled with it. I did
not confirm whether it was actually needed anywhere, that is if anyone
does `ref = super; ref().foo()` -- although I certainly doubt it's very
wide-spread.

Python: InlineCallGraphTest: Allow non-unique callable name in different files

Python: more MRO tests

Python: Add MRO approximation for `super()`

Although it's not 100% accurate, it seems to be on level with the one in
points-to.

Python: Remove some spurious targets for direct calls

removal of TODO from refactoring

remove TODOs class call support

Python: Add contrived subclass call example

Python: Remove more spurious call targets

NOTE: I initially forgot to use
`findFunctionAccordingToMroKnownStartingClass` instead of
`findFunctionAccordingToMro` for __init__ and __new__, and since I did
make that mistake myself, I wanted to add something to the test to
highlight this fact, and make it viewable by PR reviewer... this will be
fixed in the next commit.

Python: Proper fix for spurious __init__ targets

Python: Add call-graph example of class decorator

Python: Support decorated classes in new call-graph

Python: Add call-graph tests for `type(obj).meth()`

Python: support `type(obj).meth()`

Python: Add test for callable defined in function

Python: Add test for callable as argument

Current'y we don't find these with type-tracking, which is super
mysterious. I did check that we have proper flow from the arguments to
the parameters.

Python: Found problem for callable as argument :| MAJOR WIP

WIP commit

IT WORKS AGAIN (but terrible performance)

remove pragma[inline]

remove oops

Fix performance problem

I tried to optimize it even further, but I didn't end up achieving anything :|

Fix call-graph comparison

add comparison version with easy lookup

incomplete missing call-graph tests

unhandled tests

trying to replicate missing call-edge due to missing imports ... but it's hard

also seems to be problems with the inline-expectation-value that I used, seems like it has both missing/unexpected results with same value

Python: Add import-problem test

Python: Add shadowing problem

some cleanup of rewrite fix

a little more cleanup

Add consistency queries to call-graph tests

Python: Add post-update nodes for `self` in implicit `super()` uses

But we do need to discuss whether this is the right approach :O

Fix for field-flow tests

This came from more precise argument passing

Fixed results in type-tracking

Comes from better argument passing with super() and handling of
functions with decorators

fix of inline call graph tests

Fixup call annotation test

Many minor cleanups/fixes

NewNormalCall -> NormalCall

Python: Major restructuring + qldoc writing

Python: Accept changes from pre/post update node .toString changes

Python: Reduce `super` complexity !! WIP !!

Python: Only pass self-reference if in same enclosing-callable

Python: Add call-graph test with nested class

This was inspired by the ImpliesDataflow test that showed missing flow
for q_super, but at least for the call-graph, I'm not able to reproduce
this missing result :|

Python: Restrict `super()` to function defined directly on class

Python: Accept fixes to ImpliesDataflow

Python: Expand field-flow crosstalk tests
2022-11-22 14:46:29 +01:00
Rasmus Wriedt Larsen
716576b1d6 Python: Minimal type-tracking call-graph
That does absolutely nothing so far, but compiles
2022-11-22 14:46:29 +01:00
Rasmus Wriedt Larsen
6f5007b810 Python: Rename -> DataFlowDispatch
So diff can make more sense when introducing blank state for type-tracking based call-graph
2022-11-22 14:46:29 +01:00
Rasmus Wriedt Larsen
9195b73d84 Python: Model getpass.getpass as source of passwords 2022-11-22 14:11:52 +01:00
Rasmus Wriedt Larsen
80e71b202a Python: Cleartext queires: Remove flow from getpass.py 2022-11-22 14:08:00 +01:00
Taus
18be30d177 Python: Apply suggestion from review
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2022-11-22 13:46:45 +01:00
Edoardo Pirovano
6c33ddcd47 Merge pull request #11349 from github/edoardo/2.11.4-mergeback
Merge `rc/3.8` into `main`
2022-11-21 18:08:27 +00:00
Taus
a385e87273 Python: Add change note for module resolution
Also adapts the version-specific tests to support results specific to
Python 2 (though at the moment there are no such tests).
2022-11-21 14:29:39 +00:00
Taus
8f4eb7107a Merge pull request #10861 from tausbn/python-clean-up-import-resolution
Python: Clean up import resolution
2022-11-21 15:18:08 +01:00
Tom Hvitved
99e70e9a50 Data flow: Sync files 2022-11-20 10:19:23 +01:00
Porcupiney Hairs
db231a111c Python : Improve the PAM authentication bypass query
The current PAM auth bypass query which was contributed by me a few months back, alert on a vulenrable function but does not check if the function is actually function. This leads to a lot of fasle positives.

With this PR, I add a taint-tracking configuration to check if the username parameter can actually be supplied by an attacker.

This should bring the FP's significantly down.
2022-11-19 01:29:25 +05:30
Taus
d79eed533b Python: Remove unwanted recursion
Depending on `localFlowStep` meant that this predicate ended up being
recursive with itself (by way of flow summaries which depend on API
graphs, which in turn depend on import resolution).

Changing this to use the simple local flow step predicate that we use
for type tracking should fix this issue.
2022-11-18 13:50:50 +00:00
github-actions[bot]
5b14ebf22a Post-release preparation for codeql-cli-2.11.4 2022-11-18 11:26:00 +00:00
Taus
e76ab8c78c Merge branch 'main' into python-clean-up-import-resolution 2022-11-17 22:47:50 +00:00
erik-krogh
468a879c1f Python: delete dead code. thanks QL-for-QL 2022-11-17 22:12:51 +01:00