Commit Graph

1016 Commits

Author SHA1 Message Date
Taus Brock-Nannestad
1251bc57f5 Python: Fix bad join in TObject::literal_instantiation
Here, `context.appliesTo(n)` was being distributed across all of the
disjuncts, which caused poor performance.

The new helper predicate, `literal_node_class` should be fairly small,
since it only applies to a subset of `ControlFlowNode`s, and only
assigns a limited set of `ClassObjectInternal`s to these nodes.
2020-11-05 16:40:29 +01:00
Taus Brock-Nannestad
35a63e2411 Python: Fix bad join in regex::used_as_regex
Since the number of relevant attributes in the `re` module is fairly
small, it made sense to factor this out in a separate predicate, and
the join order also became more sensible.
2020-11-05 16:33:59 +01:00
Taus Brock-Nannestad
035e747ad5 Python: Fix slow use of regexCapture in Builtin::strValue
This is only _really_ expensive when there are a _lot_ of strings in
the database, but for this case, where we're always extracting the
same substring of the string, it's easier -- and faster -- to just
make a substring operation directly.
2020-11-05 16:33:33 +01:00
Taus Brock-Nannestad
83ba8c9bf5 Python: Add LocalSourceNode and flowsTo
This fixes the major performance problem with type tracking on
some (pathological) databases.

The interface could probably be improved a bit. In particular, I'm
thinking that we might want to have `DataFlow::exprNode` return a
`LocalSourceNode` so that a cast isn't necessary in order to use
`flowsTo`.

I have added two `cached` annotations. The one on `flowsTo` is
crucial, as performance regresses without it. The one on
`simpleLocalFlowStep` may not be needed, but Java has a similar
annotation, and to me it makes sense to have this relation cached.
2020-11-05 16:26:03 +01:00
yoff
79fcf598f3 Merge pull request #4608 from RasmusWL/patch-1
Python: Remove unnecessary cached annotation from adjacentRefUse
2020-11-04 16:08:30 +01:00
Rasmus Wriedt Larsen
31247739d7 Python: Remove unnecessary cached annotation from adjacentRefUse
As discussed in https://github.com/github/codeql/pull/4544#pullrequestreview-516575676
2020-11-04 15:16:08 +01:00
yoff
62cb4ec974 Merge pull request #4605 from RasmusWL/python-fix-django-response-modeling
Python: fix django response modeling
2020-11-04 15:00:52 +01:00
Rasmus Wriedt Larsen
5cf8285717 Python: Fix default mimetype for django FileResponse 2020-11-04 12:28:51 +01:00
Rasmus Wriedt Larsen
826aedeb85 Python: Remove resolved TODO 2020-11-04 12:17:31 +01:00
Rasmus Wriedt Larsen
353505ec6c Python: Handle content of Django redirects correctly 2020-11-04 12:10:58 +01:00
Taus
180373c41d Merge pull request #4597 from yoff/python-fix-ql-doc
Python: Fix ql doc
2020-11-04 11:37:32 +01:00
Rasmus Wriedt Larsen
92dc7dc2f3 Python: Use mimetype instead of content-type in django modeling
This enables the XSS query to actually find results from django responses.
2020-11-04 11:34:20 +01:00
Anders Schack-Mulligen
92494441a7 Merge pull request #4554 from aschackmull/dataflow/reverse-partial
Dataflow: Add support reverse partial flow exploration.
2020-11-03 15:34:30 +01:00
Rasmus Lerchedahl Petersen
1023b239e4 Python: Simplify doc 2020-11-03 12:10:00 +01:00
yoff
d6a33a1253 Apply suggestions from code review
Co-authored-by: Taus <tausbn@github.com>
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2020-11-03 12:04:43 +01:00
Rasmus Lerchedahl Petersen
b71ea40dbd Python: QL doc for Werkzeug 2020-11-03 11:44:48 +01:00
Rasmus Lerchedahl Petersen
1773cc3a38 Python: QL doc for MySQLdb 2020-11-03 11:39:28 +01:00
Rasmus Lerchedahl Petersen
01783acca6 Python: QL doc for RemoteFlowSources 2020-11-03 11:37:34 +01:00
Rasmus Lerchedahl Petersen
f44cbf4b6c Python: QL doc for TypeTracker 2020-11-03 11:32:57 +01:00
Rasmus Lerchedahl Petersen
50eb51b6fe Python: QL doc for StepSummary 2020-11-03 11:30:52 +01:00
Rasmus Lerchedahl Petersen
6103dbcfff Python: QL doc for Node 2020-11-03 11:13:58 +01:00
Rasmus Lerchedahl Petersen
2bb1917733 Python: QlDoc for content 2020-11-03 11:10:33 +01:00
Rasmus Wriedt Larsen
cac336d053 Python: Import Customizations into python
Using the pattern from JS and Java to make this the _first_ import in `<lang>.qll`
2020-11-03 10:23:05 +01:00
Anders Schack-Mulligen
2971784f9c Dataflow: Add missing qldoc and sync. 2020-11-03 09:21:48 +01:00
Anders Schack-Mulligen
7eb64aa998 Dataflow: Code review fixes. 2020-11-03 09:16:20 +01:00
Anders Schack-Mulligen
1ae76a80aa Dataflow: Fix qldoc. 2020-11-03 09:16:20 +01:00
Anders Schack-Mulligen
d5be4d7b92 Dataflow: Add support reverse partial flow exploration. 2020-11-03 09:16:19 +01:00
Taus Brock-Nannestad
8752b1af1e Python: Fix up remaining data-flow library copies 2020-11-02 23:02:04 +01:00
Taus Brock-Nannestad
b7773849d7 Python: Fix up some comments 2020-11-02 22:57:40 +01:00
Taus Brock-Nannestad
d8c554ed4f Python: Add redirects to old data-flow libraries 2020-11-02 22:20:16 +01:00
Taus Brock-Nannestad
a5121babc8 Python: The one with changes that don't look like renames anymore 2020-11-02 22:19:15 +01:00
Taus Brock-Nannestad
5156bf756d Python: Promote data-flow libraries
Step 1: Moving stuff around. Also includes a bit of import renaming.
2020-11-02 22:15:38 +01:00
yoff
c8bb0509e5 Merge pull request #4563 from tausbn/python-remove-refersto-from-regex-libs
Python: Remove `refersTo` from `regex.qll`
2020-10-28 13:37:14 +01:00
Taus Brock-Nannestad
1503c5ea16 Python: Remove refersTo from regex.qll
This was causing the old `Object` API stuff to be evaluated when using
our new library models (specifically the Django model).
2020-10-28 12:41:17 +01:00
Rasmus Wriedt Larsen
7993a83750 Merge pull request #4544 from tausbn/python-fix-bad-join-in-use-use-ssa
Python: Fix bad join order in `adjacentUseUseSameVar`
2020-10-23 14:37:27 +02:00
Taus Brock-Nannestad
6d81ca12c4 Python: Fix bad join order in adjacentUseUseSameVar 2020-10-23 14:08:45 +02:00
Erik Krogh Kristensen
e89e99deaa Merge pull request #4461 from erik-krogh/pyPrint
Python: implement printAst for Python
2020-10-22 09:37:10 +02:00
Erik Krogh Kristensen
e18cf08d99 documentation changes based on review 2020-10-21 09:45:16 +02:00
Erik Krogh Kristensen
c1dba2ee9f add a few shouldPrint calls to improve performance 2020-10-21 09:37:53 +02:00
Erik Krogh Kristensen
3306b59a14 Update python/ql/src/semmle/python/PrintAst.qll
Co-authored-by: yoff <lerchedahl@gmail.com>
2020-10-20 23:19:47 +02:00
Erik Krogh Kristensen
d629eea54e aggregate the arguments of a call into a synthetic node 2020-10-15 13:35:19 +02:00
Erik Krogh Kristensen
5770d0256f fixing printing of NameConstants 2020-10-15 13:32:22 +02:00
Erik Krogh Kristensen
2a5dd2c8a3 fix pretty-printing of number literals 2020-10-15 13:04:52 +02:00
Erik Krogh Kristensen
1d4a605517 remove location for synthetic nodes 2020-10-15 12:57:46 +02:00
Erik Krogh Kristensen
9da8c23717 change the order of the children from FunctionDef 2020-10-15 12:57:17 +02:00
Rasmus Wriedt Larsen
c5810d623b Merge pull request #4474 from tausbn/python-fix-tostring-divergence
Python: Fix divergence in tuple/subscripted type `toString`
2020-10-15 10:29:33 +02:00
Taus Brock-Nannestad
f8190feef2 Python: Fix divergence in tuple/subscripted type toString
A slightly more complicated version of the situation in
https://github.com/github/codeql/pull/2507 could cause the `toString`
calculation to diverge. Although the previous PR took tuples nested
inside tuples into account (and subscripted types cannot be nested
inside each other in our modelling), it did not account for having
this nesting be interleaved, and this is what caused the divergence.

I have not done the usual "test case first to show the problem
exists", since this would also diverge and take forever to fail. The
instance observed in `scipy` was likely caused by something akin to

```python
x = ()
while True:
    x = x[(x,)]
```

Finally, to prevent this from happening with other types, I went
through and checked each instance where the string representation of
an `ObjectInternal` might potentially contain a reference to
itself (and thus explode). I encapsulated this in a
`bounded_toString` helper predicate, and used this in all the cases
where I was able to determine that the above _could_ happen.
2020-10-14 16:13:03 +02:00
Erik Krogh Kristensen
9604705f64 remove pretty printing of bytes (unstable between minor versions) 2020-10-12 22:32:37 +02:00
Erik Krogh Kristensen
9b7c59f4b4 implement printAst for Python 2020-10-12 21:17:46 +02:00
Rasmus Wriedt Larsen
67c5c590d2 Python: Expose getParameter on ParameterNode 2020-10-07 12:28:35 +02:00