Commit Graph

943 Commits

Author SHA1 Message Date
Rasmus Lerchedahl Petersen
e5f087518e Python: stay in control flow layer 2022-09-06 14:16:48 +02:00
Rasmus Lerchedahl Petersen
af08c6eb08 Python: remove repeated test file 2022-09-05 20:44:55 +02:00
yoff
9aa8b46cbf Python: remove redundant code 2022-08-25 12:48:08 +00:00
yoff
54dde41329 Python: remove example code 2022-08-25 12:19:12 +00:00
yoff
800165d63c python: udate deprecated call 2022-08-25 09:49:46 +00:00
yoff
d9444d8b08 Python: update synced file FlowSummaryImpl.qll 2022-08-25 09:31:45 +00:00
yoff
0b5d4c59dd Merge branch 'main' of https://github.com/github/codeql into python-dataflow/flow-summaries-from-scratch
synced files have changed
2022-08-25 09:24:05 +00:00
yoff
4a5fa5993d Apply suggestions from code review
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2022-08-25 10:47:16 +02:00
Ian Lynagh
501a9b3c6b Make *.qll non-executable 2022-08-24 16:36:15 +01:00
erik-krogh
82d9180892 only have one deprecated alias for XmlDtd 2022-08-23 10:38:23 +02:00
Erik Krogh Kristensen
7704a9eeac apply suggestions from Python review
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2022-08-23 10:38:10 +02:00
erik-krogh
28083ebe09 run the implicit-this patch 2022-08-22 21:23:31 +02:00
erik-krogh
e89e0eb7fb make some acronyms camelCase 2022-08-22 21:22:35 +02:00
erik-krogh
2ac5441aec rename the XMLDTD class to XmlDTD 2022-08-22 14:09:19 +02:00
erik-krogh
1a89ddae5d update some comments from XML to Xml 2022-08-22 14:09:19 +02:00
erik-krogh
ce9f69a639 rename all occurrences of XML to Xml 2022-08-22 14:08:31 +02:00
Taus
c904ba1d16 Merge pull request #9852 from tausbn/python-add-str-class
Python: Add `Str` class
2022-08-22 10:55:01 +02:00
Tom Hvitved
1b29bddb73 Python: Revert change to AnyNode 2022-08-19 14:08:21 +02:00
Tom Hvitved
663096fe3a Remove redundant overrides 2022-08-19 13:57:41 +02:00
Erik Krogh Kristensen
e93ff8672c Merge pull request #10075 from erik-krogh/depOld
delete old deprecations
2022-08-17 21:21:57 +02:00
yoff
78756bdc6a Merge pull request #9859 from tausbn/python-fix-another-bad-value-transfer-join 2022-08-17 20:47:00 +02:00
Taus
1c15fc5600 Python: Define Str as an alias of StrConst 2022-08-17 13:36:32 +00:00
erik-krogh
5586c9a17e delete old deprecations 2022-08-16 22:27:15 +02:00
yoff
e7c6c04076 Merge pull request #9858 from tausbn/python-fix-bad-getOuterVariable-join
Python: Fix bad join in `getOuterVariable`
2022-08-16 14:40:42 +02:00
yoff
3006fa60c6 Merge pull request #9856 from tausbn/python-fix-bad-ScopeEntryDefinition-charpred-join
Python: Fix bad join in `ScopeEntryDefinition`
2022-08-16 14:37:53 +02:00
Taus
1f5176d623 Python: Simplify class_var_scope
Co-authored-by: yoff <lerchedahl@gmail.com>
2022-08-16 14:02:47 +02:00
Taus
b17e74dfe8 Python: Simplify binding fix
Co-authored-by: yoff <yoff@github.com>
2022-08-16 11:41:43 +00:00
erik-krogh
8e6a36256c import the non-deprecated NfaUtils in the overly-large-range query 2022-08-16 11:21:43 +02:00
Erik Krogh Kristensen
f106e064fa Merge pull request #9422 from erik-krogh/refacReDoS
Refactorizations of the ReDoS libraries
2022-08-16 09:32:08 +02:00
Erik Krogh Kristensen
0adb588fe8 Merge pull request #9712 from erik-krogh/badRange
JS/RB/PY/Java: add suspicious range query
2022-08-15 13:55:44 +02:00
Anders Schack-Mulligen
a3fb54c9de Merge pull request #10007 from aschackmull/dataflow/source-node-identity
Dataflow: Fix identification of source PathNodes in the presence of source-to-source flow
2022-08-15 10:39:17 +02:00
erik-krogh
3a4a3437b5 fix some QL-for-QL warnings 2022-08-12 20:38:50 +02:00
erik-krogh
b54f037424 Merge branch 'main' into refacReDoS 2022-08-12 20:28:30 +02:00
erik-krogh
b9e96fb078 sync changes to other languages 2022-08-12 20:28:12 +02:00
Erik Krogh Kristensen
73df8e4c7d Merge pull request #9832 from erik-krogh/misspellings
Fix lots of misspellings
2022-08-11 12:43:26 +02:00
Rasmus Wriedt Larsen
ff23f8ef86 Merge pull request #9855 from tausbn/python-fix-bad-scope_entry_transfer-join
Python: Fix bad join in scope entry transfer
2022-08-11 11:55:51 +02:00
Erik Krogh Kristensen
887f6557ed fix common misspellings throughout github/codeql 2022-08-10 23:21:41 +02:00
Anders Schack-Mulligen
abad133ab5 Dataflow: Fix identification of source PathNodes in the presence of source-to-source flow. 2022-08-10 15:02:56 +02:00
yoff
b8931d36ca python: give InterpretNode empty charpred
InterpreNode is going away, but we need a dummy implementation.
However, we do not need any instances, and some tests get confused.
2022-08-10 10:57:30 +00:00
Rasmus Wriedt Larsen
40d25cb34c Merge pull request #9849 from tausbn/python-fix-bad-essa-getInput-join
Python: Fix bad join in ESSA `getInput`
2022-08-10 11:45:23 +02:00
yoff
75ac24a847 Merge branch 'main' into python-dataflow/flow-summaries-from-scratch 2022-08-10 10:57:59 +02:00
Rasmus Wriedt Larsen
b541103b7f Merge pull request #9846 from tausbn/python-fix-bad-syntactic_call_count-join
Python: Fix bad join in `syntactic_call_count`
2022-08-10 10:09:51 +02:00
Erik Krogh Kristensen
559ec7ba56 Merge branch 'main' into repeatedWord 2022-08-09 21:22:47 +02:00
Erik Krogh Kristensen
49276b1f38 Merge branch 'main' into refacReDoS 2022-08-09 16:18:46 +02:00
Rasmus Wriedt Larsen
f89b32183f Merge branch 'main' into typetracker-decorators 2022-08-08 11:52:09 +02:00
Anders Schack-Mulligen
3d47875b60 Dataflow: Generate shorter RA/DIL names. 2022-08-05 11:00:56 +02:00
Anders Schack-Mulligen
d3dcc3ce3a Dataflow: Sync. 2022-08-05 11:00:56 +02:00
Rasmus Wriedt Larsen
1737d08145 Merge pull request #9579 from yoff/python/more-logic-tests
Python: Improve `BarrierGuard`
2022-08-01 11:36:11 +02:00
Taus
2436b060f1 Python: Fix another bad "value transfer" join
The culprit:

```
Tuple counts for PointsTo::InterProceduralPointsTo::scope_entry_value_transfer_from_earlier#741b54e2#ffff#join_rhs/5@eb1340iv after 12.6s:
72973    ~3%     {2} r1 = JOIN PointsToContext::TImportContext#cf3039a0#f WITH Definitions::NonEscapingGlobalVariable#class#486534ab#f CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 'arg1'
537932   ~0%     {3} r2 = JOIN r1 WITH Essa::EssaDefinition::getSourceVariable#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg2', Lhs.1 'arg1', Lhs.0
982333   ~0%     {4} r3 = JOIN r2 WITH Essa::EssaVariable::getAUse#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.2, Lhs.1 'arg1', Lhs.0 'arg2', Rhs.1 'arg0'
37029774 ~0%     {4} r4 = JOIN r3 WITH Essa::TEssaNodeDefinition#24e22a14#ffff ON FIRST 1 OUTPUT Rhs.3 'arg3', Lhs.1 'arg1', Lhs.2 'arg2', Lhs.3 'arg0'
35956211 ~0%     {5} r5 = JOIN r4 WITH Essa::ScopeEntryDefinition::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.3 'arg0', Lhs.1 'arg1', Lhs.2 'arg2', Lhs.0 'arg3', Rhs.1 'arg4'
                return r5
```

You may notice that this is a predicate that's _materialised_, but it's
never actually used anywhere. It's the old "standard order" bringing
much sadness.

The problem here is that in the standard order (which we never actually
use here), we end up with a join between the bits above, `getRootCall`,
and `appliesToScope`. The `join_rhs` bit is joined twice, once with
`getRootCall#prev` and `appliesToScope#prev_delta` (in that order), and
once with `prev` and `prev_delta` swapped.

So to fix this, I used the unbinding pragma to force `appliesToScope` to
appear first in the join order. This was enough to make the compiler
_not_ push the common context into its own `join_rhs` predicate (and
the join-order is still decent.)
2022-07-19 17:18:07 +00:00
Taus
b5cac9285e Python: Fix bad join in getOuterVariable
Much sadness:

```
Tuple counts for ImportTime::ImportTimeScope::getOuterVariable#dispred#f0820431#fff/3@64d04d33 after 7.6s:
19624    ~1%     {1} r1 = SCAN py_Classes OUTPUT In.0 'this'
19531    ~1%     {1} r2 = JOIN r1 WITH ImportTime::ImportTimeScope#class#7851b601#f ON FIRST 1 OUTPUT Lhs.0 'this'
19531    ~0%     {2} r3 = JOIN r2 WITH Scope::Scope::getEnclosingModule#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0 'this', Rhs.1
296389   ~0%     {3} r4 = JOIN r3 WITH Variables::Variable::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.0 'this', Lhs.1
296389   ~0%     {3} r5 = JOIN r4 WITH Variables::LocalVariable#3aa06bbf#f ON FIRST 1 OUTPUT Lhs.0 'var', Lhs.1 'this', Lhs.2
296389   ~1%     {4} r6 = JOIN r5 WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.2, Lhs.1 'this', Lhs.0 'var', Rhs.1
62294919 ~0%     {4} r7 = JOIN r6 WITH Variables::Variable::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.1 'this', Lhs.2 'var', Lhs.3
62294919 ~0%     {4} r8 = JOIN r7 WITH Variables::GlobalVariable#class#3aa06bbf#f ON FIRST 1 OUTPUT Lhs.0 'result', Lhs.3, Lhs.1 'this', Lhs.2 'var'
639      ~0%     {3} r9 = JOIN r8 WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.3 'var', Lhs.0 'result'
                return r9
```

Clearly we _shouldn't_ be joining on `getId` as the last thing, as this
means we're building tuples of completely unrelated variables (not even
with the same name!) which obviously blows up.

A standard way of fixing this is to correlate as much information about
these variables as possible in a `nomagic`ked helper predicate. This is
what we do here, grouping together the variable with its scope and name
(both of which are uniquely determined by the variable). This results
in a much nicer join order:

```
Tuple counts for ImportTime::ImportTimeScope::getOuterVariable#dispred#f0820431#fff/3@82866b6p after 42ms:
23867  ~4%     {2} r1 = JOIN Scope::Scope::getEnclosingModule#dispred#f0820431#ff WITH ImportTime::ImportTimeScope#class#7851b601#f ON FIRST 1 OUTPUT Lhs.0 'this', Lhs.1
296389 ~0%     {4} r2 = JOIN r1 WITH ImportTime::class_var_scope#7851b601#fff ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0 'this', Rhs.2 'var'
639    ~0%     {3} r3 = JOIN r2 WITH ImportTime::global_var_scope#7851b601#fff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.3 'var', Rhs.2 'result'
                return r3
```
```
Tuple counts for ImportTime::class_var_scope#7851b601#fff/3@366258vr after 47ms:
19624  ~1%     {1} r1 = SCAN py_Classes OUTPUT In.0 'scope'
296743 ~0%     {2} r2 = JOIN r1 WITH Variables::Variable::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.0 'scope'
296743 ~0%     {2} r3 = JOIN r2 WITH Variables::LocalVariable#3aa06bbf#f ON FIRST 1 OUTPUT Lhs.0 'var', Lhs.1 'scope'
296743 ~2%     {3} r4 = JOIN r3 WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'scope', Rhs.1 'name', Lhs.0 'var'
                return r4
```
```
Tuple counts for ImportTime::global_var_scope#7851b601#fff/3@718e4bpm after 18ms:
108173 ~0%     {2} r1 = JOIN Variables::GlobalVariable#class#3aa06bbf#f WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0 'var', Rhs.1 'name'
108173 ~0%     {3} r2 = JOIN r1 WITH Variables::Variable::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'name', Rhs.1 'scope', Lhs.0 'var'
                return r2
```

(You may be wondering what's up with the order of arguments for the two
helper predicates. By ordering the arguments this way, there's no need
to reorder the resulting relations when used in `getOuterVariable.)
2022-07-19 17:14:37 +00:00