Commit Graph

2989 Commits

Author SHA1 Message Date
Anders Schack-Mulligen
a3fb54c9de Merge pull request #10007 from aschackmull/dataflow/source-node-identity
Dataflow: Fix identification of source PathNodes in the presence of source-to-source flow
2022-08-15 10:39:17 +02:00
erik-krogh
3a4a3437b5 fix some QL-for-QL warnings 2022-08-12 20:38:50 +02:00
erik-krogh
b54f037424 Merge branch 'main' into refacReDoS 2022-08-12 20:28:30 +02:00
erik-krogh
b9e96fb078 sync changes to other languages 2022-08-12 20:28:12 +02:00
github-actions[bot]
21d0c78376 Post-release preparation for codeql-cli-2.10.3 2022-08-11 23:20:39 +00:00
github-actions[bot]
57c4f9145b Release preparation for version 2.10.3 2022-08-11 11:12:15 +00:00
Erik Krogh Kristensen
73df8e4c7d Merge pull request #9832 from erik-krogh/misspellings
Fix lots of misspellings
2022-08-11 12:43:26 +02:00
Rasmus Wriedt Larsen
ff23f8ef86 Merge pull request #9855 from tausbn/python-fix-bad-scope_entry_transfer-join
Python: Fix bad join in scope entry transfer
2022-08-11 11:55:51 +02:00
Erik Krogh Kristensen
887f6557ed fix common misspellings throughout github/codeql 2022-08-10 23:21:41 +02:00
Anders Schack-Mulligen
abad133ab5 Dataflow: Fix identification of source PathNodes in the presence of source-to-source flow. 2022-08-10 15:02:56 +02:00
yoff
b8931d36ca python: give InterpretNode empty charpred
InterpreNode is going away, but we need a dummy implementation.
However, we do not need any instances, and some tests get confused.
2022-08-10 10:57:30 +00:00
Rasmus Wriedt Larsen
40d25cb34c Merge pull request #9849 from tausbn/python-fix-bad-essa-getInput-join
Python: Fix bad join in ESSA `getInput`
2022-08-10 11:45:23 +02:00
yoff
75ac24a847 Merge branch 'main' into python-dataflow/flow-summaries-from-scratch 2022-08-10 10:57:59 +02:00
Rasmus Wriedt Larsen
b541103b7f Merge pull request #9846 from tausbn/python-fix-bad-syntactic_call_count-join
Python: Fix bad join in `syntactic_call_count`
2022-08-10 10:09:51 +02:00
Erik Krogh Kristensen
559ec7ba56 Merge branch 'main' into repeatedWord 2022-08-09 21:22:47 +02:00
Erik Krogh Kristensen
49276b1f38 Merge branch 'main' into refacReDoS 2022-08-09 16:18:46 +02:00
Rasmus Wriedt Larsen
f89b32183f Merge branch 'main' into typetracker-decorators 2022-08-08 11:52:09 +02:00
Anders Schack-Mulligen
3d47875b60 Dataflow: Generate shorter RA/DIL names. 2022-08-05 11:00:56 +02:00
Anders Schack-Mulligen
d3dcc3ce3a Dataflow: Sync. 2022-08-05 11:00:56 +02:00
Alex Ford
8e3548efb3 Merge branch 'main' into post-release-prep/codeql-cli-2.10.2 2022-08-02 20:29:26 +01:00
Rasmus Wriedt Larsen
1737d08145 Merge pull request #9579 from yoff/python/more-logic-tests
Python: Improve `BarrierGuard`
2022-08-01 11:36:11 +02:00
github-actions[bot]
e8747d3176 Post-release preparation for codeql-cli-2.10.2 2022-07-28 20:00:09 +00:00
github-actions[bot]
212786ed91 Release preparation for version 2.10.2 2022-07-28 13:38:35 +00:00
Andrew Eisenberg
43ae5d4285 Merge pull request #9838 from github/aeisenberg/python-local-ref-def
Move python contextual queries to lib folders
2022-07-25 09:00:32 -07:00
Taus
2436b060f1 Python: Fix another bad "value transfer" join
The culprit:

```
Tuple counts for PointsTo::InterProceduralPointsTo::scope_entry_value_transfer_from_earlier#741b54e2#ffff#join_rhs/5@eb1340iv after 12.6s:
72973    ~3%     {2} r1 = JOIN PointsToContext::TImportContext#cf3039a0#f WITH Definitions::NonEscapingGlobalVariable#class#486534ab#f CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 'arg1'
537932   ~0%     {3} r2 = JOIN r1 WITH Essa::EssaDefinition::getSourceVariable#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg2', Lhs.1 'arg1', Lhs.0
982333   ~0%     {4} r3 = JOIN r2 WITH Essa::EssaVariable::getAUse#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.2, Lhs.1 'arg1', Lhs.0 'arg2', Rhs.1 'arg0'
37029774 ~0%     {4} r4 = JOIN r3 WITH Essa::TEssaNodeDefinition#24e22a14#ffff ON FIRST 1 OUTPUT Rhs.3 'arg3', Lhs.1 'arg1', Lhs.2 'arg2', Lhs.3 'arg0'
35956211 ~0%     {5} r5 = JOIN r4 WITH Essa::ScopeEntryDefinition::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.3 'arg0', Lhs.1 'arg1', Lhs.2 'arg2', Lhs.0 'arg3', Rhs.1 'arg4'
                return r5
```

You may notice that this is a predicate that's _materialised_, but it's
never actually used anywhere. It's the old "standard order" bringing
much sadness.

The problem here is that in the standard order (which we never actually
use here), we end up with a join between the bits above, `getRootCall`,
and `appliesToScope`. The `join_rhs` bit is joined twice, once with
`getRootCall#prev` and `appliesToScope#prev_delta` (in that order), and
once with `prev` and `prev_delta` swapped.

So to fix this, I used the unbinding pragma to force `appliesToScope` to
appear first in the join order. This was enough to make the compiler
_not_ push the common context into its own `join_rhs` predicate (and
the join-order is still decent.)
2022-07-19 17:18:07 +00:00
Taus
b5cac9285e Python: Fix bad join in getOuterVariable
Much sadness:

```
Tuple counts for ImportTime::ImportTimeScope::getOuterVariable#dispred#f0820431#fff/3@64d04d33 after 7.6s:
19624    ~1%     {1} r1 = SCAN py_Classes OUTPUT In.0 'this'
19531    ~1%     {1} r2 = JOIN r1 WITH ImportTime::ImportTimeScope#class#7851b601#f ON FIRST 1 OUTPUT Lhs.0 'this'
19531    ~0%     {2} r3 = JOIN r2 WITH Scope::Scope::getEnclosingModule#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0 'this', Rhs.1
296389   ~0%     {3} r4 = JOIN r3 WITH Variables::Variable::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.0 'this', Lhs.1
296389   ~0%     {3} r5 = JOIN r4 WITH Variables::LocalVariable#3aa06bbf#f ON FIRST 1 OUTPUT Lhs.0 'var', Lhs.1 'this', Lhs.2
296389   ~1%     {4} r6 = JOIN r5 WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.2, Lhs.1 'this', Lhs.0 'var', Rhs.1
62294919 ~0%     {4} r7 = JOIN r6 WITH Variables::Variable::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.1 'this', Lhs.2 'var', Lhs.3
62294919 ~0%     {4} r8 = JOIN r7 WITH Variables::GlobalVariable#class#3aa06bbf#f ON FIRST 1 OUTPUT Lhs.0 'result', Lhs.3, Lhs.1 'this', Lhs.2 'var'
639      ~0%     {3} r9 = JOIN r8 WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.3 'var', Lhs.0 'result'
                return r9
```

Clearly we _shouldn't_ be joining on `getId` as the last thing, as this
means we're building tuples of completely unrelated variables (not even
with the same name!) which obviously blows up.

A standard way of fixing this is to correlate as much information about
these variables as possible in a `nomagic`ked helper predicate. This is
what we do here, grouping together the variable with its scope and name
(both of which are uniquely determined by the variable). This results
in a much nicer join order:

```
Tuple counts for ImportTime::ImportTimeScope::getOuterVariable#dispred#f0820431#fff/3@82866b6p after 42ms:
23867  ~4%     {2} r1 = JOIN Scope::Scope::getEnclosingModule#dispred#f0820431#ff WITH ImportTime::ImportTimeScope#class#7851b601#f ON FIRST 1 OUTPUT Lhs.0 'this', Lhs.1
296389 ~0%     {4} r2 = JOIN r1 WITH ImportTime::class_var_scope#7851b601#fff ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0 'this', Rhs.2 'var'
639    ~0%     {3} r3 = JOIN r2 WITH ImportTime::global_var_scope#7851b601#fff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.3 'var', Rhs.2 'result'
                return r3
```
```
Tuple counts for ImportTime::class_var_scope#7851b601#fff/3@366258vr after 47ms:
19624  ~1%     {1} r1 = SCAN py_Classes OUTPUT In.0 'scope'
296743 ~0%     {2} r2 = JOIN r1 WITH Variables::Variable::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.0 'scope'
296743 ~0%     {2} r3 = JOIN r2 WITH Variables::LocalVariable#3aa06bbf#f ON FIRST 1 OUTPUT Lhs.0 'var', Lhs.1 'scope'
296743 ~2%     {3} r4 = JOIN r3 WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'scope', Rhs.1 'name', Lhs.0 'var'
                return r4
```
```
Tuple counts for ImportTime::global_var_scope#7851b601#fff/3@718e4bpm after 18ms:
108173 ~0%     {2} r1 = JOIN Variables::GlobalVariable#class#3aa06bbf#f WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0 'var', Rhs.1 'name'
108173 ~0%     {3} r2 = JOIN r1 WITH Variables::Variable::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'name', Rhs.1 'scope', Lhs.0 'var'
                return r2
```

(You may be wondering what's up with the order of arguments for the two
helper predicates. By ordering the arguments this way, there's no need
to reorder the resulting relations when used in `getOuterVariable.)
2022-07-19 17:14:37 +00:00
Taus
cfacd015b9 Python: Fix bad join in ScopeEntryDefinition
Before:

```
Tuple counts for Essa::ScopeEntryDefinition#class#24e22a14#f/1@45e0d8dh after 10.5s:
2133368   ~1%     {2} r1 = Essa::TEssaNodeDefinition#24e22a14#ffff_03#join_rhs AND NOT Essa::ImplicitSubModuleDefinition#class#24e22a14#f(Lhs.1 'this')
534478950 ~0%     {2} r2 = JOIN r1 WITH Definitions::SsaSourceVariable::getScopeEntryDefinition#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'this', Rhs.1
581249    ~4%     {1} r3 = JOIN r2 WITH Essa::EssaNodeDefinition::getDefiningNode#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.0 'this'
                return r3
```

Let's see if pushing the `getDefiningNode` join further up improves the
number of intermediary tuples. (Intuitively it should, since there
should only be one defining node for any given `EssaNodeDefinition`.)

To do this, we unbind the `this.getSourceVariable()` part, which
encourages the compiler to put this join later.

After:

```
Tuple counts for Essa::ScopeEntryDefinition#class#24e22a14#f/1@30758cv4 after 300ms:
2133569 ~1%     {2} r1 = SCAN Essa::TEssaNodeDefinition#24e22a14#ffff OUTPUT In.0, In.3 'this'
2133368 ~1%     {2} r2 = r1 AND NOT Essa::ImplicitSubModuleDefinition#class#24e22a14#f(Lhs.1 'this')
2133368 ~0%     {2} r3 = JOIN r2 WITH Definitions::SsaSourceVariable#class#486534ab#f ON FIRST 1 OUTPUT Lhs.1 'this', Lhs.0
2133368 ~0%     {3} r4 = JOIN r3 WITH Essa::EssaNodeDefinition::getDefiningNode#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Lhs.0 'this'
581249  ~4%     {1} r5 = JOIN r4 WITH Definitions::SsaSourceVariable::getScopeEntryDefinition#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'this'
                return r5
```

Much better (and our intuition is confirmed -- joining with
`getDefiningNode` did not increase the number of tuples).
2022-07-19 14:28:25 +00:00
Asger F
b9bdee6651 Merge branch 'main' into post-release-prep/codeql-cli-2.10.1 2022-07-19 16:24:35 +02:00
Taus
87960b6e42 Python: Fix bad join in scope entry transfer
How it started:

```
Tuple counts for Base::BaseFlow::scope_entry_value_transfer_from_earlier#f76ef5bb#ffff/4@f2af49f5 after 18s:
1526390  ~0%     {3} r1 = JOIN Base::BaseFlow::scope_entry_value_transfer_from_earlier#f76ef5bb#ffff#shared WITH Essa::EssaVariable::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Rhs.1 'pred_scope', Lhs.0 'pred_var', Lhs.1
7798319  ~0%     {4} r2 = JOIN r1 WITH Scope::Scope::precedes#dispred#f0820431#ff ON FIRST 1 OUTPUT Rhs.1 'succ_scope', Lhs.1 'pred_var', Lhs.2, Lhs.0 'pred_scope'

5427334  ~0%     {4} r3 = JOIN Base::BaseFlow::scope_entry_value_transfer_from_earlier#f76ef5bb#ffff#shared#1 WITH Scope::Scope::precedes#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'pred_var', Lhs.2, Lhs.0 'pred_scope', Rhs.1 'succ_scope'
5426883  ~0%     {4} r4 = r3 AND NOT Base::BaseFlow::scope_entry_value_transfer_from_earlier#f76ef5bb#ffff#antijoin_rhs(Lhs.0 'pred_var', Lhs.1, Lhs.2 'pred_scope', Lhs.3)
5426883  ~0%     {5} r5 = SCAN r4 OUTPUT In.3, "__init__", In.0 'pred_var', In.1, In.2 'pred_scope'
2002084  ~0%     {4} r6 = JOIN r5 WITH Scope::Scope::getName#dispred#f0820431#fb ON FIRST 2 OUTPUT Lhs.0, Lhs.2 'pred_var', Lhs.3, Lhs.4 'pred_scope'
39293988 ~2%     {4} r7 = JOIN r6 WITH Scope::Scope::precedes#dispred#f0820431#ff ON FIRST 1 OUTPUT Rhs.1 'succ_scope', Lhs.1 'pred_var', Lhs.2, Lhs.3 'pred_scope'

47092307 ~0%     {4} r8 = r2 UNION r7
94173236 ~7%     {5} r9 = JOIN r8 WITH Essa::ScopeEntryDefinition::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.2, Rhs.1 'succ_def', Lhs.1 'pred_var', Lhs.3 'pred_scope', Lhs.0 'succ_scope'
599441   ~1%     {4} r10 = JOIN r9 WITH Essa::TEssaNodeDefinition#24e22a14#ffff_03#join_rhs ON FIRST 2 OUTPUT Lhs.2 'pred_var', Lhs.3 'pred_scope', Lhs.1 'succ_def', Lhs.4 'succ_scope'
                return r10
```

How it ended:

```
Tuple counts for Base::essa_var_scope#f76ef5bb#fff/3@20fd243c after 153ms:
1526390 ~0%     {2} r1 = JOIN Essa::EssaDefinition::getSourceVariable#dispred#f0820431#ff WITH Base::BaseFlow::reaches_exit#f76ef5bb#f ON FIRST 1 OUTPUT Lhs.0 'pred_var', Lhs.1 'var'
1526390 ~5%     {3} r2 = JOIN r1 WITH Essa::EssaVariable::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'var', Rhs.1 'pred_scope', Lhs.0 'pred_var'
                return r2
```
```

Tuple counts for Base::scope_entry_def_scope#f76ef5bb#fff/3@34224fid after 40ms:
581249 ~1%     {3} r1 = JOIN Essa::TEssaNodeDefinition#24e22a14#ffff_30#join_rhs WITH Essa::ScopeEntryDefinition::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'var', Rhs.1 'succ_scope', Lhs.0 'succ_def'
                return r1
```
```
Tuple counts for Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff#shared/5@cb3c45lu after 76ms:
471230 ~0%     {3} r1 = JOIN Variables::GlobalVariable#class#3aa06bbf#f WITH Base::scope_entry_def_scope#f76ef5bb#fff ON FIRST 1 OUTPUT Rhs.1 'arg1', Lhs.0 'arg0', Rhs.2 'arg2'
313791 ~2%     {5} r2 = JOIN r1 WITH Base::step_through_init#f76ef5bb#fff ON FIRST 1 OUTPUT Lhs.1 'arg0', Lhs.0 'arg1', Lhs.2 'arg2', Rhs.1 'arg3', Rhs.2 'arg4'
                return r2
```
```
Tuple counts for Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff#antijoin_rhs/5@886d8bvr after 67ms:
508926 ~0%      {6} r1 = JOIN Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff#shared WITH Exprs::Name::defines#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.4 'arg4', Lhs.0 'arg0', Lhs.1 'arg1', Lhs.2 'arg2', Lhs.3 'arg3'
25     ~46%     {5} r2 = JOIN r1 WITH Exprs::Expr::getScope#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'arg0', Lhs.3 'arg1', Lhs.4 'arg2', Lhs.5 'arg3', Lhs.1 'arg4'
                return r2
```
```
Tuple counts for Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff/4@87ec703f after 80ms:
313774 ~2%     {5} r1 = Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff#shared AND NOT Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff#antijoin_rhs(Lhs.0, Lhs.1 'succ_scope', Lhs.2 'succ_def', Lhs.3 'pred_scope', Lhs.4)
313774 ~0%     {4} r2 = SCAN r1 OUTPUT In.3 'pred_scope', In.0, In.1 'succ_scope', In.2 'succ_def'
313774 ~4%     {4} r3 = JOIN r2 WITH @py_scope#f ON FIRST 1 OUTPUT Lhs.1, Lhs.0 'pred_scope', Lhs.2 'succ_scope', Lhs.3 'succ_def'
313778 ~0%     {4} r4 = JOIN r3 WITH Base::essa_var_scope#f76ef5bb#fff ON FIRST 2 OUTPUT Rhs.2 'pred_var', Lhs.1 'pred_scope', Lhs.3 'succ_def', Lhs.2 'succ_scope'
                return r4
```
```
Tuple counts for Base::step_through_init#f76ef5bb#fff/3@7ba1ee1c after 17ms:
11763  ~0%     {1} r1 = JOIN Scope::Scope::precedes#dispred#f0820431#ff#join_rhs WITH Scope::Scope::getName#dispred#f0820431#fb_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'init'
196671 ~4%     {2} r2 = JOIN r1 WITH Scope::Scope::precedes#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0 'init', Rhs.1 'succ_scope'
196671 ~6%     {3} r3 = JOIN r2 WITH Scope::Scope::precedes#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.1 'succ_scope', Rhs.1 'pred_scope', Lhs.0 'init'
                return r3
```
```
Tuple counts for Base::BaseFlow::scope_entry_value_transfer_from_earlier#f76ef5bb#ffff/4@4892f93f after 426ms:
1526390 ~0%     {3} r1 = SCAN Base::essa_var_scope#f76ef5bb#fff OUTPUT In.1, In.0, In.2 'pred_var'
7798319 ~0%     {4} r2 = JOIN r1 WITH Scope::Scope::precedes#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1, Rhs.1 'succ_scope', Rhs.0, Lhs.2 'pred_var'
285663  ~3%     {4} r3 = JOIN r2 WITH Base::scope_entry_def_scope#f76ef5bb#fff ON FIRST 2 OUTPUT Lhs.3 'pred_var', Lhs.2 'pred_scope', Rhs.2 'succ_def', Lhs.1 'succ_scope'

599441  ~1%     {4} r4 = Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff UNION r3
                return r4
```

It's possible this could be improved even further, but I think this is
good enough. (I'm not entirely happy with how many helper predicates I
ended up needing, but it was the only way I could get the joins to
happen in a semi-sensible order.)
2022-07-19 13:46:55 +00:00
Taus
bde47836d0 Python: Add Str class
This makes the AST viewer (which annotates string constant nodes as
`Str`) a bit more consistent.
2022-07-19 12:25:10 +00:00
Taus
8c0725e8c6 Python: Fix bad join in ESSA getInput
Before:

```
Tuple counts for Essa::EssaEdgeRefinement::getInput#dispred#f0820431#ff/2@b84afc77 after 20.3s:
873421    ~0%     {3} r1 = JOIN Essa::TEssaEdgeDefinition#24e22a14#ffff_31#join_rhs WITH Essa::TEssaEdgeDefinition#24e22a14#ffff_30#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0 'this'
181627951 ~0%     {3} r2 = JOIN r1 WITH Essa::EssaDefinition::getSourceVariable#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1, Lhs.2 'this'
873418    ~0%     {2} r3 = JOIN r2 WITH Essa::EssaDefinition::reachesEndOfBlock#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.0 'result'
                return r3
```
It's perhaps not immediately obvious what's going on here (because of
the `...join_rhs` indirection), but basically we're joining together
`this` and `def` and their `getSourceVariable`, and only then actually
relating `this` and `def` through `reachesEndOfBlock`.

By unbinding `var`, we prevent this early join, which now encourages the
`reachesEndOfBlock` join to happen earlier:

```
Tuple counts for Essa::EssaEdgeRefinement::getInput#dispred#f0820431#ff/2@2d63e5lb after 2s
873421  ~0%     {2} r1 = SCAN Essa::TEssaEdgeDefinition#24e22a14#ffff OUTPUT In.3 'this', In.1
873421  ~0%     {3} r2 = JOIN r1 WITH Essa::TEssaEdgeDefinition#24e22a14#ffff_30#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0 'this'
873421  ~0%     {3} r3 = JOIN r2 WITH Definitions::SsaSourceVariable#class#486534ab#f ON FIRST 1 OUTPUT Lhs.1, Lhs.2 'this', Lhs.0
8758877 ~0%     {3} r4 = JOIN r3 WITH Essa::EssaDefinition::reachesEndOfBlock#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.2, Lhs.1 'this'
873418  ~0%     {2} r5 = JOIN r4 WITH Essa::EssaDefinition::getSourceVariable#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.0 'result'
                return r5
```
2022-07-18 20:21:39 +00:00
alexet
f9b6ca76e5 Python: Fix binding incorrect predicate. 2022-07-18 16:28:19 +01:00
Taus
bdd771989f Python: Fix bad join in syntactic_call_count
On certain databases, the evaluation of this predicate was running out
of memory due to the way the `count` aggregate was being used. Here's
an example of the tuple counts involved:

```
Tuple counts for PointsToContext::syntactic_call_count#cf3039a0#ff#antijoin_rhs/1@d2199bb8 after 1m27s:
595518502 ~521250%     {1} r1 = JOIN PointsToContext::syntactic_call_count#cf3039a0#ff#shared#3 WITH Flow::CallNode::getFunction#dispred#f0820431#ff_1#join_rhs ON FIRST 1 OUTPUT Lhs.1 'arg0'

26518709  ~111513%     {1} r2 = JOIN PointsToContext::syntactic_call_count#cf3039a0#ff#shared#2 WITH Flow::CallNode::getFunction#dispred#f0820431#ff_1#join_rhs ON FIRST 1 OUTPUT Lhs.1 'arg0'

622037211 ~498045%     {1} r3 = r1 UNION r2
                        return r3
```

and a timing report that looked like this:
```
time  | evals |   max @ iter | predicate
------|-------|--------------|----------
 5m8s |       |              | PointsToContext::syntactic_call_count#cf3039a0#ff#shared#2@6d98d1nd
4m38s |       |              | PointsToContext::syntactic_call_count#cf3039a0#ff#count_range@f5df1do4
3m51s |       |              | PointsToContext::syntactic_call_count#cf3039a0#ff#shared#3@da3b4abf
1m58s |  7613 |  37ms @ 4609 | MRO::ClassListList::removedClassParts#f0820431#fffff#reorder_2_3_4_0_1@8155axyi
1m37s |  7613 |  33ms @ 3904 | MRO::ClassListList::bestMergeCandidate#f0820431#2#fff@8155a83w
1m27s |       |              | PointsToContext::syntactic_call_count#cf3039a0#ff#antijoin_rhs@d2199bb8
 1m8s |  1825 |  63ms @ 404  | PointsTo::Expressions::equalityEvaluatesTo#741b54e2#fffff@8155aw7w
37.6s |       |              | PointsToContext::syntactic_call_count#cf3039a0#ff#join_rhs@e348fc1p
...
```

To make optimising this easier for the compiler, I moved the bodies of
the `count` aggregate into their own helper predicates (with size
linear in the number of `CallNode`s), and also factored out the many
calls to `f.getName()`.

The astute reader will notice that in writing this as a sum of `count`s
rather than a count of a disjunction, the intersection (if it exists)
will be counted twice, and so the semantics may be different. However,
since `method_call` and `function_call` require `AttrNode` and
`NameNode` functions respectively, and as these two types are disjoint,
there is no intersection, and so the semantics should be preserved.

After the change, the evaluation of `syntactic_call_count` now looks as
follows:
```
Tuple counts for PointsToContext::syntactic_call_count#cf3039a0#ff/2@662dd8s0 after 216ms:
23960  ~0%     {1} r1 = @py_scope#f AND NOT py_Functions_0#antijoin_rhs(Lhs.0 's')
23960  ~0%     {2} r2 = SCAN r1 OUTPUT In.0 's', 0

276309 ~7%     {2} r3 = SCAN @py_scope#f OUTPUT In.0 's', "__init__"
11763  ~0%     {2} r4 = JOIN r3 WITH Scope::Scope::getName#dispred#f0820431#fb ON FIRST 2 OUTPUT Lhs.0 's', 1

35723  ~0%     {2} r5 = r2 UNION r4

252349 ~0%     {2} r6 = JOIN @py_scope#f WITH Function::Function::getName#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0 's', Rhs.1

240586 ~0%     {2} r7 = SELECT r6 ON In.1 != "__init__"

131727 ~4%     {2} r8 = r7 AND NOT project#PointsToContext::method_call#cf3039a0#ff(Lhs.1)
131727 ~0%     {3} r9 = SCAN r8 OUTPUT In.1, In.0 's', 0

240586 ~0%     {2} r10 = SCAN r7 OUTPUT In.1, In.0 's'

108859 ~0%     {3} r11 = JOIN r10 WITH PointsToContext::syntactic_call_count#cf3039a0#ff#join_rhs ON FIRST 1 OUTPUT Lhs.0, Lhs.1 's', Rhs.1

240586 ~0%     {3} r12 = r9 UNION r11
24100  ~0%     {2} r13 = JOIN r12 WITH PointsToContext::syntactic_call_count#cf3039a0#ff#join_rhs#1 ON FIRST 1 OUTPUT Lhs.1 's', (Rhs.1 + Lhs.2)

240586 ~0%     {2} r14 = SELECT r6 ON In.1 != "__init__"
131727 ~4%     {2} r15 = r14 AND NOT project#PointsToContext::method_call#cf3039a0#ff(Lhs.1)
131727 ~0%     {3} r16 = SCAN r15 OUTPUT In.0 's', In.1, 0

108859 ~4%     {3} r17 = JOIN r10 WITH PointsToContext::syntactic_call_count#cf3039a0#ff#join_rhs ON FIRST 1 OUTPUT Lhs.1 's', Lhs.0, Rhs.1

240586 ~4%     {3} r18 = r16 UNION r17
216486 ~2%     {3} r19 = r18 AND NOT project#PointsToContext::function_call#cf3039a0#ff(Lhs.1)
216486 ~0%     {2} r20 = SCAN r19 OUTPUT In.0 's', (0 + In.2)

240586 ~0%     {2} r21 = r13 UNION r20
276309 ~0%     {2} r22 = r5 UNION r21
                return r22
```
2022-07-18 13:58:00 +00:00
Andrew Eisenberg
b897a40228 Move python contextual queries to lib folders
This will ensure that python projects can use jump to ref/def in
vscode when the core libraries are not installed.
2022-07-15 13:12:17 -07:00
github-actions[bot]
0ee476129a Post-release preparation for codeql-cli-2.10.1 2022-07-14 14:38:49 +00:00
Erik Krogh Kristensen
85a652f3d1 remove a bunch of repeated words 2022-07-14 12:42:48 +02:00
github-actions[bot]
d1aa0d7dd3 Release preparation for version 2.10.1 2022-07-14 08:56:03 +00:00
Erik Krogh Kristensen
595875ff98 remove redundant not-equals check 2022-07-13 12:06:12 +02:00
Erik Krogh Kristensen
8e52fc97fc changes based on review by Shack 2022-07-12 16:02:50 +02:00
Erik Krogh Kristensen
aae3e2ddde other changes based on Esbens review 2022-07-12 16:02:50 +02:00
Erik Krogh Kristensen
ff25451699 rename query to overly-large-range, and rewrite the @description 2022-07-12 16:02:46 +02:00
yoff
f52d792b36 Merge branch 'main' of https://github.com/github/codeql into python-dataflow/flow-summaries-from-scratch 2022-07-01 12:01:07 +00:00
CodeQL CI
5b5a52fa25 Merge pull request #9551 from yoff/python/port-tarslip
Approved by RasmusWL
2022-07-01 12:58:25 +01:00
yoff
61523bd330 python: better names
- "Normal" instead of "NonSpecial"
- "NonLibrary" instead of "2"

I could not find a good replacement for "NonLibrary", nor for "Source",
but I added QLDocs in a few places to help the reading.
2022-07-01 11:55:20 +00:00
yoff
a0db438799 python: rename getACall2 -> getANonLibraryCall 2022-07-01 10:29:03 +00:00
yoff
f6af24894d python: recover isPackageUsed
- add `unknownAttribute` to pre-compute negation
- add `Node`-less formulation of "is imported"
2022-07-01 09:39:07 +00:00
yoff
3a80baf39c python: concession to get the code to compile
`isPackageUsed` now does no filtering
2022-07-01 07:06:09 +00:00
yoff
e54ada175d python: rewrite not away
A `LocalSourceNode` is either a `ModuleVariableNode`
or an `ExprNode`.
2022-07-01 07:03:14 +00:00
yoff
cf9b69b5f2 python: More helpful comment 2022-06-30 13:07:13 +00:00
yoff
b0a29b146a Update python/ql/lib/semmle/python/security/dataflow/TarSlipQuery.qll
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2022-06-30 14:54:01 +02:00