codeql

mirror of https://github.com/github/codeql.git synced 2025-12-19 18:33:16 +01:00

Author	SHA1	Message	Date
Rasmus Wriedt Larsen	b541103b7f	Merge pull request #9846 from tausbn/python-fix-bad-syntactic_call_count-join Python: Fix bad join in `syntactic_call_count`	2022-08-10 10:09:51 +02:00
Erik Krogh Kristensen	559ec7ba56	Merge branch 'main' into repeatedWord	2022-08-09 21:22:47 +02:00
Erik Krogh Kristensen	49276b1f38	Merge branch 'main' into refacReDoS	2022-08-09 16:18:46 +02:00
Rasmus Wriedt Larsen	f89b32183f	Merge branch 'main' into typetracker-decorators	2022-08-08 11:52:09 +02:00
Anders Schack-Mulligen	3d47875b60	Dataflow: Generate shorter RA/DIL names.	2022-08-05 11:00:56 +02:00
Anders Schack-Mulligen	d3dcc3ce3a	Dataflow: Sync.	2022-08-05 11:00:56 +02:00
Alex Ford	8e3548efb3	Merge branch 'main' into post-release-prep/codeql-cli-2.10.2	2022-08-02 20:29:26 +01:00
Rasmus Wriedt Larsen	1737d08145	Merge pull request #9579 from yoff/python/more-logic-tests Python: Improve `BarrierGuard`	2022-08-01 11:36:11 +02:00
github-actions[bot]	e8747d3176	Post-release preparation for codeql-cli-2.10.2	2022-07-28 20:00:09 +00:00
github-actions[bot]	212786ed91	Release preparation for version 2.10.2	2022-07-28 13:38:35 +00:00
Andrew Eisenberg	43ae5d4285	Merge pull request #9838 from github/aeisenberg/python-local-ref-def Move python contextual queries to lib folders	2022-07-25 09:00:32 -07:00
Taus	2436b060f1	Python: Fix another bad "value transfer" join The culprit: ``` Tuple counts for PointsTo::InterProceduralPointsTo::scope_entry_value_transfer_from_earlier#741b54e2#ffff#join_rhs/5@eb1340iv after 12.6s: 72973 ~3% {2} r1 = JOIN PointsToContext::TImportContext#cf3039a0#f WITH Definitions::NonEscapingGlobalVariable#class#486534ab#f CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 'arg1' 537932 ~0% {3} r2 = JOIN r1 WITH Essa::EssaDefinition::getSourceVariable#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg2', Lhs.1 'arg1', Lhs.0 982333 ~0% {4} r3 = JOIN r2 WITH Essa::EssaVariable::getAUse#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.2, Lhs.1 'arg1', Lhs.0 'arg2', Rhs.1 'arg0' 37029774 ~0% {4} r4 = JOIN r3 WITH Essa::TEssaNodeDefinition#24e22a14#ffff ON FIRST 1 OUTPUT Rhs.3 'arg3', Lhs.1 'arg1', Lhs.2 'arg2', Lhs.3 'arg0' 35956211 ~0% {5} r5 = JOIN r4 WITH Essa::ScopeEntryDefinition::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.3 'arg0', Lhs.1 'arg1', Lhs.2 'arg2', Lhs.0 'arg3', Rhs.1 'arg4' return r5 ``` You may notice that this is a predicate that's _materialised_, but it's never actually used anywhere. It's the old "standard order" bringing much sadness. The problem here is that in the standard order (which we never actually use here), we end up with a join between the bits above, `getRootCall`, and `appliesToScope`. The `join_rhs` bit is joined twice, once with `getRootCall#prev` and `appliesToScope#prev_delta` (in that order), and once with `prev` and `prev_delta` swapped. So to fix this, I used the unbinding pragma to force `appliesToScope` to appear first in the join order. This was enough to make the compiler _not_ push the common context into its own `join_rhs` predicate (and the join-order is still decent.)	2022-07-19 17:18:07 +00:00
Taus	b5cac9285e	Python: Fix bad join in `getOuterVariable` Much sadness: ``` Tuple counts for ImportTime::ImportTimeScope::getOuterVariable#dispred#f0820431#fff/3@64d04d33 after 7.6s: 19624 ~1% {1} r1 = SCAN py_Classes OUTPUT In.0 'this' 19531 ~1% {1} r2 = JOIN r1 WITH ImportTime::ImportTimeScope#class#7851b601#f ON FIRST 1 OUTPUT Lhs.0 'this' 19531 ~0% {2} r3 = JOIN r2 WITH Scope::Scope::getEnclosingModule#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0 'this', Rhs.1 296389 ~0% {3} r4 = JOIN r3 WITH Variables::Variable::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.0 'this', Lhs.1 296389 ~0% {3} r5 = JOIN r4 WITH Variables::LocalVariable#3aa06bbf#f ON FIRST 1 OUTPUT Lhs.0 'var', Lhs.1 'this', Lhs.2 296389 ~1% {4} r6 = JOIN r5 WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.2, Lhs.1 'this', Lhs.0 'var', Rhs.1 62294919 ~0% {4} r7 = JOIN r6 WITH Variables::Variable::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.1 'this', Lhs.2 'var', Lhs.3 62294919 ~0% {4} r8 = JOIN r7 WITH Variables::GlobalVariable#class#3aa06bbf#f ON FIRST 1 OUTPUT Lhs.0 'result', Lhs.3, Lhs.1 'this', Lhs.2 'var' 639 ~0% {3} r9 = JOIN r8 WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.3 'var', Lhs.0 'result' return r9 ``` Clearly we _shouldn't_ be joining on `getId` as the last thing, as this means we're building tuples of completely unrelated variables (not even with the same name!) which obviously blows up. A standard way of fixing this is to correlate as much information about these variables as possible in a `nomagic`ked helper predicate. This is what we do here, grouping together the variable with its scope and name (both of which are uniquely determined by the variable). This results in a much nicer join order: ``` Tuple counts for ImportTime::ImportTimeScope::getOuterVariable#dispred#f0820431#fff/3@82866b6p after 42ms: 23867 ~4% {2} r1 = JOIN Scope::Scope::getEnclosingModule#dispred#f0820431#ff WITH ImportTime::ImportTimeScope#class#7851b601#f ON FIRST 1 OUTPUT Lhs.0 'this', Lhs.1 296389 ~0% {4} r2 = JOIN r1 WITH ImportTime::class_var_scope#7851b601#fff ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0 'this', Rhs.2 'var' 639 ~0% {3} r3 = JOIN r2 WITH ImportTime::global_var_scope#7851b601#fff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.3 'var', Rhs.2 'result' return r3 ``` ``` Tuple counts for ImportTime::class_var_scope#7851b601#fff/3@366258vr after 47ms: 19624 ~1% {1} r1 = SCAN py_Classes OUTPUT In.0 'scope' 296743 ~0% {2} r2 = JOIN r1 WITH Variables::Variable::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.0 'scope' 296743 ~0% {2} r3 = JOIN r2 WITH Variables::LocalVariable#3aa06bbf#f ON FIRST 1 OUTPUT Lhs.0 'var', Lhs.1 'scope' 296743 ~2% {3} r4 = JOIN r3 WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'scope', Rhs.1 'name', Lhs.0 'var' return r4 ``` ``` Tuple counts for ImportTime::global_var_scope#7851b601#fff/3@718e4bpm after 18ms: 108173 ~0% {2} r1 = JOIN Variables::GlobalVariable#class#3aa06bbf#f WITH Variables::Variable::getId#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0 'var', Rhs.1 'name' 108173 ~0% {3} r2 = JOIN r1 WITH Variables::Variable::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'name', Rhs.1 'scope', Lhs.0 'var' return r2 ``` (You may be wondering what's up with the order of arguments for the two helper predicates. By ordering the arguments this way, there's no need to reorder the resulting relations when used in `getOuterVariable.)	2022-07-19 17:14:37 +00:00
Taus	cfacd015b9	Python: Fix bad join in `ScopeEntryDefinition` Before: ``` Tuple counts for Essa::ScopeEntryDefinition#class#24e22a14#f/1@45e0d8dh after 10.5s: 2133368 ~1% {2} r1 = Essa::TEssaNodeDefinition#24e22a14#ffff_03#join_rhs AND NOT Essa::ImplicitSubModuleDefinition#class#24e22a14#f(Lhs.1 'this') 534478950 ~0% {2} r2 = JOIN r1 WITH Definitions::SsaSourceVariable::getScopeEntryDefinition#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'this', Rhs.1 581249 ~4% {1} r3 = JOIN r2 WITH Essa::EssaNodeDefinition::getDefiningNode#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.0 'this' return r3 ``` Let's see if pushing the `getDefiningNode` join further up improves the number of intermediary tuples. (Intuitively it should, since there should only be one defining node for any given `EssaNodeDefinition`.) To do this, we unbind the `this.getSourceVariable()` part, which encourages the compiler to put this join later. After: ``` Tuple counts for Essa::ScopeEntryDefinition#class#24e22a14#f/1@30758cv4 after 300ms: 2133569 ~1% {2} r1 = SCAN Essa::TEssaNodeDefinition#24e22a14#ffff OUTPUT In.0, In.3 'this' 2133368 ~1% {2} r2 = r1 AND NOT Essa::ImplicitSubModuleDefinition#class#24e22a14#f(Lhs.1 'this') 2133368 ~0% {2} r3 = JOIN r2 WITH Definitions::SsaSourceVariable#class#486534ab#f ON FIRST 1 OUTPUT Lhs.1 'this', Lhs.0 2133368 ~0% {3} r4 = JOIN r3 WITH Essa::EssaNodeDefinition::getDefiningNode#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Lhs.0 'this' 581249 ~4% {1} r5 = JOIN r4 WITH Definitions::SsaSourceVariable::getScopeEntryDefinition#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'this' return r5 ``` Much better (and our intuition is confirmed -- joining with `getDefiningNode` did not increase the number of tuples).	2022-07-19 14:28:25 +00:00
Asger F	b9bdee6651	Merge branch 'main' into post-release-prep/codeql-cli-2.10.1	2022-07-19 16:24:35 +02:00
Taus	87960b6e42	Python: Fix bad join in scope entry transfer How it started: ``` Tuple counts for Base::BaseFlow::scope_entry_value_transfer_from_earlier#f76ef5bb#ffff/4@f2af49f5 after 18s: 1526390 ~0% {3} r1 = JOIN Base::BaseFlow::scope_entry_value_transfer_from_earlier#f76ef5bb#ffff#shared WITH Essa::EssaVariable::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Rhs.1 'pred_scope', Lhs.0 'pred_var', Lhs.1 7798319 ~0% {4} r2 = JOIN r1 WITH Scope::Scope::precedes#dispred#f0820431#ff ON FIRST 1 OUTPUT Rhs.1 'succ_scope', Lhs.1 'pred_var', Lhs.2, Lhs.0 'pred_scope' 5427334 ~0% {4} r3 = JOIN Base::BaseFlow::scope_entry_value_transfer_from_earlier#f76ef5bb#ffff#shared#1 WITH Scope::Scope::precedes#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'pred_var', Lhs.2, Lhs.0 'pred_scope', Rhs.1 'succ_scope' 5426883 ~0% {4} r4 = r3 AND NOT Base::BaseFlow::scope_entry_value_transfer_from_earlier#f76ef5bb#ffff#antijoin_rhs(Lhs.0 'pred_var', Lhs.1, Lhs.2 'pred_scope', Lhs.3) 5426883 ~0% {5} r5 = SCAN r4 OUTPUT In.3, "__init__", In.0 'pred_var', In.1, In.2 'pred_scope' 2002084 ~0% {4} r6 = JOIN r5 WITH Scope::Scope::getName#dispred#f0820431#fb ON FIRST 2 OUTPUT Lhs.0, Lhs.2 'pred_var', Lhs.3, Lhs.4 'pred_scope' 39293988 ~2% {4} r7 = JOIN r6 WITH Scope::Scope::precedes#dispred#f0820431#ff ON FIRST 1 OUTPUT Rhs.1 'succ_scope', Lhs.1 'pred_var', Lhs.2, Lhs.3 'pred_scope' 47092307 ~0% {4} r8 = r2 UNION r7 94173236 ~7% {5} r9 = JOIN r8 WITH Essa::ScopeEntryDefinition::getScope#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.2, Rhs.1 'succ_def', Lhs.1 'pred_var', Lhs.3 'pred_scope', Lhs.0 'succ_scope' 599441 ~1% {4} r10 = JOIN r9 WITH Essa::TEssaNodeDefinition#24e22a14#ffff_03#join_rhs ON FIRST 2 OUTPUT Lhs.2 'pred_var', Lhs.3 'pred_scope', Lhs.1 'succ_def', Lhs.4 'succ_scope' return r10 ``` How it ended: ``` Tuple counts for Base::essa_var_scope#f76ef5bb#fff/3@20fd243c after 153ms: 1526390 ~0% {2} r1 = JOIN Essa::EssaDefinition::getSourceVariable#dispred#f0820431#ff WITH Base::BaseFlow::reaches_exit#f76ef5bb#f ON FIRST 1 OUTPUT Lhs.0 'pred_var', Lhs.1 'var' 1526390 ~5% {3} r2 = JOIN r1 WITH Essa::EssaVariable::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'var', Rhs.1 'pred_scope', Lhs.0 'pred_var' return r2 ``` ``` Tuple counts for Base::scope_entry_def_scope#f76ef5bb#fff/3@34224fid after 40ms: 581249 ~1% {3} r1 = JOIN Essa::TEssaNodeDefinition#24e22a14#ffff_30#join_rhs WITH Essa::ScopeEntryDefinition::getScope#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'var', Rhs.1 'succ_scope', Lhs.0 'succ_def' return r1 ``` ``` Tuple counts for Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff#shared/5@cb3c45lu after 76ms: 471230 ~0% {3} r1 = JOIN Variables::GlobalVariable#class#3aa06bbf#f WITH Base::scope_entry_def_scope#f76ef5bb#fff ON FIRST 1 OUTPUT Rhs.1 'arg1', Lhs.0 'arg0', Rhs.2 'arg2' 313791 ~2% {5} r2 = JOIN r1 WITH Base::step_through_init#f76ef5bb#fff ON FIRST 1 OUTPUT Lhs.1 'arg0', Lhs.0 'arg1', Lhs.2 'arg2', Rhs.1 'arg3', Rhs.2 'arg4' return r2 ``` ``` Tuple counts for Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff#antijoin_rhs/5@886d8bvr after 67ms: 508926 ~0% {6} r1 = JOIN Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff#shared WITH Exprs::Name::defines#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.4 'arg4', Lhs.0 'arg0', Lhs.1 'arg1', Lhs.2 'arg2', Lhs.3 'arg3' 25 ~46% {5} r2 = JOIN r1 WITH Exprs::Expr::getScope#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'arg0', Lhs.3 'arg1', Lhs.4 'arg2', Lhs.5 'arg3', Lhs.1 'arg4' return r2 ``` ``` Tuple counts for Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff/4@87ec703f after 80ms: 313774 ~2% {5} r1 = Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff#shared AND NOT Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff#antijoin_rhs(Lhs.0, Lhs.1 'succ_scope', Lhs.2 'succ_def', Lhs.3 'pred_scope', Lhs.4) 313774 ~0% {4} r2 = SCAN r1 OUTPUT In.3 'pred_scope', In.0, In.1 'succ_scope', In.2 'succ_def' 313774 ~4% {4} r3 = JOIN r2 WITH @py_scope#f ON FIRST 1 OUTPUT Lhs.1, Lhs.0 'pred_scope', Lhs.2 'succ_scope', Lhs.3 'succ_def' 313778 ~0% {4} r4 = JOIN r3 WITH Base::essa_var_scope#f76ef5bb#fff ON FIRST 2 OUTPUT Rhs.2 'pred_var', Lhs.1 'pred_scope', Lhs.3 'succ_def', Lhs.2 'succ_scope' return r4 ``` ``` Tuple counts for Base::step_through_init#f76ef5bb#fff/3@7ba1ee1c after 17ms: 11763 ~0% {1} r1 = JOIN Scope::Scope::precedes#dispred#f0820431#ff#join_rhs WITH Scope::Scope::getName#dispred#f0820431#fb_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'init' 196671 ~4% {2} r2 = JOIN r1 WITH Scope::Scope::precedes#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0 'init', Rhs.1 'succ_scope' 196671 ~6% {3} r3 = JOIN r2 WITH Scope::Scope::precedes#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.1 'succ_scope', Rhs.1 'pred_scope', Lhs.0 'init' return r3 ``` ``` Tuple counts for Base::BaseFlow::scope_entry_value_transfer_from_earlier#f76ef5bb#ffff/4@4892f93f after 426ms: 1526390 ~0% {3} r1 = SCAN Base::essa_var_scope#f76ef5bb#fff OUTPUT In.1, In.0, In.2 'pred_var' 7798319 ~0% {4} r2 = JOIN r1 WITH Scope::Scope::precedes#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1, Rhs.1 'succ_scope', Rhs.0, Lhs.2 'pred_var' 285663 ~3% {4} r3 = JOIN r2 WITH Base::scope_entry_def_scope#f76ef5bb#fff ON FIRST 2 OUTPUT Lhs.3 'pred_var', Lhs.2 'pred_scope', Rhs.2 'succ_def', Lhs.1 'succ_scope' 599441 ~1% {4} r4 = Base::scope_entry_value_transfer_through_init#f76ef5bb#ffff UNION r3 return r4 ``` It's possible this could be improved even further, but I think this is good enough. (I'm not entirely happy with how many helper predicates I ended up needing, but it was the only way I could get the joins to happen in a semi-sensible order.)	2022-07-19 13:46:55 +00:00
Taus	bde47836d0	Python: Add `Str` class This makes the AST viewer (which annotates string constant nodes as `Str`) a bit more consistent.	2022-07-19 12:25:10 +00:00
Taus	8c0725e8c6	Python: Fix bad join in ESSA `getInput` Before: ``` Tuple counts for Essa::EssaEdgeRefinement::getInput#dispred#f0820431#ff/2@b84afc77 after 20.3s: 873421 ~0% {3} r1 = JOIN Essa::TEssaEdgeDefinition#24e22a14#ffff_31#join_rhs WITH Essa::TEssaEdgeDefinition#24e22a14#ffff_30#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0 'this' 181627951 ~0% {3} r2 = JOIN r1 WITH Essa::EssaDefinition::getSourceVariable#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1, Lhs.2 'this' 873418 ~0% {2} r3 = JOIN r2 WITH Essa::EssaDefinition::reachesEndOfBlock#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.0 'result' return r3 ``` It's perhaps not immediately obvious what's going on here (because of the `...join_rhs` indirection), but basically we're joining together `this` and `def` and their `getSourceVariable`, and only then actually relating `this` and `def` through `reachesEndOfBlock`. By unbinding `var`, we prevent this early join, which now encourages the `reachesEndOfBlock` join to happen earlier: ``` Tuple counts for Essa::EssaEdgeRefinement::getInput#dispred#f0820431#ff/2@2d63e5lb after 2s 873421 ~0% {2} r1 = SCAN Essa::TEssaEdgeDefinition#24e22a14#ffff OUTPUT In.3 'this', In.1 873421 ~0% {3} r2 = JOIN r1 WITH Essa::TEssaEdgeDefinition#24e22a14#ffff_30#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0 'this' 873421 ~0% {3} r3 = JOIN r2 WITH Definitions::SsaSourceVariable#class#486534ab#f ON FIRST 1 OUTPUT Lhs.1, Lhs.2 'this', Lhs.0 8758877 ~0% {3} r4 = JOIN r3 WITH Essa::EssaDefinition::reachesEndOfBlock#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.2, Lhs.1 'this' 873418 ~0% {2} r5 = JOIN r4 WITH Essa::EssaDefinition::getSourceVariable#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.0 'result' return r5 ```	2022-07-18 20:21:39 +00:00
alexet	f9b6ca76e5	Python: Fix binding incorrect predicate.	2022-07-18 16:28:19 +01:00
Taus	bdd771989f	Python: Fix bad join in `syntactic_call_count` On certain databases, the evaluation of this predicate was running out of memory due to the way the `count` aggregate was being used. Here's an example of the tuple counts involved: ``` Tuple counts for PointsToContext::syntactic_call_count#cf3039a0#ff#antijoin_rhs/1@d2199bb8 after 1m27s: 595518502 ~521250% {1} r1 = JOIN PointsToContext::syntactic_call_count#cf3039a0#ff#shared#3 WITH Flow::CallNode::getFunction#dispred#f0820431#ff_1#join_rhs ON FIRST 1 OUTPUT Lhs.1 'arg0' 26518709 ~111513% {1} r2 = JOIN PointsToContext::syntactic_call_count#cf3039a0#ff#shared#2 WITH Flow::CallNode::getFunction#dispred#f0820431#ff_1#join_rhs ON FIRST 1 OUTPUT Lhs.1 'arg0' 622037211 ~498045% {1} r3 = r1 UNION r2 return r3 ``` and a timing report that looked like this: ``` time \| evals \| max @ iter \| predicate ------\|-------\|--------------\|---------- 5m8s \| \| \| PointsToContext::syntactic_call_count#cf3039a0#ff#shared#2@6d98d1nd 4m38s \| \| \| PointsToContext::syntactic_call_count#cf3039a0#ff#count_range@f5df1do4 3m51s \| \| \| PointsToContext::syntactic_call_count#cf3039a0#ff#shared#3@da3b4abf 1m58s \| 7613 \| 37ms @ 4609 \| MRO::ClassListList::removedClassParts#f0820431#fffff#reorder_2_3_4_0_1@8155axyi 1m37s \| 7613 \| 33ms @ 3904 \| MRO::ClassListList::bestMergeCandidate#f0820431#2#fff@8155a83w 1m27s \| \| \| PointsToContext::syntactic_call_count#cf3039a0#ff#antijoin_rhs@d2199bb8 1m8s \| 1825 \| 63ms @ 404 \| PointsTo::Expressions::equalityEvaluatesTo#741b54e2#fffff@8155aw7w 37.6s \| \| \| PointsToContext::syntactic_call_count#cf3039a0#ff#join_rhs@e348fc1p ... ``` To make optimising this easier for the compiler, I moved the bodies of the `count` aggregate into their own helper predicates (with size linear in the number of `CallNode`s), and also factored out the many calls to `f.getName()`. The astute reader will notice that in writing this as a sum of `count`s rather than a count of a disjunction, the intersection (if it exists) will be counted twice, and so the semantics may be different. However, since `method_call` and `function_call` require `AttrNode` and `NameNode` functions respectively, and as these two types are disjoint, there is no intersection, and so the semantics should be preserved. After the change, the evaluation of `syntactic_call_count` now looks as follows: ``` Tuple counts for PointsToContext::syntactic_call_count#cf3039a0#ff/2@662dd8s0 after 216ms: 23960 ~0% {1} r1 = @py_scope#f AND NOT py_Functions_0#antijoin_rhs(Lhs.0 's') 23960 ~0% {2} r2 = SCAN r1 OUTPUT In.0 's', 0 276309 ~7% {2} r3 = SCAN @py_scope#f OUTPUT In.0 's', "__init__" 11763 ~0% {2} r4 = JOIN r3 WITH Scope::Scope::getName#dispred#f0820431#fb ON FIRST 2 OUTPUT Lhs.0 's', 1 35723 ~0% {2} r5 = r2 UNION r4 252349 ~0% {2} r6 = JOIN @py_scope#f WITH Function::Function::getName#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0 's', Rhs.1 240586 ~0% {2} r7 = SELECT r6 ON In.1 != "__init__" 131727 ~4% {2} r8 = r7 AND NOT project#PointsToContext::method_call#cf3039a0#ff(Lhs.1) 131727 ~0% {3} r9 = SCAN r8 OUTPUT In.1, In.0 's', 0 240586 ~0% {2} r10 = SCAN r7 OUTPUT In.1, In.0 's' 108859 ~0% {3} r11 = JOIN r10 WITH PointsToContext::syntactic_call_count#cf3039a0#ff#join_rhs ON FIRST 1 OUTPUT Lhs.0, Lhs.1 's', Rhs.1 240586 ~0% {3} r12 = r9 UNION r11 24100 ~0% {2} r13 = JOIN r12 WITH PointsToContext::syntactic_call_count#cf3039a0#ff#join_rhs#1 ON FIRST 1 OUTPUT Lhs.1 's', (Rhs.1 + Lhs.2) 240586 ~0% {2} r14 = SELECT r6 ON In.1 != "__init__" 131727 ~4% {2} r15 = r14 AND NOT project#PointsToContext::method_call#cf3039a0#ff(Lhs.1) 131727 ~0% {3} r16 = SCAN r15 OUTPUT In.0 's', In.1, 0 108859 ~4% {3} r17 = JOIN r10 WITH PointsToContext::syntactic_call_count#cf3039a0#ff#join_rhs ON FIRST 1 OUTPUT Lhs.1 's', Lhs.0, Rhs.1 240586 ~4% {3} r18 = r16 UNION r17 216486 ~2% {3} r19 = r18 AND NOT project#PointsToContext::function_call#cf3039a0#ff(Lhs.1) 216486 ~0% {2} r20 = SCAN r19 OUTPUT In.0 's', (0 + In.2) 240586 ~0% {2} r21 = r13 UNION r20 276309 ~0% {2} r22 = r5 UNION r21 return r22 ```	2022-07-18 13:58:00 +00:00
Andrew Eisenberg	b897a40228	Move python contextual queries to lib folders This will ensure that python projects can use jump to ref/def in vscode when the core libraries are not installed.	2022-07-15 13:12:17 -07:00
github-actions[bot]	0ee476129a	Post-release preparation for codeql-cli-2.10.1	2022-07-14 14:38:49 +00:00
Erik Krogh Kristensen	85a652f3d1	remove a bunch of repeated words	2022-07-14 12:42:48 +02:00
github-actions[bot]	d1aa0d7dd3	Release preparation for version 2.10.1	2022-07-14 08:56:03 +00:00
Erik Krogh Kristensen	595875ff98	remove redundant not-equals check	2022-07-13 12:06:12 +02:00
Erik Krogh Kristensen	8e52fc97fc	changes based on review by Shack	2022-07-12 16:02:50 +02:00
Erik Krogh Kristensen	aae3e2ddde	other changes based on Esbens review	2022-07-12 16:02:50 +02:00
Erik Krogh Kristensen	ff25451699	rename query to overly-large-range, and rewrite the @description	2022-07-12 16:02:46 +02:00
yoff	f52d792b36	Merge branch 'main' of https://github.com/github/codeql into python-dataflow/flow-summaries-from-scratch	2022-07-01 12:01:07 +00:00
CodeQL CI	5b5a52fa25	Merge pull request #9551 from yoff/python/port-tarslip Approved by RasmusWL	2022-07-01 12:58:25 +01:00
yoff	61523bd330	python: better names - "Normal" instead of "NonSpecial" - "NonLibrary" instead of "2" I could not find a good replacement for "NonLibrary", nor for "Source", but I added QLDocs in a few places to help the reading.	2022-07-01 11:55:20 +00:00
yoff	a0db438799	python: rename `getACall2` -> `getANonLibraryCall`	2022-07-01 10:29:03 +00:00
yoff	f6af24894d	python: recover `isPackageUsed` - add `unknownAttribute` to pre-compute negation - add `Node`-less formulation of "is imported"	2022-07-01 09:39:07 +00:00
yoff	3a80baf39c	python: concession to get the code to compile `isPackageUsed` now does no filtering	2022-07-01 07:06:09 +00:00
yoff	e54ada175d	python: rewrite `not` away A `LocalSourceNode` is either a `ModuleVariableNode` or an `ExprNode`.	2022-07-01 07:03:14 +00:00
yoff	cf9b69b5f2	python: More helpful comment	2022-06-30 13:07:13 +00:00
yoff	b0a29b146a	Update python/ql/lib/semmle/python/security/dataflow/TarSlipQuery.qll Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>	2022-06-30 14:54:01 +02:00
yoff	df7ffb2880	Update python/ql/lib/semmle/python/security/dataflow/TarSlipCustomizations.qll Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>	2022-06-30 14:53:49 +02:00
Andrew Eisenberg	fbeecd6c08	Merge pull request #9744 from github/aeisenberg/move-contextual-queries	2022-06-29 11:44:33 -07:00
Andrew Eisenberg	ddf06f8617	Add change notes and qldoc for moved files	2022-06-29 10:03:12 -07:00
Andrew Eisenberg	a3f4d1bf66	Move contextual queries from src to lib With this change, users are now able to run View AST command in vscode within vscode workspaces that do not include the core libraries. The relevant core library only needs to be installed in the package cache.	2022-06-29 07:51:26 -07:00
yoff	8988a02806	Merge pull request #9733 from tausbn/python-fix-bad-mro-flatten-list-join Python: Fix bad join in MRO `flatten_list`	2022-06-29 13:29:48 +02:00
yoff	f122af81ea	Merge pull request #9741 from tausbn/python-fix-bad-join-in-regexpbackref-getgroup Python: Fix bad join in `RegExpBackRef::getGroup`	2022-06-29 13:23:07 +02:00
yoff	731f866242	Merge pull request #9717 from tausbn/python-fix-bad-mro-linearization-of-bases-join Python: Fix bad join in MRO	2022-06-29 13:08:18 +02:00
Jeroen Ketema	55e052af26	Merge pull request #9686 from aschackmull/dataflow/no-node-scan Dataflow performance: Avoid node scans	2022-06-29 10:38:56 +02:00
yoff	1105cd569b	Merge branch 'main' into python/port-tarslip	2022-06-28 22:17:28 +02:00
yoff	6087bc6888	Merge branch 'main' into python/more-logic-tests	2022-06-28 22:16:38 +02:00
yoff	ac0c8d238f	python: only clear taint on false-edge	2022-06-28 20:14:52 +00:00
Taus	38b8640582	Python: Fix bad join in `RegExpBackRef::getGroup` Although this wasn't (as far as I know) causing any performance issues, it was making the join-order badness report quite noisy, and so I figured it was worth fixing. Before: ``` Tuple counts for RegexTreeView::RegExpBackRef::getGroup#dispred#f0820431#ff/2@d3441d0b after 84ms: 1501195 ~3% {2} r1 = JOIN RegexTreeView::RegExpTerm::getLiteral#dispred#f0820431#ff_10#join_rhs WITH RegexTreeView::RegExpTerm::getLiteral#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1 'result' 149 ~0% {5} r2 = JOIN r1 WITH RegexTreeView::RegExpBackRef#class#31aac2a7#ffff ON FIRST 1 OUTPUT Rhs.1, Rhs.2, Rhs.3, Lhs.1 'result', Lhs.0 'this' 149 ~1% {3} r3 = JOIN r2 WITH regex::RegexString::numbered_backreference#dispred#f0820431#ffff ON FIRST 3 OUTPUT Lhs.3 'result', Rhs.3, Lhs.4 'this' 4 ~0% {2} r4 = JOIN r3 WITH RegexTreeView::RegExpGroup::getNumber#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.2 'this', Lhs.0 'result' 1501195 ~3% {2} r5 = JOIN RegexTreeView::RegExpTerm::getLiteral#dispred#f0820431#ff_10#join_rhs WITH RegexTreeView::RegExpTerm::getLiteral#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.1 'result', Rhs.1 'result' 42526 ~0% {5} r6 = JOIN r5 WITH RegexTreeView::RegExpGroup#31aac2a7#ffff ON FIRST 1 OUTPUT Lhs.1 'this', Lhs.0 'result', Rhs.1, Rhs.2, Rhs.3 22 ~0% {8} r7 = JOIN r6 WITH RegexTreeView::RegExpBackRef#class#31aac2a7#ffff ON FIRST 1 OUTPUT Lhs.2, Lhs.3, Lhs.4, Lhs.1 'result', Lhs.0 'this', Rhs.1, Rhs.2, Rhs.3 0 ~0% {6} r8 = JOIN r7 WITH regex::RegexString::getGroupName#dispred#f0820431#ffff ON FIRST 3 OUTPUT Lhs.5, Lhs.6, Lhs.7, Rhs.3, Lhs.3 'result', Lhs.4 'this' 0 ~0% {2} r9 = JOIN r8 WITH regex::RegexString::named_backreference#dispred#f0820431#ffff ON FIRST 4 OUTPUT Lhs.5 'this', Lhs.4 'result' 4 ~0% {2} r10 = r4 UNION r9 return r10 ``` In this case I opted for a classical solution: tying together the literal and number (or name) part of the backreference in order to encourage a two-column join. After: ``` Tuple counts for RegexTreeView::RegExpBackRef::getGroup#dispred#f0820431#ff/2@b0cc4d5n after 0ms: 898 ~1% {3} r1 = JOIN RegexTreeView::RegExpTerm::getLiteral#dispred#f0820431#ff WITH RegexTreeView::RegExpGroup::getNumber#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Lhs.0 'result' 4 ~0% {2} r2 = JOIN r1 WITH RegexTreeView::RegExpBackRef::hasLiteralAndNumber#f0820431#fff_120#join_rhs ON FIRST 2 OUTPUT Rhs.2 'this', Lhs.2 'result' 1110 ~0% {5} r3 = JOIN RegexTreeView::RegExpGroup#31aac2a7#ffff WITH RegexTreeView::RegExpTerm::getLiteral#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.3, Lhs.0 'result', Rhs.1 146 ~0% {3} r4 = JOIN r3 WITH regex::RegexString::getGroupName#dispred#f0820431#ffff ON FIRST 3 OUTPUT Lhs.4, Rhs.3, Lhs.3 'result' 0 ~0% {2} r5 = JOIN r4 WITH RegexTreeView::RegExpBackRef::hasLiteralAndName#f0820431#fff_120#join_rhs ON FIRST 2 OUTPUT Rhs.2 'this', Lhs.2 'result' 4 ~0% {2} r6 = r2 UNION r5 return r6 ```	2022-06-28 16:51:09 +00:00
Taus	b98c482c47	Python: Fix bad join in MRO `flatten_list` This bad join was identified by the join-order-badness report, which showed that: py/use-of-input:MRO::flatten_list#f4eaf05f#fff#9c5fe54whnlqffdgu65vhb8uhpg# (order_500000) calculated a whopping 212,820,108 tuples in order to produce an output of size 55516, roughly 3833 times more effort than needed. Here's a snippet of the slowest iteration of that predicate: ``` Tuple counts for MRO::flatten_list#f4eaf05f#fff/3@i1839#0265eb3w after 14ms: 0 ~0% {3} r1 = JOIN MRO::need_flattening#f4eaf05f#f#prev_delta WITH MRO::ConsList#f4eaf05f#fff#reorder_2_0_1#prev ON FIRST 1 OUTPUT Rhs.1, Lhs.0 'list', Rhs.2 0 ~0% {3} r2 = JOIN r1 WITH MRO::ClassList::length#f0820431#ff#prev ON FIRST 1 OUTPUT Lhs.2, Lhs.1 'list', Rhs.1 'n' 0 ~0% {3} r3 = JOIN r2 WITH MRO::ClassListList::flatten#dispred#f0820431#ff#prev ON FIRST 1 OUTPUT Lhs.1 'list', Lhs.2 'n', Rhs.1 'result' 0 ~0% {3} r4 = SCAN MRO::ConsList#f4eaf05f#fff#prev_delta OUTPUT In.2 'list', In.0, In.1 0 ~0% {3} r5 = JOIN r4 WITH MRO::need_flattening#f4eaf05f#f#prev ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.0 'list' 0 ~0% {3} r6 = JOIN r5 WITH MRO::ClassList::length#f0820431#ff#prev ON FIRST 1 OUTPUT Lhs.1, Lhs.2 'list', Rhs.1 'n' 0 ~0% {3} r7 = JOIN r6 WITH MRO::ClassListList::flatten#dispred#f0820431#ff#prev ON FIRST 1 OUTPUT Lhs.1 'list', Lhs.2 'n', Rhs.1 'result' 0 ~0% {3} r8 = r3 UNION r7 26355 ~2% {3} r9 = SCAN MRO::ConsList#f4eaf05f#fff#prev OUTPUT In.2 'list', In.0, In.1 0 ~0% {3} r10 = JOIN r9 WITH MRO::need_flattening#f4eaf05f#f#prev ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.0 'list' 0 ~0% {3} r11 = JOIN r10 WITH MRO::ClassList::length#f0820431#ff#prev_delta ON FIRST 1 OUTPUT Lhs.1, Lhs.2 'list', Rhs.1 'n' 0 ~0% {3} r12 = JOIN r11 WITH MRO::ClassListList::flatten#dispred#f0820431#ff#prev ON FIRST 1 OUTPUT Lhs.1 'list', Lhs.2 'n', Rhs.1 'result' ... ``` (... and a bunch more lines. The same construction appears several times, but the join order is the same each time.) Clearly it would be better to start with whatever is in `need_flattening`, and then do the other joins. This is what the present fix does (by unbinding `list` in all but the `needs_flattening` call). After the fix, the slowest iteration is as follows: ``` Tuple counts for MRO::flatten_list#f4eaf05f#fff/3@i2617#8155ab3w after 9ms: 0 ~0% {2} r1 = SCAN MRO::need_flattening#f4eaf05f#f#prev_delta OUTPUT In.0 'list', In.0 'list' 0 ~0% {3} r2 = JOIN r1 WITH MRO::ConsList#f4eaf05f#fff#reorder_2_0_1#prev ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'list', Rhs.2 0 ~0% {3} r3 = JOIN r2 WITH MRO::ClassList::length#f0820431#ff#prev ON FIRST 1 OUTPUT Lhs.2, Lhs.1 'list', Rhs.1 'n' 0 ~0% {3} r4 = JOIN r3 WITH MRO::ClassListList::flatten#dispred#f0820431#ff#prev ON FIRST 1 OUTPUT Lhs.1 'list', Lhs.2 'n', Rhs.1 'result' 1 ~0% {2} r5 = SCAN MRO::need_flattening#f4eaf05f#f#prev OUTPUT In.0 'list', In.0 'list' 0 ~0% {3} r6 = JOIN r5 WITH MRO::ConsList#f4eaf05f#fff#reorder_2_0_1#prev_delta ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'list', Rhs.2 0 ~0% {3} r7 = JOIN r6 WITH MRO::ClassList::length#f0820431#ff#prev ON FIRST 1 OUTPUT Lhs.2, Lhs.1 'list', Rhs.1 'n' 0 ~0% {3} r8 = JOIN r7 WITH MRO::ClassListList::flatten#dispred#f0820431#ff#prev ON FIRST 1 OUTPUT Lhs.1 'list', Lhs.2 'n', Rhs.1 'result' ... ``` (... and so on. The remainder is 0 tuples all the way.) In total, we went from ``` 40.6s \| 7614 \| 15ms @ 1839 \| MRO::flatten_list#f4eaf05f#fff@0265eb3w ``` to ``` 7.8s \| 7614 \| 11ms @ 2617 \| MRO::flatten_list#f4eaf05f#fff@8155ab3w ```	2022-06-28 14:17:47 +00:00

... 35 36 37 38 39 ...

2826 Commits