Commit Graph

3007 Commits

Author SHA1 Message Date
Koen Vlaswinkel
779795b421 Python: Exclude probable test files in model editor 2024-07-05 11:06:22 +02:00
Tom Hvitved
da0909c080 Merge pull request #16896 from hvitved/ssa/dataflow-integration-prep
SSA: Add `BasicBlock.{getNode/1,length/0}` to the input signature
2024-07-03 19:56:35 +02:00
Taus
b779341ba6 Merge pull request #16885 from github/tausbn/python-fix-bad-join-in-function-resolution-type-tracker
Python: Fix bad join in function resolution
2024-07-03 13:59:13 +02:00
Rasmus Wriedt Larsen
f9536e9a66 Merge pull request #16883 from github/tausbn/python-fix-bad-join-in-import-resolution
Python: Fix bad join in `getImmediateModuleReference`
2024-07-03 11:40:01 +02:00
Tom Hvitved
4ae8720930 SSA: Add BasicBlock.{getNode/1,length/0} to the input signature 2024-07-03 11:32:35 +02:00
Rasmus Wriedt Larsen
ce177c3450 Merge pull request #15655 from yoff/python/support-model-editor
Python: Support model editor
2024-07-02 16:28:58 +02:00
Taus
e5a15f34a4 Python: Fix bad join in function resolution
On a certain database, the evaluator was spending 17 seconds on a single
iteration, which is usually a bad sign.

Looking more closely, we find the following culprit:

```
(621s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i1#fb44303q after 17.8s:
79429084 ~0%     {3} r1 = JOIN `TypeTrackingImpl::TypeTracker.start/0#dispred#f8047cd1` WITH `DataFlowPublic::Node.asExpr/0#dispred#2845197a` CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 't', Rhs.1
59018875 ~7%     {3}    | JOIN WITH LocalSources::LocalSourceNode#2491029a ON FIRST 1 OUTPUT Lhs.2, Lhs.1 't', Lhs.0 'result'
7929     ~1%     {3}    | JOIN WITH `PoorMansFunctionResolution::lastDecoratorCall/1#152a5a8f_10#join_rhs` ON FIRST 1 OUTPUT Lhs.1 't', Rhs.1 'func', Lhs.2 'result'

277654   ~5%     {3} r2 = JOIN `TypeTrackingImpl::TypeTracker.start/0#dispred#f8047cd1` WITH `Function::Function.getDefinition/0#dispred#6279b12c` CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 't', Rhs.1
                 {3}    | AND NOT `project#Function::Function.getADecorator/0#dispred#119184e3`(FIRST 1)
269725   ~0%     {3}    | SCAN OUTPUT In.2, In.1 't', In.0 'func'
270268   ~0%     {3}    | JOIN WITH `DataFlowPublic::Node.asExpr/0#dispred#2845197a_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1 't', Lhs.2 'func'
270268   ~7%     {3}    | JOIN WITH LocalSources::LocalSourceNode#2491029a ON FIRST 1 OUTPUT Lhs.1 't', Lhs.2 'func', Lhs.0 'result'

278197   ~5%     {3} r3 = r1 UNION r2
                 return r3
```

This is clearly silly. We're creating a type tracker start spot for
_every expression_, and only then filtering them down to just final
decorator calls.

To fix this, I simply pushed the `.asExpr` into `lastDecoratorCall` and
made that `pragma[nomagic]`.

This resulted in a much more reasonable iteration:

```
(585s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i1#62ade299 after 2.1s:
61252093 ~0%     {2} r1 = JOIN `TypeTrackingImpl::TypeTracker.start/0#dispred#f8047cd1` WITH LocalSources::LocalSourceNode#2491029a CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 't'
7929     ~3%     {3}    | JOIN WITH `PoorMansFunctionResolution::lastDecoratorCall/1#152a5a8f_10#join_rhs` ON FIRST 1 OUTPUT Lhs.1 't', Rhs.1 'func', Lhs.0 'result'

277654   ~5%     {3} r2 = JOIN `TypeTrackingImpl::TypeTracker.start/0#dispred#f8047cd1` WITH `Function::Function.getDefinition/0#dispred#6279b12c` CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 't', Rhs.1
                 {3}    | AND NOT `project#Function::Function.getADecorator/0#dispred#119184e3`(FIRST 1)
269725   ~0%     {3}    | SCAN OUTPUT In.2, In.1 't', In.0 'func'
270268   ~0%     {3}    | JOIN WITH `DataFlowPublic::Node.asExpr/0#dispred#2845197a_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1 't', Lhs.2 'func'
270268   ~5%     {3}    | JOIN WITH LocalSources::LocalSourceNode#2491029a ON FIRST 1 OUTPUT Lhs.1 't', Lhs.2 'func', Lhs.0 'result'

278197   ~5%     {3} r3 = r1 UNION r2
                 return r3
```

That's better, but can we do better still? Ideally, we'll join with
`lastDecoratorCall` _first_, and only then consider `LocalSourceNode`s.

Time to add some pragmas:

```
(6s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i1#049abau6 after 51ms:
247936 ~0%     {3} r1 = SCAN TypeTrackingImpl::Cached::MkTypeTracker#ff50e2d8 OUTPUT In.1, In.0, In.2 't'
2      ~0%     {3}    | JOIN WITH `num#Option::Option<TypeTrackingImpl::TypeTrackingInput::Content>::TNone#364b4b8a` ON FIRST 1 OUTPUT Lhs.1, Lhs.2 't', _
               {2}    | REWRITE WITH Tmp.2 := false, TEST InOut.0 = Tmp.2 KEEPING 2
1      ~0%     {1}    | SCAN OUTPUT In.1 't'

7929   ~0%     {3} r2 = JOIN r1 WITH `PoorMansFunctionResolution::lastDecoratorCall/1#152a5a8f` CARTESIAN PRODUCT OUTPUT Rhs.1 'result', Lhs.0 't', Rhs.0
7929   ~3%     {3}    | JOIN WITH LocalSources::LocalSourceNode#2491029a ON FIRST 1 OUTPUT Lhs.1 't', Lhs.2 'func', Lhs.0 'result'

277654 ~5%     {3} r3 = JOIN r1 WITH `Function::Function.getDefinition/0#dispred#6279b12c` CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 't', Rhs.1
               {3}    | AND NOT `project#Function::Function.getADecorator/0#dispred#119184e3`(FIRST 1)
269725 ~0%     {3}    | SCAN OUTPUT In.2, In.1 't', In.0 'func'
270268 ~0%     {3}    | JOIN WITH `DataFlowPublic::Node.asExpr/0#dispred#2845197a#fb_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1 't', Lhs.2 'func'
270268 ~5%     {3}    | JOIN WITH LocalSources::LocalSourceNode#2491029a ON FIRST 1 OUTPUT Lhs.1 't', Lhs.2 'func', Lhs.0 'result'

278197 ~5%     {3} r4 = r2 UNION r3
               return r4
```

Perfect! (Here, the optimiser has for some reason decided to inline
`t.start()`, but this is not important.)

Iteration timings before:

```
(621s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i1#fb44303q after 17.8s:
(623s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i2#fb44303q after 1.9s:
(628s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i3#fb44303q after 5.2s:
(629s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i4#fb44303q after 696ms:
(629s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i5#fb44303q after 238ms:
(630s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i6#fb44303q after 401ms:
(630s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i7#fb44303q after 6ms:
(630s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i8#fb44303q after 3ms:
(630s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i9#fb44303q after 1ms:
(630s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i10#fb44303q after 525ms:
(631s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/1#c7f86fe2#fb/2@6a2459s7 after 518ms:
(631s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/1#c7f86fe2#fb_10#join_rhs/2@eab64fk2 after 42ms:
```

And after:

```
(585s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i1#62ade299 after 2.1s:
(585s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i2#62ade299 after 331ms:
(590s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i3#62ade299 after 4.8s:
(592s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i4#62ade299 after 1.9s:
(593s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i5#62ade299 after 504ms:
(593s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i6#62ade299 after 312ms:
(593s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i7#62ade299 after 6ms:
(593s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i8#62ade299 after 2ms:
(593s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i9#62ade299 after 1ms:
(593s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i10#62ade299 after 489ms:
(594s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/1#c7f86fe2#fb/2@7cac46aj after 522ms:
(594s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/1#c7f86fe2#fb_10#join_rhs/2@a15a372e after 41ms:
```

(Curiously, in the "before" run it appeared to run that tracker _twice_,
with the same performance characteristics the second time around. This
is not present in the "after" version.)

And finally with the pragma:

```
(596s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i1#7ef50eer after 636ms:
(596s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i2#7ef50eer after 180ms:
(598s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i3#7ef50eer after 1.9s:
(604s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i4#7ef50eer after 6.2s:
(606s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i5#7ef50eer after 1.4s:
(607s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i6#7ef50eer after 1.7s:
(607s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i7#7ef50eer after 55ms:
(607s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i8#7ef50eer after 8ms:
(607s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i9#7ef50eer after 2ms:
(608s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i10#7ef50eer after 619ms:
(609s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/1#c7f86fe2#fb/2@4cb536id after 566ms:
(609s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/1#c7f86fe2#fb_10#join_rhs/2@f0664dnr after 56ms:
(714s) Tuple counts for m#PoorMansFunctionResolution::poorMansFunctionTracker/1#c7f86fe2#fb/1@fe47420j after 14ms:
(718s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i1#769227t3 after 369ms:
(719s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i2#769227t3 after 623ms:
(724s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i3#769227t3 after 5.3s:
(727s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i4#769227t3 after 2.8s:
(727s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i5#769227t3 after 289ms:
(727s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i6#769227t3 after 285ms:
(727s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i7#769227t3 after 7ms:
(727s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i8#769227t3 after 3ms:
(727s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i9#769227t3 after 1ms:
(728s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01/3@i10#769227t3 after 672ms:
(731s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01_201#join_rhs/3@cacd816f after 2.5s:
(732s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/1#c7f86fe2#fb/2@14c0cfsk after 80ms:
(732s) Tuple counts for PoorMansFunctionResolution::poorMansFunctionTracker/1#c7f86fe2#fb_10#join_rhs/2@fc0f040d after 0ms:
```

Note the double evaluation again. This will need to be investigated
further.
2024-07-01 14:04:01 +00:00
Taus
446dbf67cc Python: Fix bad join in getImmediateModuleReference
The "most expensive predicates" report had the following line on a
certain database:

```
1m15s |    11 |   37s @ 4    | ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0@12bb4xdo
```

Investigating further revealed the following bad joins

```
(388s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i4#b2325xoe after 36.5s:
0         ~0%        {2} r1 = JOIN `ImportResolution::ImportResolution::sys_modules_module_with_name/1#134529bf#prev_delta` WITH `ImportResolution::ImportResolution::getReferenceToModuleName/1#bc5da225` ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1 'm'

74884348  ~0%        {3} r2 = JOIN `ImportResolution::ImportResolution::getModuleReference/1#28368ea4#prev_delta` WITH `ImportResolution::ImportResolution::potential_module_export/2#19340171` ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Lhs.0
5221604   ~0%        {3}    | JOIN WITH `Attributes::AttrRef.accesses/2#dispred#31929f12_120#join_rhs` ON FIRST 2 OUTPUT Rhs.2 'result', Lhs.2, Lhs.1
5219926   ~2%        {3}    | JOIN WITH Attributes::AttrRead#class#f6c3f431 ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.0 'result'
5300880   ~1%        {2}    | JOIN WITH `ImportResolution::ImportResolution::module_export/3#f2fc6a2a` ON FIRST 2 OUTPUT Rhs.2, Lhs.2 'result'
42211     ~5%        {2}    | JOIN WITH `ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0#prev` ON FIRST 1 OUTPUT Lhs.1 'result', Rhs.1 'm'

957042    ~4%        {3} r3 = JOIN `ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0#prev_delta` WITH `ImportResolution::ImportResolution::module_export/3#f2fc6a2a_201#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Rhs.2, Lhs.1 'm'
957035    ~0%        {3}    | JOIN WITH `ImportResolution::ImportResolution::potential_module_export/2#19340171` ON FIRST 2 OUTPUT Lhs.1, Lhs.2 'm', Lhs.0
236753257 ~1%        {4}    | JOIN WITH `Attributes::AttrRef.accesses/2#dispred#31929f12_201#join_rhs` ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1 'm', Lhs.2, Rhs.2
199557145 ~2%        {4}    | JOIN WITH Attributes::AttrRead#class#f6c3f431 ON FIRST 1 OUTPUT Lhs.2, Lhs.3, Lhs.1 'm', Lhs.0 'result'
1         ~0%        {2}    | JOIN WITH `ImportResolution::ImportResolution::getModuleReference/1#28368ea4#prev` ON FIRST 2 OUTPUT Lhs.3 'result', Lhs.2 'm'

15199013  ~1951%     {2} r4 = JOIN `ImportResolution::ImportResolution::getModuleReference/1#28368ea4#prev_delta` WITH `Module::Module.getPackageName/0#dispred#bb0c3872` ON FIRST 1 OUTPUT Lhs.1, Rhs.1
14707604  ~2136%     {3}    | JOIN WITH `Attributes::AttrRef.accesses/2#dispred#31929f12_102#join_rhs` ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1, Rhs.2

14623588  ~2190%     {4} r5 = JOIN r4 WITH Attributes::AttrRead#class#f6c3f431 ON FIRST 1 OUTPUT _, Lhs.0 'result', Lhs.1, Lhs.2
14623588  ~2058%     {2}    | REWRITE WITH Tmp.0 := ".", Out.0 := (In.2 ++ Tmp.0 ++ In.3) KEEPING 2

14623588  ~2139%     {5} r6 = JOIN r4 WITH Attributes::AttrRead#class#f6c3f431 ON FIRST 1 OUTPUT _, Lhs.0 'result', Lhs.1, Lhs.2, _
14623588  ~2092%     {2}    | REWRITE WITH Tmp.0 := ".", Tmp.0 := (In.2 ++ Tmp.0 ++ In.3), Tmp.4 := ".__init__", Out.0 := (Tmp.0 ++ Tmp.4) KEEPING 2

29247176  ~2099%     {2} r7 = r5 UNION r6
199786001 ~6922%     {2}    | JOIN WITH `Module::isPreferredModuleForName/2#5fb427f9_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'result'
199756923 ~7024%     {2}    | JOIN WITH `Module::Module.getFile/0#dispred#53eb9b1b_10#join_rhs` ON FIRST 1 OUTPUT Lhs.1 'result', Rhs.1 'm'

199799135 ~6954%     {2} r8 = r1 UNION r2 UNION r3 UNION r7
199793992 ~6954%     {2}    | AND NOT `ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0#prev`(FIRST 2)
                     return r8
```

Clearly, waiting to joining with `getModuleReference` last is not
healthy. To fix this, I opted to simply create a helper predicate for
the `accesses` construct.

After this change, here are the iteration timings

```
(327s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i1#74f41yqa after 1.2s:
(327s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i1#8a053ys7 after 1.3s:
(327s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i2#74f41yqa after 20ms:
(327s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i2#8a053ys7 after 20ms:
(337s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i4#74f41yqa after 8.5s:
(341s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i4#8a053ys7 after 3.2s:
(346s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i5#74f41yqa after 7.2s:
(349s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i6#74f41yqa after 3ms:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i5#8a053ys7 after 10s:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i8#74f41yqa after 37ms:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i9#74f41yqa after 0ms:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i10#74f41yqa after 0ms:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i11#74f41yqa after 1ms:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i12#74f41yqa after 1ms:
(353s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i6#8a053ys7 after 1ms:
(354s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i8#8a053ys7 after 7ms:
(354s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i9#8a053ys7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i10#8a053ys7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i11#8a053ys7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i12#8a053ys7 after 0ms:
```

And the helper predicate itself is also quick to evaluate:

```
(327s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i1#74f41xqa after 0ms:
(327s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i1#8a053xs7 after 0ms:
(329s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i3#74f41xqa after 99ms:
(337s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i3#8a053xs7 after 98ms:
(338s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i4#74f41xqa after 679ms:
(341s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i4#8a053xs7 after 400ms:
(346s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i5#74f41xqa after 1ms:
(349s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i6#74f41xqa after 22ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i5#8a053xs7 after 1ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i7#74f41xqa after 1.4s:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i8#74f41xqa after 8ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i9#74f41xqa after 0ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i10#74f41xqa after 1ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i11#74f41xqa after 1ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i12#74f41xqa after 1ms:
(353s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i13#74f41xqa after 806ms:
(353s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i6#8a053xs7 after 7ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i7#8a053xs7 after 870ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i8#8a053xs7 after 2ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i9#8a053xs7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i10#8a053xs7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i11#8a053xs7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i12#8a053xs7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i13#8a053xs7 after 276ms:
```

(I note that we appear to be evaluating this code twice, which is a bit
worrying. I'll leave that investigaton for later.)
2024-07-01 12:53:04 +00:00
Taus
d9b337cb2c Merge pull request #16804 from github/tausbn/python-fix-bad-join-in-dataflow-dispatch
Python: Fix bad join in `DataFlowDispatch`
2024-07-01 13:14:28 +02:00
Joe Farebrother
b81d41ba7b Add django header write models for direct subscript write 2024-07-01 11:26:54 +01:00
Arthur Baars
b12b33c8f9 Merge remote-tracking branch 'upstream/main' into 'rc/3.14' 2024-06-28 19:50:35 +02:00
Rasmus Lerchedahl Petersen
5ddfe75a0d Python: Add value steps for sequence elements
It would be nice to simplify to a single sequence content type..
2024-06-28 15:10:08 +02:00
Rasmus Lerchedahl Petersen
59f953269a Python: remove strange sink
It is not clear from the code how this could happen and
I do not remember the path I saw, perhaps it was unreasonable.
2024-06-28 14:42:24 +02:00
yoff
bbc3ff2dfe Apply suggestions from code review
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2024-06-28 14:39:03 +02:00
Rasmus Lerchedahl Petersen
6bc830dca4 Python: add qldoc 2024-06-27 16:55:29 +02:00
Rasmus Lerchedahl Petersen
9cca1b294c Python: Add test cases 2024-06-27 16:33:23 +02:00
Rasmus Lerchedahl Petersen
27301edc28 Python: address more review comments 2024-06-27 16:05:21 +02:00
yoff
c2141b62e0 Apply suggestions from code review
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2024-06-27 14:53:03 +02:00
Rasmus Lerchedahl Petersen
da03237b32 Python: fix typo pointed out in review but missed by me 2024-06-27 11:21:28 +02:00
Rasmus Lerchedahl Petersen
a3076f4f72 Python: fix test expectations, add missing sanitizer 2024-06-26 13:27:32 +02:00
Joe Farebrother
6538d22d3f Fix tornado model of httheaders.add. 2024-06-26 09:21:53 +01:00
Rasmus Lerchedahl Petersen
571be8be3e Python: model more loggers 2024-06-26 01:00:38 +02:00
Rasmus Lerchedahl Petersen
eb32cbe8a5 Python: codecs.open 2024-06-26 00:57:59 +02:00
Rasmus Lerchedahl Petersen
bdc48088e6 Python: MaD summary models
Two of the generated summaries have been excluded:
 - ["re", "Member[split]", "Argument[0,pattern:]", "ReturnValue", "taint"]
   From the documentation, it is not clear why pattern should figure in the return value, as that is the part denoting split point and thus all those instances are filtered out.
   From the implementation
     Spit function: https://github.com/python/cpython/blob/3.12/Lib/re/__init__.py#L199
     _compile function being called by split: https://github.com/python/cpython/blob/3.12/Lib/re/__init__.py#L280
   We see that in case the pattern is already a compiled `Pattern`, it is returned directly from _compile and could thus be part of the return value from split. This is probably not possible to arrange for an attacker, and so an FP in practice.

 - ["urllib2", "Member[unquote]", "Argument[0,string:]", "ReturnValue", "taint"]
   urllib2 seems to be only in Python2 (e.g. https://docs.python.org/2.7/library/urllib2.html) and I cannot locate the function unquote.
2024-06-26 00:39:30 +02:00
yoff
58b6b3f601 Merge pull request #16789 from yoff/python/document-models-as-data
python: Document MaD format
2024-06-25 15:46:28 +02:00
Rasmus Lerchedahl Petersen
bc551174f9 Python: model copy.deepcopy as a value step 2024-06-25 14:53:06 +02:00
Rasmus Lerchedahl Petersen
501cda4e8c Python: model fnmatch.filter 2024-06-25 14:44:39 +02:00
Rasmus Lerchedahl Petersen
2118f233b9 Python: model optparse.OptionParser.parse_arg 2024-06-25 14:40:23 +02:00
Rasmus Lerchedahl Petersen
b80a711b27 python: undo changes to qlpack 2024-06-25 14:13:59 +02:00
Rasmus Lerchedahl Petersen
1e97600c4a Python: move models 2024-06-25 14:13:56 +02:00
Rasmus Lerchedahl Petersen
d410136852 python: compress models 2024-06-25 14:13:52 +02:00
Rasmus Lerchedahl Petersen
c004ffaca8 python: move model to Stdlib.yml
There is already a model there so we add to that one.

We did observe that this existing model was blocked by the external MaD model.
This is concerning and needs to be cleared up.
2024-06-25 14:13:48 +02:00
Rasmus Lerchedahl Petersen
281ac05868 python: add modelling for urlib.parse
- `quote` together with `re.compile` recover regex injection alerts on haiwen/seahub
- `quote_plus` recovers the URL redirection alert on DemocracyClub/EveryElection
- `unquote` recovers path injection alerts on `cloudera/hue`
- it was tedious finding justifications for the rest..
2024-06-25 14:13:44 +02:00
Rasmus Lerchedahl Petersen
df406b4fca python: Start modelling using MaD
- empty models for now
- `summaryModel` of `codeql/python-all` will be added to shortly.
2024-06-25 14:13:41 +02:00
Rasmus Lerchedahl Petersen
b902dd5680 Python: add change note 2024-06-25 11:54:30 +02:00
github-actions[bot]
fd385736e6 Post-release preparation for codeql-cli-2.17.6 2024-06-25 06:39:45 +00:00
Joe Farebrother
0901b3d0a6 Add change note 2024-06-24 21:43:09 +01:00
Joe Farebrother
c404f00a9b Add additional header write models for aiohttp and tornado + added qldoc 2024-06-24 17:27:25 +01:00
Joe Farebrother
79c0ed6074 Add additional fastapi mheader write models 2024-06-24 17:27:21 +01:00
Joe Farebrother
5ced5c010c Add django header writes 2024-06-24 17:27:15 +01:00
Joe Farebrother
7704801e47 Change fastapi raw cookie header models to header write models 2024-06-24 17:27:12 +01:00
Joe Farebrother
a0201e9c4f Update tests for new cookie write from headers 2024-06-24 17:27:06 +01:00
Joe Farebrother
d11f58f768 Add cookie header write concept from experimental. 2024-06-24 17:26:56 +01:00
Joe Farebrother
b71ba7c30f Move Header Write derrived concepts to Concepts 2024-06-24 17:26:51 +01:00
github-actions[bot]
e32a587078 Release preparation for version 2.17.6 2024-06-24 14:33:10 +00:00
Anders Schack-Mulligen
8c23e21073 Dataflow: Cache compatibleTypes. 2024-06-24 13:35:48 +02:00
Rasmus Lerchedahl Petersen
00fbada41d Python: recognize fabric.operations 2024-06-24 10:54:59 +02:00
Taus
6db7e72fb8 Python: Fix bad join in DataFlowDispatch
A case of bad magic. Rather than evaluating separately whether a class
has a method of some name, the compiler opted to magick in the fact
that this was done as part of the `findFunctionAccordingToMro`
predicate. Hilarity ensued.

However, _we_ know that magic really isn't needed in this case (the
number of results is bounded by `Class.getAMethod` since methods have
only a single name), so by factoring it out into a helper predicate, we
can help the join-orderer along.

Before
```
(377s) Starting to evaluate predicate _DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev_DataFlowDispatch::getNextClassInMro/1#__#shared/3@i6#L3#f893bw2h (iteration 6)
(377s) Tuple counts for _DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev_DataFlowDispatch::getNextClassInMro/1#__#shared/3@i6#L3#f893bw2h after 16ms:
33363  ~0%     {2} r1 = SCAN `DataFlowDispatch::getNextClassInMro/1#e1ee596a#prev_delta` OUTPUT In.1, In.0 'arg1'
159696 ~4%     {3}    | JOIN WITH `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev` ON FIRST 1 OUTPUT Rhs.1 'arg0', Lhs.1 'arg1', Rhs.2 'arg2'
               return r1
(377s) Starting to evaluate predicate _Class::Class.getAMethod/0#dispred#66416e47_Function::Function.getName/0#dispred#033700ef_10#join_rh__#antijoin_rhs/3@i6#L4#f893bw2h (iteration 6)
(382s) Tuple counts for _Class::Class.getAMethod/0#dispred#66416e47_Function::Function.getName/0#dispred#033700ef_10#join_rh__#antijoin_rhs/3@i6#L4#f893bw2h after 4.4s:
1770825904 ~4%     {4} r1 = JOIN `_DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev_DataFlowDispatch::getNextClassInMro/1#__#shared` WITH `Function::Function.getName/0#dispred#033700ef_10#join_rhs` ON FIRST 1 OUTPUT Lhs.1 'arg0', Rhs.1, Lhs.0 'arg1', Lhs.2 'arg2'
34558      ~3%     {3}    | JOIN WITH `Class::Class.getAMethod/0#dispred#66416e47` ON FIRST 2 OUTPUT Lhs.0 'arg0', Lhs.2 'arg1', Lhs.3 'arg2'
                   return r1
...
(382s) Starting to evaluate predicate DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3/3@i6#f893b1xh (iteration 6)
(382s)                     - DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3_delta has 125138 rows (order for disjuncts: delta=<standard>).
(382s) Tuple counts for DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3/3@i6#f893b1xh after 12ms:
33363  ~0%     {2} r1 = SCAN `DataFlowDispatch::getNextClassInMro/1#e1ee596a#prev_delta` OUTPUT In.1, In.0 'cls'
159696 ~0%     {3}    | JOIN WITH `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev` ON FIRST 1 OUTPUT Lhs.1 'cls', Rhs.1 'name', Rhs.2 'result'
125138 ~1%     {3}    | AND NOT `_Class::Class.getAMethod/0#dispred#66416e47_Function::Function.getName/0#dispred#033700ef_10#join_rh__#antijoin_rhs`(FIRST 3)

0      ~0%     {3} r2 = JOIN `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev_delta` WITH `DataFlowDispatch::getNextClassInMro/1#e1ee596a#reorder_1_0#prev` ON FIRST 1 OUTPUT Lhs.1 'name', Lhs.2 'result', Rhs.1 'cls'
               {3}    | AND NOT `_Class::Class.getAMethod/0#dispred#66416e47_DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#__#antijoin_rhs`(FIRST 3)
0      ~0%     {3}    | SCAN OUTPUT In.2 'cls', In.0 'name', In.1 'result'

125138 ~1%     {3} r3 = r1 UNION r2
125138 ~1%     {3}    | AND NOT `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev`(FIRST 3)
               return r3
```

And now
```
(18s) Tuple counts for DataFlowDispatch::class_has_method/2#0d2ae9c0/2@ff66c1lr after 18ms:
202279 ~1%     {2} r1 = JOIN `Class::Class.getAMethod/0#dispred#66416e47_10#join_rhs` WITH `Function::Function.getName/0#dispred#033700ef` ON FIRST 1 OUTPUT Lhs.1 'cls', Rhs.1 'name'
               return r1
...
(490s) Tuple counts for DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3/3@i6#48b6c1xi after 54ms:
0      ~0%     {3} r1 = JOIN `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev_delta` WITH `DataFlowDispatch::getNextClassInMro/1#e1ee596a#reorder_1_0#prev` ON FIRST 1 OUTPUT Rhs.1 'cls', Lhs.1 'name', Lhs.2 'result'
0      ~0%     {3}    | AND NOT `DataFlowDispatch::class_has_method/2#0d2ae9c0`(FIRST 2)

33363  ~0%     {2} r2 = SCAN `DataFlowDispatch::getNextClassInMro/1#e1ee596a#prev_delta` OUTPUT In.1, In.0 'cls'
159696 ~0%     {3}    | JOIN WITH `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev` ON FIRST 1 OUTPUT Lhs.1 'cls', Rhs.1 'name', Rhs.2 'result'
125138 ~1%     {3}    | AND NOT `DataFlowDispatch::class_has_method/2#0d2ae9c0`(FIRST 2)

125138 ~1%     {3} r3 = r1 UNION r2
125138 ~1%     {3}    | AND NOT `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev`(FIRST 3)
               return r3
```
2024-06-21 12:16:27 +00:00
Rasmus Lerchedahl Petersen
280a9b4408 Python: Support Model Editor 2024-06-21 11:47:51 +02:00
Rasmus Lerchedahl Petersen
f0e68887d4 Python: autoformat 2024-06-20 10:59:39 +02:00