mirror of
https://github.com/github/codeql.git
synced 2026-01-10 05:00:29 +01:00
Python: Fix bad join in getImmediateModuleReference
The "most expensive predicates" report had the following line on a
certain database:
```
1m15s | 11 | 37s @ 4 | ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0@12bb4xdo
```
Investigating further revealed the following bad joins
```
(388s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i4#b2325xoe after 36.5s:
0 ~0% {2} r1 = JOIN `ImportResolution::ImportResolution::sys_modules_module_with_name/1#134529bf#prev_delta` WITH `ImportResolution::ImportResolution::getReferenceToModuleName/1#bc5da225` ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1 'm'
74884348 ~0% {3} r2 = JOIN `ImportResolution::ImportResolution::getModuleReference/1#28368ea4#prev_delta` WITH `ImportResolution::ImportResolution::potential_module_export/2#19340171` ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Lhs.0
5221604 ~0% {3} | JOIN WITH `Attributes::AttrRef.accesses/2#dispred#31929f12_120#join_rhs` ON FIRST 2 OUTPUT Rhs.2 'result', Lhs.2, Lhs.1
5219926 ~2% {3} | JOIN WITH Attributes::AttrRead#class#f6c3f431 ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.0 'result'
5300880 ~1% {2} | JOIN WITH `ImportResolution::ImportResolution::module_export/3#f2fc6a2a` ON FIRST 2 OUTPUT Rhs.2, Lhs.2 'result'
42211 ~5% {2} | JOIN WITH `ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0#prev` ON FIRST 1 OUTPUT Lhs.1 'result', Rhs.1 'm'
957042 ~4% {3} r3 = JOIN `ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0#prev_delta` WITH `ImportResolution::ImportResolution::module_export/3#f2fc6a2a_201#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Rhs.2, Lhs.1 'm'
957035 ~0% {3} | JOIN WITH `ImportResolution::ImportResolution::potential_module_export/2#19340171` ON FIRST 2 OUTPUT Lhs.1, Lhs.2 'm', Lhs.0
236753257 ~1% {4} | JOIN WITH `Attributes::AttrRef.accesses/2#dispred#31929f12_201#join_rhs` ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1 'm', Lhs.2, Rhs.2
199557145 ~2% {4} | JOIN WITH Attributes::AttrRead#class#f6c3f431 ON FIRST 1 OUTPUT Lhs.2, Lhs.3, Lhs.1 'm', Lhs.0 'result'
1 ~0% {2} | JOIN WITH `ImportResolution::ImportResolution::getModuleReference/1#28368ea4#prev` ON FIRST 2 OUTPUT Lhs.3 'result', Lhs.2 'm'
15199013 ~1951% {2} r4 = JOIN `ImportResolution::ImportResolution::getModuleReference/1#28368ea4#prev_delta` WITH `Module::Module.getPackageName/0#dispred#bb0c3872` ON FIRST 1 OUTPUT Lhs.1, Rhs.1
14707604 ~2136% {3} | JOIN WITH `Attributes::AttrRef.accesses/2#dispred#31929f12_102#join_rhs` ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1, Rhs.2
14623588 ~2190% {4} r5 = JOIN r4 WITH Attributes::AttrRead#class#f6c3f431 ON FIRST 1 OUTPUT _, Lhs.0 'result', Lhs.1, Lhs.2
14623588 ~2058% {2} | REWRITE WITH Tmp.0 := ".", Out.0 := (In.2 ++ Tmp.0 ++ In.3) KEEPING 2
14623588 ~2139% {5} r6 = JOIN r4 WITH Attributes::AttrRead#class#f6c3f431 ON FIRST 1 OUTPUT _, Lhs.0 'result', Lhs.1, Lhs.2, _
14623588 ~2092% {2} | REWRITE WITH Tmp.0 := ".", Tmp.0 := (In.2 ++ Tmp.0 ++ In.3), Tmp.4 := ".__init__", Out.0 := (Tmp.0 ++ Tmp.4) KEEPING 2
29247176 ~2099% {2} r7 = r5 UNION r6
199786001 ~6922% {2} | JOIN WITH `Module::isPreferredModuleForName/2#5fb427f9_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'result'
199756923 ~7024% {2} | JOIN WITH `Module::Module.getFile/0#dispred#53eb9b1b_10#join_rhs` ON FIRST 1 OUTPUT Lhs.1 'result', Rhs.1 'm'
199799135 ~6954% {2} r8 = r1 UNION r2 UNION r3 UNION r7
199793992 ~6954% {2} | AND NOT `ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0#prev`(FIRST 2)
return r8
```
Clearly, waiting to joining with `getModuleReference` last is not
healthy. To fix this, I opted to simply create a helper predicate for
the `accesses` construct.
After this change, here are the iteration timings
```
(327s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i1#74f41yqa after 1.2s:
(327s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i1#8a053ys7 after 1.3s:
(327s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i2#74f41yqa after 20ms:
(327s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i2#8a053ys7 after 20ms:
(337s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i4#74f41yqa after 8.5s:
(341s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i4#8a053ys7 after 3.2s:
(346s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i5#74f41yqa after 7.2s:
(349s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i6#74f41yqa after 3ms:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i5#8a053ys7 after 10s:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i8#74f41yqa after 37ms:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i9#74f41yqa after 0ms:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i10#74f41yqa after 0ms:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i11#74f41yqa after 1ms:
(352s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i12#74f41yqa after 1ms:
(353s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i6#8a053ys7 after 1ms:
(354s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i8#8a053ys7 after 7ms:
(354s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i9#8a053ys7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i10#8a053ys7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i11#8a053ys7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::getImmediateModuleReference/1#3553e6c0#reorder_1_0/2@i12#8a053ys7 after 0ms:
```
And the helper predicate itself is also quick to evaluate:
```
(327s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i1#74f41xqa after 0ms:
(327s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i1#8a053xs7 after 0ms:
(329s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i3#74f41xqa after 99ms:
(337s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i3#8a053xs7 after 98ms:
(338s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i4#74f41xqa after 679ms:
(341s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i4#8a053xs7 after 400ms:
(346s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i5#74f41xqa after 1ms:
(349s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i6#74f41xqa after 22ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i5#8a053xs7 after 1ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i7#74f41xqa after 1.4s:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i8#74f41xqa after 8ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i9#74f41xqa after 0ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i10#74f41xqa after 1ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i11#74f41xqa after 1ms:
(352s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i12#74f41xqa after 1ms:
(353s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i13#74f41xqa after 806ms:
(353s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i6#8a053xs7 after 7ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i7#8a053xs7 after 870ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i8#8a053xs7 after 2ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i9#8a053xs7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i10#8a053xs7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i11#8a053xs7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i12#8a053xs7 after 0ms:
(354s) Tuple counts for ImportResolution::ImportResolution::module_reference_accesses/3#8f45b418#reorder_1_2_0/3@i13#8a053xs7 after 276ms:
```
(I note that we appear to be evaluating this code twice, which is a bit
worrying. I'll leave that investigaton for later.)
This commit is contained in:
@@ -278,6 +278,12 @@ module ImportResolution {
|
||||
)
|
||||
}
|
||||
|
||||
/** Join-order helper for `getImmediateModuleReference`. */
|
||||
pragma[nomagic]
|
||||
private predicate module_reference_accesses(DataFlow::AttrRead ar, Module p, string attr_name) {
|
||||
ar.accesses(getModuleReference(p), attr_name)
|
||||
}
|
||||
|
||||
/**
|
||||
* Gets a dataflow node that is an immediate reference to the module `m`.
|
||||
*
|
||||
@@ -294,16 +300,13 @@ module ImportResolution {
|
||||
)
|
||||
or
|
||||
// Reading an attribute on a module may return a submodule (or subpackage).
|
||||
exists(DataFlow::AttrRead ar, Module p, string attr_name |
|
||||
ar.accesses(getModuleReference(p), attr_name) and
|
||||
result = ar
|
||||
|
|
||||
exists(Module p, string attr_name | module_reference_accesses(result, p, attr_name) |
|
||||
m = getModuleFromName(p.getPackageName() + "." + attr_name)
|
||||
)
|
||||
or
|
||||
// This is also true for attributes that come from reexports.
|
||||
exists(Module reexporter, string attr_name |
|
||||
result.(DataFlow::AttrRead).accesses(getModuleReference(reexporter), attr_name) and
|
||||
module_reference_accesses(result, reexporter, attr_name) and
|
||||
module_reexport(reexporter, attr_name, m)
|
||||
)
|
||||
or
|
||||
|
||||
Reference in New Issue
Block a user