Since this code is shared between the AST CFG and the IR construction,
it seems right to have only one copy. That copy lives on a new class
`StaticStorageDurationVariable`, which may prove useful on its own.
Having that module in `Instruction.qll` slowed down the parsing of that
file both humans and the compiler.
This commit moves the `InstructionSanity` module to `IRSanity.qll`
without making any changes to its contents apart from adding some
imports.
On Wireshark with 6GB RAM, I've observed `definitionReachesRank` to be
the slowest predicate in the IR. It seems that the implementation was
slow because the optimizer failed to eliminate the common
`reachesRank - 1` subexpression. This led to context being pushed into
the `not`, which got implemented as `MATERIALIZE`. That wouldn't
normally be a disaster, but this is one of the largest predicates in the
IR SSA construction, and iteration 2 was very slow.
Before:
(1505s) Starting to evaluate predicate SSAConstruction::DefUse::definitionReachesRank#ffff#cur_delta/4[1]@93f592 (iteration 1)
(1535s) Tuple counts for SSAConstruction::DefUse::definitionReachesRank#ffff#cur_delta:
130670697 ~0% {4} r1 = SCAN project#SSAConstruction::DefUse::hasDefinitionAtRank#fffff AS I OUTPUT I.<0>, I.<1>, I.<2>, (I.<2> + 1)
130670697 ~6% {5} r2 = JOIN r1 WITH SSAConstruction::DefUse::exitRank#fff AS R ON FIRST 2 OUTPUT r1.<0>, r1.<1>, r1.<2>, r1.<3>, R.<2>
130670697 ~6% {5} r3 = SELECT r2 ON r2.<3> <= r2.<4>
130670697 ~0% {4} r4 = SCAN r3 OUTPUT r3.<0>, r3.<1>, r3.<2>, r3.<3>
return r4
(1535s) - SSAConstruction::DefUse::definitionReachesRank#ffff_delta has 130670697 rows (order for disjuncts: delta=<standard>).
(1535s) Starting to evaluate predicate SSAConstruction::DefUse::definitionReachesRank#ffff#cur_delta/4[2]@866c14 (iteration 2)
(1626s) Tuple counts for SSAConstruction::DefUse::definitionReachesRank#ffff#cur_delta:
261341394 ~107% {4} r1 = JOIN SSAConstruction::DefUse::definitionReachesRank#ffff#prev_delta AS L WITH SSAConstruction::DefUse::definitionReachesRank#ffff#join_rhs AS R ON FIRST 3 OUTPUT R.<0>, R.<1>, R.<2>, (1 + L.<3>)
261341394 ~107% {4} r2 = r1 AND NOT SSAConstruction::DefUse::definitionReachesRank#ffff#prev AS R(r1.<0>, r1.<1>, r1.<2>, r1.<3>)
130670697 ~0% {5} r3 = SCAN r2 OUTPUT r2.<0>, r2.<1>, (r2.<3> - 1), r2.<2>, r2.<3>
106034590 ~1% {4} r4 = JOIN r3 WITH project#SSAConstruction::DefUse::hasDefinitionAtRank#fffff AS R ON FIRST 3 OUTPUT r3.<0>, r3.<1>, r3.<3>, r3.<4>
106034590 {4} r5 = MATERIALIZE r4 AS antijoin_rhs
24636107 ~3% {4} r6 = r2 AND NOT r5(r2.<0>, r2.<1>, r2.<2>, r2.<3>)
24636107 ~0% {5} r7 = JOIN r6 WITH SSAConstruction::DefUse::exitRank#fff AS R ON FIRST 2 OUTPUT r6.<0>, r6.<1>, r6.<2>, r6.<3>, R.<2>
2749441 ~0% {5} r8 = SELECT r7 ON r7.<3> <= r7.<4>
2749441 ~4% {4} r9 = SCAN r8 OUTPUT r8.<0>, r8.<1>, r8.<2>, r8.<3>
return r9
(1626s) - SSAConstruction::DefUse::definitionReachesRank#ffff_delta has 2749441 rows (order for disjuncts: delta=<standard>).
After:
(12s) Tuple counts for SSAConstruction::DefUse::definitionReachesRank#ffff#cur_delta:
130670697 ~0% {4} r1 = SCAN project#SSAConstruction::DefUse::hasDefinitionAtRank#fffff AS I OUTPUT I.<0>, I.<1>, I.<2>, (I.<2> + 1)
return r1
(12s) - SSAConstruction::DefUse::definitionReachesRank#ffff_delta has 130670697 rows (order for disjuncts: delta=<standard>).
(12s) Starting to evaluate predicate SSAConstruction::DefUse::definitionReachesRank#ffff#cur_delta/4[2]@fff64c (iteration 2)
(34s) Tuple counts for SSAConstruction::DefUse::definitionReachesRank#ffff#cur_delta:
108784031 ~0% {4} r1 = SSAConstruction::DefUse::definitionReachesRank#ffff#prev_delta AS L AND NOT SSAConstruction::DefUse::exitRank#fff AS R(L.<0>, L.<1>, L.<3>)
2749441 ~5% {4} r2 = r1 AND NOT project#SSAConstruction::DefUse::hasDefinitionAtRank#fffff AS R(r1.<0>, r1.<1>, r1.<3>)
2749441 ~4% {4} r3 = SCAN r2 OUTPUT r2.<0>, r2.<1>, r2.<2>, (r2.<3> + 1)
2749441 ~4% {4} r4 = r3 AND NOT SSAConstruction::DefUse::definitionReachesRank#ffff#prev AS R(r3.<0>, r3.<1>, r3.<2>, r3.<3>)
return r4
(34s) - SSAConstruction::DefUse::definitionReachesRank#ffff_delta has 2749441 rows (order for disjuncts: delta=<standard>).
Note that the row counts are exactly the same before and after.
The charpred of class `GVN` in `ASTValueNumbering.qll` got inlined into
the member predicate `getAnInstruction` and caused a tuple explosion on
Wireshark in the query `StrncpyFlippedArgs.ql`.
I interrupted the predicate after 10 minutes and got these intermediate
tuple counts:
(5208s) Tuple counts for ASTValueNumbering::GVN::getAnInstruction_dispred#ff:
8754900909 ~5% {3} r1 = JOIN ValueNumberingInternal::tvalueNumber#ff_10#join_rhs AS L WITH ValueNumberingInternal::tvalueNumber#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, L.<1>, L.<0>
4390274632 ~150085% {2} r2 = JOIN r1 WITH project#SSAConstruction::Cached::getInstructionUnconvertedResultExpression AS R ON FIRST 1 OUTPUT r1.<2>, r1.<1>
return r2
After this change, the `getAnInstruction` predicate is itself inlined,
like it should be. The new non-inlined charpred takes 2.1s and has these
tuple counts:
(2s) Tuple counts for ASTValueNumbering::GVN#f:
9158442 ~117% {1} r1 = JOIN project#SSAConstruction::Cached::getInstructionUnconvertedResultExpression AS L WITH ValueNumberingInternal::tvalueNumber#ff@staged_ext AS R ON FIRST 1 OUTPUT R.<1>
return r1
The code is now quadratic in the number of statements in a basic block,
whereas before it was quadratic in the number of _control-flow nodes_ in
a basic block.