Without this fix, running the full LGTM suite would get the IR evaluated
twice. That's because we have multiple IPA types and constructors with
the same name (like `TInstruction` and `MkIRFunction`), and the QL
compiler chooses how to disambiguate those names differently depending
on import order.
I've tested that the IR is only evaluated once now by running the whole
suite on a tiny project (jbj/magicrescue) and looking at the output of
perl -ne 'print if /^RESULTS IN:/ .. /^\[/ and not /^\[/' runSnapshotQueries-debug.log | sort |uniq -c |sort -n |less
Instructions that are removed from the normal value numbering recursion
because they have a duplicated type or AST element get unique value
numbers rather than going unnumbered. This ensures comparisons of value
numbers using `!=` hold for filtered instructions.
On some snapshots, notably ffmpeg, the IR `ValueNumbering` recursion
would generate billions of tuples and eventually run out of space.
It turns out it was fairly common for an `Instruction` to get more than
one `ValueNumber` in the base cases for `VariableAddressInstruction` and
`InitializeParameterInstruction`, and it could also happen in an
instruction with more than one operand of the same `OperandTag`. When a
binary operation was applied to an instruction with `m` value numbers
and another instruction with `n` value numbers, the result would get
`m * n` value numbers. This led to doubly-exponential growth in the
number of value numbers in rare cases.
The underlying reason why a `VariableAddressInstruction` could get
multiple value numbers is that it was keyed on the associated
`IRVariable`, and the `IRVariable` is defined in part by the type of its
underlying `Variable` (or other AST element). If the extractor defines a
variable to have multiple types because of linker ambiguity, this leads
to the creation of multiple `IRVariable`s. That should ideally be solved
in `TIRVariable.qll`, but for now I've put a workaround in
`ValueNumberingInternal.qll` instead.
To remove the problem with instructions having multiple operands, the
construction in `Operand.qll` will now filter out any such operand. It
wasn't enough to apply that filter to the `raw` stage, so I've applied
it to all three stages.
This predicate got an unfortunate join order, leading to these tuple
counts on ElektraInitiative/libelektra:
(290s) Tuple counts for AliasedSSA::VariableMemoryLocation::coversEntireVariable_dispred#f:
57117 ~0% {3} r1 = SCAN IRType::IRType::getByteSize_dispred#ff AS I OUTPUT 0, (I.<1> * 8), I.<0>
421445272 ~0% {3} r2 = JOIN r1 WITH AliasedSSA::VariableMemoryLocation#fffffff_5601#join_rhs AS R ON FIRST 2 OUTPUT R.<3>, r1.<2>, R.<2>
103282 ~2% {1} r3 = JOIN r2 WITH AliasConfiguration::Allocation::getIRType_dispred#ff AS R ON FIRST 2 OUTPUT r2.<2>
return r3
With this commit, we get these tuple counts instead:
(0s) Tuple counts for AliasedSSA::VariableMemoryLocation::varIRTypeHasBitRange#bff:
361874 ~0% {3} r1 = SCAN AliasedSSA::VariableMemoryLocation#fffffff AS I OUTPUT I.<1>, 0, I.<0>
361874 ~0% {3} r2 = JOIN r1 WITH AliasConfiguration::Allocation::getIRType_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, 0, r1.<2>
361874 ~1% {3} r3 = JOIN r2 WITH IRType::IRType::getByteSize_dispred#ff AS R ON FIRST 1 OUTPUT r2.<2>, 0, (R.<1> * 8)
return r3
(0s) Tuple counts for AliasedSSA::VariableMemoryLocation::coversEntireVariable_dispred#f:
103282 ~2% {1} r1 = JOIN AliasedSSA::VariableMemoryLocation#fffffff_056#join_rhs AS L WITH AliasedSSA::VariableMemoryLocation::varIRTypeHasBitRange#bff AS R ON FIRST 3 OUTPUT L.<0>
103282 ~2% {1} r2 = STREAM DEDUP r1
return r2
We had these tuple counts on ElektraInitiative/libelektra (note that the
`modelFlow` predicate got inlined into
`simpleInstructionLocalFlowStep`):
(652s) Tuple counts for DataFlowUtil::simpleInstructionLocalFlowStep#ff:
...
19701 ~1% {4} r27 = JOIN r26 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, r26.<2>, r26.<1>, r26.<0>
7908 ~0% {3} r28 = JOIN r27 WITH SSAConstruction::Cached::getInstructionIndex#ff@staged_ext AS R ON FIRST 2 OUTPUT r27.<0>, r27.<2>, r27.<3>
4023 ~0% {3} r29 = JOIN r28 WITH Instruction::WriteSideEffectInstruction#class#ff AS R ON FIRST 1 OUTPUT r28.<1>, r28.<2>, r28.<0>
...
1060807009 ~3% {3} r34 = JOIN r33 WITH SSAConstruction::Cached::getInstructionIndex#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, r33.<1>, r33.<2>
15670 ~5% {2} r35 = JOIN r34 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff AS R ON FIRST 2 OUTPUT r34.<0>, r34.<2>
7973 ~0% {2} r36 = JOIN r35 WITH Instruction::ReadSideEffectInstruction::getSideEffectOperand_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r35.<1>
...
In this predicate there are two cases (`WriteSideEffectInstruction` and
`ReadSideEffectInstruction`) where we need to join on both the call and
the argument index of a side effect. It works well enough for the first
case, `WriteSideEffectInstruction`, where the call is joined on before
the index, but it explodes in the second case,
`ReadSideEffectInstruction`, where the index is joined first. To fix the
second case, and to guard against future optimizer accidents in the
first case, this commit changes both of those cases to use a new helper
predicate that makes it possible to join on both columns at once. The
resulting tuple counts are:
(3s) Tuple counts for DataFlowUtil::simpleInstructionLocalFlowStep#ff:
...
7908 ~0% {3} r27 = JOIN r26 WITH DataFlowUtil::getSideEffectFor#fff AS R ON FIRST 2 OUTPUT R.<2>, r26.<2>, r26.<0>
4023 ~0% {3} r28 = JOIN r27 WITH Instruction::WriteSideEffectInstruction#class#ff AS R ON FIRST 1 OUTPUT r27.<1>, r27.<2>, r27.<0>
...
15670 ~5% {2} r33 = JOIN r32 WITH DataFlowUtil::getSideEffectFor#fff AS R ON FIRST 2 OUTPUT R.<2>, r32.<2>
7973 ~0% {2} r34 = JOIN r33 WITH Instruction::ReadSideEffectInstruction::getSideEffectOperand_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r33.<1>
...
The bulge is now limited to a factor of two, and that's just because I
didn't write separate versions of `getSideEffectFor` for
`ReadSideEffectInstruction` and `WriteSideEffectInstruction`.