Now that `PhiInstruction.getAnInput` only has results for congruent
operands, a previous optimization I made to `getConstantValue` is no
longer sound. We have to check that all phi inputs give the same value,
not just the congruent ones. After this change, if there are any
non-congruent operands on a phi instruction, the whole aggregate will
have no result.
In the `SignAnalysis` abstract interpretation, "unknown sign"
corresponds to the set of _all_ `Sign`, but using `getDef` leads to the
operand having _no_ `Sign`. To fix that, we assign all signs to inexact
operands.
The issue is now fixed in the extractor, and I've confirmed that the
workaround is no longer needed for g/an-tao/drogon.
This reverts commit 48a3385809.
This query seems to have been de-optimized by recent optimizer or stats
changes. On libretro/libretro-uae, the query took 1 second on a warm
cache with dist 89ad5f1 but took 9979 seconds with dist a3b9b6eb9.
The slowness was due to a Cartesian product in
`illDefined{Decr,Incr}ForStmt` between all the definitions and all the
uses of `Variable v`. This would be no problem with the right join
order, but that has apparently been lost. This commit factors out a pair
of `pragma[noinline]` helper predicates to make sure the definitions
(`v.getAnAssignedValue()`) and the uses (`v.getAnAccess()`) are queried
and filtered in separate predicates.
The performance problem can be seen in the tuple counts of this pipeline
I interrupted during evaluation of
`inconsistentLoopDirection::illDefinedDecrForStmt#ffff#shared`:
89716 ~3% {2} r1 = SCAN Variable::Variable::getAnAssignedValue_dispred#ff OUTPUT FIELDS {Variable::Variable::getAnAssignedValue_dispred#ff.<1>,Variable::Variable::getAnAssignedValue_dispred#ff.<0>}
89716 ~0% {3} r2 = JOIN r1 WITH DataFlowUtil::TExprNode#ff@staged_ext ON r1.<0>=DataFlowUtil::TExprNode#ff@staged_ext.<0> OUTPUT FIELDS {r1.<1>,DataFlowUtil::TExprNode#ff@staged_ext.<0>,DataFlowUtil::TExprNode#ff@staged_ext.<1>}
502539405 ~0% {4} r3 = JOIN r2 WITH Variable::Variable::getAnAccess_dispred#fb ON r2.<0>=Variable::Variable::getAnAccess_dispred#fb.<0> OUTPUT FIELDS {Variable::Variable::getAnAccess_dispred#fb.<1>,r2.<1>,r2.<2>,r2.<0>}
return r3
The last commit introduced calls to two predicates that did not exist. I
created `Instruction.getResultAddress` so it now exists and changed the
other call back to use the predicate that does exist.
This query used `getASuccessor()` on the CFG, which worked in many cases
but became quadratic on certain projects including PostgreSQL and
MySQL. The problem was that there was just enough context for magic to
apply to the transitive closure, but the use of magic meant that the
fast transitive closure algorithm wasn't used. In projects where the
magic had little effect, that led to the
`#ControlFlowGraph::ControlFlowNode::getASuccessor_dispred#bfPlus`
predicate taking quadratic time and space.
This commit changes the query to use basic blocks to find successors,
which is much faster because (1) there are many more `ControlFlowNode`s
than `BasicBlocks`, and (2) the optimizer does not apply magic but uses
fast transitive closure instead.
Behavior changes slightly in the `isUsedInCorrectLeapYearCheck` case: we
now accept a `yfacheck` that comes _before_ `yfa` if they are in the
same basic block. I don't think that matters in practice.
Without this `pragma[noopt]`, `isInCycle` gets compiled into RA that
unpacks every tuple of the fast TC:
0 ~0% {2} r1 = SELECT #Operand::getNonPhiOperandDef#3#ffPlus ON FIELDS #Operand::getNonPhiOperandDef#3#ffPlus.<0>=#Operand::getNonPhiOperandDef#3#ffPlus.<1>
0 ~0% {1} r2 = SCAN r1 OUTPUT FIELDS {r1.<0>}
return r2
With this change, it just becomes one lookup in the fast TC data
structure per instruction.
Just like `TInstruction` is cached to prevent re-numbering its tuples in
every IR query, I think `TOperand` should be cached too. I tested it on
the small comdb2 snapshot, where it only saves one second of work when
running a second IR query, but the savings should grow when snapshots
are larger and when there are more IR queries in a suite. Tuple
numbering is mildly quadratic, so it should be good to avoid repeating
it.
Adding these annotations adds three cached stages to the existing four
cached stages of the IR. The new cached stages are small and do not
appear to repeat any work from the other stages, so I see no advantage
to merging them with the existing stages.
The previous version of the test used `0 = 1;` to test an lvalue-typed
`ErrorExpr`, but the extractor replaced the whole assignment expression
with `ErrorExpr` instead of just the LHS. This variation of the test
only leads to an `ErrorExpr` for the part of the syntax that's supposed
to be an lvalue-typed expression, so that's an improvement.
Unfortunately it still doesn't demonstrate that we can `Store` into an
address computed by an `ErrorExpr`.
This doesn't fix the underlying problem that for some reason there are
cycles in the operand graph on our snapshots of the Linux kernel, but it
ensures that the cycles don't lead to non-termination of
`ConstantAnalysis` and `ValueNumbering`.
The `ValueNumbering` library is supposed to propagate value numberings
through a `CopyInstruction` only when it's _congruent_, meaning it must
have exact overlap with its source. A `CopyInstruction` can be a
`LoadInstruction`, a `StoreInstruction`, or a `CopyValueInstruction`.
The latter is also a `UnaryInstruction`, and the value numbering rule
for `UnaryInstruction` applied to it as well.
This meant that value numbering would propagate even through a
non-congruent `CopyValueInstruction`. That's semantically wrong but
probably only an issue in very rare circumstances, and it should get
corrected when we change the definition of `getUnary` to require
congruence.
What's worse is the performance implications. It meant that the value
numbering IPA witness could take two different paths through every
`CopyValueInstruction`. If multiple `CopyValueInstruction`s were
chained, this would lead to an exponential number of variable numbers
for the same `Instruction`, and we would run out of time and space
while performing value numbering.
This fixes the performance of `ValueNumbering.qll` on
https://github.com/asterisk/asterisk, although this project might also
require a separate change for fixing an infinite loop in the IR constant
analysis.