This is still quadratic in the number of MemoryLocations for a vvar, but
only for a single pipeline step, which is not materialized. It seems to be
fast enough in practice for the IR.
Now that `PhiInstruction.getAnInput` only has results for congruent
operands, a previous optimization I made to `getConstantValue` is no
longer sound. We have to check that all phi inputs give the same value,
not just the congruent ones. After this change, if there are any
non-congruent operands on a phi instruction, the whole aggregate will
have no result.
In the `SignAnalysis` abstract interpretation, "unknown sign"
corresponds to the set of _all_ `Sign`, but using `getDef` leads to the
operand having _no_ `Sign`. To fix that, we assign all signs to inexact
operands.
The issue is now fixed in the extractor, and I've confirmed that the
workaround is no longer needed for g/an-tao/drogon.
This reverts commit 48a3385809.
This query seems to have been de-optimized by recent optimizer or stats
changes. On libretro/libretro-uae, the query took 1 second on a warm
cache with dist 89ad5f1 but took 9979 seconds with dist a3b9b6eb9.
The slowness was due to a Cartesian product in
`illDefined{Decr,Incr}ForStmt` between all the definitions and all the
uses of `Variable v`. This would be no problem with the right join
order, but that has apparently been lost. This commit factors out a pair
of `pragma[noinline]` helper predicates to make sure the definitions
(`v.getAnAssignedValue()`) and the uses (`v.getAnAccess()`) are queried
and filtered in separate predicates.
The performance problem can be seen in the tuple counts of this pipeline
I interrupted during evaluation of
`inconsistentLoopDirection::illDefinedDecrForStmt#ffff#shared`:
89716 ~3% {2} r1 = SCAN Variable::Variable::getAnAssignedValue_dispred#ff OUTPUT FIELDS {Variable::Variable::getAnAssignedValue_dispred#ff.<1>,Variable::Variable::getAnAssignedValue_dispred#ff.<0>}
89716 ~0% {3} r2 = JOIN r1 WITH DataFlowUtil::TExprNode#ff@staged_ext ON r1.<0>=DataFlowUtil::TExprNode#ff@staged_ext.<0> OUTPUT FIELDS {r1.<1>,DataFlowUtil::TExprNode#ff@staged_ext.<0>,DataFlowUtil::TExprNode#ff@staged_ext.<1>}
502539405 ~0% {4} r3 = JOIN r2 WITH Variable::Variable::getAnAccess_dispred#fb ON r2.<0>=Variable::Variable::getAnAccess_dispred#fb.<0> OUTPUT FIELDS {Variable::Variable::getAnAccess_dispred#fb.<1>,r2.<1>,r2.<2>,r2.<0>}
return r3
The last commit introduced calls to two predicates that did not exist. I
created `Instruction.getResultAddress` so it now exists and changed the
other call back to use the predicate that does exist.
This query used `getASuccessor()` on the CFG, which worked in many cases
but became quadratic on certain projects including PostgreSQL and
MySQL. The problem was that there was just enough context for magic to
apply to the transitive closure, but the use of magic meant that the
fast transitive closure algorithm wasn't used. In projects where the
magic had little effect, that led to the
`#ControlFlowGraph::ControlFlowNode::getASuccessor_dispred#bfPlus`
predicate taking quadratic time and space.
This commit changes the query to use basic blocks to find successors,
which is much faster because (1) there are many more `ControlFlowNode`s
than `BasicBlocks`, and (2) the optimizer does not apply magic but uses
fast transitive closure instead.
Behavior changes slightly in the `isUsedInCorrectLeapYearCheck` case: we
now accept a `yfacheck` that comes _before_ `yfa` if they are in the
same basic block. I don't think that matters in practice.
Without this `pragma[noopt]`, `isInCycle` gets compiled into RA that
unpacks every tuple of the fast TC:
0 ~0% {2} r1 = SELECT #Operand::getNonPhiOperandDef#3#ffPlus ON FIELDS #Operand::getNonPhiOperandDef#3#ffPlus.<0>=#Operand::getNonPhiOperandDef#3#ffPlus.<1>
0 ~0% {1} r2 = SCAN r1 OUTPUT FIELDS {r1.<0>}
return r2
With this change, it just becomes one lookup in the fast TC data
structure per instruction.