C++: Speed up IRGuardCondition::controlsBlock

The `controlsBlock` predicate had some dramatic bulges in its tuple
counts. To make matters worse, those bulges were in materialized
intermediate predicates like `#shared` and `#antijoin_rhs`, not just in
the middle of a pipeline.

The problem was particularly evident on kamailio/kamailio, where
`controlsBlock` was the slowest predicate in the IR libraries:

    IRGuards::IRGuardCondition::controlsBlock_dispred#fff#shared#4 ........ 58.8s
    IRGuards::IRGuardCondition::controlsBlock_dispred#fff#antijoin_rhs .... 33.4s
    IRGuards::IRGuardCondition::controlsBlock_dispred#fff#antijoin_rhs#1 .. 26.7s

The first of the above relations had 201M rows, and the others
had intermediate bulges of similar size.

The bulges could be observed even on small projects although they did
not cause measurable performance issues there. The
`controlsBlock_dispred#fff#shared#4` relation had 3M rows on git/git,
which is a lot for a project with only 1.5M IR instructions.

This commit borrows an efficient implementation from Java's
`Guards.qll`, tweaking it slightly to fit into `IRGuards`. Performance
is now much better:

    IRGuards::IRGuardCondition::controlsBlock_dispred#fff ................... 6.1s
    IRGuards::IRGuardCondition::hasDominatingEdgeTo_dispred#ff .............. 616ms
    IRGuards::IRGuardCondition::hasDominatingEdgeTo_dispred#ff#antijoin_rhs . 540ms

After this commit, the biggest bulge in `controlsBlock` is the size of
`IRBlock::dominates`. On kamailio/kamailio this is an intermediate tuple
count of 18M rows in the calculation of `controlsBlock`, which in the
end produces 11M rows.
This commit is contained in:
Jonas Jensen
2020-06-09 11:44:08 +02:00
parent cade3a3e23
commit 4642037dce

View File

@@ -13,6 +13,7 @@ import semmle.code.cpp.ir.IR
* has the AST for the `Function` itself, which tends to confuse mapping between the AST `BasicBlock`
* and the `IRBlock`.
*/
pragma[noinline]
private predicate isUnreachedBlock(IRBlock block) {
block.getFirstInstruction() instanceof UnreachedInstruction
}
@@ -405,16 +406,37 @@ class IRGuardCondition extends Instruction {
*/
private predicate controlsBlock(IRBlock controlled, boolean testIsTrue) {
not isUnreachedBlock(controlled) and
exists(IRBlock branchBlock | branchBlock.getAnInstruction() = branch |
exists(IRBlock succ |
this.hasBranchEdge(succ, testIsTrue) and
succ.dominates(controlled) and
forall(IRBlock pred | pred.getASuccessor() = succ |
pred = branchBlock or succ.dominates(pred) or not pred.isReachableFromFunctionEntry()
)
// The following implementation is ported from `controls` in Java's
// `Guards.qll`. See that file (at 398678a28) for further explanation and
// correctness arguments.
exists(IRBlock succ |
this.hasBranchEdge(succ, testIsTrue) and
this.hasDominatingEdgeTo(succ) and
succ.dominates(controlled)
)
}
/**
* Holds if `(this, succ)` is an edge that dominates `succ`, that is, all other
* predecessors of `succ` are dominated by `succ`. This implies that `this` is the
* immediate dominator of `succ`.
*
* This is a necessary and sufficient condition for an edge to dominate anything,
* and in particular `bb1.hasDominatingEdgeTo(bb2) and bb2.dominates(bb3)` means
* that the edge `(bb1, bb2)` dominates `bb3`.
*/
private predicate hasDominatingEdgeTo(IRBlock succ) {
exists(IRBlock branchBlock | branchBlock = this.getBranchBlock() |
branchBlock.immediatelyDominates(succ) and
branchBlock.getASuccessor() = succ and
forall(IRBlock pred | pred = succ.getAPredecessor() and pred != branchBlock |
succ.dominates(pred)
)
)
}
pragma[noinline]
private IRBlock getBranchBlock() { result = branch.getBlock() }
}
private ConditionalBranchInstruction get_branch_for_condition(Instruction guard) {