It's sometimes faster but sometimes up to 2x slower to use plain
recursion here. On the other hand, plain recursion won't run out of Java
heap space, and it won't make unrelated computation slower by forcing
all RAM data out to disk.
The use of transitive closure for BB index calculation has been the
cause of an out-of-memory error. This commit switches the calculation to
use the `shortestDistances` HOP, which still has the problem that the
result needs to fit in RAM, but at least the RAM requirements are sure
to be linear in the size of the result. The `shortestDistances` HOP is
already used for BB index calculation for the C++ IR and for C#.
We could guard even better against OOM by switching the calculation to
use manual recursion, but that would undo the much-needed performance
improvements we got from #123.
This change improves performance on Wireshark, which is notorious for
having long basic blocks. When I benchmarked `shortestDistances`
for #123, it was slower than TC. With the current evaluator, it looks
like `shortestDistances` is faster. Performance before was:
PrimitiveBasicBlocks::Cached::getMemberIndex#ff ................... 9.7s (executed 8027 times)
#PrimitiveBasicBlocks::Cached::member_step#ffPlus ................. 6.6s
PrimitiveBasicBlocks::Cached::primitive_basic_block_entry_node#f .. 3.5s
PrimitiveBasicBlocks::Cached::primitive_basic_block_member#fff .... 2.3s
Performance with this commit is:
PrimitiveBasicBlocks::Cached::primitive_basic_block_entry_node#f ................................................................... 3.5s
shortestDistances@PrimitiveBasicBlocks::Cached::primitive_basic_block_entry_node#1@PrimitiveBasicBlocks::Cached::member_step#2#fff . 3s
PrimitiveBasicBlocks::Cached::primitive_basic_block_member#fff ..................................................................... 963ms
This query is supposed to look for constructors that unintentionally qualify as copy constructors due to default arguments. There are quite a few real-world projects that define such constructors intentionally. I've reduced the severity to "warning" and the precision to "low" due to the high false positive rate.
Querying for overlap type wasn't possible when this library was first
written. This change fixes FPs in `RedundantNullCheckSimple.ql` on
Wireshark and other real-world projects.
IR construction was missing support for C++ 11 range-based `for` loops. The extractor generates ASTs for the compiler-generated implementation already, so I had enough information to generate IR. I've expanded on some of the predicates in `RangeBasedForStmt` to access the desugared information.
One complication was that the `DeclStmt`s for the compiler-generated variables seem to have results for `getDeclaration()` but not for `getDeclarationEntry()`. This required handling these slightly differently than we do for other `DeclStmt`s.
The flow for range-based `for` is actually easier than for a regular `for`, because all three components (init, condition, and update) are always present.
It turns out we didn't have to move the `getName` implementation into
the mirror classes in `QualifiedName`. Doing so only made it harder for
the optimiser to specialize calls to `getName` on various kinds of
`Declaration`.
This gains some performance by not computing locations for all
expressions since we are only interested in calls and variable accesses.
The `Top::hasLocationInfo` predicate goes from 2m28s to 1m32s on
Chromium.
This predicate was slow, mostly because it's just very large. A manual
join order cuts the run time on Chromium from
definitions::Top::hasLocationInfo_dispred#ffffff ..................... 3m23s
definitions::MacroAccessWithHasLocationInfo::hasLocationInfo#ffffff .. 1m56s
to
definitions::Top::hasLocationInfo#ffffff .... 2m28s
The main slowdown was the two uses of `SCAN` to reorder columns in the
RA.