The `fieldFlow` predicate contained a fragile join that has become
ordered wrong recently, either as result of an unrelated change in the
data-flow library or as part of the stats change for the last dbscheme
change.
The minimal fix is to use `getEnclosingCallable` instead of
`getFunction` since the former uses `unique` to ensure good join
ordering in its callers. A longer-term fix should be applied to the AST
base libraries, but this will be invasive and require independent
testing.
Tuple counts on Wireshark before (cancelled after a few minutes):
(747s) Starting to evaluate predicate DataFlowUtil::localFlowStep#ff/2@bdba82
(848s) Tuple counts for DataFlowUtil::localFlowStep#ff:
1766640980 ~1% {2} r1 = JOIN DataFlowUtil::Node::getFunction_dispred#ff_10#join_rhs AS L WITH DataFlowUtil::Node::getFunction_dispred#ff_10#join_rhs AS R ON FIRST 1 OUTPUT L.<1>, R.<1>
1327 ~0% {2} r2 = JOIN r1 WITH project#DataFlowImplLocal::Configuration::hasFlow#fbb AS R ON FIRST 2 OUTPUT r1.<0>, r1.<1>
9691232 ~0% {2} r3 = DataFlowUtil::simpleLocalFlowStep#ff@staged_ext \/ r2
return r3
After:
(0s) Starting to evaluate predicate DataFlowUtil::localFlowStep#ff/2@a852a0
(0s) Tuple counts for DataFlowUtil::localFlowStep#ff:
49017 ~4% {3} r1 = JOIN project#DataFlowImplLocal::Configuration::hasFlow#fff AS L WITH DataFlowUtil::Node::getEnclosingCallable_dispred#ff AS R ON FIRST 1 OUTPUT L.<1>, R.<1>, R.<0>
42359 ~0% {2} r2 = JOIN r1 WITH DataFlowUtil::Node::getEnclosingCallable_dispred#ff AS R ON FIRST 2 OUTPUT r1.<2>, r1.<0>
9732264 ~0% {2} r3 = DataFlowUtil::simpleLocalFlowStep#ff@staged_ext \/ r2
return r3
This gives a slight speedup, and I think it makes the code shorter and
clearer.
On Wireshark, the time from the beginning of the `IRBlock` stage until
just before evaluation of `getInstruction` drops from 44s to 34s.
This case was added in dccc0f4db. The surrounding code has changed a lot
since then, and the case no longer seems to have an effect except to
create some dead ends and possibly cycles in the local flow graph.
Also clarify the docs on `Call` to decrease the likelyhood of such an
omission happening again.
The updated test reflects that `f1.operator()` lets the address of `f1`
escape from the caller.
For each pointer, we start tracking (starting from the allocation or an array declaration)
1) how long is the chunk of memory allocated
2) where the current pointer is in this chunk of memory.
This information might not always exist, but when it does, it is reliable.
Currently only works intraprocedurally.
This `select` clause had become very slow after we started caching
`ElementBase::toString` because the query used string concatenation to
produce alert messages, and those string concatenations were done very
early in the pipeline, producing lots of strings that would be discarded
moments later.
By using `$@` to interpolate elements into strings, the concatenation is
done outside of QL.
Testing on a Chromium snapshot, this commit takes us from
#select#ff ................ 6m2s
to
#select#cpe#134#fff ....... 15.2s