C++: Fix fieldFlow join order

The `fieldFlow` predicate contained a fragile join that has become
ordered wrong recently, either as result of an unrelated change in the
data-flow library or as part of the stats change for the last dbscheme
change.

The minimal fix is to use `getEnclosingCallable` instead of
`getFunction` since the former uses `unique` to ensure good join
ordering in its callers. A longer-term fix should be applied to the AST
base libraries, but this will be invasive and require independent
testing.

Tuple counts on Wireshark before (cancelled after a few minutes):

    (747s) Starting to evaluate predicate DataFlowUtil::localFlowStep#ff/2@bdba82
    (848s) Tuple counts for DataFlowUtil::localFlowStep#ff:
    1766640980 ~1%        {2} r1 = JOIN DataFlowUtil::Node::getFunction_dispred#ff_10#join_rhs AS L WITH DataFlowUtil::Node::getFunction_dispred#ff_10#join_rhs AS R ON FIRST 1 OUTPUT L.<1>, R.<1>
    1327       ~0%        {2} r2 = JOIN r1 WITH project#DataFlowImplLocal::Configuration::hasFlow#fbb AS R ON FIRST 2 OUTPUT r1.<0>, r1.<1>
    9691232    ~0%        {2} r3 = DataFlowUtil::simpleLocalFlowStep#ff@staged_ext \/ r2
                          return r3

After:

    (0s) Starting to evaluate predicate DataFlowUtil::localFlowStep#ff/2@a852a0
    (0s) Tuple counts for DataFlowUtil::localFlowStep#ff:
    49017    ~4%     {3} r1 = JOIN project#DataFlowImplLocal::Configuration::hasFlow#fff AS L WITH DataFlowUtil::Node::getEnclosingCallable_dispred#ff AS R ON FIRST 1 OUTPUT L.<1>, R.<1>, R.<0>
    42359    ~0%     {2} r2 = JOIN r1 WITH DataFlowUtil::Node::getEnclosingCallable_dispred#ff AS R ON FIRST 2 OUTPUT r1.<2>, r1.<0>
    9732264  ~0%     {2} r3 = DataFlowUtil::simpleLocalFlowStep#ff@staged_ext \/ r2
                     return r3
This commit is contained in:
Jonas Jensen
2020-05-04 12:09:49 +02:00
parent 2b0ad2df6f
commit 50b0d426ee

View File

@@ -655,7 +655,7 @@ private module FieldFlow {
exists(FieldConfiguration cfg | cfg.hasFlow(node1, node2)) and
// This configuration should not be able to cross function boundaries, but
// we double-check here just to be sure.
node1.getFunction() = node2.getFunction()
node1.getEnclosingCallable() = node2.getEnclosingCallable()
}
}