problem is:
```
14294 ~33% {1} r23 = r21 UNION r22
13626 ~0% {2} | JOIN WITH `DataFlowPublic::Node.getEnclosingCallable/0#dispred#be95825a` ON FIRST 1 OUTPUT Rhs.1, Lhs.0
11871493 ~2% {2} | JOIN WITH `DataFlowPublic::Node.getEnclosingCallable/0#dispred#be95825a_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1
6810938 ~3% {2} | JOIN WITH num#DataFlowPublic::TCfgNode#2cd2fb22_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1
0 ~0% {4} | JOIN WITH `DataFlowDispatch::resolveMethodCall/4#3067f1f1#reorder_0_3_1_2#prev` ON FIRST 2 OUTPUT Rhs.3, Lhs.1, Lhs.0, Rhs.2
0 ~0% {4} | JOIN WITH num#DataFlowDispatch::CallTypeClassMethod#3508c3e5 ON FIRST 1 OUTPUT Lhs.3, Lhs.2, Lhs.0, Lhs.1
0 ~0% {4} | JOIN WITH `DataFlowDispatch::resolveCall/3#454c02d8#reorder_1_0_2#prev` ON FIRST 3 OUTPUT Lhs.3, Lhs.1, Lhs.0, Lhs.2
0 ~0% {5} | JOIN WITH num#DataFlowDispatch::TSelfArgumentPosition#de6d64b8 CARTESIAN PRODUCT OUTPUT Lhs.1, Lhs.2, Lhs.3, Lhs.0, Rhs.0
```
that is, it does cartesian product of DataFlowPublic::Node.getEnclosingCallable
After fix
```
14294 ~33% {1} r23 = r21 UNION r22
0 ~0% {4} | JOIN WITH `DataFlowDispatch::resolveMethodCall/4#3067f1f1#reorder_3_0_1_2#prev` ON FIRST 1 OUTPUT Rhs.3, Lhs.0, Rhs.1, Rhs.2
0 ~0% {4} | JOIN WITH num#DataFlowDispatch::CallTypeClassMethod#3508c3e5 ON FIRST 1 OUTPUT Lhs.3, Lhs.2, Lhs.0, Lhs.1
0 ~0% {4} | JOIN WITH `DataFlowDispatch::resolveCall/3#454c02d8#reorder_1_0_2#prev` ON FIRST 3 OUTPUT Lhs.1, Lhs.3, Lhs.0, Lhs.2
0 ~0% {5} | JOIN WITH num#DataFlowPublic::TCfgNode#2cd2fb22 ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0, Lhs.2, Lhs.3
0 ~0% {5} | JOIN WITH `DataFlowPublic::Node.getEnclosingCallable/0#dispred#be95825a` ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Lhs.2, Lhs.3, Lhs.4
0 ~0% {4} | JOIN WITH `DataFlowPublic::Node.getEnclosingCallable/0#dispred#be95825a` ON FIRST 2 OUTPUT Lhs.0, Lhs.2, Lhs.3, Lhs.4
0 ~0% {5} | JOIN WITH num#DataFlowDispatch::TSelfArgumentPosition#de6d64b8 CARTESIAN PRODUCT OUTPUT Lhs.1, Lhs.2, Lhs.3, Lhs.0, Rhs.0
```
Overall stats
(old)
Pipeline standard for DataFlowDispatch::getCallArg/5#21589076@b30c7vxg was evaluated in 51 iterations totaling 54ms (delta sizes total: 38247).
==>
(new)
Pipeline standard for DataFlowDispatch::getCallArg/5#21589076@c1559vxu was evaluated in 51 iterations totaling 28ms (delta sizes total: 38247).
No major performance impact, more of a learning example for myself (had +3000 join order badness).
Initial tuple counts
```
Evaluated recursive predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@594cfx2g in 1ms on iteration 1 (delta size: 4).
Evaluated relational algebra for predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@594cfx2g on iteration 1 running pipeline base with tuple counts:
37793 ~0% {3} r1 = JOIN `ApiGraphs::API::Node.getACall/0#dispred#312deb92_10#join_rhs` WITH DataFlowPublic::CallCfgNode#b8ddbf81 ON FIRST 1 OUTPUT Lhs.1, Lhs.0, Rhs.1
0 ~0% {2} | JOIN WITH `SqlAlchemy::SqlAlchemy::Connection::classRef/0#565fc3ad` ON FIRST 1 OUTPUT Lhs.1, Lhs.2
30 ~0% {5} r2 = JOIN DataFlowPublic::CallCfgNode#b8ddbf81 WITH `DataFlowPublic::MethodCallNode.calls/2#dispred#1dd1e0f4#ffb` ON FIRST 1 OUTPUT Lhs.0, Lhs.1, Rhs.1, Rhs.2, _
{4} | REWRITE WITH NOT [NOT [Tmp.4 := "begin", TEST InOut.3 = Tmp.4], NOT [Tmp.4 := "connect", TEST InOut.3 = Tmp.4]] KEEPING 4
21 ~0% {3} | SCAN OUTPUT In.2, In.0, In.1
4 ~0% {2} | JOIN WITH `SqlAlchemy::SqlAlchemy::Engine::instance/0#1828baef` ON FIRST 1 OUTPUT Lhs.1, Lhs.2
4 ~0% {2} r3 = r1 UNION r2
return r3
```
which is fixed by the only_bind_out
```
Evaluated recursive predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@49effxtg in 0ms on iteration 1 (delta size: 0).
Evaluated relational algebra for predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@49effxtg on iteration 1 running pipeline base with tuple counts:
0 ~0% {1} r1 = JOIN `SqlAlchemy::SqlAlchemy::Connection::classRef/0#565fc3ad` WITH `ApiGraphs::API::Node.getACall/0#dispred#312deb92` ON FIRST 1 OUTPUT Rhs.1
0 ~0% {2} | JOIN WITH DataFlowPublic::CallCfgNode#b8ddbf81 ON FIRST 1 OUTPUT Lhs.0, Rhs.1
return r1
```
We also had this initial problem
```
Evaluated recursive predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@594cfx2g in 1ms on iteration 4 (delta size: 0).
Evaluated relational algebra for predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@594cfx2g on iteration 4 running pipeline standard with tuple counts:
48722 ~6% {2} r1 = DataFlowPublic::CallCfgNode#b8ddbf81 AND NOT SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0#prev(FIRST 2)
48722 ~3% {3} r2 = SCAN r1 OUTPUT In.0, _, In.1
48722 ~1% {3} | REWRITE WITH Out.1 := "connect"
16 ~0% {3} | JOIN WITH `DataFlowPublic::MethodCallNode.calls/2#dispred#1dd1e0f4#ffb_021#join_rhs` ON FIRST 2 OUTPUT Rhs.2, Lhs.0, Lhs.2
0 ~0% {2} | JOIN WITH `SqlAlchemy::SqlAlchemy::Connection::instance/0#5ed87c17#prev_delta` ON FIRST 1 OUTPUT Lhs.1, Lhs.2
48722 ~3% {3} r3 = SCAN r1 OUTPUT In.0, _, In.1
48722 ~2% {3} | REWRITE WITH Out.1 := "execution_options"
9 ~0% {3} | JOIN WITH `DataFlowPublic::MethodCallNode.calls/2#dispred#1dd1e0f4#ffb_021#join_rhs` ON FIRST 2 OUTPUT Rhs.2, Lhs.0, Lhs.2
0 ~0% {2} | JOIN WITH `SqlAlchemy::SqlAlchemy::Connection::instance/0#5ed87c17#prev_delta` ON FIRST 1 OUTPUT Lhs.1, Lhs.2
0 ~0% {2} r4 = r2 UNION r3
return r4
```
which is fixed by `connectionConstruction_helper`
```
Evaluated recursive predicate SqlAlchemy::SqlAlchemy::Connection::helper/0#62cfc178#b@4f295yef in 1ms on iteration 4 (delta size: 0).
Evaluated relational algebra for predicate SqlAlchemy::SqlAlchemy::Connection::helper/0#62cfc178#b@4f295yef on iteration 4 running pipeline standard with tuple counts:
4 ~0% {1} r1 = JOIN `SqlAlchemy::SqlAlchemy::Connection::instance/1#029b4c87#prev_delta` WITH `TypeTrackingImpl::TypeTracker::end/0#2ac2cfd4` ON FIRST 1 OUTPUT Lhs.1
16 ~0% {1} | JOIN WITH `LocalSources::Cached::hasLocalSource/2#8b3ee0ec_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1
0 ~0% {3} | JOIN WITH `DataFlowPublic::MethodCallNode.calls/2#dispred#1dd1e0f4#ffb_102#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Rhs.2, _
0 ~0% {2} | REWRITE WITH NOT [NOT [Tmp.2 := "connect", TEST InOut.1 = Tmp.2], NOT [Tmp.2 := "execution_options", TEST InOut.1 = Tmp.2]] KEEPING 2
0 ~0% {1} | JOIN WITH DataFlowPublic::CallCfgNode#b8ddbf81 ON FIRST 1 OUTPUT Lhs.0
0 ~0% {1} | AND NOT `SqlAlchemy::SqlAlchemy::Connection::helper/0#62cfc178#b#prev`(FIRST 1)
return r1
```
Two issues:
- Tests relying on existing query machinery (i.e. `import python`) were not resolving
correctly due to a bad `qlpack.yml` file.
- The diagnostics output tests needed an updated import to account for their new location.
This is essentially the contents of `language-packs/python/tools` with some minor
modifications to account for the changed location.
Of note: we explicitly exclude the `recorded-call-graph-metrics` director that
was already present in `python/tools`. When we revisit this directory for some
cleanup (e.g. to get rid of the `lgtm` references), we'll probably want to switch
to an explicit list of sources to include.
We noticed that when
- a function has more than one summary (with different charpred)
- one summary is subsumed by a subpath (or something happens around the function being extracted)
- the function is called multiple times(we needed at least three)
one of the summaries would no longer lead to flow.
We do this to remove the inconsistencies, and to be ready for a future
where type-tracking support content tracker of depth > 1.
It works because targets of loadSteps needs to be LocalSourceNodes
predicate loadStep(Node nodeFrom, LocalSourceNode nodeTo, Content content) {