The predicate `AlwaysTrueUponEntryLoop.getARelevantVariable` was very
sensitive to join ordering, and with the 1.19 QL engine it got an
unfortunate join order that made it explode on certain snapshots. With
this change, it goes from taking minutes to taking less than a second on
a libretro-uae snapshot.
Made `Node::getType()`, `Node::asParameter()`, and `Node::asUninitialized()` operate directly on the IR. This actually fixed several diffs compared to the AST dataflow, because `getType()` wasn't holding for nodes that weren't `Exprs`.
Made `Uninitialized` a `VariableInstruction`. This makes it consistent with `InitializeParameter`.
This commit adds a new model interface that describes the known side effects (or lack thereof) of a library function. Does it read memory, does it write memory, and do any of its parameters escape? Initially, we have models for just two Standard Library functions: `std::move` and `std::forward`, which neither read nor write memory, and do not escape their parameter.
IR construction has been updated to insert the correct side effect instruction (or no side effect instruction) based on the model.
This fixes a subtle bug in the construction of aliased SSA. `getResultMemoryAccess` was failing to return a `MemoryAccess` for a store to a variable whose address escaped. This is because no `VirtualIRVariable` was being created for such variables. The code was assuming that any access to such a variable would be via `UnknownMemoryAccess`. The result is that accesses to such variables were not being modeled in SSA at all.
Instead, the way to handle this is to have a `VariableMemoryAccess` even when the variable being accessed has escaped, and to have `VariableMemoryAccess::getVirtualVariable()` return the `UnknownVirtualVariable` for escaped variables. In the future, this will also let us be less conservative about inserting `Chi` nodes, because we'll be able to determine that there's an exact overlap between two accesses to the same escaped variable in some cases.
The AST dataflow library essentially ignores conversions, which is probably the right behavior. Converting an `int` to a `long` preserves the value, even if the bit pattern might be different. It's arguable whether narrowing conversions should be treated as dataflow, but we'll do so for now. We can revisit that if we see it cause problems.
For example, in
```
void M(object x)
{
var y = x == null ? 1 : 2;
if (y == 2)
x.ToString();
}
```
the guard `y == 2` implies that the guard `x == null` must be false,
as the assignment of `2` to `y` is unique.
For example, in
```
void M(object x)
{
var y = x != null ? "" : null;
if (y != null)
x.ToString();
}
```
the guard `y != null` implies that the guard `x != null` must be true.
All the filtering is now done in `getALikelyCallee`, to which I have also added an additional parameter that improves the join in the `select` clause.
I've also simplified the alert message to no longer use `toString`, which isn't meant for alert messages anyway. (This is an old query.)
I noticed that queries using the data flow library spent significant
time in `#Dominance::bbIDominates#fbPlus`, which is the body of the
`bbStrictlyDominates` predicate. That predicate took 28 seconds to
compute on Wireshark.
The `b` in the predicate name means that magic was applied, and the
application of magic meant that it could not be evaluated with the
built-in `fastTC` HOP but became an explicit recursion instead. Applying
`pragma[nomagic]` to this predicate means that we will always get it
evaluated with `fastTC`, and that takes less than a second in my test
case.
The internal pre-SSA library was extended on 3e78c2671f
to include fields/properties that are local-scope-like. The CFG splitting logic
uses ranking of SSA definitions to define an (arbitrary) order of splits, but for
fields/properties the implicit entry definition all have the same line and column.
In effect, such SSA definitions incorrectly get the same rank. Adding the name
of the field/property to the lexicographic ordering resolves the issue.