Commit a1e44041e made `XMLFile` no longer extend `File`. I'm guessing
this was necessary in the branch where `File` was an IPA-typed `Element`
and `XMLFile` was not, but it broke compilation of some of our internal
queries.
I introduced some unnecessary base classes in the `TranslatedExpr` hierarchy with a previous commit. This commit refactors the hierarchy a bit to align with the following high-level description:
`TranslatedExpr` represents a translated piece of an `Expr`. Each `Expr` has exactly one `TranslatedCoreExpr`, which produces the result of that `Expr` ignoring any lvalue-to-rvalue conversion on its result. If an lvalue-to-rvalue converison is present, there is an additional `TranslatedLoad` for that `Expr` to do the conversion. For higher-level `Expr`s like `NewExpr`, there can also be additional `TranslatedExpr`s to represent the sub-operations within the overall `Expr`, such as the allocator call.
These expressions are a little trickier than most because they include an implicit call to an allocator function. The database tells us which function to call, but we have to synthesize the allocation size and alignment arguments ourselves. The alignment argument, if it exists, is always a constant, but the size argument requires multiplication by the element count for most `NewArrayExpr`s. I introduced the new `TranslatedAllocationSize` class to handle this.
The IR avoids having non-trivially-copyable and non-trivially-assignable types in register results, because objects of those types need to exist at a particular memory location. The `InitializeParameter` and `Uninitialized` instructions were violating this restriction because they returned register results, which were then stored into the destination location via a `Store`.
This change makes those two instructions take the destination address as an operand, and return a memory result representing the (un-)initialized memory, removing the need for a separate `Store` instruction.
For example, if you have 3 types called T, where t1 and t2 are defined
but t3 isn't, then you will have
unspecifiedtype(t1, t1)
unspecifiedtype(t2, t2)
unspecifiedtype(t3, t3)
t1 = resolve(t1)
t1 = resolve(t3)
t2 = resolve(t2)
t2 = resolve(t3)
so given
Type getUnspecifiedType() {
unspecifiedtype(unresolve(this), unresolve(result))
}
you get t1.getUnspecifiedType() = t2.
I think that in general the best thing to do is to not unresolve 'this',
but to just take the underlying value.
Casts to `void` did not have a semantic conversion type in the AST, so they also weren't getting generated correctly in the IR. I've added a `VoidConversion` class to the AST, along with tests. I've also added IR translation for such conversions, using a new `ConvertToVoid` opcode. I'm not sure if it's really necessary to generate an instruction to represent this, but it may be useful for detecting values that are explicitly unused (e.g. return value from a call).
I added two new sanity queries for the IR to detect the following:
- IR blocks with no successors, which usually indicates bad IR translation
- Phi instruction without an operand for one of the predecessor blocks.
These sanity queries found another subtle IR translation bug. If an expression that is normally translated as a condition (e.g. `&&`, `||`, or parens in certain contexts) has a constant value, we were not creating a `TranslatedExpr` for the expression at all. I changed it to always treat a constant condition as a non-condition expression.
All three `IRBlock.qll` files are now identical again, and they are just
a thin object-oriented layer on top of the three
`IRBlockConstruction.qll` files, two of which are identical.
As the queries live here, it makes sense for the suites to be versioned
together with them. The LGTM suite has already been moved. This commit
moves the actively-maintained non-LGTM suites.
Previously, we would try to find an element enclosing each macro
access. This is not in general well-defined, especially in the
context of template instantiations -- macros are a lexing-time
concept, and don't map cleanly onto AST elements.
`IRBlock` contains a few expensive predicates, mostly `getInstruction`
and `immediatelyDominates`. These were previously recomputed for each of
the three SSA layers even though they essentially produce the same
result in each layer. The only difference between the three types of
`IRBlock` is the phi nodes.
This commit changes the representation of `IRBlock` for `ssa` and
`aliased_ssa` so they become just wrappers around the `IRBlock` of their
previous layer. Most predicates in later layers are then computed from
the corresponding predicate of the preceding layer.
The `SSAConstruction::Cached::getInstructionOperand` predicate took
1m27s on a postgres snapshot before this change and was the slowest
predicate in SSAIR. It now takes 4.5s.
The slowdown was caused by its use of
`getUnmodeledDefinitionInstruction`, which got inlined into a place
where join orderer had little choice but to join the `MkInstruction`
relation with itself, creating a large intermediate relation.
I've added `pragma[noinline]` to `getUnmodeledDefinitionInstruction` and
also to similar predicates that are likely to cause the same problem in
the future.