This PR adds new predicates to `Declaration` and `Type` to get a fully-qualified canonical name for the element, suitable for debugging and dumps. It includes template parameters, cv qualifiers, function parameter and return types, and fully-qualified names for all symbols. These strings are too large to compute in productions queries, so they should be used only for dumps and debugging. Feel free to suggest better names for these predicates.
I've updated PrintAST and PrintIR to use these instead of `Function.getFullSignature()`. The biggest advantage of the new predicates is that they handle lambdas and local classes, which `getQualifiedName` and `getFullSignature` do not. This makes IR and AST dumps much more usable for real-world snapshots.
Along the way, I cleaned up some of our handling of `IntegralType` to use a single table for tracking the signed, unsigned, and canonical versions of each type. The canonical part is new, and was necessary for `getTypeIdentityString` so that `signed int` and `int` both appear as `int`.
These predicates currently take a pair of `IRBlock`s - as it stands, at
most one edge can exist from one `IRBlock` to a given other `IRBlock`.
We may need to revisit that assumption and create an `IREdge` IPA type
at some future date
There are a few IR APIs that we've found to be confusingly named. This PR renames them to be more consistent within the IR and with the AST API:
`Instruction.getFunction` -> `Instruction.getEnclosingFunction`: This was especially confusing when you'd call `FunctionAddressInstruction.getFunction` to get the function whose address was taken, and wound up with the enclosing function instead.
`Instruction.getXXXOperand` -> `Instruction.getXXX`. Now that `Operand` is an exposed type, we want a way to get a specific `Operand` of an `Instruction`, but more often we want to get the definition instruction of that operand. Now, the pattern is that `getXXXOperand` returns the `Operand`, and `getXXX` is equivalent to `getXXXOperand().getDefinitionInstruction()`.
`Operand.getInstruction` -> `Operand.getUseInstruction`: More consistent with the existing `Operand.getDefinitionInstruction` predicate.
This adds `IntegerPartial.qll`, which is similar to
`IntegerConstant.qll` except that it contains partial functions on
integers instead of total functions on optional integers. This speeds up
the constant analysis so it takes 6.5s instead of 10.3s on comdb2.
This does to `SSAConstruction` what the previous commit did to
`IRConstruction`. An instruction in `SSAConstruction` is now defined in
terms of how it was created rather than what it can be queried for.
Effectively, this defines `TInstruction` as `TInstructionTag` was
defined before and then removes `TInstructionTag` from
`SSAConstruction`. This also has the benefit of removing the concept of
an instruction tag from the public predicates on `Instruction`.
This definition was denormalized to the extent that an instruction was
defined in terms of the six main attributes it could be queried for.
This made it possible to do multi-column joins on those six attributes,
but it doesn't appear that this feature was useful in practice. The main
multi-column join that was in use was on the pair of
(`TranslatedElement, InstructionTag`), but the `TranslatedElement` was
not part of the `TInstruction`.
This commit changes `TInstruction` to be defined in terms of what it's
_built from_ (`TranslatedElement, InstructionTag`) instead. This makes
it possible to do multi-column joins on those two components, and then
there are separate predicates (usually with two columns) to query
instruction attributes, replacing the many uncached projections from
`MkInstruction` that were generated before.
An immediate advantage is that an `Expr` with multiple types will no
longer give rise to multiple `Instruction`s, fixing most of the errors
from the sanity query `ambiguousSuccessors`. The code inside
`IRConstruction.qll` becomes simpler and hopefully faster as there is no
longer a translation from `TranslatedElement` to `Locatable` and back
again.
The previous commit had the side effect that `IRVariable`s were created
for all `Functions`, including those that did not have IR. This commit
restricts all `TIRVariable` constructors to functions that have IR.
This doesn't make it much faster, but it reduces the debug output
volume. It also simplifies the code.
I've found this change necessary when I compute the full IR on a
Wireshark snapshot in QL4E. Without it, Eclipse runs out of memory
because the console log is too large.
The new predicate `isOrphan` gets inlined into
`ignoreExprAndDescendants`, whose performance improves from
TranslatedElement::ignoreExprAndDescendants#f .. 23.4s (executed 9 times)
to
TranslatedElement::ignoreExprAndDescendants#f ... 4.3s (executed 9 times)
This dramatic improvement is not only due to eliminating a type check in
the recursive case. Removing the type check from the other base cases
also enabled them to get better join orders.