I had included `InitializeNonLocal` in the recursion because it made
everything look better in the presence of a bug that's since been fixed.
Taking it out means the sanity test is again aligned with the old
`isChiForAllAliasedMemory`.
`Instruction.getDefinitionOverlap()` depends on `SSAConstruction::getMemoryOperandDefinition()`, which in turn depends on `SSAConstruction::hasMemoryOperandDefinition()`. When the definition in question came from a `Chi` instruction, `hasMemoryOperandDefinition()` incorrectly bound `overlap` to the overlap relationship between the original (non-`Chi`) instruction and the use. The fix is to make use of the `actualDefLocation` parameter to `getDefinitionOrChiInstruction()`, which specifies the location for the result of the `Chi` in that case.
The result of `getDefinitionOverlap()` should never be `MayPartiallyOverlap`, because if that were the case, we should have inserted as `Chi` instruction and hooked the definition up to that instead.
There are quite a few existing failures.
This predicate replaces `isChiForAllAliasedMemory`, which was always
intended to be temporary. A test is added to `IRSanity.qll` to verify
that the new predicate corresponds exactly with (a fixed version of) the
old one.
The implementation of the new predicate,
`Cached::hasConflatedMemoryResult` in `SSAConstruction.qll`, is faster
to compute than the old `isChiForAllAliasedMemory` because it uses
information that's readily available during SSA construction.
In the Unix ABI, `std::va_list` is defined as `typedef struct __va_list_tag { ... } va_list[1];`, which means that any `std::va_list` used as a function parameter decays to `struct __va_list_tag*`. Handling this actually made the QL code slightly cleaner. The only tricky bit is that we have to determine what type to use as the actual `va_list` type when loading, storing, or modifying a `std::va_list`. To do this, we look at the type of the argument to the `va_*` macro. A detailed QLDoc comment explains the details.
I added a test case for passing a `va_list` as an argument, and then manipulating that `va_list` in the callee.
This PR changes the IR we generate for functions that accept a variable argument list. Rather than simply using `BuiltInOperationInstruction` to model the various `va_*` macros as mysterious function-like operations, we now model them in more detail. The intent is to enable better alias analysis and taint flow through varargs.
The `va_start` macro now generates a unary `VarArgsStart` instruction that takes the address of the ellipsis pseudo-parameter as its operand, and returns a value of type `std::va_list`. This value is then stored into the actual `std::va_list` variable via a regular `Store`.
The `va_arg` macro now loads the `std::va_list` argument, then emits a `VarArg` instruction on the result. This returns the address of the vararg argument to be loaded. That address is later used as the address operand of a regular `Load` to return the value of the argument. To model the side effect of moving to the next argument, we emit a `NextVarArg` instruction that takes the previous `std::va_list` value and returns an updated one, which is then stored back into the `std::va_list` variable.
The `va_end` macro just emits a `VarArgsEnd` unary instruction that takes the address of the `std::va_list` argument and does nothing, since `va_end` doesn't really do anything on most compiler implementations anyway.
The `va_copy` macro is just modeled as a plain copy.
Since `runtimeExprInStaticInitializer` only looks at expressions at the
top level of an initializer or directly below some number of top-level
aggregate literals, there is no need for `inStaticInitializer` to
include expressions strictly below those in the AST.
I tested this on Wireshark, which has very large static initializers,
but found no measureable difference in run time. There are some
differences in tuple counts and iteration counts, though:
- `inStaticInitializer` changes from 6,241,153 rows (86 iterations) to
5,031,617 rows (7 iterations).
- `runtimeExprInStaticInitializer` changes from 386,350 rows to 4,705
rows.
- `hasDynamicInitialization` has 410 rows both before and after, which
suggests that this change does not affect results.
Even though there is no impact on this snapshot at this time, things
might look different if/when the restriction on aggregate literals to
100 children is removed in the extractor.