Flow from a definition by reference of a field into its object was
working inconsistently and in a very syntax-dependent way. For a
function `f` receiving a reference, `f(a->x)` could propagate data back
to `a` via the _reverse read_ mechanism in the shared data-flow library,
but for a function `g` receiving a pointer, `g(&a->x)` would not work.
And `f((*a).x)` would not work either.
In all cases, the issue was that the shared data-flow library propagates
data backwards between `PostUpdateNode`s only, but there is no
`PostUpdateNode` for `a->x` in `g(&a->x)`. This pull request inserts
such post-update nodes where appropriate and links them to their
neighbors. In this exapmle, flow back from the output parameter of `g`
passes first to the `PostUpdateNode` of `&`, then to the (new)
`PostUpdateNode` of `a->x`, and finally, as a _reverse read_ with the
appropriate field projection, to `a`.
This case was added in dccc0f4db. The surrounding code has changed a lot
since then, and the case no longer seems to have an effect except to
create some dead ends and possibly cycles in the local flow graph.
Also clarify the docs on `Call` to decrease the likelyhood of such an
omission happening again.
The updated test reflects that `f1.operator()` lets the address of `f1`
escape from the caller.
This yields more precise size information in a lot of the common cases of C allocation code,
as the common pattern malloc(count * sizeof(type)) is now understood.
IR generation was not handling the special two-operand flavor of the `?:` operator that GCC supports as an extension. The extractor doesn't quite give us enough information to do this correctly (see github/codeql-c-extractor-team#67), but we can get pretty close.
About half of the code could be shared between the two-operand and three-operand flavors. The main differences for the two-operand flavor are:
1. The "then" operand isn't a child of the `ConditionalExpr`. Instead, we just reuse the original value of the "condition" operand, skipping any implicit cast to `bool` (see comment for rationale).
2. For the three-operand flavor, we generate the condition as control flow rather than the computation of a `bool` value, to avoid creating unnecessarily complicated branching. For the two-operand version, we just compute the value, since we have to reuse that value in the "then" branch anyway.
I've added IR tests for these new cases. I've also updated the expectations for `SignAnalysis.ql` based on the fix. @rdmarsh2, can you please double-check that these diffs look correct? I believe they do, but you're the range/sign analysis expert.
Includes a fairly exhaustive test case for arithmetic operations involving `_Complex` and/or `_Imaginary` types. Thanks to these new tests, I discovered that the extractor treats certain arithmetic operations on `_Imaginary` types as separate expression kinds, so I added support for those kinds in IR construction.
`getVariableType()` is used to compute the actual semantic type of a variable from its declared type. That's where we handle pointer and function decay for parameters, and it's also where we handle arrays of unknown bound initialized with an initializer of known bound.
Previously, even if neither of the above situations applied, the type that we returned was the `getUnspecifiedType()` of the variable. This meant that, for example, `const char* p` would be treated as `char *`. This is inconsistent with how we handle types elsewhere in IR construction, where we preserve typedefs and cv-qualifiers when creating the `CppType` of an `IRVariable`, `Instruction`, or `Operand`.
The only visible effect this fix has is to fix the inferred result type for `Phi` instructions for variables affect by this change in `getVariableType()` behavior. Previously, we would see the variable accessed as both `const char*` and as `char*`, so we'd fall back to the canonical pointer type, which is `decltype(nullptr)`. Now, we see the same type for all accesses to the variable, so we use that type as the type of the SSA memory location and as the result type of the `Phi` instruction.