I didn't add this support in `AddressConstantExpression.qll` since I
think it would require extra work and testing to get the constexprness
right. My long-term plan for `AddressConstantExpression.qll` is to move
its functionality to the extractor.
Flow from a definition by reference of a field into its object was
working inconsistently and in a very syntax-dependent way. For a
function `f` receiving a reference, `f(a->x)` could propagate data back
to `a` via the _reverse read_ mechanism in the shared data-flow library,
but for a function `g` receiving a pointer, `g(&a->x)` would not work.
And `f((*a).x)` would not work either.
In all cases, the issue was that the shared data-flow library propagates
data backwards between `PostUpdateNode`s only, but there is no
`PostUpdateNode` for `a->x` in `g(&a->x)`. This pull request inserts
such post-update nodes where appropriate and links them to their
neighbors. In this exapmle, flow back from the output parameter of `g`
passes first to the `PostUpdateNode` of `&`, then to the (new)
`PostUpdateNode` of `a->x`, and finally, as a _reverse read_ with the
appropriate field projection, to `a`.
Logic is hard, and I made a mistake inverting the formula for the second case, so the
predicate never held for a sizeExpr like sizeof(int)*sizeof(void).
Now, this case is correctly handled by the fallback.
Based on PR feedback. This will avoid a syntactic wart and make the
invocation do the right thing both with and without
`language[monotonicAggregates]`.
This yields more precise size information in a lot of the common cases of C allocation code,
as the common pattern malloc(count * sizeof(type)) is now understood.
IR generation was not handling the special two-operand flavor of the `?:` operator that GCC supports as an extension. The extractor doesn't quite give us enough information to do this correctly (see github/codeql-c-extractor-team#67), but we can get pretty close.
About half of the code could be shared between the two-operand and three-operand flavors. The main differences for the two-operand flavor are:
1. The "then" operand isn't a child of the `ConditionalExpr`. Instead, we just reuse the original value of the "condition" operand, skipping any implicit cast to `bool` (see comment for rationale).
2. For the three-operand flavor, we generate the condition as control flow rather than the computation of a `bool` value, to avoid creating unnecessarily complicated branching. For the two-operand version, we just compute the value, since we have to reuse that value in the "then" branch anyway.
I've added IR tests for these new cases. I've also updated the expectations for `SignAnalysis.ql` based on the fix. @rdmarsh2, can you please double-check that these diffs look correct? I believe they do, but you're the range/sign analysis expert.