Initial implementation of data flow through fields, using the algorithm of the
shared data flow implementation. Fields (and field-like properties) are covered,
and stores can be either
- ordinary assignments, `Foo = x`,
- object initializers, `new C() { Foo = x }`, or
- field initializers, `int Foo = x`.
For field initializers, we need to synthesize calls (`SynthesizedCall`),
callables (`SynthesizedCallable`), parameters (`InstanceParameterNode`), and
arguments (`SynthesizedThisArgumentNode`), as the C# extractor does not (yet)
extract such entities. For example, in
```
class C
{
int Field1 = 1;
int Field2 = 2;
C() { }
}
```
there is a synthesized call from the constructor `C`, with a synthesized `this`
argument, and the targets of that call are two synthesized callables with bodies
`this.Field1 = 1` and `this.Field2 = 2`, respectively.
A consequence of this is that `DataFlowCallable` is no longer an alias for
`DotNet::Callable`, but instead an IPA type.
- General refactoring to fit with the shared data flow implementation.
- Move CFG splitting logic into `ControlFlowReachability.qll`.
- Replace `isAdditionalFlowStepIntoCall()` with `TaintedParameterNode`.
- Redefine `ReturnNode` to be the actual values that are returned, which should
yield better path information.
- No longer consider overrides in CIL calls.
Before this change,
```
flowOutOfCallableStep(CallNode call, ReturnNode ret, OutNode out, CallContext cc)
```
would compute all combinations of call sites `call` and returned expressions `ret`
up front.
Now, we instead introduce explicit return nodes, so each callable has exactly
one return node (as well as one for each `out`/`ref` parameter). There is then
local flow from a returned expression to the relevant return node, and
`flowOutOfCallableStep()` computes combinations of call sites and return nodes.
Not only does this result in better performance, it also makes `flowOutOfCallableStep()`
symmetric to `flowIntoCallableStep()`, where each argument is mapped to a parameter,
and not to all reads of that parameter.
Data flow nodes for expressions do not take CFG splitting into account. Example:
```
if (b)
x = tainted;
x = x.ToLower();
if (!b)
Use(x);
```
Flow is incorrectly reported from `tainted` to `x` in `Use(x)`, because the step
from `tainted` to `x.ToLower()` throws away the information that `b = true`.
The solution is to remember the splitting in data flow expression nodes, that is,
to represent the exact control flow node instead of just the expression. With that
we get flow from `tainted` to `[b = true] x.ToLower()`, but not from `tainted` to
`[b = false] x.ToLower()`.
The data flow API remains unchanged, but in order for analyses to fully benefit from
CFG splitting, sanitizers in particular should be CFG-based instead of expression-based:
```
if (b)
x = tainted;
if (IsInvalid(x))
return;
Use(x);
```
If the call to `IsInvalid()` is a sanitizer, then defining an expression node to be
a sanitizer using `GuardedExpr` will be too conservative (`x` in `Use(x)` is in fact
not guarded). However, `[b = true] x` in `[b = true] Use(x)` is guarded, and to help
defining guard-based sanitizers, the class `GuardedDataFlowNode` has been introduced.