In theory this bug could associated CaptureOutNodes with the wrong transitively called
callable. However, in practice I could not create a test case that revealed incorrect
behaviour. I've included one such test case in the commit.
I believe that the cause of this is that OutNode::getACall() is not actually used in the
data flow libraries. Instead, DataFlowDispatch::Cached::getAnOutNode is the predicate
which is used to associated OutNode's with DataFlowCall's in practice, and that is always
used in a context that correctly binds the runtime target of the call.
TTransitiveCaptureCall represents a control flow node that may
transitively call many different callables which capture a variable from
the current scope. Captured variables are represented as synthetic
parameters to the callable, at negative indices. However, each of the
different targets may capture a different subset of variables from the
enclosing scope, so we must include the target along side the CFN in
order to prevent incorrect capture flow.
This commit adds field initializers to the CFG for non-static constructors. For
example, in
```
class C
{
int Field1 = 0;
int Field2 = Field1 + 1;
int Field3;
public C()
{
Field3 = 2;
}
public C(int i)
{
Field3 = 3;
}
}
```
the initializer expressions `Field1 = 0` and `Field2 = Field1 + 1` are added
to the two constructors, mimicking
```
public C()
{
Field1 = 0;
Field2 = Field1 + 1;
Field3 = 2;
}
```
and
```
public C()
{
Field1 = 0;
Field2 = Field1 + 1;
Field3 = 3;
}
```
respectively. This means that we no longer have to synthesize calls, callables,
parameters, and arguments in the data flow library, so much of the work from
d1755500e4 can be simplified.
This file is now identical in all languages. Unifying this file led to
the following changes:
- The documentation spelling fixes and example from the C++ version
were copied to the other versions and updated.
- The steps through `NonLocalJumpNode` from C# were abstracted into a
`globalAdditionalTaintStep` predicate that's empty for C++ and Java.
- The `defaultTaintBarrier` predicate from Java is now present but empty
on C++ and C#.
- The C++ `isAdditionalFlowStep` predicate on
`TaintTracking::Configuration` no longer includes `localFlowStep`.
That should avoid some unnecessary tuple copying.
This class was copy-pasted in all `DataFlowN.qll` files without using
the identical-files system to keep the copies in sync. The class is now
moved to the `DataFlowImplN.qll` files.
This also has the effect of preventing recursion through first data flow
library copy for C/C++. Such recursion has been deprecated for over a
year, and some forms of recursions are already ruled out by the library
implementation.
There's now a `localFlowStep` predicate for use directly in queries and
other libraries and a `simpleLocalFlowStep` for use only by the global
data flow library. The former predicate is intended to include field
flow, but the latter may not.
This will let Java and C# (and possibly C++ IR) avoid getting two kinds
of field flow at the same time, both from SSA and from the global data
flow library. It should let C++ AST add some form of field flow to
`localFlowStep` without making it an input to the global data flow
library.
To keep the code changes minimal, and to keep the implementation similar
to C++ and Java, the `TaintTracking{Public,Private}` files are now
imported together through `TaintTrackingUtil`. This has the side effect
of exposing `localAdditionalTaintStep`. The corresponding predicate for
Java was already exposed.
Initial implementation of data flow through fields, using the algorithm of the
shared data flow implementation. Fields (and field-like properties) are covered,
and stores can be either
- ordinary assignments, `Foo = x`,
- object initializers, `new C() { Foo = x }`, or
- field initializers, `int Foo = x`.
For field initializers, we need to synthesize calls (`SynthesizedCall`),
callables (`SynthesizedCallable`), parameters (`InstanceParameterNode`), and
arguments (`SynthesizedThisArgumentNode`), as the C# extractor does not (yet)
extract such entities. For example, in
```
class C
{
int Field1 = 1;
int Field2 = 2;
C() { }
}
```
there is a synthesized call from the constructor `C`, with a synthesized `this`
argument, and the targets of that call are two synthesized callables with bodies
`this.Field1 = 1` and `this.Field2 = 2`, respectively.
A consequence of this is that `DataFlowCallable` is no longer an alias for
`DotNet::Callable`, but instead an IPA type.
- Speedup the `varBlockReaches()` predicate, by restricting to basic blocks
in which a given SSA definition may still be live, in constrast to just
being able to reach *any* access (read or write) to the underlying source
variable.
- Account for some missing cases in the `lastRead()` predicate.