- Extract names of properties in a propery match, using the `exprorstmt_name` relation.
- Simplify extraction of properties by not distinguishing between top-level patterns
and nested patterns.
- Introduce `PatternExpr` to capture patterns in `is` expressions, `case` statements,
and `switch` expression arms.
- Generalize `IsTypeExpr`, `IsPatternExpr`, `IsRecursivePatternExpr`, and `IsConstantExpr`
to just `IsExpr` with a member predicate `PatternExpr getPattern()`.
- Generalize `TypeCase`, `RecursivePatternCase`, and `ConstCase` to just `CaseStmt` with
a member predicate `PatternExpr getPattern()`.
- Introduce classes `Switch` and `Case` as base classes of switch statements/expressions
and case statements/switch expression arms, respectively.
- Simplify CFG logic using the generalized classes.
- Generalize guards library to cover `switch` expressions tests.
- Generalize data flow library to cover `switch` expression assignments.
- General refactoring to fit with the shared data flow implementation.
- Move CFG splitting logic into `ControlFlowReachability.qll`.
- Replace `isAdditionalFlowStepIntoCall()` with `TaintedParameterNode`.
- Redefine `ReturnNode` to be the actual values that are returned, which should
yield better path information.
- No longer consider overrides in CIL calls.
- Cache predicates in the same stage using a cached module.
- Introduce `DefUse::defUseVariableUpdate()` and use in `CallableReturns.qll`.
The updated file `csharp/ql/test/library-tests/cil/dataflow/Nullness.expected`
demonstrates why this is needed.
- Utilize CIL analysis in `Guards::nonNullValue()`.
- Analyze SSA definitions in `AlwaysNullExpr`, similar to `NonNullExpr`.
Please let em know if this pattern works, and I can add other mechanisms to start new threads with a shared object.
Please also let me know what other mechanisms would you like me to add, I would like to focus on the most commonly used ones first. Thanks
Before this change,
```
flowOutOfCallableStep(CallNode call, ReturnNode ret, OutNode out, CallContext cc)
```
would compute all combinations of call sites `call` and returned expressions `ret`
up front.
Now, we instead introduce explicit return nodes, so each callable has exactly
one return node (as well as one for each `out`/`ref` parameter). There is then
local flow from a returned expression to the relevant return node, and
`flowOutOfCallableStep()` computes combinations of call sites and return nodes.
Not only does this result in better performance, it also makes `flowOutOfCallableStep()`
symmetric to `flowIntoCallableStep()`, where each argument is mapped to a parameter,
and not to all reads of that parameter.
Write accesses in assignments, such as the access to `x` in `x = 0` are not
evaluated, so they should not have entries in the control flow graph. However,
qualifiers (and indexer arguments) should still be evaluated, for example in
```
x.Foo.Bar = 0;
```
the CFG should be `x --> x.Foo --> 0 --> x.Foo.Bar = 0` (as opposed to
`x --> x.Foo --> x.Foo.Bar --> 0 --> x.Foo.Bar = 0`, prior to this change).
A special case is assignments via acessors (properties, indexers, and event
adders), where we do want to include the access in the control flow graph,
as it represents the accessor call:
```
x.Prop = 0;
```
But instead of `x --> x.set_Prop --> 0 --> x.Prop = 0` the CFG should be
`x --> 0 --> x.set_Prop --> x.Prop = 0`, as the setter is called *after* the
assigned value has been evaluated.
An even more special case is tuple assignments via accessors:
```
(x.Prop1, y.Prop2) = (0, 1);
```
Here the CFG should be
`x --> y --> 0 --> 1 --> x.set_Prop1 --> y.set_Prop2 --> (x.Prop1, y.Prop2) = (0, 1)`.
The recent change to `AccessorCall` on dd99525566 resulted
in some bad join-orders, so I have (partly) reverted them. This means that the issues
orignally addressed by that change are now reintroduced, and I plan to instead apply a
fix to the CFG, which--unlike the original fix--should be able to handle multi-property-tuple
assignments.
The syntactic node assiociated with accessor calls was previously always the
underlying member access. For example, in
```
x.Prop = y.Prop;
```
the implicit call to `x.set_Prop()` was at the syntactic node `x.Prop`, while the
implicit call to `y.get_Prop()` was at the syntactic node `y.Prop`.
However, this breaks the invariant that arguments to calls dominate the call itself,
as the argument `y.Prop` for the implicit `value` parameter in `x.set_Prop()` will
be evaluated after the call (the left-hand side in an assignment is evaluated before
the right-hand side).
The solution is to redefine the access call to `x.set_Prop()` to point to the whole
assignment `x.Prop = y.Prop`, instead of the access `x.Prop`. For reads, we still want
to associate the accessor call with the member access.
A corner case arises when multiple setters are called in a tuple assignment:
```
(x.Prop1, x.Prop2) = (0, 1)
```
In this case, we cannot associate the assignment with both `x.set_Prop1()` and
`x.set_Prop2()`, so we instead revert to using the underlying member accesses as
before.
Data flow nodes for expressions do not take CFG splitting into account. Example:
```
if (b)
x = tainted;
x = x.ToLower();
if (!b)
Use(x);
```
Flow is incorrectly reported from `tainted` to `x` in `Use(x)`, because the step
from `tainted` to `x.ToLower()` throws away the information that `b = true`.
The solution is to remember the splitting in data flow expression nodes, that is,
to represent the exact control flow node instead of just the expression. With that
we get flow from `tainted` to `[b = true] x.ToLower()`, but not from `tainted` to
`[b = false] x.ToLower()`.
The data flow API remains unchanged, but in order for analyses to fully benefit from
CFG splitting, sanitizers in particular should be CFG-based instead of expression-based:
```
if (b)
x = tainted;
if (IsInvalid(x))
return;
Use(x);
```
If the call to `IsInvalid()` is a sanitizer, then defining an expression node to be
a sanitizer using `GuardedExpr` will be too conservative (`x` in `Use(x)` is in fact
not guarded). However, `[b = true] x` in `[b = true] Use(x)` is guarded, and to help
defining guard-based sanitizers, the class `GuardedDataFlowNode` has been introduced.