Commit Graph

56 Commits

Author SHA1 Message Date
Mathias Vorreiter Pedersen
7b456d6162 Merge branch 'main' into mathiasvp/array-field-flow 2020-09-16 10:45:31 +02:00
Jonas Jensen
bdce24735c C++: Add flow through arrays
This works by adding data-flow edges to skip over array expressions when
reading from arrays. On the post-update side, there was already code to
skip over array expressions when storing to arrays. That happens in
`valueToUpdate` in `AddressFlow.qll`, which needed just a small tweak to
support assignments with non-field expressions at the top-level LHS,
like `*a = ...` or `a[0] = ...`.

The new code in `AddressFlow.qll` is copy-pasted from `EscapesTree.qll`,
and there is already a note in these files saying that they share a lot
of code and must be maintained in sync.
2020-09-15 14:46:11 +02:00
Mathias Vorreiter Pedersen
41147d245d C++: Accept test changes 2020-09-08 14:35:22 +02:00
Jonas Jensen
5f0d283212 Merge remote-tracking branch 'upstream/master' into dataflow-indirect-args
The conflicts came from how `this` is now a parameter but not a
`Parameter` on `master`.

Conflicts:
	cpp/ql/src/semmle/code/cpp/ir/dataflow/internal/DataFlowUtil.qll
	cpp/ql/test/library-tests/dataflow/DefaultTaintTracking/defaulttainttracking.cpp
	cpp/ql/test/library-tests/dataflow/DefaultTaintTracking/tainted.expected
	cpp/ql/test/library-tests/dataflow/DefaultTaintTracking/test_diff.expected
	cpp/ql/test/library-tests/dataflow/dataflow-tests/dataflow-ir-consistency.expected
	cpp/ql/test/library-tests/dataflow/fields/ir-flow.expected
	cpp/ql/test/library-tests/syntax-zoo/dataflow-ir-consistency.expected
2020-06-02 15:35:02 +02:00
Robert Marsh
fb46002332 C++: Fix ThisParameterNode after IR changes 2020-05-26 13:35:08 -07:00
Jonas Jensen
3a89f43cd6 Merge remote-tracking branch 'upstream/master' into dataflow-indirect-args
Conflicts:
	cpp/ql/src/semmle/code/cpp/ir/dataflow/DefaultTaintTracking.qll
	cpp/ql/src/semmle/code/cpp/ir/dataflow/internal/DataFlowUtil.qll
	cpp/ql/test/library-tests/dataflow/DefaultTaintTracking/defaulttainttracking.cpp
	cpp/ql/test/library-tests/dataflow/DefaultTaintTracking/tainted.expected
	cpp/ql/test/library-tests/dataflow/DefaultTaintTracking/test_diff.expected
	cpp/ql/test/library-tests/dataflow/dataflow-tests/test_ir.expected
2020-05-11 14:44:17 +02:00
Mathias Vorreiter Pedersen
8c03423f3e C++: Accept test output 2020-04-17 12:03:16 +02:00
Mathias Vorreiter Pedersen
d56284fe8f C++: Move added flow from simpleLocalFlowStep to simpleInstructionLocalFlowStep and remove flow that could cause field conflation 2020-04-07 16:00:40 +02:00
Mathias Vorreiter Pedersen
c577541850 C++: Fix reverse read dataflow consistency failure and accept tests 2020-04-06 15:50:08 +02:00
Mathias Vorreiter Pedersen
af9e05b9cd C++: Accept test 2020-04-02 10:57:11 +02:00
Jonas Jensen
b622d62d3c C++: Wire up param/arg indirections in data flow 2020-03-25 15:23:43 +01:00
Jonas Jensen
a0e2d59c01 C++: Add tests for global-var support 2020-02-05 16:31:13 +01:00
Jonas Jensen
b4385c6e60 C++: Don't use GVN in AST DataFlow BarrierNode
It turns out that the evaluator will evaluate the GVN stage even when no
predicate from it is needed after optimization of the subsequent stages.
The GVN library is expensive to evaluate, and it'll become even more
expensive when we switch its implementation to IR.

This PR disables the use of GVN in `DataFlow::BarrierNode` for the AST
data-flow library, which should improve performance when evaluating a
single data-flow query on a snapshot with no cache. Precision decreases
slightly, leading to a new FP in the qltests.

There is no corresponding change for the IR data-flow library since IR
GVN is not very expensive.
2020-02-04 08:40:36 +01:00
Dave Bartolomeo
542579de7f C++: Accept dataflow test changes due to new alias analysis 2020-01-28 10:58:27 -07:00
Dave Bartolomeo
6988241b09 Merge from master 2020-01-26 16:38:48 -07:00
Mathias Vorreiter Pedersen
77531294bf C++: Accepted output on tests 2020-01-23 10:20:10 +01:00
Jonas Jensen
66914e52c6 C++: accept test changes 2020-01-22 14:08:05 +01:00
Jonas Jensen
f4d0c5e905 C++ IR: Support for global virtual dispatch
The IR data flow library now supports virtual dispatch with a library
that's similar to `security.TaintTracking`. In particular, it should
have the same performance characteristics. The main difference is that
non-recursive callers of `flowsFrom` now pass `_` instead of `true` for
`boolean allowFromArg`. This change allows flow through `return` to
actually work.
2020-01-16 14:51:28 +01:00
Jonas Jensen
64c79bf9e1 C++: Deprecate UninitializedNode in IR data flow
It's not used outside of tests, and it's not useful. It will break the
tests when we start allowing flow through chi nodes.
2019-12-27 11:21:33 +01:00
Robert Marsh
facbd32062 Merge branch 'master' into rdmarsh/cpp/ir-callee-side-effects 2019-11-14 11:09:13 -08:00
Robert Marsh
2fb1d4d1b1 C++: fix IR return block successors 2019-11-14 10:29:48 -08:00
Jonas Jensen
ec9ef33486 C++: IR data flow through inheritance conversions
This makes IR data flow behave more like AST data flow, and it makes IR
virtual dispatch work without further changes.
2019-11-06 14:04:07 +01:00
Jonas Jensen
49008c9ff5 C++: IR data flow local virtual dispatch
This is just good enough to cause no performance regressions and pass
the virtual-dispatch tests we have for `security.TaintTracking`. In
particular, it fixes the tests for `UncontrolledProcessOperation.ql`
when enabling `DefaultTaintTracking.qll`.
2019-11-06 14:04:02 +01:00
Robert Marsh
8076156cb1 Merge branch 'master' into rdmarsh/cpp/ir-callee-side-effects 2019-10-28 16:50:34 -07:00
Jonas Jensen
b13535ac7d C++: Implement DataFlow::BarrierGuard for AST+IR
The change note is copied from the Java change note.
2019-10-28 16:22:23 +01:00
Robert Marsh
e8dd0227ae C++: accept test changes 2019-10-22 14:27:43 -07:00
Jonas Jensen
19f642fc8d Merge commit '7434702' into dataflow-ref-parameter
This merges #1735 into this branch to resolve the semantic merge
conflicts between them.
2019-10-08 12:55:47 +02:00
Jonas Jensen
7c319efb8b C++: Data flow through reference parameters 2019-10-01 10:43:49 +02:00
Jonas Jensen
fd6d06fe6f C++: Data flow through address-of operator (&)
The data flow library conflates pointers and their objects in some
places but not others. For example, a member function call `x.f()` will
cause flow from `x` of type `T` to `this` of type `T*` inside `f`. It
might be ideal to avoid that conflation, but that's not realistic
without using the IR.

We've had good experience in the taint tracking library with conflating
pointers and objects, and it improves results for field flow, so perhaps
it's time to try it out for all data flow.
2019-09-17 13:16:34 +02:00
Ian Lynagh
1d56407c72 C++: Pull some of library-tests/dataflow/dataflow-tests into clang.cpp
g++ doesn't support this code:

    sorry, unimplemented: non-trivial designated initializers not supported
       twoIntFields sSwapped = { .m2 = source(), .m1 = 0 };

so we need to build it in clang mode.
2019-09-05 15:12:17 +01:00
Jonas Jensen
e9a029cba3 C++: Local field flow using global library
This commit removes fields from the responsibilities of `FlowVar.qll`.
The treatment of fields in that file was slow and imprecise.

It then adds another copy of the shared global data flow library, used
only to find local field flow, and it exposes that local field flow
through `localFlow` and `localFlowStep`.

This has a performance cost. It adds two cached stages to any query that
uses `localFlow`: the stage from `DataFlowImplCommon`, which is shared
with all queries that use global data flow, and a new stage just for
`localFlowStep`.
2019-09-02 11:17:27 +02:00
Dave Bartolomeo
8a9528b1a8 C++: Accept test output after fixes for PointerAdd element sizes 2019-08-22 10:43:31 -07:00
Jonas Jensen
d38dbf0f63 C++: Workaround for lambda expression locations
See CPP-427.
2019-08-22 11:52:56 +02:00
Jonas Jensen
84adeda167 C++: Support flow through LambdaExpression
I've checked with a temporary workaround for the locations problem that
my annotations in the test cpp files are on the correct lines.
2019-08-16 16:20:22 +02:00
Geoffrey White
eb39346d85 Merge pull request #1744 from jbj/ast-field-flow-aggregate-init
C++: Field flow through ClassAggregateLiteral
2019-08-16 09:56:11 +01:00
Geoffrey White
a6902bdb37 CPP: Test dataflow through lambdas. 2019-08-15 19:43:24 +01:00
Jonas Jensen
1b4b352316 C++: Field flow through ClassAggregateLiteral 2019-08-15 12:01:42 +02:00
Jonas Jensen
6a3f5efc1b C++: Accept AST field flow test output 2019-08-08 14:05:03 +02:00
Jonas Jensen
984405be2e C++ IR: Change many uses of getAnyDef to getDef
This changes all the getters on `Instruction` to use `getDef` instead of
`getAnyDef`, with the result that these getters now only have a result
if the definition is exact.

This is a backwards-INCOMPATIBLE change.
2019-07-03 11:04:57 +02:00
Dave Bartolomeo
95a62beb7a C++: Update test expectations due to better dataflow analysis 2019-05-02 11:18:09 -07:00
Robert Marsh
c9fbbfe7d8 Merge pull request #984 from rdmarsh2/rdmarsh/cpp/ir-stmtexpr
C++: add support for GNU StmtExpr in IR
2019-04-09 12:54:35 -04:00
Jonas Jensen
fedd652de8 Merge remote-tracking branch 'upstream/rc/1.20' into mergeback-20190408 2019-04-08 08:39:44 +02:00
Robert Marsh
8087cb5040 C++: add CopyValueInstruction for StmtExpr result 2019-04-05 11:27:19 -07:00
Jonas Jensen
71659594c8 C++: Let data flow past definition by reference
This commit changes how data flow works in the following code.

    MyType x = source();
    defineByReference(&x);
    sink(x);

The question here is whether there should be flow from `source` to
`sink`. Such flow is desirable if `defineByReference` doesn't write to
all of `x`, but it's undesirable if `defineByReference` is a typical
init function in `C` that writes to every field or if
`defineByReference` is `memcpy` or `memset` on the full range.

Before 1.20.0, there would be flow from `source` to `sink` in case `x`
happened to be modeled with `BlockVar` but not in case `x` happened to
be modelled with SSA. The choice of modelling depends on an analysis of
how `x` is used elsewhere in the function, and it's supposed to be an
internal implementation detail that there are two ways to model
variables. In 1.20.0, I changed the `BlockVar` behavior so it worked the
same as SSA, never allowing that flow. It turns out that this change
broke a customer's query.

This commit reverts `BlockVar` to its old behavior of letting flow
propagate past the `defineByReference` call and then regains consistency
by changing all variables that are ever defined by reference to be
modelled with `BlockVar` instead of SSA. This means we now get too much
flow in certain cases, but that appears to be better overall than
getting too little flow. See also the discussion in CPP-336.
2019-04-01 14:13:47 +02:00
Tom Hvitved
c5450128be Merge branch 'rc/1.20' into merge-rc 2019-03-12 09:14:38 +01:00
Robert Marsh
8a2a4678d8 C++: accept dataflow test change 2019-03-07 13:14:57 -08:00
Jonas Jensen
0a57767cc6 C++: Data flow through StmtExpr 2019-03-05 14:36:40 +01:00
Jonas Jensen
8e6daafd7c C++: Add DefinitionByReferenceNode.getParameter
This commits also adds a test that uses `getParameter`. The new tests
demonstrate that support for array-to-pointer decay works, but we get
data flow to the array rather than its contents.
2019-02-28 09:39:51 +01:00
Jonas Jensen
80183464d9 C++: Define DefinitionByReferenceNode
This enables data flow through `memcpy` and similar functions modeled in
`semmle.code.cpp.model`.
2019-02-27 15:53:00 +01:00
Jonas Jensen
5647a1a658 C++: BlockVar value stops at def by ref (partial) 2019-02-27 15:05:53 +01:00