31 Commits

Author SHA1 Message Date
Dave Bartolomeo
95a62beb7a C++: Update test expectations due to better dataflow analysis 2019-05-02 11:18:09 -07:00
Robert Marsh
919f5c616f C++: comment and test for taint flow via memcpy 2019-04-23 11:17:18 -07:00
Robert Marsh
262f724235 C++: add taint edges to DefinitionByReferenceNode 2019-04-22 10:39:02 -07:00
Robert Marsh
c9fbbfe7d8 Merge pull request #984 from rdmarsh2/rdmarsh/cpp/ir-stmtexpr
C++: add support for GNU StmtExpr in IR
2019-04-09 12:54:35 -04:00
Jonas Jensen
fedd652de8 Merge remote-tracking branch 'upstream/rc/1.20' into mergeback-20190408 2019-04-08 08:39:44 +02:00
Robert Marsh
8087cb5040 C++: add CopyValueInstruction for StmtExpr result 2019-04-05 11:27:19 -07:00
Jonas Jensen
71659594c8 C++: Let data flow past definition by reference
This commit changes how data flow works in the following code.

    MyType x = source();
    defineByReference(&x);
    sink(x);

The question here is whether there should be flow from `source` to
`sink`. Such flow is desirable if `defineByReference` doesn't write to
all of `x`, but it's undesirable if `defineByReference` is a typical
init function in `C` that writes to every field or if
`defineByReference` is `memcpy` or `memset` on the full range.

Before 1.20.0, there would be flow from `source` to `sink` in case `x`
happened to be modeled with `BlockVar` but not in case `x` happened to
be modelled with SSA. The choice of modelling depends on an analysis of
how `x` is used elsewhere in the function, and it's supposed to be an
internal implementation detail that there are two ways to model
variables. In 1.20.0, I changed the `BlockVar` behavior so it worked the
same as SSA, never allowing that flow. It turns out that this change
broke a customer's query.

This commit reverts `BlockVar` to its old behavior of letting flow
propagate past the `defineByReference` call and then regains consistency
by changing all variables that are ever defined by reference to be
modelled with `BlockVar` instead of SSA. This means we now get too much
flow in certain cases, but that appears to be better overall than
getting too little flow. See also the discussion in CPP-336.
2019-04-01 14:13:47 +02:00
Kevin Backhouse
08d852fa94 Merge pull request #1048 from jbj/dataflow-link-targets
C++: Data flow dispatch across link targets
2019-03-13 12:39:59 +00:00
Tom Hvitved
c5450128be Merge branch 'rc/1.20' into merge-rc 2019-03-12 09:14:38 +01:00
Robert Marsh
8a2a4678d8 C++: accept dataflow test change 2019-03-07 13:14:57 -08:00
Jonas Jensen
80b0765618 C++: Make IR DataFlow dispatch use non-IR version
This removes code duplication and ensures that the IR version also gets
the support for flow across link targets.
2019-03-06 10:08:14 +01:00
Jonas Jensen
10ce13d1e9 C++: Tests for cross-target dispatch 2019-03-06 10:08:13 +01:00
Jonas Jensen
0a57767cc6 C++: Data flow through StmtExpr 2019-03-05 14:36:40 +01:00
Jonas Jensen
a2de057c26 C++: Test for StmtExpr data flow 2019-03-05 14:34:19 +01:00
Jonas Jensen
8e6daafd7c C++: Add DefinitionByReferenceNode.getParameter
This commits also adds a test that uses `getParameter`. The new tests
demonstrate that support for array-to-pointer decay works, but we get
data flow to the array rather than its contents.
2019-02-28 09:39:51 +01:00
Jonas Jensen
972d00822c C++: Generalize std::move data flow 2019-02-27 15:53:00 +01:00
Jonas Jensen
80183464d9 C++: Define DefinitionByReferenceNode
This enables data flow through `memcpy` and similar functions modeled in
`semmle.code.cpp.model`.
2019-02-27 15:53:00 +01:00
Jonas Jensen
5647a1a658 C++: BlockVar value stops at def by ref (partial) 2019-02-27 15:05:53 +01:00
Jonas Jensen
20f3df0d09 C++: Add tests to demo lack dataflow by reference 2019-02-27 13:19:16 +01:00
Robert Marsh
07cbbdaf9a C++: accept test output 2019-02-21 17:18:06 -08:00
Robert Marsh
9a9ec7bb17 C++: add IR-based taint tracking library 2019-02-21 17:09:09 -08:00
Dave Bartolomeo
b40fd95b8e C++: Better tracking of SSA memory accesses
This change fixes a few key problems with the existing SSA implementations:

For unaliased SSA, we were incorrectly choosing to model a local variable that had accesses that did not cover the entire variable. This has been changed to ensure that all accesses to the variable are at offset zero and have the same type as the variable itself. This was only possible to fix now that every `MemoryOperand` has its own type.

For aliased SSA, we now correctly track the offset and size of each memory access using an interval of bit offsets covered by the access. The offset interval makes the overlap computation more straightforward. Again, this is only possible now that operands have types.
The `getXXXMemoryAccess` predicates are now driven by the `MemoryAccessKind` on the operands and results, instead of by specific opcodes.

This change does fix an existing false negative in the IR dataflow tests.

I added a few simple test cases to the SSA IR tests, covering the various kinds of overlap (MustExcactly, MustTotally, and MayPartially).

I added "PrintSSA.qll", which can dump the SSA memory accesses as part of an IR dump.
2019-02-13 10:44:39 -08:00
Jonas Jensen
dcb24e07c3 C++: Remove getFullyConverted call in sink def
With this change, the `IRDataflowTestCommon.qll` and
`DataflowTestCommon.qll` files use the same definitions of sources and
sinks. Since the IR data flow library is meant to be compatible with the
AST data flow library, this is what we ought to be testing.

Two alerts change but not necessarily for the right reasons.
2019-01-16 13:56:52 +01:00
Jonas Jensen
502b7cfe33 C++: Don't use C-style varargs in test.cpp sink
As we prepare to clarify how conversions are treated, we don't want a
`sink(...)` declaration where it's non-obvious which conversions are
applied to arguments.
2019-01-16 09:47:58 +01:00
Dave Bartolomeo
a81ba84c0e C++: Update test expectations after unreachable IR removal 2018-12-10 21:22:55 -08:00
Dave Bartolomeo
2b80aee557 C++: Use getConvertedResultExpr in IR-based dataflow
This sort of fixes one FP and causes a new FN, but for the wrong reasons. The IR dataflow is tracking the reference itself, rather than the referred-to object. Once we can better model indirections, we can make this work correctly.

This change is still the right thing to do, because it ensures that the dataflow is looking at actual expression being computed by the instruction.
2018-12-05 12:34:44 -08:00
Dave Bartolomeo
e11b4b6c40 C++: Fix IR Dataflow PR feedback 2018-12-04 07:31:13 -08:00
Dave Bartolomeo
2822d14588 C++: Add missing changes to test_ir.expected 2018-12-02 22:22:34 -08:00
Dave Bartolomeo
58f7596519 C++: IR-based dataflow 2018-11-30 12:15:11 -08:00
Ian Lynagh
a1e44041ec C++: Use mkElement/unresolveElement consistently 2018-08-20 16:12:26 +01:00
Pavel Avgustinov
b55526aa58 QL code and tests for C#/C++/JavaScript. 2018-08-02 17:53:23 +01:00