Commit Graph

114 Commits

Author SHA1 Message Date
Jonas Jensen
b4385c6e60 C++: Don't use GVN in AST DataFlow BarrierNode
It turns out that the evaluator will evaluate the GVN stage even when no
predicate from it is needed after optimization of the subsequent stages.
The GVN library is expensive to evaluate, and it'll become even more
expensive when we switch its implementation to IR.

This PR disables the use of GVN in `DataFlow::BarrierNode` for the AST
data-flow library, which should improve performance when evaluating a
single data-flow query on a snapshot with no cache. Precision decreases
slightly, leading to a new FP in the qltests.

There is no corresponding change for the IR data-flow library since IR
GVN is not very expensive.
2020-02-04 08:40:36 +01:00
Robert Marsh
71d87be773 C++: add flow through partial loads in DTT 2020-01-29 17:51:42 -08:00
Dave Bartolomeo
542579de7f C++: Accept dataflow test changes due to new alias analysis 2020-01-28 10:58:27 -07:00
Dave Bartolomeo
6988241b09 Merge from master 2020-01-26 16:38:48 -07:00
Mathias Vorreiter Pedersen
77531294bf C++: Accepted output on tests 2020-01-23 10:20:10 +01:00
Mathias Vorreiter Pedersen
256ae2fda6 C++: Add test demonstrating a flow not detected 2020-01-23 10:16:24 +01:00
Jonas Jensen
66914e52c6 C++: accept test changes 2020-01-22 14:08:05 +01:00
Jonas Jensen
391b80eac4 C++: Show virtual inheritance problem in vdispatch 2020-01-20 11:17:44 +01:00
Jonas Jensen
f4d0c5e905 C++ IR: Support for global virtual dispatch
The IR data flow library now supports virtual dispatch with a library
that's similar to `security.TaintTracking`. In particular, it should
have the same performance characteristics. The main difference is that
non-recursive callers of `flowsFrom` now pass `_` instead of `true` for
`boolean allowFromArg`. This change allows flow through `return` to
actually work.
2020-01-16 14:51:28 +01:00
Jonas Jensen
64c79bf9e1 C++: Deprecate UninitializedNode in IR data flow
It's not used outside of tests, and it's not useful. It will break the
tests when we start allowing flow through chi nodes.
2019-12-27 11:21:33 +01:00
Robert Marsh
e209ed961a Merge branch 'master' into rdmarsh/cpp/ir-callee-side-effects 2019-12-17 15:11:02 -08:00
Jonas Jensen
763b18cd11 Merge remote-tracking branch 'upstream/master' into StackVariable
Conflicts:
      change-notes/1.24/analysis-cpp.md
      cpp/ql/src/Security/CWE/CWE-131/NoSpaceForZeroTerminator.ql
2019-11-28 17:51:20 +01:00
Robert Marsh
05aebeff79 Merge branch 'master' into rdmarsh/cpp/ir-callee-side-effects 2019-11-21 13:45:31 -08:00
Jonas Jensen
c41114334f Merge remote-tracking branch 'upstream/master' into ir-dataflow-toString
Solved conflicts in `*.expected` by re-running the tests.
2019-11-19 14:27:27 +01:00
Jonas Jensen
1498499994 C++: Relax type in two tests 2019-11-19 11:31:34 +01:00
Robert Marsh
facbd32062 Merge branch 'master' into rdmarsh/cpp/ir-callee-side-effects 2019-11-14 11:09:13 -08:00
Robert Marsh
2fb1d4d1b1 C++: fix IR return block successors 2019-11-14 10:29:48 -08:00
Jonas Jensen
ec9ef33486 C++: IR data flow through inheritance conversions
This makes IR data flow behave more like AST data flow, and it makes IR
virtual dispatch work without further changes.
2019-11-06 14:04:07 +01:00
Jonas Jensen
49008c9ff5 C++: IR data flow local virtual dispatch
This is just good enough to cause no performance regressions and pass
the virtual-dispatch tests we have for `security.TaintTracking`. In
particular, it fixes the tests for `UncontrolledProcessOperation.ql`
when enabling `DefaultTaintTracking.qll`.
2019-11-06 14:04:02 +01:00
Robert Marsh
8076156cb1 Merge branch 'master' into rdmarsh/cpp/ir-callee-side-effects 2019-10-28 16:50:34 -07:00
Jonas Jensen
b13535ac7d C++: Implement DataFlow::BarrierGuard for AST+IR
The change note is copied from the Java change note.
2019-10-28 16:22:23 +01:00
Robert Marsh
e8dd0227ae C++: accept test changes 2019-10-22 14:27:43 -07:00
Jonas Jensen
19f642fc8d Merge commit '7434702' into dataflow-ref-parameter
This merges #1735 into this branch to resolve the semantic merge
conflicts between them.
2019-10-08 12:55:47 +02:00
Jonas Jensen
7c319efb8b C++: Data flow through reference parameters 2019-10-01 10:43:49 +02:00
Jonas Jensen
fd6d06fe6f C++: Data flow through address-of operator (&)
The data flow library conflates pointers and their objects in some
places but not others. For example, a member function call `x.f()` will
cause flow from `x` of type `T` to `this` of type `T*` inside `f`. It
might be ideal to avoid that conflation, but that's not realistic
without using the IR.

We've had good experience in the taint tracking library with conflating
pointers and objects, and it improves results for field flow, so perhaps
it's time to try it out for all data flow.
2019-09-17 13:16:34 +02:00
Jonas Jensen
7cfbe88e7b C++: IR DataFlow::Node.toString consistency
The `toString` for IR data-flow nodes are now similar to AST data-flow
nodes. This should make it easier to use the IR as a drop-in replacement
in the future. There are still differences because the IR data flow
library takes conversions into account.

I did not attempt to align the new nodes we use for field flow. That can
come later, when we add field flow to IR data flow.
2019-09-13 14:33:31 +02:00
Jonas Jensen
4ef5c9af62 C++: Autoformat everything
Some files that will change in #1736 have been spared.

    ./build -j4 target/jars/qlformat
    find ql/cpp/ql -name "*.ql"  -print0 | xargs -0 target/jars/qlformat --input
    find ql/cpp/ql -name "*.qll" -print0 | xargs -0 target/jars/qlformat --input
    (cd ql && git checkout 'cpp/ql/src/semmle/code/cpp/ir/implementation/**/*SSA*.qll')
    buildutils-internal/scripts/pr-checks/sync-identical-files.py --latest
2019-09-09 11:25:53 +02:00
Ian Lynagh
1d56407c72 C++: Pull some of library-tests/dataflow/dataflow-tests into clang.cpp
g++ doesn't support this code:

    sorry, unimplemented: non-trivial designated initializers not supported
       twoIntFields sSwapped = { .m2 = source(), .m1 = 0 };

so we need to build it in clang mode.
2019-09-05 15:12:17 +01:00
Jonas Jensen
e9a029cba3 C++: Local field flow using global library
This commit removes fields from the responsibilities of `FlowVar.qll`.
The treatment of fields in that file was slow and imprecise.

It then adds another copy of the shared global data flow library, used
only to find local field flow, and it exposes that local field flow
through `localFlow` and `localFlowStep`.

This has a performance cost. It adds two cached stages to any query that
uses `localFlow`: the stage from `DataFlowImplCommon`, which is shared
with all queries that use global data flow, and a new stage just for
`localFlowStep`.
2019-09-02 11:17:27 +02:00
Dave Bartolomeo
8a9528b1a8 C++: Accept test output after fixes for PointerAdd element sizes 2019-08-22 10:43:31 -07:00
Jonas Jensen
d38dbf0f63 C++: Workaround for lambda expression locations
See CPP-427.
2019-08-22 11:52:56 +02:00
Jonas Jensen
84adeda167 C++: Support flow through LambdaExpression
I've checked with a temporary workaround for the locations problem that
my annotations in the test cpp files are on the correct lines.
2019-08-16 16:20:22 +02:00
Geoffrey White
eb39346d85 Merge pull request #1744 from jbj/ast-field-flow-aggregate-init
C++: Field flow through ClassAggregateLiteral
2019-08-16 09:56:11 +01:00
Geoffrey White
a6902bdb37 CPP: Test dataflow through lambdas. 2019-08-15 19:43:24 +01:00
Jonas Jensen
1b4b352316 C++: Field flow through ClassAggregateLiteral 2019-08-15 12:01:42 +02:00
Jonas Jensen
98d6f3cada C++: Unify partial def and def-by-ref
This removes a lot of flow steps, but it all seems to be flow that was
present twice: both exiting a `PartialDefNode` and a
`DefinitionByReferenceNode`. All `DefinitionByReferenceNode`s are now
`PartialDefNode`s.
2019-08-08 14:05:03 +02:00
Jonas Jensen
6a3f5efc1b C++: Accept AST field flow test output 2019-08-08 14:05:03 +02:00
Jonas Jensen
984405be2e C++ IR: Change many uses of getAnyDef to getDef
This changes all the getters on `Instruction` to use `getDef` instead of
`getAnyDef`, with the result that these getters now only have a result
if the definition is exact.

This is a backwards-INCOMPATIBLE change.
2019-07-03 11:04:57 +02:00
Dave Bartolomeo
95a62beb7a C++: Update test expectations due to better dataflow analysis 2019-05-02 11:18:09 -07:00
Robert Marsh
c9fbbfe7d8 Merge pull request #984 from rdmarsh2/rdmarsh/cpp/ir-stmtexpr
C++: add support for GNU StmtExpr in IR
2019-04-09 12:54:35 -04:00
Jonas Jensen
fedd652de8 Merge remote-tracking branch 'upstream/rc/1.20' into mergeback-20190408 2019-04-08 08:39:44 +02:00
Robert Marsh
8087cb5040 C++: add CopyValueInstruction for StmtExpr result 2019-04-05 11:27:19 -07:00
Jonas Jensen
71659594c8 C++: Let data flow past definition by reference
This commit changes how data flow works in the following code.

    MyType x = source();
    defineByReference(&x);
    sink(x);

The question here is whether there should be flow from `source` to
`sink`. Such flow is desirable if `defineByReference` doesn't write to
all of `x`, but it's undesirable if `defineByReference` is a typical
init function in `C` that writes to every field or if
`defineByReference` is `memcpy` or `memset` on the full range.

Before 1.20.0, there would be flow from `source` to `sink` in case `x`
happened to be modeled with `BlockVar` but not in case `x` happened to
be modelled with SSA. The choice of modelling depends on an analysis of
how `x` is used elsewhere in the function, and it's supposed to be an
internal implementation detail that there are two ways to model
variables. In 1.20.0, I changed the `BlockVar` behavior so it worked the
same as SSA, never allowing that flow. It turns out that this change
broke a customer's query.

This commit reverts `BlockVar` to its old behavior of letting flow
propagate past the `defineByReference` call and then regains consistency
by changing all variables that are ever defined by reference to be
modelled with `BlockVar` instead of SSA. This means we now get too much
flow in certain cases, but that appears to be better overall than
getting too little flow. See also the discussion in CPP-336.
2019-04-01 14:13:47 +02:00
Kevin Backhouse
08d852fa94 Merge pull request #1048 from jbj/dataflow-link-targets
C++: Data flow dispatch across link targets
2019-03-13 12:39:59 +00:00
Tom Hvitved
c5450128be Merge branch 'rc/1.20' into merge-rc 2019-03-12 09:14:38 +01:00
Robert Marsh
8a2a4678d8 C++: accept dataflow test change 2019-03-07 13:14:57 -08:00
Jonas Jensen
80b0765618 C++: Make IR DataFlow dispatch use non-IR version
This removes code duplication and ensures that the IR version also gets
the support for flow across link targets.
2019-03-06 10:08:14 +01:00
Jonas Jensen
10ce13d1e9 C++: Tests for cross-target dispatch 2019-03-06 10:08:13 +01:00
Jonas Jensen
0a57767cc6 C++: Data flow through StmtExpr 2019-03-05 14:36:40 +01:00
Jonas Jensen
a2de057c26 C++: Test for StmtExpr data flow 2019-03-05 14:34:19 +01:00