Commit Graph

391 Commits

Author SHA1 Message Date
Geoffrey White
3c41ed56a1 CPP: Support taint to return value derefs instead. 2020-01-16 18:15:21 +00:00
Geoffrey White
ef47563139 CPP: Support flow of pointed-to things through function calls. 2020-01-16 11:08:19 +00:00
Geoffrey White
ce389ca791 CPP: Add tests for strdup. 2020-01-15 18:26:24 +00:00
Jonas Jensen
618bf2e29e C++: IR data flow through total chi operands 2019-12-27 11:44:41 +01:00
Jonas Jensen
c41114334f Merge remote-tracking branch 'upstream/master' into ir-dataflow-toString
Solved conflicts in `*.expected` by re-running the tests.
2019-11-19 14:27:27 +01:00
Ian Lynagh
2cf714a923 C++: Follow changes in lambda locations 2019-11-15 14:42:36 +00:00
Jonas Jensen
5d7a0b8dd5 Merge remote-tracking branch 'upstream/master' into dataflow-ref-parameter
I've accepted the new test output, which shows that this branch fixes
two false negatives in the test cases from #2088.
2019-10-08 13:09:20 +02:00
Geoffrey White
050d99fa87 CPP: Add test cases. 2019-10-04 17:44:27 +01:00
Jonas Jensen
7c319efb8b C++: Data flow through reference parameters 2019-10-01 10:43:49 +02:00
Jonas Jensen
fd6d06fe6f C++: Data flow through address-of operator (&)
The data flow library conflates pointers and their objects in some
places but not others. For example, a member function call `x.f()` will
cause flow from `x` of type `T` to `this` of type `T*` inside `f`. It
might be ideal to avoid that conflation, but that's not realistic
without using the IR.

We've had good experience in the taint tracking library with conflating
pointers and objects, and it improves results for field flow, so perhaps
it's time to try it out for all data flow.
2019-09-17 13:16:34 +02:00
Jonas Jensen
7cfbe88e7b C++: IR DataFlow::Node.toString consistency
The `toString` for IR data-flow nodes are now similar to AST data-flow
nodes. This should make it easier to use the IR as a drop-in replacement
in the future. There are still differences because the IR data flow
library takes conversions into account.

I did not attempt to align the new nodes we use for field flow. That can
come later, when we add field flow to IR data flow.
2019-09-13 14:33:31 +02:00
Jonas Jensen
562bffe710 C++: Simplify toString of ImplicitParameterNode
This string looked out of place compared to `ExplicitParameterNode`,
whose string is simply the name of the parameter and therefore
indistinguishable from an access to the parameter without looking at the
location also. This has not been a problem so far, and if we want to
distinguish more clearly between initial values and accesses at some
point, we should do it for `ExplicitParameterNode` and
`UninitializedNode` too.
2019-09-13 14:33:26 +02:00
Jonas Jensen
4ef5c9af62 C++: Autoformat everything
Some files that will change in #1736 have been spared.

    ./build -j4 target/jars/qlformat
    find ql/cpp/ql -name "*.ql"  -print0 | xargs -0 target/jars/qlformat --input
    find ql/cpp/ql -name "*.qll" -print0 | xargs -0 target/jars/qlformat --input
    (cd ql && git checkout 'cpp/ql/src/semmle/code/cpp/ir/implementation/**/*SSA*.qll')
    buildutils-internal/scripts/pr-checks/sync-identical-files.py --latest
2019-09-09 11:25:53 +02:00
Jonas Jensen
114c2fe0d4 Merge remote-tracking branch 'upstream/master' into ast-field-flow-defbyref 2019-09-05 09:33:45 +02:00
Jonas Jensen
d7681bf122 C++: Don't use definitionByReference for data flow
The data flow library conflates pointers and objects enough for the
`definitionByReference` predicate to be too strict in some cases. It was
too permissive in other cases that are now (or will be) handled better
by field flow.

See also the change note entry.
2019-09-03 11:49:01 +02:00
Matthew Gretton-Dann
03eb1ff785 C++: Update taint-tests for changed lambda support 2019-09-02 15:18:27 +01:00
Dave Bartolomeo
8a9528b1a8 C++: Accept test output after fixes for PointerAdd element sizes 2019-08-22 10:43:31 -07:00
Geoffrey White
a70975f95f CPP: Update test comments. 2019-08-22 15:40:38 +01:00
Jonas Jensen
d38dbf0f63 C++: Workaround for lambda expression locations
See CPP-427.
2019-08-22 11:52:56 +02:00
Jonas Jensen
2f4ed45dac C++: No taint between field and struct
To compensate for the lack of field flow, the taint tracking library has
previously considered taint to flow from fields to their containing
structs and back again from the structs to any of their fields. This
leads to false flow between unrelated fields and is not needed now that
we have proper flow through fields.
2019-08-21 11:57:12 +02:00
Geoffrey White
4ea999872b Merge pull request #1746 from jbj/ast-field-flow-ctor
C++: Field flow through ConstructorFieldInit
2019-08-19 09:14:02 +01:00
Jonas Jensen
84adeda167 C++: Support flow through LambdaExpression
I've checked with a temporary workaround for the locations problem that
my annotations in the test cpp files are on the correct lines.
2019-08-16 16:20:22 +02:00
Jonas Jensen
45eefdb218 C++: Field flow through ConstructorFieldInit
This allows a member initializer list to be seen as a sequence of field
assignments. For example, the constructor

    C() : a(taint()) { }

now has data flow similar to

    C() { this.a = taint(); }
2019-08-16 09:10:17 +02:00
Geoffrey White
1bd4aeebad CPP: Effects of #1715. 2019-08-15 14:05:09 +01:00
Geoffrey White
02e1edd640 CPP: Test taint through lambdas. 2019-08-15 14:00:45 +01:00
Jonas Jensen
fdd8de79da C++: Remove redundant toString override
This time I left a comment to prevent myself from getting confused again
and adding the override in the future.
2019-08-15 11:32:11 +02:00
Jonas Jensen
e94dbe926b C++: Add forgotten toString override
This makes `PostConstructorCallNode`s show up in the test output.
2019-08-14 16:26:49 +02:00
Jonas Jensen
38ec693ead C++: Improved ConstructorCall field flow
This commit changes C++ `ConstructorCall` to behave like
`new`-expressions in Java: they are both `ExprNode`s and
`PostUpdateNodes`, and there's a "pre-update node" (here called
`PreConstructorCallNode`) to play the role of the qualifier argument
when calling a constructor.
2019-08-13 11:05:13 +02:00
Jonas Jensen
98d6f3cada C++: Unify partial def and def-by-ref
This removes a lot of flow steps, but it all seems to be flow that was
present twice: both exiting a `PartialDefNode` and a
`DefinitionByReferenceNode`. All `DefinitionByReferenceNode`s are now
`PartialDefNode`s.
2019-08-08 14:05:03 +02:00
Jonas Jensen
6a3f5efc1b C++: Accept AST field flow test output 2019-08-08 14:05:03 +02:00
Geoffrey White
a9b953f89a CPP: Flip test output for consistency and easy comparison with the other tests. 2019-07-12 18:18:08 +01:00
Geoffrey White
c2fd2e273e CPP: Model taint flow through std::swap. 2019-07-12 18:00:39 +01:00
Geoffrey White
f132bca06e CPP: Add a taint flow test of 'std::swap'. 2019-07-12 16:37:01 +01:00
Robert Marsh
919f5c616f C++: comment and test for taint flow via memcpy 2019-04-23 11:17:18 -07:00
Robert Marsh
262f724235 C++: add taint edges to DefinitionByReferenceNode 2019-04-22 10:39:02 -07:00
Jonas Jensen
71659594c8 C++: Let data flow past definition by reference
This commit changes how data flow works in the following code.

    MyType x = source();
    defineByReference(&x);
    sink(x);

The question here is whether there should be flow from `source` to
`sink`. Such flow is desirable if `defineByReference` doesn't write to
all of `x`, but it's undesirable if `defineByReference` is a typical
init function in `C` that writes to every field or if
`defineByReference` is `memcpy` or `memset` on the full range.

Before 1.20.0, there would be flow from `source` to `sink` in case `x`
happened to be modeled with `BlockVar` but not in case `x` happened to
be modelled with SSA. The choice of modelling depends on an analysis of
how `x` is used elsewhere in the function, and it's supposed to be an
internal implementation detail that there are two ways to model
variables. In 1.20.0, I changed the `BlockVar` behavior so it worked the
same as SSA, never allowing that flow. It turns out that this change
broke a customer's query.

This commit reverts `BlockVar` to its old behavior of letting flow
propagate past the `defineByReference` call and then regains consistency
by changing all variables that are ever defined by reference to be
modelled with `BlockVar` instead of SSA. This means we now get too much
flow in certain cases, but that appears to be better overall than
getting too little flow. See also the discussion in CPP-336.
2019-04-01 14:13:47 +02:00
Jonas Jensen
972d00822c C++: Generalize std::move data flow 2019-02-27 15:53:00 +01:00
Jonas Jensen
80183464d9 C++: Define DefinitionByReferenceNode
This enables data flow through `memcpy` and similar functions modeled in
`semmle.code.cpp.model`.
2019-02-27 15:53:00 +01:00
Robert Marsh
07cbbdaf9a C++: accept test output 2019-02-21 17:18:06 -08:00
Robert Marsh
9a9ec7bb17 C++: add IR-based taint tracking library 2019-02-21 17:09:09 -08:00
Pavel Avgustinov
b55526aa58 QL code and tests for C#/C++/JavaScript. 2018-08-02 17:53:23 +01:00