Commit Graph

5006 Commits

Author SHA1 Message Date
Geoffrey White
8059d69bbd C++: Model calls to operator new / delete for NewFreeMismatch.ql. 2020-04-06 14:27:05 +01:00
Geoffrey White
3e9f9645ae C++: Exclude calls to operator new / delete from NewFreeMismatch.ql. 2020-04-06 14:08:00 +01:00
Geoffrey White
97cdcbee63 C++: Test for NewFreeMismatch.ql with operator new / delete. 2020-04-06 13:57:28 +01:00
Jonas Jensen
bf7614a4c9 C++: Move Expr location workaround to Expr.qll
This workaround from `DataFlowUtil.qll` should be useful for any query
that selects an `Expr`. In particular, it's useful for IR data flow.

This commit does not include test changes.
2020-04-06 14:13:22 +02:00
Jonas Jensen
d4338473b0 C++: Enforce unique enclosing callable
Every data-flow node should have a unique enclosing function (_callable_
in the terminology of the data-flow library), but this was not evident
for the optimizer, and it led to a bad join order in `pathStep`. This
commit fixes the join order for C++ AST data flow. All other copies of
data flow seem to be fine.

These are the tuple counts for OpenJDK before this commit:

    (231s) Tuple counts for DataFlowImplLocal::pathStep#fffff#cur_delta:
    5882       ~0%       {6} r1 = SCAN DataFlowImplLocal::PathNodeMid#class#ffffff#prev_delta AS I OUTPUT I.<2>, I.<0>, I.<1>, I.<3>, I.<4>, I.<5>
    1063406780 ~0%       {7} r2 = JOIN r1 WITH DataFlowImplCommon::CallContext::relevantFor_dispred#ff AS R ON FIRST 1 OUTPUT r1.<2>, R.<1>, r1.<1>, r1.<0>, r1.<3>, r1.<4>, r1.<5>
    5882       ~1%       {6} r3 = JOIN r2 WITH DataFlowUtil::Node::getFunction_dispred#ff AS R ON FIRST 2 OUTPUT r2.<0>, r2.<6>, r2.<2>, r2.<3>, r2.<4>, r2.<5>
    105        ~0%       {5} r4 = JOIN r3 WITH project#DataFlowImplLocal::LocalFlowBigStep::localFlowBigStep#ffffff_021#join_rhs AS R ON FIRST 2 OUTPUT r3.<2>, r3.<3>, r3.<4>, r3.<5>, R.<2>
    5882       ~1%       {6} r5 = JOIN r2 WITH DataFlowUtil::Node::getFunction_dispred#ff AS R ON FIRST 2 OUTPUT r2.<5>, r2.<2>, r2.<0>, r2.<3>, r2.<4>, r2.<6>
    5882       ~0%       {6} r6 = JOIN r5 WITH DataFlowImplLocal::TNil#ff_1#join_rhs AS R ON FIRST 1 OUTPUT r5.<2>, false, r5.<5>, r5.<1>, r5.<3>, r5.<4>
    0          ~0%       {5} r7 = JOIN r6 WITH DataFlowImplLocal::LocalFlowBigStep::localFlowBigStep#ffffff_02413#join_rhs AS R ON FIRST 3 OUTPUT R.<4>, r6.<3>, r6.<4>, r6.<5>, R.<3>
    0          ~0%       {5} r8 = JOIN r7 WITH DataFlowImplLocal::TNil#ff AS R ON FIRST 1 OUTPUT r7.<1>, r7.<2>, r7.<3>, R.<1>, r7.<4>
    105        ~0%       {5} r9 = r4 \/ r8

The problem is that `DataFlowUtil::Node::getFunction_dispred#ff`
(`getEnclosingCallable`) is joined too late.

After this commit, the tuple counts look like this:

    (13s) Tuple counts for DataFlowImplLocal::pathStep#fffff#cur_delta:
    5882    ~1%       {6} r1 = SCAN DataFlowImplLocal::PathNodeMid#class#ffffff#prev_delta AS I OUTPUT I.<1>, I.<0>, I.<2>, I.<3>, I.<4>, I.<5>
    5882    ~3%       {7} r2 = JOIN r1 WITH DataFlowUtil::Node::getEnclosingCallable_dispred#ff AS R ON FIRST 1 OUTPUT r1.<2>, R.<1>, r1.<1>, r1.<0>, r1.<3>, r1.<4>, r1.<5>
    5882    ~1%       {6} r3 = JOIN r2 WITH DataFlowImplCommon::CallContext::relevantFor_dispred#ff AS R ON FIRST 2 OUTPUT r2.<3>, r2.<6>, r2.<2>, r2.<0>, r2.<4>, r2.<5>
    105     ~0%       {5} r4 = JOIN r3 WITH project#DataFlowImplLocal::LocalFlowBigStep::localFlowBigStep#ffffff_021#join_rhs AS R ON FIRST 2 OUTPUT r3.<2>, r3.<3>, r3.<4>, r3.<5>, R.<2>
    5882    ~1%       {6} r5 = JOIN r2 WITH DataFlowImplCommon::CallContext::relevantFor_dispred#ff AS R ON FIRST 2 OUTPUT r2.<5>, r2.<2>, r2.<3>, r2.<0>, r2.<4>, r2.<6>
    5882    ~0%       {6} r6 = JOIN r5 WITH DataFlowImplLocal::TNil#ff_1#join_rhs AS R ON FIRST 1 OUTPUT r5.<2>, false, r5.<5>, r5.<1>, r5.<3>, r5.<4>
    0       ~0%       {5} r7 = JOIN r6 WITH DataFlowImplLocal::LocalFlowBigStep::localFlowBigStep#ffffff_02413#join_rhs AS R ON FIRST 3 OUTPUT R.<4>, r6.<3>, r6.<4>, r6.<5>, R.<3>
    0       ~0%       {5} r8 = JOIN r7 WITH DataFlowImplLocal::TNil#ff AS R ON FIRST 1 OUTPUT r7.<1>, r7.<2>, r7.<3>, R.<1>, r7.<4>
    105     ~0%       {5} r9 = r4 \/ r8

There is a slight slowdown coming from the introduction of a new
predicate `DataFlowImplLocal::pathStep#fffff#join_rhs`, which is used
only in the standard order:

    (12s) Tuple counts for DataFlowImplLocal::pathStep#fffff#join_rhs:
    282057  ~0%     {2} r1 = SCAN DataFlowImplCommon::CallContext::relevantFor_dispred#ff AS I OUTPUT I.<1>, I.<0>
    9159890 ~1%     {2} r2 = JOIN r1 WITH DataFlowUtil::Node::getEnclosingCallable_dispred#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, r1.<1>
                    return r2

The evaluation of `unique` is cheap but not free:

    DataFlowUtil::Node::getEnclosingCallable_dispred#ff .............. 3.9s
    DataFlowUtil::Node::getEnclosingCallable_dispred#ff_10#join_rhs .. 3.5s

The first of these two predicates evaluates `unique`, and the second
simply reorders columns. They take about the same time, which suggests
that `unique` is about as fast as it can be, given the number of tuples
it needs to push around. Note that the column reordering predicate is
only needed because of the standard order.
2020-04-06 12:04:39 +02:00
Mathias Vorreiter Pedersen
3aa293210d C++: Ensure that only non-conflated chi instructions are used everywhere 2020-04-06 12:02:56 +02:00
Jonas Jensen
46fc91315b Java/C++/C#: Revert the join order fix from #2872
This revert brings back the performance problems in
`DataFlowImplLocal.qll` so they can be fixed in a different way. The fix
in #2872 was asymptotically good but had undesired overhead because it
introduced another predicate in the SCC that existed purely for join
ordering.

I did the revert by inlining the helper predicate, eliminating the
`enclosing` variable, and re-ordering the resulting lines to what they
were before #2872.
2020-04-06 10:04:50 +02:00
Robert
1096e5d947 Merge pull request #3163 from robertbrignull/code_scanning_suites
Add code-scanning suites
2020-04-06 08:45:40 +01:00
Mathias Vorreiter Pedersen
317734f41e C++: Attach PostUpdateNodes to Chi nodes following aschackmull's suggestion 2020-04-05 22:35:26 +02:00
Jonas Jensen
530d4294b0 Merge remote-tracking branch 'upstream/master' into DefaultTaintTracking-Configuration 2020-04-05 07:27:07 +02:00
Jonas Jensen
58366b19e9 C++: Path explanations in the last two queries
For some reason I thought that these two queries were special because
they manipulate `SecurityOptions` to change the taint-tracking sources.
It turns out it was just the opposite: the queries used to be special
because they invalidated the cache for the `tainted` predicate, but that
predicate is no longer used, so these queries are no longer special.
2020-04-04 16:47:06 +02:00
Jonas Jensen
54a23a486a C++: Accept test changes for 108d5177b8 2020-04-04 16:46:59 +02:00
Jonas Jensen
d7332644f0 C++: Fix DefinitionByReferenceNode.toString
This predicate now has a value also for calls to function pointers.
2020-04-04 15:31:01 +02:00
Jonas Jensen
108d5177b8 C++: Fix two bugs found by @rdmarsh2
Co-Authored-By: Robert Marsh <rdmarsh2@gmail.com>
2020-04-04 15:24:44 +02:00
Robert Marsh
316d932829 Merge pull request #3198 from MathiasVP/valuenumbering-provider-new-file
C++/C#: Prevent accidental import of ValueNumberPropertyProvider
2020-04-03 13:31:11 -07:00
Jonas Jensen
bb3616e4c4 C++: Add example for globalVarFromId 2020-04-03 17:51:35 +02:00
Jonas Jensen
5822cd7b84 C++: Put paths in the remaining LGTM-suite queries 2020-04-03 17:10:47 +02:00
Jonas Jensen
3ec1f691c2 C++: First query with flow-paths through globals 2020-04-03 16:45:00 +02:00
Jonas Jensen
aaebe3687e C++: Fix copy-paste error in convertedExprNode 2020-04-03 16:37:23 +02:00
Jonas Jensen
469bdae9b2 C++: More helpful toString for def. by ref. node 2020-04-03 16:37:23 +02:00
Jonas Jensen
36da2d1dae C++: Manipulate the source end of paths too
Without this, we get duplicate alerts in some cases and
unnatural-looking source nodes in other cases. The source nodes were
often `Conversion`s.
2020-04-03 16:37:23 +02:00
Jonas Jensen
e916f07a8e C++: Formatting fixups 2020-04-03 15:52:13 +02:00
Jonas Jensen
427815d3d1 C++: taintedWithPath QLDoc + simplification 2020-04-03 15:52:13 +02:00
Jonas Jensen
3653627650 C++: Let configuration class extend singleton 2020-04-03 15:52:13 +02:00
Mathias Vorreiter Pedersen
c54cddead1 C++: Include PrintValueNumbering in testcase 2020-04-03 12:42:06 +02:00
Jonas Jensen
16c7a35b1c Merge pull request #3195 from geoffw0/taintstring
C++: Model taint flow through std::string constructor and c_str()
2020-04-03 12:05:07 +02:00
Geoffrey White
73bfd819d9 C++: Rename classes. 2020-04-03 09:23:31 +01:00
Geoffrey White
1bcf187c3e C++: Rename Strings.qll -> StdString.qll. 2020-04-03 09:17:33 +01:00
Mathias Vorreiter Pedersen
0b12c1519b C++/C#: Sync identical files 2020-04-03 10:06:37 +02:00
Mathias Vorreiter Pedersen
0f70944a5b C++: Move ValueNumberPropertyProvider into its own file to prevent accidental imports 2020-04-03 09:55:41 +02:00
Robert Marsh
a8e191248e Merge branch 'master' into rdmarsh/cpp/ir-flow-through-outparams
Merge IR SSA test additions
2020-04-02 15:30:20 -07:00
Robert Marsh
fd915bb5b1 C++: fix join order in IR virtual dispatch 2020-04-02 14:56:11 -07:00
Geoffrey White
c9ec30fa2a C++: Update use of deprecated methods. 2020-04-02 19:49:42 +01:00
Geoffrey White
e9132d833c C++: Autoformat. 2020-04-02 19:49:42 +01:00
Geoffrey White
73171682b7 C++: Switch to taint flow as suggested in the old PR. 2020-04-02 19:49:41 +01:00
Geoffrey White
b14b52d0ac C++: Add models for std::string (as in old PR). 2020-04-02 19:49:41 +01:00
Geoffrey White
69f6790c83 C++: Add a test of taint through std::strings, based on the one in the old PR. 2020-04-02 19:49:31 +01:00
Mathias Vorreiter Pedersen
ce5d8d516f Merge branch 'master' into ir-flow-fields 2020-04-02 15:23:00 +02:00
Mathias Vorreiter Pedersen
e2908eaf63 C++: Add comment explaining why we can split call and allocation side effects 2020-04-02 15:11:13 +02:00
Mathias Vorreiter Pedersen
a273917e51 Merge branch 'master' into init-dynamic-alloc-newexpr 2020-04-02 14:11:03 +02:00
Jonas Jensen
4825774ce2 Merge pull request #3173 from geoffw0/opnew
C++: Support operator new and operator delete in models library
2020-04-02 14:01:10 +02:00
Mathias Vorreiter Pedersen
dda3aaa8aa C++: Add QLDoc to public classes and predicates 2020-04-02 14:00:33 +02:00
Mathias Vorreiter Pedersen
af9e05b9cd C++: Accept test 2020-04-02 10:57:11 +02:00
Geoffrey White
ead5feb921 C++: Autoformat. 2020-04-02 09:50:14 +01:00
Mathias Vorreiter Pedersen
020c273dc9 Merge branch 'master' into ir-flow-fields 2020-04-02 10:45:58 +02:00
Mathias Vorreiter Pedersen
8fdc4b037a C++: Ensure that no call side effect is an allocation side effect 2020-04-02 07:30:56 +02:00
Robert Marsh
a061811939 Merge branch 'master' into rdmarsh/cpp/ir-flow-through-outparams
Pick up new test for user-defined swap functions
2020-04-01 17:32:55 -07:00
Robert Marsh
bbb69d524e Merge pull request #3188 from geoffw0/swap4
C++: Add a user defined swap function to taint tests.
2020-04-01 17:28:46 -07:00
Jonas Jensen
207c76b855 C++: Path explanations in DefaultTaintTracking
The first three queries are migrated to use path explanations.
2020-04-01 20:51:05 +02:00
Jonas Jensen
b07380d2eb C++: Update ppReprType for C++ IR dataflow
I forgot to do this in b1be123e31. Without this change, we suffix
` : void` on very step of an IR path explanation.
2020-04-01 20:19:00 +02:00