Commit Graph

3581 Commits

Author SHA1 Message Date
Jonas Jensen
81b1bd4177 Merge pull request #2769 from aschackmull/java/perf-regression
Java: Improve performance.
2020-02-05 20:15:18 +01:00
Dave Bartolomeo
4c31c038b8 Merge from master 2020-02-05 11:23:14 -07:00
Dave Bartolomeo
4362bdb626 C++: Accept new test output 2020-02-05 10:56:40 -07:00
Dave Bartolomeo
1b6de4b32f C++: Fix formatting 2020-02-05 10:55:49 -07:00
Jonas Jensen
a0e2d59c01 C++: Add tests for global-var support 2020-02-05 16:31:13 +01:00
Jonas Jensen
f40acc19d2 C++: Use VariableNode in DefaultTaintTracking 2020-02-05 16:29:13 +01:00
Jonas Jensen
6d081a997a C++: Add VariableNode 2020-02-05 16:29:13 +01:00
Jonas Jensen
73e34f1447 C++: Refactor to separate out InstructionNode
This commit prepares the IR data-flow library for having more than one
type of data-flow node.
2020-02-05 16:29:13 +01:00
Jonas Jensen
cdfcee3ae9 Merge remote-tracking branch 'upstream/master' into ir-crement-load
Conflicts:
	cpp/ql/test/library-tests/ir/ssa/aliased_ssa_ir.expected
	cpp/ql/test/library-tests/ir/ssa/aliased_ssa_ir_unsound.expected
2020-02-05 16:13:21 +01:00
Anders Schack-Mulligen
07482abed7 Java/C++/C#: Sync. 2020-02-05 15:17:20 +01:00
Ian Lynagh
67d7e83c17 Merge pull request #2727 from matt-gretton-dann/codeql-c-extractor/7-edg-60-upgrade
Update expected results for changes in Extractor FE
2020-02-05 12:23:02 +00:00
Jonas Jensen
2928f9e5b2 Merge pull request #2703 from rdmarsh2/connect-ir-dataflow-models
C++: IR dataflow through modeled functions
2020-02-05 11:28:48 +01:00
Matthew Gretton-Dann
b601908577 CPP: Update for changes in EDG IL. 2020-02-05 09:11:23 +00:00
Matthew Gretton-Dann
1b67f47918 C++: Update with improved location information
EDG 6.0 gives better location in some circumstances changing the results
of these tests for the better.
2020-02-05 09:11:23 +00:00
Matthew Gretton-Dann
cec6646846 C++: Update for EDG 6.0 behaviour change
EDG 6.0 has changed how much information it gives about invalid
expressions.  Changing the output of this test.
2020-02-05 09:11:23 +00:00
Dave Bartolomeo
73ad2e9658 Merge from master 2020-02-04 18:33:10 -07:00
Dave Bartolomeo
a23d5afc6c C++: Add test case to demonstrate string literl aliasing change
Also fixed a minor bug where we should have been treating `AllNonLocalMemory` as _totally_ overlapping an access to a non-local variable, rather than _partially_ overlapping it. This fix is exhibited both in the new test case and in a couple existing test functions in `ssa.cpp`.
2020-02-04 18:24:08 -07:00
Robert Marsh
1576bcfa3f C++: remove unused predicates 2020-02-04 12:08:03 -08:00
Jonas Jensen
c77a921b06 Merge pull request #2695 from rdmarsh2/default-taint-tracking-diff-test
C++: add diff tests for DefaultTaintTracking
2020-02-04 20:57:55 +01:00
Robert Marsh
ac2e89317b C++: autoformat 2020-02-04 10:41:30 -08:00
Robert Marsh
861d5eb86b C++: update tests after merge 2020-02-04 10:29:52 -08:00
Robert Marsh
785d54ac67 Merge branch 'master' into default-taint-tracking-diff-test 2020-02-04 09:50:05 -08:00
Tom Hvitved
6e14ba4e56 C++: Follow-up changes 2020-02-04 14:09:12 +01:00
Tom Hvitved
c591719df2 Data flow: Sync files 2020-02-04 14:09:12 +01:00
Mathias Vorreiter Pedersen
0276c97b9c Merge pull request #2755 from jbj/BarrierGuard-SSA
C++: Don't use GVN in AST DataFlow BarrierNode
2020-02-04 12:00:12 +01:00
Jonas Jensen
b4385c6e60 C++: Don't use GVN in AST DataFlow BarrierNode
It turns out that the evaluator will evaluate the GVN stage even when no
predicate from it is needed after optimization of the subsequent stages.
The GVN library is expensive to evaluate, and it'll become even more
expensive when we switch its implementation to IR.

This PR disables the use of GVN in `DataFlow::BarrierNode` for the AST
data-flow library, which should improve performance when evaluating a
single data-flow query on a snapshot with no cache. Precision decreases
slightly, leading to a new FP in the qltests.

There is no corresponding change for the IR data-flow library since IR
GVN is not very expensive.
2020-02-04 08:40:36 +01:00
Robert Marsh
eafd7b6045 C++: accept test output 2020-02-03 15:27:34 -08:00
Robert Marsh
677f0f090a Merge branch 'master' into rdmarsh/cpp/ir-flow-through-outparams 2020-02-03 13:06:35 -08:00
Robert Marsh
931c0e982e Merge pull request #2748 from MathiasVP/value-numbering-indirection
C++: Indirection for ValueNumbering
2020-02-03 14:41:58 -05:00
Robert Marsh
f51841ac37 Merge pull request #2736 from jbj/buffer-type-size
C++: Workaround for problem with memcpy flow
2020-02-03 14:31:28 -05:00
Robert Marsh
3bfcf0bf46 Merge branch 'master' into connect-ir-dataflow-models 2020-02-03 11:06:45 -08:00
Cornelius Riemenschneider
36479d3fd6 Support to keep bounds derived on implicit integer casts. 2020-02-03 17:33:06 +01:00
Cornelius Riemenschneider
cf8efbb5a0 Add testcase. 2020-02-03 17:23:24 +01:00
Robert Marsh
2b10cd6228 Merge pull request #2737 from jbj/DefaultTaintTracking-indirect-parameters
C++: Interprocedural indirections in DefaultTaintTracking.qll
2020-02-03 11:12:38 -05:00
Mathias Vorreiter Pedersen
8aae2990d0 C++: Formatting 2020-02-03 16:15:49 +01:00
Mathias Vorreiter Pedersen
a8b3bcb87d C++: Indirection for value numbering 2020-02-03 16:13:32 +01:00
Cornelius Riemenschneider
1b68f86d5b Fix bug in CPP range analysis. 2020-02-03 14:16:48 +01:00
Dave Bartolomeo
fd2cafa95f C++: Accept GVN test output 2020-01-31 13:36:14 -07:00
Jonas Jensen
e2da98ae24 C++: Accept autoformat and test changes 2020-01-31 20:58:53 +01:00
Robert Marsh
3e2b0328b7 C++: update test expectations post-merge 2020-01-31 11:48:51 -08:00
Robert Marsh
089dda9090 Merge branch 'buffer-type-size-test' into jbj/buffer-type-size 2020-01-31 11:31:55 -08:00
Robert Marsh
2dd368fd1f C++: add SSA test for void* buffer parameters 2020-01-31 11:31:28 -08:00
Dave Bartolomeo
e27a0fe504 C++: Prevent AliasedVirtualVariable from overlapping string literals
We were hitting a combinatorial explosion in `hasDefinitionAtRank` for functions that contain a large number of string literals. The problem was that every `Chi` instruction for `AliasedVirtualVariable` was treated as a definition of every string literal. We already mark string literals as `isReadOnly()`, but we were allowing `AliasedVirtualVariable` to define read-only locations so that the `AliasedDefinition` instruction would provide the initial definition for all string literals.

To fix this, I've introduced the new `InitializeNonLocal` instruction, which is inserted in the prologue of every function right after `AliasedDefinition`. It provides the initial definition for every non-stack memory location, including read-only locations, but is never written to anywhere else. It is the conterpart of the `AliasedUse` instruction in the function epilogue, which represents the use of all non-stack memory after the function returns. I considered renaming `AliasedUse` to `ReturnNonLocal`, to match the `InitializeXXX`/`ReturnXXX` pattern we already use for parameters and indirections, but held off to avoid unnecessary churn. Any thoughts on whether I should make this name change?

This change has a significant speedup in evaluation time for a few of our troublesome databases:
`attnam/ivan`: 13%
`awslabs/s2n`: 26%
`SinaMostafanejad/OpenRDM`: 7%
`zcoinofficial/zcoin`: 8%
2020-01-31 11:33:46 -07:00
Jonas Jensen
83f807f182 C++: Interprocedural indirection taint tracking
As a temporary workaround in the `DefaultTaintTracking` library, we
funnel flow across calls by conflating pointer and object both at the
caller and the callee.

The three cases in `adjustedSink` were deleted because they are now
covered by the one case for `ReadSideEffectInstruction` in
`instructionTaintStep`.

When enabling `DefaultTaintTracking`, this commit on top of #2736 has
the effect effect of recovering two lost results:

    --- a/cpp/ql/test/query-tests/Security/CWE/CWE-119/semmle/tests/OverflowDestination.expected
    +++ b/cpp/ql/test/query-tests/Security/CWE/CWE-119/semmle/tests/OverflowDestination.expected
    @@ -1,2 +1,4 @@
     | overflowdestination.cpp:30:2:30:8 | call to strncpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |
     | overflowdestination.cpp:46:2:46:7 | call to memcpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |
    +| overflowdestination.cpp:53:2:53:7 | call to memcpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |
    +| overflowdestination.cpp:64:2:64:7 | call to memcpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |

In the internal repo, we recover one lost result. Additionally, there
are two queries that gain an extra source for an existing sink. I'll
classify that as noise. The new results look like this:

    foo(argv); // this `argv` is a new source for the sink in `bar`
    bar(argv); // this `argv` is the existing source for the sink in `bar`
2020-01-31 16:28:45 +01:00
Jonas Jensen
a1aed1ad93 C++: Workaround for problem with memcpy flow
The type of the source argument to `memcpy` is `void *`, and somehow
that meant that the copied object itself got type `void`. Since that has
size 0, the SSA construction did not model it as reading from the last
write.

This is probably not the right fix, but maybe it's good enough for now.
The right fix would ensure that the type reported by
`hasOperandMemoryAccess` is `UnknownType`.

When `DefaultTaintTracking.qll` is enabled, this commit has the effect
of restoring a lost results:

    --- a/cpp/ql/test/query-tests/Security/CWE/CWE-119/semmle/tests/OverflowDestination.expected
    +++ b/cpp/ql/test/query-tests/Security/CWE/CWE-119/semmle/tests/OverflowDestination.expected
    @@ -1 +1,2 @@
     | overflowdestination.cpp:30:2:30:8 | call to strncpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |
    +| overflowdestination.cpp:46:2:46:7 | call to memcpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |
2020-01-31 16:04:43 +01:00
alexet
cd688367c7 CPP: Avoid uncessary recursion 2020-01-31 12:47:03 +00:00
Robert Marsh
83d611de11 C++: don't conflate pointers in data flow 2020-01-30 16:18:24 -08:00
Robert Marsh
209a30688a Merge pull request #2718 from jbj/DefaultTaintTracking-isUserInput
C++: Fix mapping of sources from Expr to Node
2020-01-30 16:22:48 -05:00
Robert Marsh
4617940eee Merge branch 'master' into connect-ir-dataflow-models 2020-01-30 08:49:42 -08:00
Jonas Jensen
148e87c61d C++: Put AliasedSSA.qll in new qlformat style 2020-01-30 11:38:16 +01:00