Commit Graph

3597 Commits

Author SHA1 Message Date
Mathias Vorreiter Pedersen
8aae2990d0 C++: Formatting 2020-02-03 16:15:49 +01:00
Mathias Vorreiter Pedersen
a8b3bcb87d C++: Indirection for value numbering 2020-02-03 16:13:32 +01:00
Cornelius Riemenschneider
1b68f86d5b Fix bug in CPP range analysis. 2020-02-03 14:16:48 +01:00
Dave Bartolomeo
fd2cafa95f C++: Accept GVN test output 2020-01-31 13:36:14 -07:00
Jonas Jensen
e2da98ae24 C++: Accept autoformat and test changes 2020-01-31 20:58:53 +01:00
Robert Marsh
3e2b0328b7 C++: update test expectations post-merge 2020-01-31 11:48:51 -08:00
Robert Marsh
089dda9090 Merge branch 'buffer-type-size-test' into jbj/buffer-type-size 2020-01-31 11:31:55 -08:00
Robert Marsh
2dd368fd1f C++: add SSA test for void* buffer parameters 2020-01-31 11:31:28 -08:00
Dave Bartolomeo
e27a0fe504 C++: Prevent AliasedVirtualVariable from overlapping string literals
We were hitting a combinatorial explosion in `hasDefinitionAtRank` for functions that contain a large number of string literals. The problem was that every `Chi` instruction for `AliasedVirtualVariable` was treated as a definition of every string literal. We already mark string literals as `isReadOnly()`, but we were allowing `AliasedVirtualVariable` to define read-only locations so that the `AliasedDefinition` instruction would provide the initial definition for all string literals.

To fix this, I've introduced the new `InitializeNonLocal` instruction, which is inserted in the prologue of every function right after `AliasedDefinition`. It provides the initial definition for every non-stack memory location, including read-only locations, but is never written to anywhere else. It is the conterpart of the `AliasedUse` instruction in the function epilogue, which represents the use of all non-stack memory after the function returns. I considered renaming `AliasedUse` to `ReturnNonLocal`, to match the `InitializeXXX`/`ReturnXXX` pattern we already use for parameters and indirections, but held off to avoid unnecessary churn. Any thoughts on whether I should make this name change?

This change has a significant speedup in evaluation time for a few of our troublesome databases:
`attnam/ivan`: 13%
`awslabs/s2n`: 26%
`SinaMostafanejad/OpenRDM`: 7%
`zcoinofficial/zcoin`: 8%
2020-01-31 11:33:46 -07:00
Jonas Jensen
83f807f182 C++: Interprocedural indirection taint tracking
As a temporary workaround in the `DefaultTaintTracking` library, we
funnel flow across calls by conflating pointer and object both at the
caller and the callee.

The three cases in `adjustedSink` were deleted because they are now
covered by the one case for `ReadSideEffectInstruction` in
`instructionTaintStep`.

When enabling `DefaultTaintTracking`, this commit on top of #2736 has
the effect effect of recovering two lost results:

    --- a/cpp/ql/test/query-tests/Security/CWE/CWE-119/semmle/tests/OverflowDestination.expected
    +++ b/cpp/ql/test/query-tests/Security/CWE/CWE-119/semmle/tests/OverflowDestination.expected
    @@ -1,2 +1,4 @@
     | overflowdestination.cpp:30:2:30:8 | call to strncpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |
     | overflowdestination.cpp:46:2:46:7 | call to memcpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |
    +| overflowdestination.cpp:53:2:53:7 | call to memcpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |
    +| overflowdestination.cpp:64:2:64:7 | call to memcpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |

In the internal repo, we recover one lost result. Additionally, there
are two queries that gain an extra source for an existing sink. I'll
classify that as noise. The new results look like this:

    foo(argv); // this `argv` is a new source for the sink in `bar`
    bar(argv); // this `argv` is the existing source for the sink in `bar`
2020-01-31 16:28:45 +01:00
Jonas Jensen
a1aed1ad93 C++: Workaround for problem with memcpy flow
The type of the source argument to `memcpy` is `void *`, and somehow
that meant that the copied object itself got type `void`. Since that has
size 0, the SSA construction did not model it as reading from the last
write.

This is probably not the right fix, but maybe it's good enough for now.
The right fix would ensure that the type reported by
`hasOperandMemoryAccess` is `UnknownType`.

When `DefaultTaintTracking.qll` is enabled, this commit has the effect
of restoring a lost results:

    --- a/cpp/ql/test/query-tests/Security/CWE/CWE-119/semmle/tests/OverflowDestination.expected
    +++ b/cpp/ql/test/query-tests/Security/CWE/CWE-119/semmle/tests/OverflowDestination.expected
    @@ -1 +1,2 @@
     | overflowdestination.cpp:30:2:30:8 | call to strncpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |
    +| overflowdestination.cpp:46:2:46:7 | call to memcpy | To avoid overflow, this operation should be bounded by destination-buffer size, not source-buffer size. |
2020-01-31 16:04:43 +01:00
alexet
cd688367c7 CPP: Avoid uncessary recursion 2020-01-31 12:47:03 +00:00
Robert Marsh
83d611de11 C++: don't conflate pointers in data flow 2020-01-30 16:18:24 -08:00
Robert Marsh
209a30688a Merge pull request #2718 from jbj/DefaultTaintTracking-isUserInput
C++: Fix mapping of sources from Expr to Node
2020-01-30 16:22:48 -05:00
Robert Marsh
4617940eee Merge branch 'master' into connect-ir-dataflow-models 2020-01-30 08:49:42 -08:00
Jonas Jensen
148e87c61d C++: Put AliasedSSA.qll in new qlformat style 2020-01-30 11:38:16 +01:00
Jonas Jensen
f0f752844e Merge remote-tracking branch 'upstream/master' into dbartol/Indirections
Conflicts:
	cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/AliasedSSA.qll
	csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/internal/AliasAnalysis.qll
2020-01-30 10:26:44 +01:00
Jonas Jensen
036e16af8b Merge remote-tracking branch 'upstream/master' into ir-crement-load
Conflicts:
	cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/TranslatedExpr.qll
2020-01-30 09:07:30 +01:00
Jonas Jensen
c4d2163321 Merge pull request #2673 from aschackmull/ql/autoformat-comparisonterm
Java/C++/C#: Autoformat comparison terms
2020-01-30 08:47:50 +01:00
Robert Marsh
71d87be773 C++: add flow through partial loads in DTT 2020-01-29 17:51:42 -08:00
Dave Bartolomeo
6249446ba0 Merge remote-tracking branch 'upstream/master' into dbartol/Indirections 2020-01-29 17:29:44 -07:00
Robert Marsh
1472101613 Merge branch 'master' into rdmarsh/cpp/ir-flow-through-outparams 2020-01-29 14:44:29 -08:00
Robert Marsh
74ea9bcdf4 C++: fix merge issue 2020-01-29 14:37:41 -08:00
Robert Marsh
1a458aa450 C++: IR dataflow edges through outparams 2020-01-29 14:37:41 -08:00
Dave Bartolomeo
46c414b53f C++: Document regular expressions in InlineExpectationsTest 2020-01-29 13:24:55 -07:00
Dave Bartolomeo
1277881294 C++: Document InlineExpectationsTest 2020-01-29 13:07:34 -07:00
Robert Marsh
37570c7750 Merge pull request #2676 from jbj/dataflow-partial-chi
C++: data flow through partial chi operands where type is known
2020-01-29 13:44:06 -05:00
Jonas Jensen
52d2bebd1c C++: Taint through most partial chi operands
This changes the flow to be taint rather than data flow, and it extends
it to include chi instructions with unknown type as long as they're not
for the `AliasedVirtualVariable`.

We're losing three good test results because these tests are not
affected by `DefaultTaintTracking.qll`. The taint step added here can
later be ported to `TaintTrackingUtil.qll` to recover these results, but
we probably want a better API than transitive-closure search through
instructions before doing that.
2020-01-29 18:02:03 +01:00
Geoffrey White
f673791fe8 Merge pull request #2717 from jbj/DefaultTaintTracking-memcpy
C++: Add taint from gets through memcpy
2020-01-29 16:28:45 +00:00
Jonas Jensen
0436caecdc C++: Always use the old library for the diff test
This change ensures that the diff test will show the difference between
the old and the new library even after we switch the default
implementation of `security.TaintTracking` to be the new one.
2020-01-29 16:03:35 +01:00
Jonas Jensen
4a77f2b53c Merge remote-tracking branch 'upstream/master' into ir-crement-load
Update test output to fix semantic merge conflict.
2020-01-29 15:56:05 +01:00
Jonas Jensen
9b651ea92c C++: Fix mapping of sources from Expr to Node
The code contained the remains of how `isUserInput` in `Security.qll`
used to be ported to IR. It's wrong to use that port since many queries
call `userInput` directly to get the "cause" string.
2020-01-29 15:50:08 +01:00
Jonas Jensen
7bed6ad63b C++: Add taint from gets through memcpy 2020-01-29 15:42:43 +01:00
Jonas Jensen
d7e8ea7cc5 Merge pull request #2641 from marcrepo/master
Documentation update for Issue #2623
2020-01-29 13:37:00 +01:00
Jonas Jensen
386e8e87d1 Merge pull request #2645 from geoffw0/typo
CPP: Fix typo.
2020-01-29 13:35:55 +01:00
Anders Schack-Mulligen
0d4b2e4bf7 C#/C++: Autoformat post rebase. 2020-01-29 13:16:46 +01:00
Anders Schack-Mulligen
96e4a57edd C++: Autoformat. 2020-01-29 13:11:50 +01:00
Jonas Jensen
02cb8e9cc7 Merge remote-tracking branch 'upstream/master' into dataflow-partial-chi
Conflicts:
	cpp/ql/src/semmle/code/cpp/ir/dataflow/internal/DataFlowUtil.qll
	cpp/ql/test/library-tests/dataflow/DefaultTaintTracking/tainted.expected
2020-01-29 13:03:40 +01:00
Jonas Jensen
27b5902258 Merge pull request #2707 from geoffw0/taint-format
C++: Add TaintFunction model to FormattingFunction
2020-01-29 08:20:34 +01:00
Dave Bartolomeo
60a0eff4d7 Merge remote-tracking branch 'upstream/master' into dbartol/Indirections 2020-01-28 12:06:43 -07:00
Dave Bartolomeo
542579de7f C++: Accept dataflow test changes due to new alias analysis 2020-01-28 10:58:27 -07:00
Dave Bartolomeo
dda32359fa C++: Accept IR dump test results changes due to new alias analysis 2020-01-28 10:58:05 -07:00
Dave Bartolomeo
7013bc6bf4 C++: Update escape analysis tests to new API 2020-01-28 10:57:07 -07:00
Dave Bartolomeo
bb9485d548 C++: Update points_to tests to use new framework 2020-01-28 10:56:49 -07:00
Dave Bartolomeo
af9d90cf46 C++: New test framework that allows expected results as comments in source code 2020-01-28 10:56:13 -07:00
Dave Bartolomeo
976b564b68 C++: Update AliasedSSA to use Allocation instead of IRVariable
This introduces a new type of `MemoryLocation`: `EntireAllocationMemoryLocation`, representing an entire contiguous allocation whose size is not known. This is used to model the memory accesses on `InitializeIndirection` and `ReturnIndirection`.
2020-01-28 10:55:24 -07:00
Dave Bartolomeo
165a45d9b5 C++/C#: Update SimpleSSA to use Allocation instead of IRVariable 2020-01-28 10:53:18 -07:00
Dave Bartolomeo
1bbc875442 C++/C#: Parameterize alias analysis based on AliasConfiguration
Instead of tracking `IRVariable`s directly, alias analysis now tracks instances of the `Allocation` type provided by its `Configuration` parameter. For unaliased SSA, an `Allocation` is just an `IRAutomaticVariable`. For aliased SSA, an `Allocation` is either an `IRVariable` or the memory pointed to by an indirect parameter.
2020-01-28 10:51:21 -07:00
Dave Bartolomeo
b15dd82732 C++/C#: Share alias analysis between C++ and C# 2020-01-28 10:47:37 -07:00
Dave Bartolomeo
1b1fded535 C++/C#: Add new MemoryAccessKind to represent entire allocation 2020-01-28 10:41:53 -07:00