Commit Graph

4510 Commits

Author SHA1 Message Date
Robert Marsh
173ade1336 C++: add arithmetic/bitwise instruction classes 2019-02-21 17:09:08 -08:00
Jonas Jensen
7649e8758b Merge pull request #846 from geoffw0/returnstack
CPP: Improve  ReturnStackAllocatedMemory.ql
2019-02-21 22:04:53 +01:00
Robert Marsh
a0c12c46e5 Merge pull request #962 from jbj/IRGuards-cached
C++: Reduce the IRGuards to two cached stages
2019-02-21 10:50:19 -08:00
Evgeny Vereshchagin
e9401fca0d CPP: add a query for catching alloca in a loop
Thanks to Sam Lanning (@samlanning) and Robert Marsh for taking the time to help
to make it possible. In fact, it was Robert Marsh who effectively
wrote the query and figured out that __builtin_alloca should be
used to also take functions like strdupa into account. I just
filled out the metadata :-)
2019-02-21 18:09:56 +01:00
Jonas Jensen
1bc967c1d1 Merge pull request #819 from geoffw0/newdelete
CPP: Improve dataflow in newdelete.qll
2019-02-21 15:09:49 +01:00
Geoffrey White
cd13e5877f CPP: Performance improvement. 2019-02-21 11:31:44 +00:00
Jonas Jensen
d200bda2ad C++: Reduce the IRGuards to two cached stages
Before this change, all the cached predicates in `IRGuards.qll` were in
separate cached stages, resulting in recomputation of most of the
library for each stage. This change groups the cached predicates in two
cached classes. A better grouping may be possible, but this grouping was
easy to do and seems to solve the problem.

Before this change, the `IRGuards` library accounted for five cached
stages when using the `RangeAnalysis` library. After this change, it
only accounts for one.
2019-02-21 12:03:35 +01:00
Jonas Jensen
1e0a385d41 C++: Put ReturnStackAllocatedMemory.ql on LGTM 2019-02-21 11:39:05 +01:00
Jonas Jensen
b9236d216f C++: Improve ReturnStackAllocatedMemory alert msg 2019-02-21 11:20:25 +01:00
Jonas Jensen
dcf910f20c C++: Use EscapesTree to find pointers to stack
This simplifies the query and is a strict improvement on the tests. I
also found it to be an overall improvement on real projects.
2019-02-21 11:20:25 +01:00
Jonas Jensen
9ac8d60636 C++: IR query for redundant null check
This new query is not written because it's the most interesting query we
could write but because it's an IR-based query whose results are easy to
verify.
2019-02-21 10:13:25 +01:00
Geoffrey White
d30bcb6fcf CPP: Widen allocReachedVariable slightly. 2019-02-20 10:19:57 +00:00
Jonas Jensen
2dea0b4270 Merge pull request #879 from rdmarsh2/rdmarsh/cpp/ir-guards-edges
C++: Add edge-based predicates to IRGuards
2019-02-19 16:54:52 +01:00
Jonas Jensen
9dc3b93164 Merge pull request #916 from geoffw0/largeparam
CPP: Update severity/precision of LargeParameter.ql.
2019-02-18 12:23:00 +01:00
Robert Marsh
26a0f4b100 Merge pull request #938 from dave-bartolomeo/dave/AliasedSSA
C++: Better tracking of SSA memory accesses
2019-02-14 08:10:31 -08:00
Anders Schack-Mulligen
980a690b8b CPP/Java: Sync Dataflow 2019-02-14 09:59:08 +01:00
Dave Bartolomeo
b40fd95b8e C++: Better tracking of SSA memory accesses
This change fixes a few key problems with the existing SSA implementations:

For unaliased SSA, we were incorrectly choosing to model a local variable that had accesses that did not cover the entire variable. This has been changed to ensure that all accesses to the variable are at offset zero and have the same type as the variable itself. This was only possible to fix now that every `MemoryOperand` has its own type.

For aliased SSA, we now correctly track the offset and size of each memory access using an interval of bit offsets covered by the access. The offset interval makes the overlap computation more straightforward. Again, this is only possible now that operands have types.
The `getXXXMemoryAccess` predicates are now driven by the `MemoryAccessKind` on the operands and results, instead of by specific opcodes.

This change does fix an existing false negative in the IR dataflow tests.

I added a few simple test cases to the SSA IR tests, covering the various kinds of overlap (MustExcactly, MustTotally, and MayPartially).

I added "PrintSSA.qll", which can dump the SSA memory accesses as part of an IR dump.
2019-02-13 10:44:39 -08:00
Dave Bartolomeo
055485d9eb C++: Work around lack of size for enum type 2019-02-13 10:44:39 -08:00
Dave Bartolomeo
aff2ea3316 C++: Handle pointer decay and inferred array sizes
For function parameters that are subject to "pointer decay", the database contains the type as originally declared (e.g. `T[]` instead of `T*`). The IR needs the actual type. Similarly, for variable declared as an array of unknown size, the actual size needs to be inferred from the initializer (e.g. `char a[] = "blah";` needs to have the type `char[5]`).

I've opened a ticket to have the extractor emit the actual type alongside the declared type, but for now, this workaround is enough to unblock progress for typical code.
2019-02-12 12:41:21 -08:00
Dave Bartolomeo
f5121d71bc C++: Fix range analysis for new API 2019-02-12 09:38:11 -08:00
Dave Bartolomeo
c224bbd767 C++: Fix Operand.getSize() 2019-02-11 17:48:59 -08:00
Dave Bartolomeo
bd46c43067 C++: Add sanity test for missing operand type 2019-02-11 09:47:00 -08:00
Dave Bartolomeo
a54d86423a C++: Add Operand.getType() 2019-02-11 09:47:00 -08:00
Dave Bartolomeo
fa2ef620ac C++: Rationalize RegisterOperand vs. MemoryOperand
This change does some shuffling to make the distinction between memory operands and register operands more clear in the IR API. First, any given type that extends `Operand` is now either always a `MemoryOperand` or always a `RegisterOperand`. This required getting rid of `CopySourceOperand`, which was used for both the `CopyValue` instruction (as a `RegisterOperand`) and for the `Load` instruction (as a `MemoryOperand`). `CopyValue` is now just a `UnaryInstruction`, `Store` has a `StoreValueOperand` (`RegisterOperand`), and all of the instructions that read a value from memory indirectly (`Load`, `ReturnValue`, and `ThrowValue`) all now have a `LoadOperand` (`MemoryOperand`).

There are no diffs in the IR output for this commit, but this change is required for a subsequent commit that will make each `MemoryOperand` have a `Type`, which in turn is needed to fix a critical bug in aliased SSA construction.
2019-02-11 09:47:00 -08:00
Geoffrey White
8b2405b267 CPP: Update severity/precision of LargeParameter.ql. 2019-02-08 15:23:57 +00:00
Dave Bartolomeo
283991d520 C++: Handle ProxyClass in getIdentityString() 2019-02-07 14:26:01 -08:00
Dave Bartolomeo
3414c105c6 C++: Hoist getTemplateArgument() and friends into Declaration 2019-02-07 14:26:01 -08:00
Dave Bartolomeo
1c6b14e505 C++: Remove deprecation of getFullSignature() until we can fix internal tests to use getIdentityString() 2019-02-07 14:26:01 -08:00
Dave Bartolomeo
dbe12e7d02 C++: More PR feedback 2019-02-07 14:26:01 -08:00
Dave Bartolomeo
eb7016620b C++: Fix PR feedback 2019-02-07 14:26:00 -08:00
Dave Bartolomeo
7b54db8ca9 C++: Fix getIdentityString for TemplateParameter 2019-02-07 14:26:00 -08:00
Dave Bartolomeo
bd4ecc3e91 C++: Declaration.getIdentityString and Type.getTypeIdentityString
This PR adds new predicates to `Declaration` and `Type` to get a fully-qualified canonical name for the element, suitable for debugging and dumps. It includes template parameters, cv qualifiers, function parameter and return types, and fully-qualified names for all symbols. These strings are too large to compute in productions queries, so they should be used only for dumps and debugging. Feel free to suggest better names for these predicates.

I've updated PrintAST and PrintIR to use these instead of `Function.getFullSignature()`. The biggest advantage of the new predicates is that they handle lambdas and local classes, which `getQualifiedName` and `getFullSignature` do not. This makes IR and AST dumps much more usable for real-world snapshots.

Along the way, I cleaned up some of our handling of `IntegralType` to use a single table for tracking the signed, unsigned, and canonical versions of each type. The canonical part is new, and was necessary for `getTypeIdentityString` so that `signed int` and `int` both appear as `int`.
2019-02-07 14:26:00 -08:00
Robert Marsh
3c638b5966 C++: add edge-based predicates to IRGuards
These predicates currently take a pair of `IRBlock`s - as it stands, at
most one edge can exist from one `IRBlock` to a given other `IRBlock`.
We may need to revisit that assumption and create an `IREdge` IPA type
at some future date
2019-02-07 09:38:54 -08:00
Robert Marsh
b85b7744ef C++: refactor branch instruction handling 2019-02-07 09:36:34 -08:00
Robert Marsh
92ba0919cc Merge pull request #899 from Semmle/rdmarsh/cpp/IRRename-rebased
C++: Rename a few problematic IR APIs
2019-02-07 09:28:59 -08:00
Jonas Jensen
47ad280e34 Merge pull request #842 from geoffw0/gets
CPP: Clean up PotentialBufferOverflow.ql, PotentiallyDangerousFunction.ql
2019-02-07 09:27:00 +01:00
Dave Bartolomeo
f6d392089e C++: Replace getAnOperand().(XXXOperand) with getXXXOperand() 2019-02-06 22:44:53 -08:00
Dave Bartolomeo
4c23ad100e C++: Rename a few IR APIs
There are a few IR APIs that we've found to be confusingly named. This PR renames them to be more consistent within the IR and with the AST API:

`Instruction.getFunction` -> `Instruction.getEnclosingFunction`: This was especially confusing when you'd call `FunctionAddressInstruction.getFunction` to get the function whose address was taken, and wound up with the enclosing function instead.

`Instruction.getXXXOperand` -> `Instruction.getXXX`. Now that `Operand` is an exposed type, we want a way to get a specific `Operand` of an `Instruction`, but more often we want to get the definition instruction of that operand. Now, the pattern is that `getXXXOperand` returns the `Operand`, and `getXXX` is equivalent to `getXXXOperand().getDefinitionInstruction()`.

`Operand.getInstruction` -> `Operand.getUseInstruction`: More consistent with the existing `Operand.getDefinitionInstruction` predicate.
2019-02-06 22:43:49 -08:00
Robert Marsh
97c5b8ee44 Merge pull request #882 from jbj/ir-ConstantAnalysis-perf
C++: Speed up IR ConstantAnalysis
2019-02-06 22:29:09 -08:00
Dave Bartolomeo
1f873d0c9c Merge pull request #890 from aeyerstaylor/more-field-overriding
C++: Use more field overriding in IR construction
2019-02-06 17:04:43 -08:00
Geoffrey White
018450500d CPP: Fix closing tag. 2019-02-05 17:58:30 +00:00
Geoffrey White
c05df6ea4c CPP: Add reference. 2019-02-05 17:58:30 +00:00
Geoffrey White
f73a3a6a24 CPP: Explain the danger of gets a bit more in qhelp. 2019-02-05 17:58:30 +00:00
Geoffrey White
0541950c44 CPP: Clean up PotentialBufferOverflow.ql a bit. 2019-02-05 17:58:30 +00:00
Geoffrey White
c32e1b8000 CPP: Change the @name of PotentialBufferOverflow.ql to be in line with everything else. 2019-02-05 17:58:30 +00:00
Geoffrey White
f7e7737789 CPP: Update qhelp. 2019-02-05 17:58:30 +00:00
Geoffrey White
87a25f0cbe CPP: Update CWE tags. 2019-02-05 17:58:30 +00:00
Geoffrey White
429f53ed74 CPP: Move the 'gets' case. 2019-02-05 17:58:30 +00:00
Geoffrey White
bbc8e7886b CPP: Rearrange PotentiallyDangerousFunction.ql. 2019-02-05 17:58:30 +00:00
alexet
59a5bec769 CPP: Use more field overriding 2019-02-05 13:07:41 +00:00