Commit Graph

6845 Commits

Author SHA1 Message Date
Dave Bartolomeo
aff2ea3316 C++: Handle pointer decay and inferred array sizes
For function parameters that are subject to "pointer decay", the database contains the type as originally declared (e.g. `T[]` instead of `T*`). The IR needs the actual type. Similarly, for variable declared as an array of unknown size, the actual size needs to be inferred from the initializer (e.g. `char a[] = "blah";` needs to have the type `char[5]`).

I've opened a ticket to have the extractor emit the actual type alongside the declared type, but for now, this workaround is enough to unblock progress for typical code.
2019-02-12 12:41:21 -08:00
Dave Bartolomeo
f5121d71bc C++: Fix range analysis for new API 2019-02-12 09:38:11 -08:00
Dave Bartolomeo
c224bbd767 C++: Fix Operand.getSize() 2019-02-11 17:48:59 -08:00
Dave Bartolomeo
bd46c43067 C++: Add sanity test for missing operand type 2019-02-11 09:47:00 -08:00
Dave Bartolomeo
a54d86423a C++: Add Operand.getType() 2019-02-11 09:47:00 -08:00
Dave Bartolomeo
fa2ef620ac C++: Rationalize RegisterOperand vs. MemoryOperand
This change does some shuffling to make the distinction between memory operands and register operands more clear in the IR API. First, any given type that extends `Operand` is now either always a `MemoryOperand` or always a `RegisterOperand`. This required getting rid of `CopySourceOperand`, which was used for both the `CopyValue` instruction (as a `RegisterOperand`) and for the `Load` instruction (as a `MemoryOperand`). `CopyValue` is now just a `UnaryInstruction`, `Store` has a `StoreValueOperand` (`RegisterOperand`), and all of the instructions that read a value from memory indirectly (`Load`, `ReturnValue`, and `ThrowValue`) all now have a `LoadOperand` (`MemoryOperand`).

There are no diffs in the IR output for this commit, but this change is required for a subsequent commit that will make each `MemoryOperand` have a `Type`, which in turn is needed to fix a critical bug in aliased SSA construction.
2019-02-11 09:47:00 -08:00
Dave Bartolomeo
bda00bbff2 C++: Split out SSA IR tests
The IR tests were getting kind of unwieldy. We were using "ir.cpp" to contain test cases that covered both IR construction (every language construct imaginable) and SSA construction. We would then build and dump all three flavors of IR. For IR construction tests, examining the SSA dumps when you add a new test case is tedious.

To make this easier to manage, I've split the SSA-specific test cases out into a separate directory. "ir.cpp" should now contain only IR construction test cases, and "ssa.cpp" should contain only SSA construction test cases. We dump just the raw IR for "ir.cpp", and just the two SSA flavors for "ssa.cpp". We still run all three flavors of the IR sanity tests for "ir.cpp", though.

I also removed the "ssa_block_count.ql" test, which wasn't really adding any coverage, because any change to the block count would be reflected in the dump as well.
2019-02-08 15:28:06 -08:00
Geoffrey White
8b2405b267 CPP: Update severity/precision of LargeParameter.ql. 2019-02-08 15:23:57 +00:00
Jonas Jensen
566eafc706 Merge pull request #823 from dave-bartolomeo/dave/IdentityString
C++: Declaration.getIdentityString and Type.getTypeIdentityString
2019-02-08 13:16:02 +01:00
Dave Bartolomeo
1e7dcedcdf C++: Fix semantic merge conflict 2019-02-07 14:32:26 -08:00
Dave Bartolomeo
283991d520 C++: Handle ProxyClass in getIdentityString() 2019-02-07 14:26:01 -08:00
Dave Bartolomeo
3414c105c6 C++: Hoist getTemplateArgument() and friends into Declaration 2019-02-07 14:26:01 -08:00
Dave Bartolomeo
1c6b14e505 C++: Remove deprecation of getFullSignature() until we can fix internal tests to use getIdentityString() 2019-02-07 14:26:01 -08:00
Dave Bartolomeo
dbe12e7d02 C++: More PR feedback 2019-02-07 14:26:01 -08:00
Dave Bartolomeo
eb7016620b C++: Fix PR feedback 2019-02-07 14:26:00 -08:00
Dave Bartolomeo
7b54db8ca9 C++: Fix getIdentityString for TemplateParameter 2019-02-07 14:26:00 -08:00
Dave Bartolomeo
5d71d06dbc C++: Fix test expectation 2019-02-07 14:26:00 -08:00
Dave Bartolomeo
bd4ecc3e91 C++: Declaration.getIdentityString and Type.getTypeIdentityString
This PR adds new predicates to `Declaration` and `Type` to get a fully-qualified canonical name for the element, suitable for debugging and dumps. It includes template parameters, cv qualifiers, function parameter and return types, and fully-qualified names for all symbols. These strings are too large to compute in productions queries, so they should be used only for dumps and debugging. Feel free to suggest better names for these predicates.

I've updated PrintAST and PrintIR to use these instead of `Function.getFullSignature()`. The biggest advantage of the new predicates is that they handle lambdas and local classes, which `getQualifiedName` and `getFullSignature` do not. This makes IR and AST dumps much more usable for real-world snapshots.

Along the way, I cleaned up some of our handling of `IntegralType` to use a single table for tracking the signed, unsigned, and canonical versions of each type. The canonical part is new, and was necessary for `getTypeIdentityString` so that `signed int` and `int` both appear as `int`.
2019-02-07 14:26:00 -08:00
Dave Bartolomeo
f460d2c1c3 C++: Fix another test expectation 2019-02-07 09:56:56 -08:00
Dave Bartolomeo
f2a0a86c6d C++: Update captures test for closure fields extractor fix 2019-02-07 09:56:56 -08:00
Robert Marsh
3c638b5966 C++: add edge-based predicates to IRGuards
These predicates currently take a pair of `IRBlock`s - as it stands, at
most one edge can exist from one `IRBlock` to a given other `IRBlock`.
We may need to revisit that assumption and create an `IREdge` IPA type
at some future date
2019-02-07 09:38:54 -08:00
Robert Marsh
b85b7744ef C++: refactor branch instruction handling 2019-02-07 09:36:34 -08:00
Robert Marsh
92ba0919cc Merge pull request #899 from Semmle/rdmarsh/cpp/IRRename-rebased
C++: Rename a few problematic IR APIs
2019-02-07 09:28:59 -08:00
Jonas Jensen
ce31b14f21 C++: Add a queries.xml to the test dir
This makes compilation caching work with `*.ql` files in the test dir
when using `odasa qltest --optimize`.
2019-02-07 11:04:20 +01:00
Jonas Jensen
47ad280e34 Merge pull request #842 from geoffw0/gets
CPP: Clean up PotentialBufferOverflow.ql, PotentiallyDangerousFunction.ql
2019-02-07 09:27:00 +01:00
Dave Bartolomeo
f6d392089e C++: Replace getAnOperand().(XXXOperand) with getXXXOperand() 2019-02-06 22:44:53 -08:00
Dave Bartolomeo
4c23ad100e C++: Rename a few IR APIs
There are a few IR APIs that we've found to be confusingly named. This PR renames them to be more consistent within the IR and with the AST API:

`Instruction.getFunction` -> `Instruction.getEnclosingFunction`: This was especially confusing when you'd call `FunctionAddressInstruction.getFunction` to get the function whose address was taken, and wound up with the enclosing function instead.

`Instruction.getXXXOperand` -> `Instruction.getXXX`. Now that `Operand` is an exposed type, we want a way to get a specific `Operand` of an `Instruction`, but more often we want to get the definition instruction of that operand. Now, the pattern is that `getXXXOperand` returns the `Operand`, and `getXXX` is equivalent to `getXXXOperand().getDefinitionInstruction()`.

`Operand.getInstruction` -> `Operand.getUseInstruction`: More consistent with the existing `Operand.getDefinitionInstruction` predicate.
2019-02-06 22:43:49 -08:00
Robert Marsh
97c5b8ee44 Merge pull request #882 from jbj/ir-ConstantAnalysis-perf
C++: Speed up IR ConstantAnalysis
2019-02-06 22:29:09 -08:00
Dave Bartolomeo
1f873d0c9c Merge pull request #890 from aeyerstaylor/more-field-overriding
C++: Use more field overriding in IR construction
2019-02-06 17:04:43 -08:00
Geoffrey White
2321ae911e CPP: Fix the test by adding PotentiallyDangerousFunction. 2019-02-05 17:58:30 +00:00
Geoffrey White
018450500d CPP: Fix closing tag. 2019-02-05 17:58:30 +00:00
Geoffrey White
c05df6ea4c CPP: Add reference. 2019-02-05 17:58:30 +00:00
Geoffrey White
f73a3a6a24 CPP: Explain the danger of gets a bit more in qhelp. 2019-02-05 17:58:30 +00:00
Geoffrey White
0541950c44 CPP: Clean up PotentialBufferOverflow.ql a bit. 2019-02-05 17:58:30 +00:00
Geoffrey White
c32e1b8000 CPP: Change the @name of PotentialBufferOverflow.ql to be in line with everything else. 2019-02-05 17:58:30 +00:00
Geoffrey White
f7e7737789 CPP: Update qhelp. 2019-02-05 17:58:30 +00:00
Geoffrey White
87a25f0cbe CPP: Update CWE tags. 2019-02-05 17:58:30 +00:00
Geoffrey White
429f53ed74 CPP: Move the 'gets' case. 2019-02-05 17:58:30 +00:00
Geoffrey White
a82832e779 CPP: Add a test that uses 'gets'. 2019-02-05 17:58:30 +00:00
Geoffrey White
bbc8e7886b CPP: Rearrange PotentiallyDangerousFunction.ql. 2019-02-05 17:58:30 +00:00
alexet
59a5bec769 CPP: Use more field overriding 2019-02-05 13:07:41 +00:00
Jonas Jensen
cad4bac548 C++: Concretize ConstantAnalysis NegateInstruction
This is just to make the QL shorter. It generates the same DIL.
2019-02-05 11:05:47 +01:00
Jonas Jensen
be35c674a7 C++: Factor out getConstantValueToPhi
This speeds up `getConstantValue`, the main predicate in
`ConstantAnalysis`, from 2.4s to 1.6s on comdb2.
2019-02-05 11:05:47 +01:00
Jonas Jensen
283bb2f6d0 C++: Factor out ConstantAnalysis BinaryInstruction
This speeds up comdb2 constant analysis from 6.5s to 4.5s.
2019-02-05 11:05:47 +01:00
Jonas Jensen
d66578eaa8 C++: Add IntegerPartial, use in ConstantAnalysis
This adds `IntegerPartial.qll`, which is similar to
`IntegerConstant.qll` except that it contains partial functions on
integers instead of total functions on optional integers. This speeds up
the constant analysis so it takes 6.5s instead of 10.3s on comdb2.
2019-02-05 11:05:47 +01:00
semmle-qlci
06ae0c421a Merge pull request #864 from jbj/ir-TIRVariable-shared
Approved by dave-bartolomeo
2019-02-05 07:55:28 +00:00
Dave Bartolomeo
dc209246aa Merge pull request #866 from jbj/ir-TInstruction-normalize
C++: Normalize TInstruction
2019-02-04 12:14:45 -08:00
Dave Bartolomeo
aadd5cf202 Merge pull request #863 from jbj/ir-variableLiveOnEntryToBlock-rhs
C++: Speed up variableLiveOnEntryToBlock in IR
2019-02-04 10:47:29 -08:00
Jonas Jensen
3735cb69ce C++: No InstructionTag in SSAConstruction
This does to `SSAConstruction` what the previous commit did to
`IRConstruction`. An instruction in `SSAConstruction` is now defined in
terms of how it was created rather than what it can be queried for.
Effectively, this defines `TInstruction` as `TInstructionTag` was
defined before and then removes `TInstructionTag` from
`SSAConstruction`. This also has the benefit of removing the concept of
an instruction tag from the public predicates on `Instruction`.
2019-02-04 19:43:17 +01:00
Jonas Jensen
8ae3551ec1 C++: Normalize TInstruction in raw IR
This definition was denormalized to the extent that an instruction was
defined in terms of the six main attributes it could be queried for.
This made it possible to do multi-column joins on those six attributes,
but it doesn't appear that this feature was useful in practice. The main
multi-column join that was in use was on the pair of
(`TranslatedElement, InstructionTag`), but the `TranslatedElement` was
not part of the `TInstruction`.

This commit changes `TInstruction` to be defined in terms of what it's
_built from_ (`TranslatedElement, InstructionTag`) instead. This makes
it possible to do multi-column joins on those two components, and then
there are separate predicates (usually with two columns) to query
instruction attributes, replacing the many uncached projections from
`MkInstruction` that were generated before.

An immediate advantage is that an `Expr` with multiple types will no
longer give rise to multiple `Instruction`s, fixing most of the errors
from the sanity query `ambiguousSuccessors`. The code inside
`IRConstruction.qll` becomes simpler and hopefully faster as there is no
longer a translation from `TranslatedElement` to `Locatable` and back
again.
2019-02-04 19:43:17 +01:00