Commit Graph

548 Commits

Author SHA1 Message Date
Geoffrey White
3e8b28a0a8 Merge pull request #2213 from jbj/BarrierGuard
C++: Implement DataFlow::BarrierGuard for AST+IR
2019-11-04 11:08:36 +00:00
Robert Marsh
9477bd5698 Merge branch 'master' of github.com:Semmle/ql into rdmarsh/cpp/ir-buffer-read-call-se 2019-10-31 11:00:01 -07:00
Jonas Jensen
b6038f3caa C++: Remove best-bound logic from test
This logic, in an improved form, is now part of the library itself.
2019-10-29 11:54:32 +01:00
Jonas Jensen
311963906b C++: Only give the best delta in range analysis
This mirrors Java's 6b85fe087a.
2019-10-29 11:49:49 +01:00
Jonas Jensen
b13535ac7d C++: Implement DataFlow::BarrierGuard for AST+IR
The change note is copied from the Java change note.
2019-10-28 16:22:23 +01:00
Dave Bartolomeo
cc5a689293 C++/C#: Fix up after merge from master 2019-10-25 14:11:34 -07:00
Dave Bartolomeo
f5e320e988 Merge from master 2019-10-25 13:24:19 -07:00
Dave Bartolomeo
56cbd0c152 C++/C#: Make AliasedUse access only non-local memory
The `AliasedUse` instruction is supposed to represent future uses of aliased memory after the function returns. Since local variables from that function are no longer allocated after the function returns, the `AliasedUse` instruction should access only the set of aliased locations that does not include locals from the current stack frame.
2019-10-25 13:10:39 -07:00
Jonas Jensen
22de0efc58 Merge pull request #2008 from dave-bartolomeo/dave/IRType2
C++: Implement language-neutral IR type system
2019-10-25 09:42:23 +02:00
Dave Bartolomeo
1223388ab6 C++: Fix test expectations 2019-10-24 13:54:21 -07:00
Dave Bartolomeo
d03a4f86e5 C++/C#: Add AliasedUse instruction to all functions
This new instruction is the dual of the existing `AliasedDefinition` instruction. Whereas that instruction defines the contents of aliased memory before the function was called, `AliasedUse` represents the potential use of all aliased memory after the function returns. This ensures that writes to aliased memory do not appear "dead", even if there are no further reads from aliased memory within the function itself.
2019-10-23 11:59:05 -07:00
Robert Marsh
9f0499cce9 Merge pull request #2063 from jbj/dataflow-ref-parameter
C++: Data flow through reference parameters
2019-10-22 09:40:15 -07:00
Dave Bartolomeo
63038896f4 C++: Accept test output after changes 2019-10-21 17:06:32 -07:00
Dave Bartolomeo
2cd694756b C++: Remove mistakenly-added file 2019-10-21 15:58:38 -07:00
Dave Bartolomeo
7241c1aae6 C++/C#: More sanity checks for IRType 2019-10-21 14:22:46 -07:00
Dave Bartolomeo
71a6b5dffe C++/C#: Fix some duplicate IRType problems, and add a sanity test 2019-10-21 10:46:30 -07:00
Robert Marsh
e57fef093b C++: accept syntax-zoo changes 2019-10-18 10:08:53 -07:00
Robert Marsh
b29f88450b C++: buffer read side effects on unmodeled funcs 2019-10-17 12:10:23 -07:00
Robert Marsh
30d7238921 C++: fix missing getPrimaryInstruction 2019-10-16 17:05:37 -07:00
Robert Marsh
fffe3c2432 C++: add sanity test for side effect primaries 2019-10-16 16:53:55 -07:00
Dave Bartolomeo
6e61b1dcd0 C++: Fix up after merge from master
The one interesting piece that needed to be fixed up was the type of an `Indirect[Read|Write]SideEffect` operand/result. If the parameter type is a pointer or reference to an incomplete type, we need to set the type of the side effect memory access to `Unknown`, because we don't model incomplete types in the IR type system.

I also added minimal support for `__assume` (generated as a `NoOp`), because lack of `__assume` support got in the way of debugging the other issue above.
2019-10-16 15:55:56 -07:00
Dave Bartolomeo
167d2289c4 Merge from master 2019-10-16 10:10:10 -07:00
Geoffrey White
6f96d1759f Merge pull request #2077 from jbj/cfg-enable-pr
C++: enable the QL-based CFG code
2019-10-16 14:06:22 +01:00
Matthew Gretton-Dann
692c29d095 C++: Test fun_decl for INVALID_KEYs 2019-10-15 14:47:32 +01:00
Nick Rolfe
6c83c76268 C++: add a test for __builtin_complex 2019-10-14 11:31:59 +01:00
Geoffrey White
1c0fdef0a8 CPP: Add a simplified test case for ImplicitThisFieldAccess. 2019-10-10 10:04:32 +01:00
Geoffrey White
bc4363bc22 CPP: Add a test of FunctionAccess and cases for FieldAccess. 2019-10-10 10:04:31 +01:00
Jonas Jensen
5d7a0b8dd5 Merge remote-tracking branch 'upstream/master' into dataflow-ref-parameter
I've accepted the new test output, which shows that this branch fixes
two false negatives in the test cases from #2088.
2019-10-08 13:09:20 +02:00
Jonas Jensen
19f642fc8d Merge commit '7434702' into dataflow-ref-parameter
This merges #1735 into this branch to resolve the semantic merge
conflicts between them.
2019-10-08 12:55:47 +02:00
Geoffrey White
050d99fa87 CPP: Add test cases. 2019-10-04 17:44:27 +01:00
Jonas Jensen
01a3a037bc C++: Make complex_numbers/expr.ql less brittle
This test used `getAQlClass`, which caused it to break when new classes
were added anywhere in the libraries. That's now avoided by switching to
`getCanonicalQLClass`. It turns out that `getCanonicalQLClass` didn't
support arithmetic expressions on complex numbers, so that support had
to be added.
2019-10-03 13:19:16 +02:00
Jonas Jensen
2eed38e2d4 C++: Accept slight CFG regression in static init
Hopefully it does not make a difference in practice whether
uninstantiated template functions are considered to have control flow
through initializers of their static variables.
2019-10-03 11:48:03 +02:00
Jonas Jensen
7c319efb8b C++: Data flow through reference parameters 2019-10-01 10:43:49 +02:00
Robert Marsh
a45a6e48f8 C++: remove side effect operands from non-reads 2019-09-30 12:00:55 -07:00
Robert Marsh
8649978a43 C++: add indexes for specific side effects 2019-09-30 12:00:53 -07:00
Robert Marsh
24574be007 C++: add SizedBuffer side effect instructions 2019-09-30 12:00:53 -07:00
Robert Marsh
3d562243e4 C++: add side effects for outparams 2019-09-30 12:00:52 -07:00
Dave Bartolomeo
043e5f716b C++, C#: Autoformat 2019-09-29 22:39:09 -07:00
Matthew Gretton-Dann
cc016d583d C++: Add further vector_size attribute tests 2019-09-27 11:28:31 +01:00
Matthew Gretton-Dann
c10ed5e114 C++: Update results for vector_size atrr changes 2019-09-27 11:28:31 +01:00
Dave Bartolomeo
9b8b364c8f Merge from master 2019-09-26 22:15:02 -07:00
Geoffrey White
18b28b1b57 Merge pull request #1959 from jbj/const-pmf
C++: Classify more expressions as constant
2019-09-26 17:13:27 +01:00
semmle-qlci
24240177c5 Merge pull request #2023 from ian-semmle/agglit
Approved by jbj
2019-09-25 11:35:33 +01:00
Jonas Jensen
0aafa0b0e2 C++: Accept test changes in IR sanity queries
These looks harmless.
2019-09-25 08:55:55 +02:00
Ian Lynagh
49276e09c5 C++: Add aggregate literals to sideEffects test 2019-09-24 11:28:57 +01:00
Dave Bartolomeo
300e580874 C++: Implement language-neutral IR type system
The C++ IR currently has a very clunky way of specifying the type of an IR entity (`Instruction`, `Operand`, `IRVariable`, etc.). There are three separate predicates: `getType()`, `isGLValue()`, and `getSize()`. All three are necessary, rather than just having a `getType()` predicate, because some IR entities have types that are not represented via an existing `Type` object in the AST. Examples include the type for an lvalue returned from a `VariableAddress` instruction, the type for an array slice being zero-initialized in a variable initializer, and several others. It is very easy for QL code to just check the `getType()` predicate, while forgetting to use `isGLValue()` to determine if that type is the actual type of the entity (the prvalue case) or the type referred to by a glvalue entity. Furthermore, the C++ type system creates potentially many different `Type` objects for the same underlying type (e.g. typedefs, using declarations, `const`/`volatile` qualifiers, etc.), making it more difficult to tell when two entities have semantically equivalent types.

In addition, other languages for which we want to enable the IR have somewhat different type systems. The various language type systems differ in their structure, although they tend to share the basic building blocks necessary for the IR.

To address all of the above problems, I've introduced a new class hierarchy, rooted at the class `IRType`, that represents a bare-bones type system that is independent of source language (at least across C/C++/C#/Java). A type's identity is based on its kind (signed integer, unsigned integer, floating-point, Boolean, blob, etc.), size and in the case of blob types, a "tag" to differentiate between different classes and structs. No distinction is made between, say `signed int` and plain `int`, or between different language integer types that have the same signedness and size (e.g. `unsigned int` vs. `wchar_t` on Linux). `IRType` is intended for use by language-agnostic IR-based analyses, including range analysis, dataflow, SSA construction, and alias analysis. The set of available `IRType`s is determined by predicate provided by the language library implementation (e.g. `hasSignedIntegerType(int byteSize)`.

In addition to `IRType`, each language now defines a type alias named `LanguageType`, representing the type of an IR entity in more language-specific terms. The only predicate requried on `LanguageType` is `getIRType()`, which returns the single `IRType` object for the language-neutral representation of that `LanguageType`. All other predicates on and subclasses of `LanguageType` are language-specific. There may be many instances of `LanguageType` that map to a given `IRType`, to allow for typedefs, etc.

Most of the changes are mechanical changes in the IR construction code, to return the correct type for each IR entity. SSA construction has also been updated to avoid dependencies on language-specific types.

I have not yet removed the original `getType()` predicates that just return `Type`. These can be removed once we move the remaining existing libraries to use `IRType`.

Test results are, by design, pretty much unchanged. Once case changed for inline asm, because the previously IR generation for it played a little fast and loose with the input/output expressions. The test case now includes both input and output variables. The generated IR for `Conditional_LValue` is now more correct, because we now have a way to represent an lvalue of an lvalue. `syntax-zoo` is still a hot mess. Most of the changed outputs are due to wobble from having multiple functions with the same name, but with a slightly different order of evaluation due to the type changes. Others are wobble from already-invalid IR. A couple non-wobbly places have improved slightly, though.

The C# part of this change is waiting for #2005 to be merged, since that has some of the necessary C# implementation.
2019-09-23 16:14:00 -07:00
Matthew Gretton-Dann
6b28f33713 C++: Update test for fix to namespace members
Generation of IDs for namespace members has been fixed to generate
unique IDs for variables of the same name but in different namespaces.

Update the same_name test to validate this.
2019-09-23 16:04:59 +01:00
Matthew Gretton-Dann
fc75a6af5a C++: Add tests for using aliases 2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann
9ff38ebeee C++: Update tests for new CTypedefType. 2019-09-23 13:57:50 +01:00
Robert Marsh
fd88f7a3ce Merge pull request #1884 from jbj/dataflow-addressof
C++: Data flow through address-of operator (&)
2019-09-19 09:15:43 -07:00