Commit Graph

1683 Commits

Author SHA1 Message Date
Robert Marsh
b60e7c204d C++: autoformat and accept test output 2019-10-07 14:07:25 -07:00
Robert Marsh
057c634fe4 C++: fix identical chi node operands 2019-10-04 13:05:47 -07:00
Robert Marsh
17e14348d5 C++: sanity test for identical Chi node operands 2019-10-04 12:57:30 -07:00
Robert Marsh
3377f88494 C++: generate Chi nodes on total IndirectMayWrites 2019-10-04 11:59:22 -07:00
Robert Marsh
5f8a3054d1 C++: add UninitializedInstructions for direct init 2019-10-04 11:34:14 -07:00
Geoffrey White
050d99fa87 CPP: Add test cases. 2019-10-04 17:44:27 +01:00
Robert Marsh
bc973973df C++: accept test changes 2019-10-03 14:43:54 -07:00
Robert Marsh
a76c4d9b3b C++: index for constructor qualifier side effects 2019-10-03 12:39:32 -07:00
Robert Marsh
47b9c497fa C++: IR SSA tests for explicit constructor calls 2019-10-03 12:25:41 -07:00
Jonas Jensen
01a3a037bc C++: Make complex_numbers/expr.ql less brittle
This test used `getAQlClass`, which caused it to break when new classes
were added anywhere in the libraries. That's now avoided by switching to
`getCanonicalQLClass`. It turns out that `getCanonicalQLClass` didn't
support arithmetic expressions on complex numbers, so that support had
to be added.
2019-10-03 13:19:16 +02:00
Jonas Jensen
2eed38e2d4 C++: Accept slight CFG regression in static init
Hopefully it does not make a difference in practice whether
uninstantiated template functions are considered to have control flow
through initializers of their static variables.
2019-10-03 11:48:03 +02:00
Robert Marsh
53f522c7f6 C++: respond to PR comments and autoformat 2019-10-02 10:11:58 -07:00
Robert Marsh
bace8c723d C++: side effect instrs for constructor qualifiers
This adds IndirectMustWriteSideEffects for constructor qualifiers. The
introduced sanity failures result from constructor calls without qualifier
operands in the IR
2019-10-01 14:53:37 -07:00
Jonas Jensen
7c319efb8b C++: Data flow through reference parameters 2019-10-01 10:43:49 +02:00
Robert Marsh
a45a6e48f8 C++: remove side effect operands from non-reads 2019-09-30 12:00:55 -07:00
Robert Marsh
8649978a43 C++: add indexes for specific side effects 2019-09-30 12:00:53 -07:00
Robert Marsh
24574be007 C++: add SizedBuffer side effect instructions 2019-09-30 12:00:53 -07:00
Robert Marsh
3d562243e4 C++: add side effects for outparams 2019-09-30 12:00:52 -07:00
Dave Bartolomeo
043e5f716b C++, C#: Autoformat 2019-09-29 22:39:09 -07:00
Matthew Gretton-Dann
cc016d583d C++: Add further vector_size attribute tests 2019-09-27 11:28:31 +01:00
Matthew Gretton-Dann
c10ed5e114 C++: Update results for vector_size atrr changes 2019-09-27 11:28:31 +01:00
Dave Bartolomeo
9b8b364c8f Merge from master 2019-09-26 22:15:02 -07:00
Geoffrey White
18b28b1b57 Merge pull request #1959 from jbj/const-pmf
C++: Classify more expressions as constant
2019-09-26 17:13:27 +01:00
semmle-qlci
24240177c5 Merge pull request #2023 from ian-semmle/agglit
Approved by jbj
2019-09-25 11:35:33 +01:00
Jonas Jensen
0aafa0b0e2 C++: Accept test changes in IR sanity queries
These looks harmless.
2019-09-25 08:55:55 +02:00
Jonas Jensen
b75bf06649 C++: Accept test changes in other IR tests 2019-09-24 13:00:21 +02:00
Ian Lynagh
49276e09c5 C++: Add aggregate literals to sideEffects test 2019-09-24 11:28:57 +01:00
Dave Bartolomeo
300e580874 C++: Implement language-neutral IR type system
The C++ IR currently has a very clunky way of specifying the type of an IR entity (`Instruction`, `Operand`, `IRVariable`, etc.). There are three separate predicates: `getType()`, `isGLValue()`, and `getSize()`. All three are necessary, rather than just having a `getType()` predicate, because some IR entities have types that are not represented via an existing `Type` object in the AST. Examples include the type for an lvalue returned from a `VariableAddress` instruction, the type for an array slice being zero-initialized in a variable initializer, and several others. It is very easy for QL code to just check the `getType()` predicate, while forgetting to use `isGLValue()` to determine if that type is the actual type of the entity (the prvalue case) or the type referred to by a glvalue entity. Furthermore, the C++ type system creates potentially many different `Type` objects for the same underlying type (e.g. typedefs, using declarations, `const`/`volatile` qualifiers, etc.), making it more difficult to tell when two entities have semantically equivalent types.

In addition, other languages for which we want to enable the IR have somewhat different type systems. The various language type systems differ in their structure, although they tend to share the basic building blocks necessary for the IR.

To address all of the above problems, I've introduced a new class hierarchy, rooted at the class `IRType`, that represents a bare-bones type system that is independent of source language (at least across C/C++/C#/Java). A type's identity is based on its kind (signed integer, unsigned integer, floating-point, Boolean, blob, etc.), size and in the case of blob types, a "tag" to differentiate between different classes and structs. No distinction is made between, say `signed int` and plain `int`, or between different language integer types that have the same signedness and size (e.g. `unsigned int` vs. `wchar_t` on Linux). `IRType` is intended for use by language-agnostic IR-based analyses, including range analysis, dataflow, SSA construction, and alias analysis. The set of available `IRType`s is determined by predicate provided by the language library implementation (e.g. `hasSignedIntegerType(int byteSize)`.

In addition to `IRType`, each language now defines a type alias named `LanguageType`, representing the type of an IR entity in more language-specific terms. The only predicate requried on `LanguageType` is `getIRType()`, which returns the single `IRType` object for the language-neutral representation of that `LanguageType`. All other predicates on and subclasses of `LanguageType` are language-specific. There may be many instances of `LanguageType` that map to a given `IRType`, to allow for typedefs, etc.

Most of the changes are mechanical changes in the IR construction code, to return the correct type for each IR entity. SSA construction has also been updated to avoid dependencies on language-specific types.

I have not yet removed the original `getType()` predicates that just return `Type`. These can be removed once we move the remaining existing libraries to use `IRType`.

Test results are, by design, pretty much unchanged. Once case changed for inline asm, because the previously IR generation for it played a little fast and loose with the input/output expressions. The test case now includes both input and output variables. The generated IR for `Conditional_LValue` is now more correct, because we now have a way to represent an lvalue of an lvalue. `syntax-zoo` is still a hot mess. Most of the changed outputs are due to wobble from having multiple functions with the same name, but with a slightly different order of evaluation due to the type changes. Others are wobble from already-invalid IR. A couple non-wobbly places have improved slightly, though.

The C# part of this change is waiting for #2005 to be merged, since that has some of the necessary C# implementation.
2019-09-23 16:14:00 -07:00
Matthew Gretton-Dann
6b28f33713 C++: Update test for fix to namespace members
Generation of IDs for namespace members has been fixed to generate
unique IDs for variables of the same name but in different namespaces.

Update the same_name test to validate this.
2019-09-23 16:04:59 +01:00
Jonas Jensen
cd5f3b84a8 C++: Make sure there's a Instruction for each Expr
This change ensures that all `Expr`s (except parentheses) have a
`TranslatedExpr` with a `getResult` that's one of its own instructions,
not an instruction from one of its operands. This means that when we
translate back and forth between `Expr` and `Instruction`, like in
`DataFlow::exprNode`, we will not conflate `e` with `&e` or `... = e`.
2019-09-23 15:23:31 +02:00
Matthew Gretton-Dann
fc75a6af5a C++: Add tests for using aliases 2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann
9ff38ebeee C++: Update tests for new CTypedefType. 2019-09-23 13:57:50 +01:00
Robert Marsh
fd88f7a3ce Merge pull request #1884 from jbj/dataflow-addressof
C++: Data flow through address-of operator (&)
2019-09-19 09:15:43 -07:00
Jonas Jensen
307b92feed C++: Unknown template literals are constant 2019-09-19 10:23:26 +02:00
Jonas Jensen
e0d1da3b67 C++: Test for template enum constant CFG 2019-09-18 15:17:24 +02:00
Jonas Jensen
7d8396fa65 C++: Constant template pointer-to-member literals 2019-09-18 14:44:25 +02:00
Jonas Jensen
d644150ead C++: Test for template pointer-to-member CFG 2019-09-18 14:30:18 +02:00
Jonas Jensen
0f2731064d C++: Annotate tellDifferent with template status
This is helpful for turning real-world cases into test cases.
2019-09-18 14:23:52 +02:00
Jonas Jensen
c90fd32a78 C++: Pointer-to-member-function is constant 2019-09-18 13:55:56 +02:00
Jonas Jensen
55edfe4224 C++: Test for pointer-to-member-function CFG 2019-09-18 13:37:52 +02:00
Jonas Jensen
e7d8fa4251 Merge pull request #1945 from geoffw0/more-tests
CPP: Add a test of ConditionalDeclExpr.
2019-09-18 11:11:16 +02:00
Geoffrey White
07e29bb627 CPP: Add a test of ConditionalDeclExpr. 2019-09-17 17:38:54 +01:00
Jonas Jensen
b2df18ab78 C++: Document tests better
This addresses PR comments by @rdmarsh2.
2019-09-17 13:17:25 +02:00
Jonas Jensen
ef601cf78e C++: Annotate changes in struct_init.c test 2019-09-17 13:16:36 +02:00
Jonas Jensen
fd6d06fe6f C++: Data flow through address-of operator (&)
The data flow library conflates pointers and their objects in some
places but not others. For example, a member function call `x.f()` will
cause flow from `x` of type `T` to `this` of type `T*` inside `f`. It
might be ideal to avoid that conflation, but that's not realistic
without using the IR.

We've had good experience in the taint tracking library with conflating
pointers and objects, and it improves results for field flow, so perhaps
it's time to try it out for all data flow.
2019-09-17 13:16:34 +02:00
Dave Bartolomeo
553238a9e8 Merge pull request #1922 from jbj/qlcfg-const-pointer-to-member
C++: Add PointerToFieldLiteral class
2019-09-13 10:44:52 -07:00
Jonas Jensen
7cfbe88e7b C++: IR DataFlow::Node.toString consistency
The `toString` for IR data-flow nodes are now similar to AST data-flow
nodes. This should make it easier to use the IR as a drop-in replacement
in the future. There are still differences because the IR data flow
library takes conversions into account.

I did not attempt to align the new nodes we use for field flow. That can
come later, when we add field flow to IR data flow.
2019-09-13 14:33:31 +02:00
Jonas Jensen
562bffe710 C++: Simplify toString of ImplicitParameterNode
This string looked out of place compared to `ExplicitParameterNode`,
whose string is simply the name of the parameter and therefore
indistinguishable from an access to the parameter without looking at the
location also. This has not been a problem so far, and if we want to
distinguish more clearly between initial values and accesses at some
point, we should do it for `ExplicitParameterNode` and
`UninitializedNode` too.
2019-09-13 14:33:26 +02:00
Tom Hvitved
f5cae9b6ea Merge pull request #1881 from aschackmull/java/pathgraph-nodes
Java/C++/C#: Add nodes predicate to PathGraph.
2019-09-13 10:32:47 +02:00
Anders Schack-Mulligen
61e4e61087 C++: Adjust qltest expected output. 2019-09-12 11:00:49 +02:00