Commit Graph

2357 Commits

Author SHA1 Message Date
Jonas Jensen
34b625900a C++: Avoid extends Operation in LeapYear.qll
The `Operation` class is abstract, and extending it caused cached stages
to be recomputed all the way down to the AST. This meant that the leap
year queries evaluated their own copy of SSA and data flow.
2019-10-01 11:50:33 +02:00
Jonas Jensen
7c319efb8b C++: Data flow through reference parameters 2019-10-01 10:43:49 +02:00
Robert Marsh
a45a6e48f8 C++: remove side effect operands from non-reads 2019-09-30 12:00:55 -07:00
Robert Marsh
9f20cb83c3 C++/C#: Autoformat 2019-09-30 12:00:55 -07:00
Robert Marsh
fcfc11052a C++: add QLDoc to side effect functions 2019-09-30 12:00:54 -07:00
Robert Marsh
8649978a43 C++: add indexes for specific side effects 2019-09-30 12:00:53 -07:00
Robert Marsh
24574be007 C++: add SizedBuffer side effect instructions 2019-09-30 12:00:53 -07:00
Robert Marsh
554d6390f7 C++: clean up after rebase 2019-09-30 12:00:53 -07:00
Robert Marsh
49088e7f09 C++: Fix formatting and dropped line 2019-09-30 12:00:53 -07:00
Robert Marsh
3d562243e4 C++: add side effects for outparams 2019-09-30 12:00:52 -07:00
Ziemowit Laski
a0cbd87d1f [zlaski/memset-model] Rename predicate usage as per PR/1938. 2019-09-30 10:47:59 -07:00
Ziemowit Laski
ae169e9c33 [zlaski/memset-model] Add AliasFunction as base class of MemsetFunction; override predicates parameterNeverEscapes, parameterEscapesOnlyViaReturn and parameterIsAlwaysReturned. 2019-09-30 10:44:12 -07:00
Ziemowit Laski
aaa2a60b93 [zlaski/memset-model] Remove taint tracking from Memset.qll. Add Memset.qll to Models.qll. 2019-09-30 10:44:12 -07:00
Ziemowit Laski
144aacb09d [zlaski/memset-model] New Memset.qll file. 2019-09-30 10:44:12 -07:00
Matthew Gretton-Dann
b76f66e83b C++: Add test cases for constant initializers
Adds test cases for initialisation of constants which aren't simple
zeros.  Example: int x = int();
2019-09-30 14:57:26 +01:00
Jonas Jensen
f417640da4 Merge pull request #1938 from dave-bartolomeo/dave/InNOut
C++: Rename predicates in `FunctionInputsAndOutputs.qll` and add QLDoc
2019-09-30 13:30:19 +02:00
Dave Bartolomeo
420713204a C++, C#: Fix typo 2019-09-29 22:44:17 -07:00
Dave Bartolomeo
043e5f716b C++, C#: Autoformat 2019-09-29 22:39:09 -07:00
Dave Bartolomeo
c1e5db0b96 C++ More PR feedback 2019-09-27 17:54:18 -07:00
Dave Bartolomeo
bcd987cdf1 Merge from master and share value numbering 2019-09-27 17:40:43 -07:00
Dave Bartolomeo
f76334c24a C++, C#: Share unaliased SSA files between languages
Most of the C# diffs are from bringing those files in sync with the latest C++ files.
2019-09-27 13:46:42 -07:00
Matthew Gretton-Dann
cc016d583d C++: Add further vector_size attribute tests 2019-09-27 11:28:31 +01:00
Matthew Gretton-Dann
c10ed5e114 C++: Update results for vector_size atrr changes 2019-09-27 11:28:31 +01:00
Dave Bartolomeo
9b8b364c8f Merge from master 2019-09-26 22:15:02 -07:00
Dave Bartolomeo
e30e163081 C#: Implement IRType
This commit implements the language-neutral IR type system for C#. It mostly follows the same pattern as C++, modified to fit the C# type system. All object references, pointers, and lvalues are represented as `IRAddress` types. All structs and generic parameters are implemented as `IRBlobType`. Function addresses get a single `IRFunctionAddressType`.

I had to fix a couple places in the original IR type system where I didn't realize I was still depending on language-specific types. As part of this, `CSharpType` and `CppType` now have a `hasUnspecifiedType()` predicate, which is equivalent to `hasType()`, except that it holds only for the unspecified version of the type. This predicate can go away once we remove the IR's references to the underlying `Type` objects.

All C# IR tests pass without modification, but only because this commit continues to print the name of `IRUnknownType` as `null`, and `IRFunctionAddressType` as `glval<null>`. These will be fixed separately in a subsequent commit in this PR.
2019-09-26 15:47:52 -07:00
Dave Bartolomeo
28aa7dcae2 C++: Fix PR feedback 2019-09-26 13:56:43 -07:00
Geoffrey White
18b28b1b57 Merge pull request #1959 from jbj/const-pmf
C++: Classify more expressions as constant
2019-09-26 17:13:27 +01:00
Anders Schack-Mulligen
f97958296d Java/C++/C#: Sync. 2019-09-26 17:12:08 +02:00
semmle-qlci
24240177c5 Merge pull request #2023 from ian-semmle/agglit
Approved by jbj
2019-09-25 11:35:33 +01:00
Ian Lynagh
142e1cb9fb C++: Implement AggregateLiteral.mayBeImpure() 2019-09-25 10:34:30 +01:00
Jonas Jensen
0aafa0b0e2 C++: Accept test changes in IR sanity queries
These looks harmless.
2019-09-25 08:55:55 +02:00
Ziemowit Laski
a6d619cfe1 [zlaski/what-buffer-function] Rename CustomModels to Models 2019-09-24 18:17:34 -07:00
Ziemowit Laski
7e14e2a950 [zlaski/what-buffer-function] Rename references to BufferFunction to ArrayFunction. 2019-09-24 18:02:14 -07:00
Ian Lynagh
49276e09c5 C++: Add aggregate literals to sideEffects test 2019-09-24 11:28:57 +01:00
Dave Bartolomeo
300e580874 C++: Implement language-neutral IR type system
The C++ IR currently has a very clunky way of specifying the type of an IR entity (`Instruction`, `Operand`, `IRVariable`, etc.). There are three separate predicates: `getType()`, `isGLValue()`, and `getSize()`. All three are necessary, rather than just having a `getType()` predicate, because some IR entities have types that are not represented via an existing `Type` object in the AST. Examples include the type for an lvalue returned from a `VariableAddress` instruction, the type for an array slice being zero-initialized in a variable initializer, and several others. It is very easy for QL code to just check the `getType()` predicate, while forgetting to use `isGLValue()` to determine if that type is the actual type of the entity (the prvalue case) or the type referred to by a glvalue entity. Furthermore, the C++ type system creates potentially many different `Type` objects for the same underlying type (e.g. typedefs, using declarations, `const`/`volatile` qualifiers, etc.), making it more difficult to tell when two entities have semantically equivalent types.

In addition, other languages for which we want to enable the IR have somewhat different type systems. The various language type systems differ in their structure, although they tend to share the basic building blocks necessary for the IR.

To address all of the above problems, I've introduced a new class hierarchy, rooted at the class `IRType`, that represents a bare-bones type system that is independent of source language (at least across C/C++/C#/Java). A type's identity is based on its kind (signed integer, unsigned integer, floating-point, Boolean, blob, etc.), size and in the case of blob types, a "tag" to differentiate between different classes and structs. No distinction is made between, say `signed int` and plain `int`, or between different language integer types that have the same signedness and size (e.g. `unsigned int` vs. `wchar_t` on Linux). `IRType` is intended for use by language-agnostic IR-based analyses, including range analysis, dataflow, SSA construction, and alias analysis. The set of available `IRType`s is determined by predicate provided by the language library implementation (e.g. `hasSignedIntegerType(int byteSize)`.

In addition to `IRType`, each language now defines a type alias named `LanguageType`, representing the type of an IR entity in more language-specific terms. The only predicate requried on `LanguageType` is `getIRType()`, which returns the single `IRType` object for the language-neutral representation of that `LanguageType`. All other predicates on and subclasses of `LanguageType` are language-specific. There may be many instances of `LanguageType` that map to a given `IRType`, to allow for typedefs, etc.

Most of the changes are mechanical changes in the IR construction code, to return the correct type for each IR entity. SSA construction has also been updated to avoid dependencies on language-specific types.

I have not yet removed the original `getType()` predicates that just return `Type`. These can be removed once we move the remaining existing libraries to use `IRType`.

Test results are, by design, pretty much unchanged. Once case changed for inline asm, because the previously IR generation for it played a little fast and loose with the input/output expressions. The test case now includes both input and output variables. The generated IR for `Conditional_LValue` is now more correct, because we now have a way to represent an lvalue of an lvalue. `syntax-zoo` is still a hot mess. Most of the changed outputs are due to wobble from having multiple functions with the same name, but with a slightly different order of evaluation due to the type changes. Others are wobble from already-invalid IR. A couple non-wobbly places have improved slightly, though.

The C# part of this change is waiting for #2005 to be merged, since that has some of the necessary C# implementation.
2019-09-23 16:14:00 -07:00
Matthew Gretton-Dann
6b28f33713 C++: Update test for fix to namespace members
Generation of IDs for namespace members has been fixed to generate
unique IDs for variables of the same name but in different namespaces.

Update the same_name test to validate this.
2019-09-23 16:04:59 +01:00
Jonas Jensen
22e57a6559 Merge pull request #1860 from matt-gretton-dann/add-using-aliases
Add support for using aliases
2019-09-23 16:53:51 +02:00
Jonas Jensen
898976121b Merge pull request #1987 from geoffw0/toomanyformat
CPP: WrongNumberOfFormatArguments.ql Fix
2019-09-23 16:05:11 +02:00
Robert Marsh
90c91a78f8 Merge pull request #1976 from pavgust/fix/hashcons-perf
C++: HashCons: Further performance improvements
2019-09-23 06:37:03 -07:00
Matthew Gretton-Dann
4606587fe8 C++: Apply style guide to TypedefType.qll 2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann
af3b0d9e73 C++: Update stats. 2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann
c8dfa46c63 C++: Add upgrade script for using aliases. 2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann
fc75a6af5a C++: Add tests for using aliases 2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann
9ff38ebeee C++: Update tests for new CTypedefType. 2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann
5468b8def7 C++: Add support for C++ using aliases
Previously these were identified as typedefs.
2019-09-23 13:57:50 +01:00
Geoffrey White
b3df289a80 CPP: Fix test. 2019-09-23 13:56:24 +01:00
Geoffrey White
2d8e4b3176 CPP: Additional cases resembling the ticket. 2019-09-23 13:04:14 +01:00
Geoffrey White
040bd89163 CPP: Correct expected results. 2019-09-23 11:02:36 +01:00
Geoffrey White
9100ab9360 CPP: Autoformat. 2019-09-20 15:30:59 +01:00
Geoffrey White
f7607313e7 CPP: Fix FPs. 2019-09-20 15:12:55 +01:00