codeql

mirror of https://github.com/github/codeql.git synced 2026-04-02 13:48:20 +02:00

Author	SHA1	Message	Date
Dave Bartolomeo	cc5a689293	C++/C#: Fix up after merge from master	2019-10-25 14:11:34 -07:00
Jonas Jensen	22de0efc58	Merge pull request #2008 from dave-bartolomeo/dave/IRType2 C++: Implement language-neutral IR type system	2019-10-25 09:42:23 +02:00
Dave Bartolomeo	2cd694756b	C++: Remove mistakenly-added file	2019-10-21 15:58:38 -07:00
Dave Bartolomeo	7241c1aae6	C++/C#: More sanity checks for `IRType`	2019-10-21 14:22:46 -07:00
Dave Bartolomeo	71a6b5dffe	C++/C#: Fix some duplicate IRType problems, and add a sanity test	2019-10-21 10:46:30 -07:00
Robert Marsh	30d7238921	C++: fix missing getPrimaryInstruction	2019-10-16 17:05:37 -07:00
Robert Marsh	fffe3c2432	C++: add sanity test for side effect primaries	2019-10-16 16:53:55 -07:00
Dave Bartolomeo	167d2289c4	Merge from master	2019-10-16 10:10:10 -07:00
Robert Marsh	a45a6e48f8	C++: remove side effect operands from non-reads	2019-09-30 12:00:55 -07:00
Robert Marsh	8649978a43	C++: add indexes for specific side effects	2019-09-30 12:00:53 -07:00
Robert Marsh	24574be007	C++: add SizedBuffer side effect instructions	2019-09-30 12:00:53 -07:00
Robert Marsh	3d562243e4	C++: add side effects for outparams	2019-09-30 12:00:52 -07:00
Dave Bartolomeo	043e5f716b	C++, C#: Autoformat	2019-09-29 22:39:09 -07:00
Matthew Gretton-Dann	c10ed5e114	C++: Update results for vector_size atrr changes	2019-09-27 11:28:31 +01:00
Dave Bartolomeo	300e580874	C++: Implement language-neutral IR type system The C++ IR currently has a very clunky way of specifying the type of an IR entity (`Instruction`, `Operand`, `IRVariable`, etc.). There are three separate predicates: `getType()`, `isGLValue()`, and `getSize()`. All three are necessary, rather than just having a `getType()` predicate, because some IR entities have types that are not represented via an existing `Type` object in the AST. Examples include the type for an lvalue returned from a `VariableAddress` instruction, the type for an array slice being zero-initialized in a variable initializer, and several others. It is very easy for QL code to just check the `getType()` predicate, while forgetting to use `isGLValue()` to determine if that type is the actual type of the entity (the prvalue case) or the type referred to by a glvalue entity. Furthermore, the C++ type system creates potentially many different `Type` objects for the same underlying type (e.g. typedefs, using declarations, `const`/`volatile` qualifiers, etc.), making it more difficult to tell when two entities have semantically equivalent types. In addition, other languages for which we want to enable the IR have somewhat different type systems. The various language type systems differ in their structure, although they tend to share the basic building blocks necessary for the IR. To address all of the above problems, I've introduced a new class hierarchy, rooted at the class `IRType`, that represents a bare-bones type system that is independent of source language (at least across C/C++/C#/Java). A type's identity is based on its kind (signed integer, unsigned integer, floating-point, Boolean, blob, etc.), size and in the case of blob types, a "tag" to differentiate between different classes and structs. No distinction is made between, say `signed int` and plain `int`, or between different language integer types that have the same signedness and size (e.g. `unsigned int` vs. `wchar_t` on Linux). `IRType` is intended for use by language-agnostic IR-based analyses, including range analysis, dataflow, SSA construction, and alias analysis. The set of available `IRType`s is determined by predicate provided by the language library implementation (e.g. `hasSignedIntegerType(int byteSize)`. In addition to `IRType`, each language now defines a type alias named `LanguageType`, representing the type of an IR entity in more language-specific terms. The only predicate requried on `LanguageType` is `getIRType()`, which returns the single `IRType` object for the language-neutral representation of that `LanguageType`. All other predicates on and subclasses of `LanguageType` are language-specific. There may be many instances of `LanguageType` that map to a given `IRType`, to allow for typedefs, etc. Most of the changes are mechanical changes in the IR construction code, to return the correct type for each IR entity. SSA construction has also been updated to avoid dependencies on language-specific types. I have not yet removed the original `getType()` predicates that just return `Type`. These can be removed once we move the remaining existing libraries to use `IRType`. Test results are, by design, pretty much unchanged. Once case changed for inline asm, because the previously IR generation for it played a little fast and loose with the input/output expressions. The test case now includes both input and output variables. The generated IR for `Conditional_LValue` is now more correct, because we now have a way to represent an lvalue of an lvalue. `syntax-zoo` is still a hot mess. Most of the changed outputs are due to wobble from having multiple functions with the same name, but with a slightly different order of evaluation due to the type changes. Others are wobble from already-invalid IR. A couple non-wobbly places have improved slightly, though. The C# part of this change is waiting for #2005 to be merged, since that has some of the necessary C# implementation.	2019-09-23 16:14:00 -07:00
Matthew Gretton-Dann	9ff38ebeee	C++: Update tests for new CTypedefType.	2019-09-23 13:57:50 +01:00
Jonas Jensen	4ef5c9af62	C++: Autoformat everything Some files that will change in #1736 have been spared. ./build -j4 target/jars/qlformat find ql/cpp/ql -name ".ql" -print0 \| xargs -0 target/jars/qlformat --input find ql/cpp/ql -name ".qll" -print0 \| xargs -0 target/jars/qlformat --input (cd ql && git checkout 'cpp/ql/src/semmle/code/cpp/ir/implementation/*/SSA*.qll') buildutils-internal/scripts/pr-checks/sync-identical-files.py --latest	2019-09-09 11:25:53 +02:00
Dave Bartolomeo	a84a7e8c8a	C++: Fixup after rebase	2019-08-22 11:36:15 -07:00
Dave Bartolomeo	3108d97ea5	C++: Minimal IR support for `GNUVectorType` Lack of support for the GCC vector extensions was causing a bunch of sanity failures in the syntax zoo. This PR adds minimal IR generation support for these types. Added `VectorAggregateLiteral`, and factored most of `ArrayAggregateLiteral` out into the common base class `ArrayOrVectorAggregateLiteral`. I'd be happy to merge these all into `ArrayAggregateLiteral` if we don't care about the distinction. Made a few tweaks to `TranslatedArrayExpr` to compute the element type by looking at the result type of the `ArrayExpr`, not the type of the base operand. Note that this means that for `T a[10]; a[i] = foo;`, the result of the `PointerAdd` for `a[i]` will now be `glvalue<T>`, not `T*`. This is actually more faithful to the source language, and has no semantic difference on the IR. Added some missing `getInstructionElementSize()` overrides. Added the new `BuiltIn` opcode, renamed the existing `BuiltInInstruction` to `BuiltInOperationInstruction`, and made any `BuiltInOperation` that we don't specifically handle translate to `BuiltIn`. `BuiltInOperationInstruction` now has a way to get the specific `BuiltInOperation`. Added `getCanonicalQLClass()` overrides for `GNUVectorType` and `BuiltInOperation`. Added a simple IR test for vector types.	2019-08-22 10:43:30 -07:00
zlaski-semmle	ce71b45649	Zlaski/cpp386a (#1753 ) * [CPP-386] Cumulative patch. * Restore dataflow libraries clobbered by my last commit.	2019-08-19 10:03:18 +02:00
Jonas Jensen	d378da33e8	C++ IR: Fix performance of large array value init There were two problems here. 1. The inline predicates `isInitialized` and `isValueInitialized` on `ArrayAggregateLiteral` caused their callers to materialize every `int` that was a valid index into the array. This was slow on huge value-initialized arrays. 2. The `isInitialized` predicate was used in the `TInstructionTag` IPA type, creating a numbered tuple for each integer in it. This seemed to be entirely unnecessary since the `TranslatedElement`s using those tags were already indexed appropriately.	2019-08-06 14:50:57 +02:00
Dave Bartolomeo	6370391dbd	C++: Add sanity test for definitions that don't dominate their uses.	2019-08-01 15:01:42 -07:00
Dave Bartolomeo	912679ef8c	C++: Two IR fixes My original fix in https://github.com/Semmle/ql/pull/1661 fixed my minimal test case, but did not fix the original failure in a Linux snapshot. The real fix is to simply not create a `TranslatedDeclarationEntry` for an extern declaration, and have `TranslatedDeclStmt` skip any such declarations. I've added a regression test for that case (multiple extern declarations with same location in a macro expansion, with control flow between them). I did verify that it generates correct IR, and that it fixes all of the "use not dominated by definition" failures in Linux. The underlying extractor bug, that caused the above issue also caused PrintAST to print garbage. I've worked around the bug in PrintAST.qll. I've also fixed a bug in the control flow for `try`/`catch`, where there was missing flow from the `CatchByType` of the last handler of a `try` to the enclosing handler (or `Unwind`). Hat tip to @AndreiDiaconu1 for spotting this bug.	2019-08-01 14:38:19 -07:00
Dave Bartolomeo	972f0d97d3	C++: Stop generating `NoOp` instructions for declarations of externs Previously, where we had a function-scoped `DeclarationEntry` for an extern variable or function, we would generate a `NoOp` instruction for it. There's nothing wrong with this by itself, although it was unnecessary. However, I've hit an extractor issue (Jira ticket already opened) that commonly causes multiple `DeclStmt`s to share a single `DeclarationEntry` child on extern declarations, so removing the `NoOp` instructions is an easy way to work around the extractor issue.	2019-07-30 16:49:24 -07:00
Ziemowit Laski	a0570213d7	[CPP-386] Separate printing of casts and conversion, per Dave's request.	2019-07-19 16:56:22 -07:00
Ziemowit Laski	45d944411f	[CPP-386] Fix Local{Class,Struct,Union}, macro invocations, printing of member functions and operators.	2019-07-18 16:09:04 -07:00
Ziemowit Laski	926742561b	[CPP-340] Eliminate superfluous print-outs of `NestedStruct`, `NestedUnion` and `MemberFunction`	2019-07-17 13:39:43 -07:00
Ziemowit Laski	f0982791e3	[CPP-340] Remove colons and extraenous QLDoc comments; add a few more classes.	2019-07-16 17:58:39 -07:00
Ziemowit Laski	c906560edd	Fix up expected IR output after rebase.	2019-07-13 12:57:25 -07:00
Ziemowit Laski	960a41be85	Handle `__builtin_addressof`.	2019-07-13 12:23:40 -07:00
Ziemowit Laski	175ba7b3b0	Fix up .expected on the IR side.	2019-07-13 12:23:40 -07:00
Ziemowit Laski	e5fc07660d	[CPP-386] Print QL AST classes next to elements in PrintAST trees.	2019-07-13 12:23:09 -07:00
Dave Bartolomeo	00ff2bb6c4	Merge pull request #1554 from jbj/ir-ErrorExpr C++ IR: support for translating ErrorExpr	2019-07-11 13:05:04 -07:00
Jonas Jensen	23001d5471	Merge pull request #1566 from rdmarsh2/rdmarsh/cpp/pure-functions-effect-model C++: alias and side effect info for pure functions	2019-07-11 21:21:54 +02:00
Robert Marsh	c195420ba1	C++: respond to PR comments	2019-07-11 11:00:52 -07:00
Jonas Jensen	0889d5d27a	C++ IR: Improve ErrorExpr test The previous version of the test used `0 = 1;` to test an lvalue-typed `ErrorExpr`, but the extractor replaced the whole assignment expression with `ErrorExpr` instead of just the LHS. This variation of the test only leads to an `ErrorExpr` for the part of the syntax that's supposed to be an lvalue-typed expression, so that's an improvement. Unfortunately it still doesn't demonstrate that we can `Store` into an address computed by an `ErrorExpr`.	2019-07-09 13:35:20 +02:00
Jonas Jensen	4324c97d39	C++: Use Opcode::Error for ErrorExpr translation	2019-07-09 13:26:00 +02:00
Jonas Jensen	a86ddd50de	C++ IR: Translate ErrorExpr to NoOp	2019-07-09 13:18:11 +02:00
Jonas Jensen	e2a43eeed6	C++ IR: Tests with ErrorExpr	2019-07-09 13:18:09 +02:00
Dave Bartolomeo	7bbfffec4d	Merge pull request #1552 from jbj/ir-builtin_addressof C++ IR: Support __builtin_addressof	2019-07-08 17:08:38 -07:00
Robert Marsh	41e4d920e3	C++: alias and side effect info for pure functions	2019-07-08 12:26:58 -07:00
Robert Marsh	ea7602b571	C++: add test for Alias and SideEffect models	2019-07-08 11:41:46 -07:00
Jonas Jensen	4b4e7caf9f	C++ IR: Support __builtin_addressof	2019-07-05 11:05:00 +02:00
Jonas Jensen	6fe9945c04	C++: Placeholder translation of delete expressions Before this change, `delete` and `delete[]` expressions had no control flow after them, which caused the reachability analysis to remove all code after a delete expression. This commit adds placeholder support for delete expression by translating them to `NoOp` instructions so their presence doesn't cause large chunks of the program to be removed.	2019-07-05 10:54:35 +02:00
Robert Marsh	5dd8c9cd4e	C++: revert InlineAsm subclassing SideEffectOpcode	2019-05-31 13:28:26 -07:00
Robert Marsh	2770b2a9b9	C++: respond to PR comments	2019-05-31 13:19:40 -07:00
Robert Marsh	98d6f5919f	C++: Treat asmStmt operands as input/output in IR	2019-05-31 12:51:44 -07:00
Robert Marsh	66d1efdb97	C++: respond to PR comments	2019-05-31 12:42:04 -07:00
Robert Marsh	23560436a7	C++: add minimal AsmStmt support to IR	2019-05-31 12:29:19 -07:00
Dave Bartolomeo	aff85c5b24	C++: IR support for range-based `for` loops IR construction was missing support for C++ 11 range-based `for` loops. The extractor generates ASTs for the compiler-generated implementation already, so I had enough information to generate IR. I've expanded on some of the predicates in `RangeBasedForStmt` to access the desugared information. One complication was that the `DeclStmt`s for the compiler-generated variables seem to have results for `getDeclaration()` but not for `getDeclarationEntry()`. This required handling these slightly differently than we do for other `DeclStmt`s. The flow for range-based `for` is actually easier than for a regular `for`, because all three components (init, condition, and update) are always present.	2019-05-29 14:40:29 -07:00

1 2 3

139 Commits