codeql

mirror of https://github.com/github/codeql.git synced 2025-12-17 01:03:14 +01:00

Author	SHA1	Message	Date
Dave Bartolomeo	f76334c24a	C++, C#: Share unaliased SSA files between languages Most of the C# diffs are from bringing those files in sync with the latest C++ files.	2019-09-27 13:46:42 -07:00
Matthew Gretton-Dann	cc016d583d	C++: Add further vector_size attribute tests	2019-09-27 11:28:31 +01:00
Matthew Gretton-Dann	c10ed5e114	C++: Update results for vector_size atrr changes	2019-09-27 11:28:31 +01:00
Dave Bartolomeo	9b8b364c8f	Merge from master	2019-09-26 22:15:02 -07:00
Dave Bartolomeo	e30e163081	C#: Implement `IRType` This commit implements the language-neutral IR type system for C#. It mostly follows the same pattern as C++, modified to fit the C# type system. All object references, pointers, and lvalues are represented as `IRAddress` types. All structs and generic parameters are implemented as `IRBlobType`. Function addresses get a single `IRFunctionAddressType`. I had to fix a couple places in the original IR type system where I didn't realize I was still depending on language-specific types. As part of this, `CSharpType` and `CppType` now have a `hasUnspecifiedType()` predicate, which is equivalent to `hasType()`, except that it holds only for the unspecified version of the type. This predicate can go away once we remove the IR's references to the underlying `Type` objects. All C# IR tests pass without modification, but only because this commit continues to print the name of `IRUnknownType` as `null`, and `IRFunctionAddressType` as `glval<null>`. These will be fixed separately in a subsequent commit in this PR.	2019-09-26 15:47:52 -07:00
Dave Bartolomeo	28aa7dcae2	C++: Fix PR feedback	2019-09-26 13:56:43 -07:00
Geoffrey White	18b28b1b57	Merge pull request #1959 from jbj/const-pmf C++: Classify more expressions as constant	2019-09-26 17:13:27 +01:00
Anders Schack-Mulligen	f97958296d	Java/C++/C#: Sync.	2019-09-26 17:12:08 +02:00
semmle-qlci	24240177c5	Merge pull request #2023 from ian-semmle/agglit Approved by jbj	2019-09-25 11:35:33 +01:00
Ian Lynagh	142e1cb9fb	C++: Implement AggregateLiteral.mayBeImpure()	2019-09-25 10:34:30 +01:00
Jonas Jensen	0aafa0b0e2	C++: Accept test changes in IR sanity queries These looks harmless.	2019-09-25 08:55:55 +02:00
Ziemowit Laski	a6d619cfe1	[zlaski/what-buffer-function] Rename `CustomModels` to `Models`	2019-09-24 18:17:34 -07:00
Ziemowit Laski	7e14e2a950	[zlaski/what-buffer-function] Rename references to `BufferFunction` to `ArrayFunction`.	2019-09-24 18:02:14 -07:00
Ian Lynagh	49276e09c5	C++: Add aggregate literals to sideEffects test	2019-09-24 11:28:57 +01:00
Dave Bartolomeo	300e580874	C++: Implement language-neutral IR type system The C++ IR currently has a very clunky way of specifying the type of an IR entity (`Instruction`, `Operand`, `IRVariable`, etc.). There are three separate predicates: `getType()`, `isGLValue()`, and `getSize()`. All three are necessary, rather than just having a `getType()` predicate, because some IR entities have types that are not represented via an existing `Type` object in the AST. Examples include the type for an lvalue returned from a `VariableAddress` instruction, the type for an array slice being zero-initialized in a variable initializer, and several others. It is very easy for QL code to just check the `getType()` predicate, while forgetting to use `isGLValue()` to determine if that type is the actual type of the entity (the prvalue case) or the type referred to by a glvalue entity. Furthermore, the C++ type system creates potentially many different `Type` objects for the same underlying type (e.g. typedefs, using declarations, `const`/`volatile` qualifiers, etc.), making it more difficult to tell when two entities have semantically equivalent types. In addition, other languages for which we want to enable the IR have somewhat different type systems. The various language type systems differ in their structure, although they tend to share the basic building blocks necessary for the IR. To address all of the above problems, I've introduced a new class hierarchy, rooted at the class `IRType`, that represents a bare-bones type system that is independent of source language (at least across C/C++/C#/Java). A type's identity is based on its kind (signed integer, unsigned integer, floating-point, Boolean, blob, etc.), size and in the case of blob types, a "tag" to differentiate between different classes and structs. No distinction is made between, say `signed int` and plain `int`, or between different language integer types that have the same signedness and size (e.g. `unsigned int` vs. `wchar_t` on Linux). `IRType` is intended for use by language-agnostic IR-based analyses, including range analysis, dataflow, SSA construction, and alias analysis. The set of available `IRType`s is determined by predicate provided by the language library implementation (e.g. `hasSignedIntegerType(int byteSize)`. In addition to `IRType`, each language now defines a type alias named `LanguageType`, representing the type of an IR entity in more language-specific terms. The only predicate requried on `LanguageType` is `getIRType()`, which returns the single `IRType` object for the language-neutral representation of that `LanguageType`. All other predicates on and subclasses of `LanguageType` are language-specific. There may be many instances of `LanguageType` that map to a given `IRType`, to allow for typedefs, etc. Most of the changes are mechanical changes in the IR construction code, to return the correct type for each IR entity. SSA construction has also been updated to avoid dependencies on language-specific types. I have not yet removed the original `getType()` predicates that just return `Type`. These can be removed once we move the remaining existing libraries to use `IRType`. Test results are, by design, pretty much unchanged. Once case changed for inline asm, because the previously IR generation for it played a little fast and loose with the input/output expressions. The test case now includes both input and output variables. The generated IR for `Conditional_LValue` is now more correct, because we now have a way to represent an lvalue of an lvalue. `syntax-zoo` is still a hot mess. Most of the changed outputs are due to wobble from having multiple functions with the same name, but with a slightly different order of evaluation due to the type changes. Others are wobble from already-invalid IR. A couple non-wobbly places have improved slightly, though. The C# part of this change is waiting for #2005 to be merged, since that has some of the necessary C# implementation.	2019-09-23 16:14:00 -07:00
Matthew Gretton-Dann	6b28f33713	C++: Update test for fix to namespace members Generation of IDs for namespace members has been fixed to generate unique IDs for variables of the same name but in different namespaces. Update the same_name test to validate this.	2019-09-23 16:04:59 +01:00
Jonas Jensen	22e57a6559	Merge pull request #1860 from matt-gretton-dann/add-using-aliases Add support for using aliases	2019-09-23 16:53:51 +02:00
Jonas Jensen	898976121b	Merge pull request #1987 from geoffw0/toomanyformat CPP: WrongNumberOfFormatArguments.ql Fix	2019-09-23 16:05:11 +02:00
Robert Marsh	90c91a78f8	Merge pull request #1976 from pavgust/fix/hashcons-perf C++: HashCons: Further performance improvements	2019-09-23 06:37:03 -07:00
Matthew Gretton-Dann	4606587fe8	C++: Apply style guide to TypedefType.qll	2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann	af3b0d9e73	C++: Update stats.	2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann	c8dfa46c63	C++: Add upgrade script for using aliases.	2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann	fc75a6af5a	C++: Add tests for using aliases	2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann	9ff38ebeee	C++: Update tests for new CTypedefType.	2019-09-23 13:57:50 +01:00
Matthew Gretton-Dann	5468b8def7	C++: Add support for C++ using aliases Previously these were identified as typedefs.	2019-09-23 13:57:50 +01:00
Geoffrey White	b3df289a80	CPP: Fix test.	2019-09-23 13:56:24 +01:00
Geoffrey White	2d8e4b3176	CPP: Additional cases resembling the ticket.	2019-09-23 13:04:14 +01:00
Geoffrey White	040bd89163	CPP: Correct expected results.	2019-09-23 11:02:36 +01:00
Geoffrey White	9100ab9360	CPP: Autoformat.	2019-09-20 15:30:59 +01:00
Geoffrey White	f7607313e7	CPP: Fix FPs.	2019-09-20 15:12:55 +01:00
Geoffrey White	9a407eb43c	CPP: Test format args with mismatching declarations.	2019-09-20 14:54:44 +01:00
Pavel Avgustinov	1c971d3f88	HashCons: Further performance improvements The key insight here is that `HC_FieldCons` and `HC_Array` are functionally determined by the things that arise in another recursive call. Lifting them to their own predicate, therefore, reduces nonlinearity and constrains the join order in a way that cannot be asymptotically bad -- and, indeed, makes quite a big difference in practice.	2019-09-20 12:00:33 +01:00
Robert Marsh	d3f2d8169e	Merge pull request #1967 from jbj/tainttracking-ir-2 C++: DefaultTaintTracking flow from a to a[i]	2019-09-19 15:00:29 -07:00
Robert Marsh	9c6a0ffc48	Merge pull request #1979 from nickrolfe/wrong_type_uninstantiated C++: ignore uninstantiated templates in WrongTypeFormatArguments.ql	2019-09-19 14:51:45 -07:00
Nick Rolfe	56f4f86921	C++: ignore uninstantiated templates in WrongTypeFormatArguments.ql	2019-09-19 21:18:47 +01:00
Robert Marsh	fd88f7a3ce	Merge pull request #1884 from jbj/dataflow-addressof C++: Data flow through address-of operator (&)	2019-09-19 09:15:43 -07:00
Jonas Jensen	29c93488bc	C++: DefaultTaintTracking flow from a to a[i] Switching `security.TaintTracking` to use `DefaultTaintTracking` causes us to lose a result from `UnboundedWrite.ql`, while this commit restores it: diff --git a/semmlecode-cpp-tests/DO_NOT_DISTRIBUTE/security-tests/CWE-120/CERT/STR35-C/UnboundedWrite.expected b/semmlecode-cpp-tests/DO_NOT_DISTRIBUTE/security-tests/CWE-120/CERT/STR35-C/UnboundedWrite.expected index 1eba0e52f0e..d947b33b9d9 100644 --- a/semmlecode-cpp-tests/DO_NOT_DISTRIBUTE/security-tests/CWE-120/CERT/STR35-C/UnboundedWrite.expected +++ b/semmlecode-cpp-tests/DO_NOT_DISTRIBUTE/security-tests/CWE-120/CERT/STR35-C/UnboundedWrite.expected @@ -1,2 +1,3 @@ +\| main.c:54:7:54:12 \| call to strcat \| This 'call to strcat' with input from $@ may overflow the destination. \| main.c:93:15:93:18 \| argv \| argv \| \| main.c:99:9:99:12 \| call to gets \| This 'call to gets' with input from $@ may overflow the destination. \| main.c:99:9:99:12 \| call to gets \| call to gets \| \| main.c:213:17:213:19 \| buf \| This 'scanf string argument' with input from $@ may overflow the destination. \| main.c:213:17:213:19 \| buf \| buf \|	2019-09-19 14:52:40 +02:00
Jonas Jensen	34a5368101	C++: Ignore templates in AmbiguouslySignedBitField If it's possible that the type is not fully resolved, it's better to avoid giving an alert. This fixes a FP in https://github.com/heremaps/flatdata.	2019-09-19 14:21:53 +02:00
Jonas Jensen	0ed0951d43	C++: Demonstrate AmbiguouslySignedBitField FP	2019-09-19 14:19:34 +02:00
Jonas Jensen	30d1c327cf	C++: Implement predictableInstruction without Expr This is one step toward implementing the taint-tracking wrapper in terms of `Instruction` rather than `Expr`. This leads to a few duplicate results in `TaintedAllocationSize.ql` because the library now considers `sizeof(int)` to be just as predictable as `4`, whereas the `security.TaintTracking` library does not consider `sizeof` to be predictable. I think it's simpler to accept the duplicate results since they are ultimately a quirk of the query, not the library. The following is the diff between (a) replacing `TaintTracking.qll` with a link to `DefaultTaintTracking.qll` and (b) additionally applying this commit. diff --git a b --- a/cpp/ql/test/query-tests/Security/CWE/CWE-190/semmle/TaintedAllocationSize/TaintedAllocationSize.expected +++ b/cpp/ql/test/query-tests/Security/CWE/CWE-190/semmle/TaintedAllocationSize/TaintedAllocationSize.expected @@ -1,5 +1,8 @@ \| test.cpp:42:31:42:36 \| call to malloc \| This allocation size is derived from $@ and might overflow \| test.cpp:39:21:39:24 \| argv \| user input (argv) \| +\| test.cpp:43:31:43:36 \| call to malloc \| This allocation size is derived from $@ and might overflow \| test.cpp:39:21:39:24 \| argv \| user input (argv) \| \| test.cpp:43:38:43:63 \| ... * ... \| This allocation size is derived from $@ and might overflow \| test.cpp:39:21:39:24 \| argv \| user input (argv) \| +\| test.cpp:45:31:45:36 \| call to malloc \| This allocation size is derived from $@ and might overflow \| test.cpp:39:21:39:24 \| argv \| user input (argv) \| \| test.cpp:48:25:48:30 \| call to malloc \| This allocation size is derived from $@ and might overflow \| test.cpp:39:21:39:24 \| argv \| user input (argv) \| \| test.cpp:49:17:49:30 \| new[] \| This allocation size is derived from $@ and might overflow \| test.cpp:39:21:39:24 \| argv \| user input (argv) \| +\| test.cpp:52:21:52:27 \| call to realloc \| This allocation size is derived from $@ and might overflow \| test.cpp:39:21:39:24 \| argv \| user input (argv) \| \| test.cpp:52:35:52:60 \| ... * ... \| This allocation size is derived from $@ and might overflow \| test.cpp:39:21:39:24 \| argv \| user input (argv) \| --- a/semmlecode-cpp-tests/DO_NOT_DISTRIBUTE/security-tests/CWE-190/CERT/INT04-C/int04.expected +++ b/semmlecode-cpp-tests/DO_NOT_DISTRIBUTE/security-tests/CWE-190/CERT/INT04-C/int04.expected @@ -1 +1,2 @@ \| int04c.c:21:29:21:51 \| ... * ... \| This allocation size is derived from $@ and might overflow \| int04c.c:14:30:14:35 \| call to getenv \| user input (getenv) \| +\| int04c.c:22:33:22:38 \| call to malloc \| This allocation size is derived from $@ and might overflow \| int04c.c:14:30:14:35 \| call to getenv \| user input (getenv) \|	2019-09-19 13:11:27 +02:00
Jonas Jensen	307b92feed	C++: Unknown template literals are constant	2019-09-19 10:23:26 +02:00
Jonas Jensen	9b805c01cc	Merge pull request #1951 from pavgust/fix/hashcons-perf C++: Fix HashCons library performance	2019-09-19 08:10:34 +02:00
Jonas Jensen	e0d1da3b67	C++: Test for template enum constant CFG	2019-09-18 15:17:24 +02:00
Jonas Jensen	7d8396fa65	C++: Constant template pointer-to-member literals	2019-09-18 14:44:25 +02:00
Jonas Jensen	d644150ead	C++: Test for template pointer-to-member CFG	2019-09-18 14:30:18 +02:00
Jonas Jensen	0f2731064d	C++: Annotate `tellDifferent` with template status This is helpful for turning real-world cases into test cases.	2019-09-18 14:23:52 +02:00
Jonas Jensen	c90fd32a78	C++: Pointer-to-member-function is constant	2019-09-18 13:55:56 +02:00
Pavel Avgustinov	eca31908ab	HashCons: Make some functionality apparent. The user knows that an expression functionally determines its hashCons value, and that an expression functionally determines its number of children, but this is not provable from the definitions, and so not usable by the optimiser. By storing the result of those known-functional calls in a variable, rather than repeating the call, we enable better join orders.	2019-09-18 12:54:48 +01:00
Pavel Avgustinov	03502863cf	Distribute a recursive call into a recursive disjunction. As the linearity of the disjuncts is different, this enables us to pick better join orders for each disjunct separately.	2019-09-18 12:54:48 +01:00
Jonas Jensen	55edfe4224	C++: Test for pointer-to-member-function CFG	2019-09-18 13:37:52 +02:00

... 2 3 4 5 6 ...

2336 Commits