codeql

mirror of https://github.com/github/codeql.git synced 2025-12-18 09:43:15 +01:00

Author	SHA1	Message	Date
Jonas Jensen	17737cd872	C++: Account for unreachable blocks in guards This restores the code I removed in `4642037dc`.	2020-06-16 14:33:12 +02:00
Mathias Vorreiter Pedersen	c30d1a618e	C++: Add charpred to partial definition node classes in qltest	2020-06-16 09:55:37 +02:00
Jonas Jensen	d80a033bed	Merge pull request #3719 from dbartol/github/codeql-c-analysis-team/69-consistency C++/C#: Fix a couple new consistency failures, and improve consistency messages	2020-06-16 08:48:35 +02:00
Aditya Sharad	d7d00bddf6	Merge pull request #3718 from adityasharad/cpp/formatting-function-doc C++: Fix QLDoc on `FormattingFunction` library	2020-06-15 08:39:16 -07:00
Dave Bartolomeo	fecffab8e7	C++: Fix consistency error `TTranslatedAllocationSideEffects` wasn't limiting itself to functions that actually have IR, so it was getting used even in template definitions.	2020-06-15 10:47:00 -04:00
Dave Bartolomeo	8cbc7e8654	C++/C#: Improve consistency failure result messages Some of our IR consistency failure query predicates already produced results in the schema as an `@kind problem` query, including `$@` replacements for the enclosing `IRFunction` to make it easier to figure out which function to dump when debugging. This change moves the rest of the query predicates in `IRConsistency.qll` to do the same. In addition, it wraps each call to `getEnclosingIRFunction()` to return an `OptionalIRFunction`, which can be either a real `IRFunction` or a placeholder in case `getEnclosingIRFunction()` returned no results. This exposes a couple new consistency failures in `syntax-zoo`, which will be fixed in a subsequent commit. This change also deals with consistency failures when the enclosing `IRFunction` has more than one `Function` or `Location`. For multiple `Function`s, we concatenate the function names. For multiple `Location`s, we pick the first one in lexicographical order. This changes the number of results produced in the existing tests, but does't change the actual number of problems.	2020-06-15 10:46:46 -04:00
Aditya Sharad	1033d22d1b	C++: Fix QLDoc on `FormattingFunction` library Copy-paste typo from `DataFlowFunction`.	2020-06-15 07:32:53 -07:00
Mathias Vorreiter Pedersen	6748f3887e	C++: Add test demonstrating differences between AST and IR field flow. Also refactored the partial definitions test	2020-06-15 09:39:15 +02:00
Dave Bartolomeo	89a1fd4b4a	C++/C#: Fix formatting	2020-06-13 08:22:04 -04:00
Dave Bartolomeo	73d2e09a8d	C++:/C# Remove `opcode` from `TRawInstruction`	2020-06-12 17:36:01 -04:00
Dave Bartolomeo	978275cbd4	C++/C#: Move `irFunc` out of various `TInstruction` branches	2020-06-12 17:26:45 -04:00
Dave Bartolomeo	07c1520b4d	C++/C#: Move `ast` out of `TRawInstruction`	2020-06-12 17:03:02 -04:00
Dave Bartolomeo	2aabe431f6	C++/C#: Stop caching `getOldInstruction()`	2020-06-12 16:22:58 -04:00
Dave Bartolomeo	ac169931b3	C++/C#: More efficient evaluation of `SSA::hasInstruction()`	2020-06-12 16:09:50 -04:00
Dave Bartolomeo	4331b9b54e	C++: Simplify logic to an implication	2020-06-12 09:31:19 -04:00
Jonas Jensen	abd05bcff1	Merge pull request #3596 from robertbrignull/more-suites Add more code-scanning suites	2020-06-12 09:08:20 +02:00
Robert Marsh	65f4ef712e	C++: accept false positive tests after merge The IR false positives are due to the same path length limit as the AST false positives on the same line.	2020-06-11 15:27:13 -07:00
Robert Marsh	a7efa0d602	Merge branch 'master' into ir-this-parameter-2	2020-06-11 13:21:52 -07:00
Mathias Vorreiter Pedersen	b78c06559e	Merge pull request #3691 from geoffw0/reftest C++: Add a test case for CWE-114 involving pointers and references.	2020-06-11 22:02:45 +02:00
Geoffrey White	fdd7ad2300	C++: Add a SideEffectFunction model to 'system'.	2020-06-11 18:59:17 +01:00
Geoffrey White	e8b34e07f8	C++: Add an AliasFunction model to 'system'.	2020-06-11 18:44:41 +01:00
Geoffrey White	7fee2c239d	C++: Add an ArrayFunction model to 'system'.	2020-06-11 18:44:09 +01:00
Geoffrey White	b38a7a9ffc	C++: Fill out ArrayFunction model for 'fgets'.	2020-06-11 18:20:24 +01:00
Geoffrey White	40c20f2731	C++: Add the test for DefaultTaintTracking as well.	2020-06-11 17:37:05 +01:00
Geoffrey White	2f192f6a0c	C++: Add a test of char* -> std::string -> char* taint.	2020-06-11 17:37:05 +01:00
Dave Bartolomeo	41df7000c5	Merge from master, including fixing up merge conflicts	2020-06-11 12:20:46 -04:00
Ian Lynagh	fd88289e46	C++: Fix reference to `Block` We don't call it `BlockStmt`.	2020-06-11 16:50:23 +01:00
Robert Marsh	982fb38807	Merge pull request #3419 from MathiasVP/flat-structs C++: Add reverse reads to IR field flow	2020-06-10 14:31:00 -07:00
Mathias Vorreiter Pedersen	a38839b446	C++: Include copy of IntWrapper class with two data members	2020-06-10 22:27:40 +02:00
Mathias Vorreiter Pedersen	ca20f17703	C++: Implement move constructor in terms of swap. I'm haven't found anything online on whether this is good or bad, and the only reason for not doing it might be performance.	2020-06-10 22:16:58 +02:00
Mathias Vorreiter Pedersen	1a95095505	C++: Add default move constructor. Also removed debug comment I forgot to remove earlier. Luckily, that meant that no line numbers changed in .expected files.	2020-06-10 17:13:04 +02:00
Mathias Vorreiter Pedersen	5abab25c28	Update cpp/ql/test/library-tests/dataflow/taint-tests/taint.cpp Co-authored-by: Jonas Jensen <jbj@github.com>	2020-06-10 16:51:21 +02:00
Geoffrey White	91b9b78c48	C++: Add a test case for CWE-114 involving pointers and references.	2020-06-10 14:09:46 +01:00
Mathias Vorreiter Pedersen	88dabffd2b	C++: Add tests that demonstrate flow through custom swap functions	2020-06-10 15:06:57 +02:00
semmle-qlci	1b8f3c4b84	Merge pull request #3657 from hvitved/dataflow/hidden-nodes Approved by aschackmull, jbj	2020-06-10 13:22:09 +01:00
Robert Brignull	ded5eec76a	rename slow-queries.yml to exclude-slow-queries.yml	2020-06-10 09:59:31 +01:00
Jonas Jensen	ad401e9f21	C++: Copy and adjust Java's correctness argumnt Instead of a vague reference to a code comment for another language, the `controlsBlock` predicate now has the whole comment in it directly. I've adjusted the wording so it should be reasonably correct for C/C++. As with the other comments in this file, I don't distinguish between the condition and its block. I think that makes the explanation clearer without losing any detail we care about. To make the code fit the wording of the comment, I changed the `hasBranchEdge/2` predicate into `getBranchSuccessor/1`.	2020-06-09 20:53:56 +02:00
Jonas Jensen	a341912da9	C++: Performance tweak for 1-field struct loads On kamailio/kamailio the `DataFlowUtil::simpleInstructionLocalFlowStep` predicate was slow because of the case for single-field structs, where there was a large tuple-count bulge when joining with `getFieldSizeOfClass`: 3552902 ~2% {2} r1 = SCAN Instruction::CopyInstruction::getSourceValueOperand_dispred#3#ff AS I OUTPUT I.<1>, I.<0> 2065347 ~2% {2} r35 = JOIN r1 WITH Operand::NonPhiMemoryOperand::getAnyDef_dispred#3#ff AS R ON FIRST 1 OUTPUT r1.<1>, R.<1> 2065827 ~2% {3} r36 = JOIN r35 WITH Instruction::Instruction::getResultType_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r35.<1>, r35.<0> 2065825 ~3% {3} r37 = JOIN r36 WITH Type::Type::getSize_dispred#ff AS R ON FIRST 1 OUTPUT r36.<1>, r36.<2>, R.<1> 2068334 ~2% {4} r38 = JOIN r37 WITH Instruction::Instruction::getResultType_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r37.<2>, r37.<0>, r37.<1> 314603817 ~0% {3} r39 = JOIN r38 WITH DataFlowUtil::getFieldSizeOfClass#fff_120#join_rhs AS R ON FIRST 2 OUTPUT r38.<3>, R.<2>, r38.<2> 8 ~0% {2} r40 = JOIN r39 WITH Instruction::Instruction::getResultType_dispred#3#ff AS R ON FIRST 2 OUTPUT r39.<2>, r39.<0> That's 314M tuples. Strangely, there is no such bulge on more well-behaved snapshots like mysql/mysql-server. With this commit the explosion is gone: ... 2065825 ~0% {4} r37 = JOIN r36 WITH Type::Type::getSize_dispred#ff AS R ON FIRST 1 OUTPUT r36.<0>, R.<1>, r36.<1>, r36.<2> 1521 ~1% {3} r38 = JOIN r37 WITH DataFlowUtil::getFieldSizeOfClass#fff_021#join_rhs AS R ON FIRST 2 OUTPUT r37.<2>, R.<2>, r37.<3> 8 ~0% {2} r39 = JOIN r38 WITH Instruction::Instruction::getResultType_dispred#3#ff AS R ON FIRST 2 OUTPUT r38.<0>, r38.<2>	2020-06-09 14:50:02 +02:00
Tom Hvitved	a371205db1	Data flow: Sync files	2020-06-09 13:55:12 +02:00
Tom Hvitved	8c9f85d04f	Data flow: Allow nodes to be hidden from path explanations	2020-06-09 13:53:19 +02:00
Jonas Jensen	4642037dce	C++: Speed up IRGuardCondition::controlsBlock The `controlsBlock` predicate had some dramatic bulges in its tuple counts. To make matters worse, those bulges were in materialized intermediate predicates like `#shared` and `#antijoin_rhs`, not just in the middle of a pipeline. The problem was particularly evident on kamailio/kamailio, where `controlsBlock` was the slowest predicate in the IR libraries: IRGuards::IRGuardCondition::controlsBlock_dispred#fff#shared#4 ........ 58.8s IRGuards::IRGuardCondition::controlsBlock_dispred#fff#antijoin_rhs .... 33.4s IRGuards::IRGuardCondition::controlsBlock_dispred#fff#antijoin_rhs#1 .. 26.7s The first of the above relations had 201M rows, and the others had intermediate bulges of similar size. The bulges could be observed even on small projects although they did not cause measurable performance issues there. The `controlsBlock_dispred#fff#shared#4` relation had 3M rows on git/git, which is a lot for a project with only 1.5M IR instructions. This commit borrows an efficient implementation from Java's `Guards.qll`, tweaking it slightly to fit into `IRGuards`. Performance is now much better: IRGuards::IRGuardCondition::controlsBlock_dispred#fff ................... 6.1s IRGuards::IRGuardCondition::hasDominatingEdgeTo_dispred#ff .............. 616ms IRGuards::IRGuardCondition::hasDominatingEdgeTo_dispred#ff#antijoin_rhs . 540ms After this commit, the biggest bulge in `controlsBlock` is the size of `IRBlock::dominates`. On kamailio/kamailio this is an intermediate tuple count of 18M rows in the calculation of `controlsBlock`, which in the end produces 11M rows.	2020-06-09 12:15:45 +02:00
Jonas Jensen	cade3a3e23	C++: Use the `hasBranchEdge` helper predicate This tidies up the code, removing unnecessary repetition.	2020-06-09 10:33:03 +02:00
Dave Bartolomeo	3fc02ce24e	C++: Fix join order in virtual dispatch with `unique` The optimizer picked a terrible join order in `VirtualDispatch::DataSensitiveCall::flowsFrom()`. Telling it that `getAnOutNode()` has a unique result convinces it to join first on the `Callable`, rather than on the `ReturnKind`.	2020-06-08 17:15:43 -04:00
Robert Marsh	2a96856ca5	C++/C#: Document IRPositionalParameter	2020-06-08 12:41:26 -07:00
Dave Bartolomeo	c511cc3444	C++: Better caching for `getPrimaryInstructionForSideEffect()`	2020-06-08 15:37:36 -04:00
Dave Bartolomeo	0ae98e78a2	Merge remote-tracking branch 'github/master' into github/codeql-c-analysis-team/69_union	2020-06-08 11:20:14 -04:00
Mathias Vorreiter Pedersen	b48168fc03	C++: Accept tests	2020-06-08 12:26:25 +02:00
Jonas Jensen	c62220e0dc	C++: Fix data-flow dispatch perf with globals There wasn't a good join order for the "store to global var" case in the virtual dispatch library. When a global variable had millions of accesses but few stores to it, the `flowsFrom` predicate would join to see all those millions of accesses before filtering down to stores only. The solution is to pull out a `storeIntoGlobal` helper predicate that pre-computes which accesses are stores. To make the code clearer, I've also pulled out a repeated chunk of code into a new `addressOfGlobal` helper predicate. For the kamailio/kamailio project, these are the tuple counts before: Starting to evaluate predicate DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta/3[3]@21a1df (iteration 3) Tuple counts for DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta: ... 59002 ~0% {3} r17 = SCAN DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#prev_delta AS I OUTPUT I.<1>, true, I.<0> 58260 ~1% {3} r31 = JOIN r17 WITH DataFlowUtil::Node::asVariable_dispred#fb AS R ON FIRST 1 OUTPUT R.<1>, true, r17.<2> 2536187389 ~6% {3} r32 = JOIN r31 WITH Instruction::VariableInstruction::getASTVariable_dispred#fb_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r31.<2> 2536187389 ~6% {3} r33 = JOIN r32 WITH project#Instruction::VariableAddressInstruction#class#3#ff AS R ON FIRST 1 OUTPUT r32.<0>, true, r32.<2> 58208 ~0% {3} r34 = JOIN r33 WITH Instruction::StoreInstruction::getDestinationAddress_dispred#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r33.<2> Tuple counts after: Starting to evaluate predicate DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta/3[3]@6073c5 (iteration 3) Tuple counts for DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta: ... 59002 ~0% {3} r17 = SCAN DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#prev_delta AS I OUTPUT I.<1>, true, I.<0> 58260 ~1% {3} r23 = JOIN r17 WITH DataFlowUtil::Node::asVariable_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, true, r17.<2> 58208 ~0% {3} r24 = JOIN r23 WITH DataFlowDispatch::VirtualDispatch::storeIntoGlobal#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r23.<2> 58208 ~0% {3} r25 = JOIN r24 WITH DataFlowUtil::InstructionNode#ff_10#join_rhs AS R ON FIRST 1 OUTPUT true, r24.<2>, R.<1> Notice that the final tuple count, 58208, is the same before and after. The kamailio/kamailio project seems to have been affected by this issue because it has global variables to do with logging policy, and these variables are loaded from in every place where their logging macro is used.	2020-06-08 11:48:40 +02:00
Mathias Vorreiter Pedersen	431cc5c926	C++: Fix inconsistent class name	2020-06-08 11:27:09 +02:00
Mathias Vorreiter Pedersen	01f3793159	C++: Add ReadSideEffect as a possible end instruction for load chains	2020-06-08 11:05:30 +02:00

... 4 5 6 7 8 ...

4315 Commits