Commit Graph

41418 Commits

Author SHA1 Message Date
Erik Krogh Kristensen
90596167b1 add taint-step for Array.reduce 2020-06-09 17:15:00 +02:00
Mathias Vorreiter Pedersen
06066f0c5b Merge pull request #3659 from jbj/getFieldSizeOfClass-perf
C++: Performance tweak for 1-field struct loads
2020-06-09 15:53:19 +02:00
Erik Krogh Kristensen
be71ddf7bb introduce basic BuildArtifactLeak query 2020-06-09 15:27:55 +02:00
Erik Krogh Kristensen
896a9b05f6 refactor CleartextLogging to allow for reuse 2020-06-09 15:03:07 +02:00
Jonas Jensen
a341912da9 C++: Performance tweak for 1-field struct loads
On kamailio/kamailio the `DataFlowUtil::simpleInstructionLocalFlowStep`
predicate was slow because of the case for single-field structs, where
there was a large tuple-count bulge when joining with
`getFieldSizeOfClass`:

    3552902   ~2%       {2} r1 = SCAN Instruction::CopyInstruction::getSourceValueOperand_dispred#3#ff AS I OUTPUT I.<1>, I.<0>
    2065347   ~2%       {2} r35 = JOIN r1 WITH Operand::NonPhiMemoryOperand::getAnyDef_dispred#3#ff AS R ON FIRST 1 OUTPUT r1.<1>, R.<1>
    2065827   ~2%       {3} r36 = JOIN r35 WITH Instruction::Instruction::getResultType_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r35.<1>, r35.<0>
    2065825   ~3%       {3} r37 = JOIN r36 WITH Type::Type::getSize_dispred#ff AS R ON FIRST 1 OUTPUT r36.<1>, r36.<2>, R.<1>
    2068334   ~2%       {4} r38 = JOIN r37 WITH Instruction::Instruction::getResultType_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r37.<2>, r37.<0>, r37.<1>
    314603817 ~0%       {3} r39 = JOIN r38 WITH DataFlowUtil::getFieldSizeOfClass#fff_120#join_rhs AS R ON FIRST 2 OUTPUT r38.<3>, R.<2>, r38.<2>
    8         ~0%       {2} r40 = JOIN r39 WITH Instruction::Instruction::getResultType_dispred#3#ff AS R ON FIRST 2 OUTPUT r39.<2>, r39.<0>

That's 314M tuples.

Strangely, there is no such bulge on more well-behaved snapshots like
mysql/mysql-server.

With this commit the explosion is gone:

    ...
    2065825  ~0%       {4} r37 = JOIN r36 WITH Type::Type::getSize_dispred#ff AS R ON FIRST 1 OUTPUT r36.<0>, R.<1>, r36.<1>, r36.<2>
    1521     ~1%       {3} r38 = JOIN r37 WITH DataFlowUtil::getFieldSizeOfClass#fff_021#join_rhs AS R ON FIRST 2 OUTPUT r37.<2>, R.<2>, r37.<3>
    8        ~0%       {2} r39 = JOIN r38 WITH Instruction::Instruction::getResultType_dispred#3#ff AS R ON FIRST 2 OUTPUT r38.<0>, r38.<2>
2020-06-09 14:50:02 +02:00
Rasmus Wriedt Larsen
bacd491875 Python: Fix isSequence() and isMapping() 2020-06-09 14:21:02 +02:00
Anders Schack-Mulligen
f77f486c6b Merge pull request #3438 from artem-smotrakov/unsafe-tls
Java: Added a query for unsafe TLS versions
2020-06-09 14:07:17 +02:00
Rasmus Wriedt Larsen
846101d295 Python: Extend isSequence/isMapping test with custom classes 2020-06-09 14:04:14 +02:00
Tom Hvitved
a371205db1 Data flow: Sync files 2020-06-09 13:55:12 +02:00
Tom Hvitved
8c9f85d04f Data flow: Allow nodes to be hidden from path explanations 2020-06-09 13:53:19 +02:00
Erik Krogh Kristensen
b510e470b1 support rest-patterns inside property patterns 2020-06-09 13:28:56 +02:00
Erik Krogh Kristensen
c580ada527 Merge pull request #3643 from erik-krogh/yargs
JS: extend support for yargs for js/indirect-command-line-injection
2020-06-09 13:17:28 +02:00
Jonas Jensen
4642037dce C++: Speed up IRGuardCondition::controlsBlock
The `controlsBlock` predicate had some dramatic bulges in its tuple
counts. To make matters worse, those bulges were in materialized
intermediate predicates like `#shared` and `#antijoin_rhs`, not just in
the middle of a pipeline.

The problem was particularly evident on kamailio/kamailio, where
`controlsBlock` was the slowest predicate in the IR libraries:

    IRGuards::IRGuardCondition::controlsBlock_dispred#fff#shared#4 ........ 58.8s
    IRGuards::IRGuardCondition::controlsBlock_dispred#fff#antijoin_rhs .... 33.4s
    IRGuards::IRGuardCondition::controlsBlock_dispred#fff#antijoin_rhs#1 .. 26.7s

The first of the above relations had 201M rows, and the others
had intermediate bulges of similar size.

The bulges could be observed even on small projects although they did
not cause measurable performance issues there. The
`controlsBlock_dispred#fff#shared#4` relation had 3M rows on git/git,
which is a lot for a project with only 1.5M IR instructions.

This commit borrows an efficient implementation from Java's
`Guards.qll`, tweaking it slightly to fit into `IRGuards`. Performance
is now much better:

    IRGuards::IRGuardCondition::controlsBlock_dispred#fff ................... 6.1s
    IRGuards::IRGuardCondition::hasDominatingEdgeTo_dispred#ff .............. 616ms
    IRGuards::IRGuardCondition::hasDominatingEdgeTo_dispred#ff#antijoin_rhs . 540ms

After this commit, the biggest bulge in `controlsBlock` is the size of
`IRBlock::dominates`. On kamailio/kamailio this is an intermediate tuple
count of 18M rows in the calculation of `controlsBlock`, which in the
end produces 11M rows.
2020-06-09 12:15:45 +02:00
Rasmus Wriedt Larsen
65ce6d27ff Python: Update isSequence() and isMapping() for Python 3.8 2020-06-09 11:57:00 +02:00
Rasmus Wriedt Larsen
958763edc2 Python: Add test for ClassValue.isSequence() and isMapping()
For Python 3.6
2020-06-09 11:55:22 +02:00
Tom Hvitved
8006866370 C#: Refactor data-flow predicates defined by dispatch 2020-06-09 11:25:07 +02:00
Erik Krogh Kristensen
b04d7015ae fix test 2020-06-09 11:23:46 +02:00
Asger Feldthaus
0345036420 JS: Fix 'match' call in StringOps::RegExpTest 2020-06-09 10:07:36 +01:00
Jonas Jensen
cade3a3e23 C++: Use the hasBranchEdge helper predicate
This tidies up the code, removing unnecessary repetition.
2020-06-09 10:33:03 +02:00
Erik Krogh Kristensen
c2fbcea96f base the chaining on yargs on the methods that are NOT chained 2020-06-09 10:22:25 +02:00
Esben Sparre Andreasen
2d2468463b JS: initial version of IncompleteMultiCharacterSanitization.ql 2020-06-09 08:59:59 +02:00
Erik Krogh Kristensen
167239e745 add query to detect accidential leak of private files 2020-06-08 23:41:14 +02:00
Dave Bartolomeo
3fc02ce24e C++: Fix join order in virtual dispatch with unique
The optimizer picked a terrible join order in `VirtualDispatch::DataSensitiveCall::flowsFrom()`. Telling it that `getAnOutNode()` has a unique result convinces it to join first on the `Callable`, rather than on the `ReturnKind`.
2020-06-08 17:15:43 -04:00
Robert Marsh
2a96856ca5 C++/C#: Document IRPositionalParameter 2020-06-08 12:41:26 -07:00
Dave Bartolomeo
c511cc3444 C++: Better caching for getPrimaryInstructionForSideEffect() 2020-06-08 15:37:36 -04:00
ubuntu
ab65ec40c0 Add Codeql to detect missing 'Message.origin' validation when using postMessage API 2020-06-08 20:18:34 +02:00
luchua-bc
5acfc52087 Add dependent stub classes for the test case 2020-06-08 16:17:40 +00:00
luchua-bc
1e4addb20d Add dependent stub classes for the test case 2020-06-08 16:17:01 +00:00
Dave Bartolomeo
0ae98e78a2 Merge remote-tracking branch 'github/master' into github/codeql-c-analysis-team/69_union 2020-06-08 11:20:14 -04:00
Dave Bartolomeo
398678a28b Merge pull request #3637 from jbj/dispatch-global-perf
C++: Fix data-flow dispatch perf with globals
2020-06-08 11:19:37 -04:00
semmle-qlci
1a7570ebbe Merge pull request #3563 from RasmusWL/python-fabric-execute
Approved by tausbn
2020-06-08 16:00:49 +01:00
Erik Krogh Kristensen
0f06f04e32 extend support for yargs for js/indirect-command-line-injection 2020-06-08 16:45:09 +02:00
Asger Feldthaus
53280a6b11 JS: Add test demonstrating new flow 2020-06-08 14:25:21 +01:00
Rasmus Wriedt Larsen
baa415fec8 Python: Add points-to regression for metaclass 2020-06-08 15:03:46 +02:00
Rasmus Wriedt Larsen
7c037cd2ab Python: Handle Enum._convert in Python 3.8 2020-06-08 14:49:58 +02:00
Asger Feldthaus
2d9b9fa584 JS: Use PreCallGraphStep in select array steps 2020-06-08 13:45:28 +01:00
Asger Feldthaus
3d2bbbd3db JS: Add PreCallGraphStep extension point 2020-06-08 13:45:28 +01:00
Asger Feldthaus
1f2ab605bd JS: Add store/load steps to AdditionalTypeTrackingStep 2020-06-08 13:45:28 +01:00
Henning Makholm
5daf1db5e5 Merge pull request #3615 from github/fix-root-defintion
QL Specification: Fix mistake in dispatch computation
2020-06-08 14:34:58 +02:00
Bt2018
99aa559ef2 Fix auto-formatting issue 2020-06-08 06:43:00 -04:00
Mathias Vorreiter Pedersen
b48168fc03 C++: Accept tests 2020-06-08 12:26:25 +02:00
Jonas Jensen
c62220e0dc C++: Fix data-flow dispatch perf with globals
There wasn't a good join order for the "store to global var" case in the
virtual dispatch library. When a global variable had millions of
accesses but few stores to it, the `flowsFrom` predicate would join to
see all those millions of accesses before filtering down to stores only.
The solution is to pull out a `storeIntoGlobal` helper predicate that
pre-computes which accesses are stores.

To make the code clearer, I've also pulled out a repeated chunk of code
into a new `addressOfGlobal` helper predicate.

For the kamailio/kamailio project, these are the tuple counts before:

    Starting to evaluate predicate DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta/3[3]@21a1df (iteration 3)
    Tuple counts for DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta:
    ...
    59002      ~0%     {3} r17 = SCAN DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#prev_delta AS I OUTPUT I.<1>, true, I.<0>
    58260      ~1%     {3} r31 = JOIN r17 WITH DataFlowUtil::Node::asVariable_dispred#fb AS R ON FIRST 1 OUTPUT R.<1>, true, r17.<2>
    2536187389 ~6%     {3} r32 = JOIN r31 WITH Instruction::VariableInstruction::getASTVariable_dispred#fb_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r31.<2>
    2536187389 ~6%     {3} r33 = JOIN r32 WITH project#Instruction::VariableAddressInstruction#class#3#ff AS R ON FIRST 1 OUTPUT r32.<0>, true, r32.<2>
    58208      ~0%     {3} r34 = JOIN r33 WITH Instruction::StoreInstruction::getDestinationAddress_dispred#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r33.<2>

Tuple counts after:

    Starting to evaluate predicate DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta/3[3]@6073c5 (iteration 3)
    Tuple counts for DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#cur_delta:
    ...
    59002    ~0%     {3} r17 = SCAN DataFlowDispatch::VirtualDispatch::DataSensitiveCall::flowsFrom#fff#prev_delta AS I OUTPUT I.<1>, true, I.<0>
    58260    ~1%     {3} r23 = JOIN r17 WITH DataFlowUtil::Node::asVariable_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, true, r17.<2>
    58208    ~0%     {3} r24 = JOIN r23 WITH DataFlowDispatch::VirtualDispatch::storeIntoGlobal#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, true, r23.<2>
    58208    ~0%     {3} r25 = JOIN r24 WITH DataFlowUtil::InstructionNode#ff_10#join_rhs AS R ON FIRST 1 OUTPUT true, r24.<2>, R.<1>

Notice that the final tuple count, 58208, is the same before and after.

The kamailio/kamailio project seems to have been affected by this issue
because it has global variables to do with logging policy, and these
variables are loaded from in every place where their logging macro is
used.
2020-06-08 11:48:40 +02:00
Anders Schack-Mulligen
8513c6981c Merge pull request #3329 from artem-smotrakov/mvel-injection
Java: Add a query for MVEL injections
2020-06-08 11:48:00 +02:00
Mathias Vorreiter Pedersen
431cc5c926 C++: Fix inconsistent class name 2020-06-08 11:27:09 +02:00
Calum Grant
00078d14b9 Merge pull request #3601 from hvitved/csharp/overlapping-configs
C#: Avoid multiple taint-tracking configurations
2020-06-08 10:21:40 +01:00
Mathias Vorreiter Pedersen
01f3793159 C++: Add ReadSideEffect as a possible end instruction for load chains 2020-06-08 11:05:30 +02:00
Mathias Vorreiter Pedersen
a4388e9258 C++: Add example demonstrating missing flow 2020-06-08 11:03:36 +02:00
Esben Sparre Andreasen
872ee13ba6 JS: formatting 2020-06-08 10:04:37 +02:00
Anders Schack-Mulligen
ad8647f345 Merge pull request #3547 from pwntester/issue_3139
add support for java.io.StringWriter
2020-06-08 10:02:23 +02:00
Pavel Avgustinov
7c0b8f5587 Merge pull request #3622 from aschackmull/mergeback-124
Mergeback rc/1.24 -> master
2020-06-08 08:38:12 +01:00