Commit Graph

2860 Commits

Author SHA1 Message Date
Jonas Jensen
b8b1f0c617 C++: pragma[noinline] parameter index predicates
A performance regression in `definitionByReferenceNodeFromArgument#ff`
was ultimately caused by a join on parameter indexes in
`DefinitionByReferenceNode.getArgument`. Joining on numbers in QL is
always fragile, and somehow the changes in #4432 had caused the join
order here to break.

Instead of tweaking the join order in the slow predicate itself, I added
`pragma[noinline]` to one of the predicates involved in the join on
parameter indexes. This should prevent us from getting similar
performance problems in the future when we write code that joins on
parameter numbers. Joining on indexes is always risky, but it's even
more risky when one of the predicates in the join is inlined by the
compiler and expands to further joins.

I tested performance by running `CgiXss.ql` on a ChakraCore snapshot.
Tuple counts before (I interrupted execution after five minutes or so):

    (626s) Tuple counts for DataFlowUtil::definitionByReferenceNodeFromArgument#ff:
    58162      ~0%     {3} r1 = SCAN DataFlowUtil::DefinitionByReferenceNode#class#ff AS I OUTPUT I.<1>, -1, I.<0>
    26934      ~0%     {2} r2 = JOIN r1 WITH Instruction::IndexedInstruction#ff AS R ON FIRST 2 OUTPUT r1.<0>, r1.<2>
    26934      ~1%     {2} r3 = JOIN r2 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r2.<1>
    26850      ~1%     {2} r4 = JOIN r3 WITH Instruction::CallInstruction::getThisArgumentOperand_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r3.<1>
    26850      ~0%     {2} r5 = JOIN r4 WITH Operand::Operand::getDef_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r4.<1>
    26850      ~1%     {2} r6 = JOIN r5 WITH Instruction::Instruction::getUnconvertedResultExpression_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r5.<1>
    58162      ~0%     {2} r7 = SCAN DataFlowUtil::DefinitionByReferenceNode#class#ff AS I OUTPUT I.<1>, I.<0>
    58162      ~4%     {3} r8 = JOIN r7 WITH Instruction::IndexedInstruction#ff AS R ON FIRST 1 OUTPUT R.<1>, r7.<1>, r7.<0>
    4026581120 ~0%     {4} r9 = JOIN r8 WITH Instruction::CallInstruction::getPositionalArgumentOperand_dispred#fff_102#join_rhs AS R ON FIRST 1 OUTPUT r8.<2>, R.<1>, r8.<1>, R.<2>
    31154      ~4%     {2} r10 = JOIN r9 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff AS R ON FIRST 2 OUTPUT r9.<3>, r9.<2>
    31154      ~8%     {2} r11 = JOIN r10 WITH Operand::Operand::getDef_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r10.<1>
    31154      ~0%     {2} r12 = JOIN r11 WITH Instruction::Instruction::getUnconvertedResultExpression_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r11.<1>
    58004      ~0%     {2} r13 = r6 \/ r12
                       return r13

Tuple counts after:

    (0s) Tuple counts for DataFlowUtil::definitionByReferenceNodeFromArgument#ff:
    385785  ~6%     {2} r1 = SCAN DataFlowUtil::DefinitionByReferenceNode#class#ff AS I OUTPUT I.<1>, I.<0>
    385785  ~0%     {3} r2 = JOIN r1 WITH Instruction::IndexedInstruction#ff AS R ON FIRST 1 OUTPUT r1.<0>, r1.<1>, R.<1>
    385785  ~1%     {3} r3 = JOIN r2 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r2.<2>, r2.<1>
    198736  ~4%     {2} r4 = JOIN r3 WITH Instruction::CallInstruction::getPositionalArgument#fff AS R ON FIRST 2 OUTPUT R.<2>, r3.<2>
    198736  ~0%     {2} r5 = JOIN r4 WITH Instruction::Instruction::getUnconvertedResultExpression_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r4.<1>
    385785  ~1%     {3} r6 = SCAN DataFlowUtil::DefinitionByReferenceNode#class#ff AS I OUTPUT I.<1>, -1, I.<0>
    186891  ~1%     {2} r7 = JOIN r6 WITH Instruction::IndexedInstruction#ff AS R ON FIRST 2 OUTPUT r6.<0>, r6.<2>
    186891  ~2%     {2} r8 = JOIN r7 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r7.<1>
    183201  ~3%     {2} r9 = JOIN r8 WITH Instruction::CallInstruction::getThisArgumentOperand_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r8.<1>
    183201  ~0%     {2} r10 = JOIN r9 WITH Operand::Operand::getDef_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r9.<1>
    175449  ~8%     {2} r11 = JOIN r10 WITH Instruction::Instruction::getUnconvertedResultExpression_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r10.<1>
    374185  ~3%     {2} r12 = r5 \/ r11
                    return r12
2020-11-09 09:01:22 +01:00
Robert Marsh
2f204869e7 Merge pull request #4604 from criemen/ir-block-sort-order
C++, C# IR: Stabilize sort order for basic blocks.
2020-11-04 18:22:23 -05:00
Cornelius Riemenschneider
44d6584fa2 C++, C#: Auto-format. 2020-11-04 16:26:56 +01:00
Cornelius Riemenschneider
a13947424a C++, C# IR: Stabilize sort order for basic blocks. 2020-11-04 16:26:56 +01:00
Cornelius Riemenschneider
e7e5754270 C++: Add taint model for std::vector::emplace/_back. 2020-11-04 16:20:01 +01:00
Dave Bartolomeo
f0b9794907 Merge remote-tracking branch 'upstream/main' into work 2020-11-03 11:33:44 -05:00
Anders Schack-Mulligen
2971784f9c Dataflow: Add missing qldoc and sync. 2020-11-03 09:21:48 +01:00
Anders Schack-Mulligen
7eb64aa998 Dataflow: Code review fixes. 2020-11-03 09:16:20 +01:00
Anders Schack-Mulligen
1ae76a80aa Dataflow: Fix qldoc. 2020-11-03 09:16:20 +01:00
Anders Schack-Mulligen
d5be4d7b92 Dataflow: Add support reverse partial flow exploration. 2020-11-03 09:16:19 +01:00
Dave Bartolomeo
ec398b2a67 Merge remote-tracking branch 'upstream/main' into work 2020-10-30 12:36:33 -04:00
Dave Bartolomeo
42373417e2 Merge from main 2020-10-30 12:02:56 -04:00
Cornelius Riemenschneider
e7d995313e C++: Address review. 2020-10-30 16:30:57 +01:00
Cornelius Riemenschneider
84fe7ba199 C++: Add support for StmtExpr to Print AST. 2020-10-30 15:53:54 +01:00
Cornelius Riemenschneider
d3631d8f2e Merge pull request #4562 from criemen/printast-labels
C++: Change PrintAST to provide the predicates that can be used to traverse the AST.
2020-10-30 15:48:46 +01:00
Dave Bartolomeo
36b27add24 Simplify ordering of children with conversions using rank
In `getChild(int childIndex)`, the actual values of `childIndex` don't matter, as long as they are in the correct order. Rather than doing complicated math to compute the indices for the synthesized `.getFullyConverted()` children, just use the `rank` aggregate to order all children first by whether or not the child is a conversion, then by the original child index.
2020-10-30 10:00:23 -04:00
Cornelius Riemenschneider
cf8f802310 C++: Rename predicate. 2020-10-30 12:51:19 +01:00
Cornelius Riemenschneider
ab42ddb0dc C++: Adjust code for the conversions PR, provide correct childIndexes for the new nodes. 2020-10-30 12:48:53 +01:00
Jonas Jensen
ba41417d61 Merge pull request #4553 from geoffw0/samateregtests
C++: Additional pointer tests for DefaultTaintTracking.
2020-10-30 10:02:11 +01:00
Cornelius Riemenschneider
4276d1f3e5 C++: Add missing comment and update test results. 2020-10-29 14:49:06 +01:00
Cornelius Riemenschneider
7e667b9bec C++: Add comment to FunctionNode. 2020-10-29 14:49:06 +01:00
Cornelius Riemenschneider
668764ce40 C++: Make new predicates private. 2020-10-29 14:49:06 +01:00
Cornelius Riemenschneider
8c925a20a7 C++: Provide the predicates that can be used to traverse the AST as metadata. 2020-10-29 14:48:47 +01:00
Jonas Jensen
fa344d216f Merge pull request #4493 from criemen/fix-4278-printast-conversions
Fix C++ Print AST handling of Conversions
2020-10-29 13:48:15 +01:00
Cornelius Riemenschneider
59dd892748 C++: Address review, fix bug related to Conversions. 2020-10-29 11:40:31 +01:00
Jonas Jensen
0af62b8431 Merge pull request #4515 from geoffw0/modelchanges1
C++: Changes to models library.
2020-10-29 11:21:56 +01:00
Dave Bartolomeo
7a2c59c194 Merge from main 2020-10-28 15:35:46 -04:00
Geoffrey White
ae84d1383e Merge pull request #4565 from MathiasVP/instruction-tag-for-this-addr-and-load-fix
C++: Fix spelling in getInstructionTagId
2020-10-28 16:53:55 +00:00
Mathias Vorreiter Pedersen
614e2ba851 C++: Fix spelling 2020-10-28 13:05:37 +01:00
Cornelius Riemenschneider
f1f64fb7df C++: Make BuiltInVarArgs* classes subclasses of VarArgsExpr. 2020-10-28 10:48:00 +01:00
Geoffrey White
09372f5c81 C++: Remove misleading comment. 2020-10-28 09:04:10 +00:00
Mathias Vorreiter Pedersen
ad9e7b7343 C++: Give getInstructionTagId a result when tag is ThisAddressTag or ThisLoadTag 2020-10-27 22:16:01 +01:00
Geoffrey White
c8783b5ea3 Revert "C++: Create a module for models of things in Std."
This reverts commit ddc5150080.
2020-10-27 13:31:16 +00:00
Cornelius Riemenschneider
1b88ca1e81 C++: Simplify code, add comment explaining the logic. 2020-10-26 14:39:12 +01:00
Cornelius Riemenschneider
447ba205b4 C++: Move Conversions in PrintAST to the side. 2020-10-26 13:49:02 +01:00
Dave Bartolomeo
3fce971f2d Fix taint propagation to qualifier objects and update test expectations 2020-10-23 17:48:37 -04:00
Dave Bartolomeo
4d2f658ece Don't treat allocator argument as a string input 2020-10-23 17:44:07 -04:00
Robert Marsh
aab9797c2f Merge branch 'main' into rdmarsh2/cpp/output-iterators-2
Resolve merge conflict in tests
2020-10-23 13:50:15 -07:00
Dave Bartolomeo
35abcae5d3 Fix formatting 2020-10-23 13:43:29 -04:00
Dave Bartolomeo
bace0dca6d Handle more cases that require synthesizing temporary objects
- Parens around qualifier expressions
- Inheritance conversions involving class prvalues
2020-10-23 12:04:09 -04:00
Dave Bartolomeo
99072483b8 Fix PR feedback 2020-10-22 12:55:40 -04:00
Dave Bartolomeo
b62bda6c3a Fix regression due to primary instructions for side effects not being computed correctly in the presence of synthetic temporary objects. 2020-10-22 12:55:30 -04:00
Cornelius Riemenschneider
6b072686ab C++: Improve PrintAST performance.
This improves the performance of the printAst.ql query by excluding a lot of string concatenations that happen in files unrelated to the one the user is interested in printing.
This is supposed to help the performance of the AST Viewer on bigger databases.
2020-10-22 16:38:52 +02:00
Dave Bartolomeo
ee18db7b36 Fix IR for member accesses on prvalues
This fixes the IR generation for member accesses where the qualifier is a prvalue that is _not_ the load of a `TemporaryObjectExpr`. We synthesize a temporary variable during IR generation instead. It fits into the IR construction code at the same spot as `TranslatedLoad`, since it's basically the opposite of `TranslatedLoad` (prvalue->glvalue instead of vice versa). Note that array prvalues require special treatment.

This fixes some consistency errors in the `syntax-zoo`. It introduces three new ones in `dataflow-ir-consistency.expected`, but those are along the same lines as tons of existing failures.
2020-10-21 13:32:15 -04:00
Robert Marsh
413c845e97 Merge branch 'main' into rdmarsh2/cpp/output-iterators-2
Accept test changes for unnamed elements
2020-10-20 15:22:08 -07:00
Dave Bartolomeo
08af0803ff Add examples to QLDoc comment 2020-10-20 17:34:46 -04:00
Dave Bartolomeo
4ba281731c Fix IR generation for member access with a prvalue on the RHS
For historical reasons, the extractor marks the temporary object expression used as the qualifier of a member access as a prvalue(load), even though the current C++ standard says that the temporary object materialization results in a glvalue. Added some special handling to ignore the load for both field accesses and member function calls.

This fixes all of the consistency failures in our regular tests, and all of the related failures in `syntax-zoo` other than the ones that deal with pointers-to-member, which aren't really supported yet anyway.
2020-10-20 12:53:47 -04:00
Dave Bartolomeo
735c657326 IR consistency checks for FieldAddress and this arguments that are not actually addresses.
Exposes failures in existing tests. Also added a small test case for `FieldAddress` on a prvalue.
2020-10-20 10:32:28 -04:00
Dave Bartolomeo
ade6d10e58 Merge remote-tracking branch 'upstream/main' into work 2020-10-20 07:24:42 -04:00
Mathias Vorreiter Pedersen
528afc55ab Merge pull request #3788 from geoffw0/callderef
C++: Add bcopy to models and use it.
2020-10-20 12:15:23 +02:00