Commit Graph

1468 Commits

Author SHA1 Message Date
Geoffrey White
200545d88c CPP: Add detail to the model. 2020-01-17 18:56:21 +00:00
Geoffrey White
77a3778eef CPP: Add some strlen variants to the PureStrFunction model. 2020-01-17 18:56:21 +00:00
Jonas Jensen
3632d51abc Merge pull request #2635 from geoffw0/modelstrdup
CPP: Model strdup
2020-01-17 19:26:26 +01:00
Dave Bartolomeo
c7e62b4a35 Merge pull request #2613 from rdmarsh2/getPhiOperandDefinition-perf-2
C++: performance fixes for getPhiOperandDefinition
2020-01-17 09:01:33 -07:00
Jonas Jensen
53e10e4c7f Merge pull request #2634 from MathiasVP/overrideable-taint-sources
C++: Overrideable taint sources in DefaultTaintTracking
2020-01-17 13:01:03 +01:00
Jonas Jensen
5d08a0e338 Merge pull request #2558 from MathiasVP/ast-classes-should-not-be-abstract
C++: Ast classes should not be abstract
2020-01-17 08:47:55 +01:00
Geoffrey White
3c41ed56a1 CPP: Support taint to return value derefs instead. 2020-01-16 18:15:21 +00:00
Robert Marsh
e0406190a1 Merge branch 'master' into getPhiOperandDefinition-perf-2 2020-01-16 07:23:59 -08:00
Robert Marsh
c942da524c C++/C#: Sync 2020-01-16 07:16:57 -08:00
Robert Marsh
1b5d33023e C++: actually fix Chi total operands 2020-01-16 07:15:08 -08:00
Geoffrey White
ef47563139 CPP: Support flow of pointed-to things through function calls. 2020-01-16 11:08:19 +00:00
Mathias Vorreiter Pedersen
87c59e0017 C++: Overrideable taint sources in DefaultTaintTracking 2020-01-16 11:10:43 +01:00
Mathias Vorreiter Pedersen
603b1c26a7 Merge branch 'master' into ast-classes-should-not-be-abstract 2020-01-16 10:16:03 +01:00
Dave Bartolomeo
48301e1187 Merge pull request #2594 from rdmarsh2/ir-overlappingVariableMemoryLocations
C++: compute overlap on irvars with vvar indexes
2020-01-15 13:06:33 -07:00
Geoffrey White
04af2ace94 CPP: Add DataFlow to strdup. 2020-01-15 19:18:37 +00:00
Geoffrey White
9b5be995d2 CPP: Split Strdup model into it's own class and file. 2020-01-15 18:38:33 +00:00
Robert Marsh
a91f10fe40 Merge pull request #2629 from dbartol/dbartol/missing-vvars
C++/C#: Fix missing virtual variables
2020-01-15 08:32:43 -08:00
Tom Hvitved
f7278d36e1 Merge pull request #2498 from aschackmull/java/taint-getter
Java/C++/C#: Add support for taint-getter/setter summaries in data flow.
2020-01-15 09:55:19 +01:00
Dave Bartolomeo
e60f902c36 C++/C#: Fix missing virtual variables
The aliased SSA code was assuming that, for every automatic variable, there would be at least one memory access that reads or writes the entire variable. We've encountered a couple cases where that isn't true due to extractor issues. As a workaround, we now always create the `VariableMemoryLocation` for every local variable.

I've also added a sanity test to detect this condition in the future.

Along the way, I had to fix a perf issue in the PrintIR code. When determining the ID of a result based on line number, we were considering all `Instruction`s generated for a particular line, regardless of whether they were all in the same `IRFunction`. In addition, the predicate had what appeared to be a bad join order that made it take forever on large snapshots. I've scoped it down to just consider `Instruction`s in the same function, and outlined that predicate to fix the join order issue. This causes some numbering changes, but they're for the better. I don't think there was actually any nondeterminism there before, but now the numbering won't depend on the number of instantiations of a template, either.
2020-01-14 17:57:15 -07:00
Robert Marsh
42be28b211 C++: autoformat 2020-01-14 13:17:57 -08:00
Robert Marsh
5a5832b7de Merge pull request #2569 from jbj/ir-total-chi-flow
C++: IR data flow through total chi operands
2020-01-14 12:47:58 -08:00
Anders Schack-Mulligen
241b8a05e4 Java/C++/C#: Address review comment. 2020-01-14 11:59:55 +01:00
Anders Schack-Mulligen
041bcc5812 Java/C++/C#: Small perf improvement and simplification. 2020-01-13 17:00:56 +01:00
Robert Marsh
d2b225790a C++: fix chi instr oeprands to chi instrs 2020-01-09 11:48:18 -08:00
Robert Marsh
5007fd2aa8 C++: Autoformat and sync 2020-01-08 12:49:51 -08:00
Robert Marsh
e416d75f6f C++: add noopt on getPhiOperandDefinition 2020-01-08 11:36:57 -08:00
Jonas Jensen
8acbb3bfb9 C++: Further simplify a bit
This changes tuple counts!?
2020-01-08 11:36:50 -08:00
Jonas Jensen
5072201b7e C++: Fix join order 2020-01-08 11:36:40 -08:00
Jonas Jensen
838720bef0 C++: de-inline getDefinitionOrChiInstruction
Still has bad join order
2020-01-08 11:36:34 -08:00
Jonas Jensen
3d2cc7bbce C++: make hasPhiOperandDefinition feasible 2020-01-08 11:36:14 -08:00
Jonas Jensen
55f157e06d C++: Fix overlappingVariableMemoryLocations perf
The `overlappingVariableMemoryLocations` predicate was a helper
predicate introduced to fix a join-order issue in
`overlappingIRVariableMemoryLocations`. Unfortunately it caused a
performance issue of its own because it could grow too large. On the
small project (38MB zip) awslabs/s2n there were 181M rows in
`overlappingVariableMemoryLocations`, and it took 134s to evaluate.

The fix is to collapse the two predicates into one and fix join ordering
by including an extra column in the predicates being joined.

In addition, some parameters were reordered to avoid the overhead of
auto-generated `join_rhs` predicates.

Tuple counts of `overlappingVariableMemoryLocations` before:

    623285    ~176%     {2} r1 = JOIN AliasedSSA::isCoveredOffset#fff_120#join_rhs AS L WITH AliasedSSA::isCoveredOffset#fff_120#join_rhs AS R ON FIRST 2 OUTPUT L.<2>, R.<2>
    119138    ~3%       {2} r2 = SCAN AliasedSSA::VariableMemoryLocation::getVirtualVariable_dispred#ff AS I OUTPUT I.<1>, I.<0>
    172192346 ~0%       {2} r3 = JOIN r2 WITH AliasedSSA::hasUnknownOffset#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, r2.<1>
    172815631 ~0%       {2} r4 = r1 \/ r3
    172192346 ~0%       {2} r5 = JOIN r2 WITH AliasedSSA::hasUnknownOffset#ff_10#join_rhs AS R ON FIRST 1 OUTPUT r2.<1>, R.<1>
    345007977 ~87%      {2} r6 = r4 \/ r5
                        return r6

Tuple counts of `overlappingIRVariableMemoryLocations` after:

    117021 ~134%     {2} r1 = JOIN AliasedSSA::isCoveredOffset#ffff AS L WITH AliasedSSA::isCoveredOffset#ffff AS R ON FIRST 3 OUTPUT L.<3>, R.<3>
    201486 ~1%       {2} r2 = JOIN AliasedSSA::hasUnknownOffset#fff AS L WITH AliasedSSA::hasVariableAndVirtualVariable#fff AS R ON FIRST 2 OUTPUT L.<2>, R.<2>
    318507 ~26%      {2} r3 = r1 \/ r2
    201486 ~3%       {2} r4 = JOIN AliasedSSA::hasUnknownOffset#fff AS L WITH AliasedSSA::hasVariableAndVirtualVariable#fff AS R ON FIRST 2 OUTPUT R.<2>, L.<2>
    519993 ~92%      {2} r5 = r3 \/ r4
                     return r5
2020-01-08 11:07:20 -08:00
Robert Marsh
9b361f1701 Merge pull request #2601 from dbartol/dbartol/OpcodeProperties
C++: Consolidate opcode properties onto `Opcode` class
2020-01-08 11:05:41 -08:00
Dave Bartolomeo
690d23d15e C++: Fix formatting 2020-01-07 13:23:36 -07:00
Dave Bartolomeo
9df37399f8 C++: Consolidate opcode properties onto Opcode class
Previously, we had several predicates on `Instruction` and `Operand` whose values were determined solely by the opcode of the instruction. For large snapshots, this meant that we would populate large tables mapping each of the millions of `Instruction`s to the appropriate value, times three (once for each IR flavor).

This change moves all of these opcode properties onto `Opcode` itself, with inline wrapper predicates on `Instruction` and `Operand` where necessary. On smaller snapshots, like ChakraCore, performance is a wash, but this did speed up Wireshark by about 4%.

Even ignoring the modest performance benefit, having these properties defined on `Opcode` seems like a better organization than having them on `Instruction` and `Operand`.
2020-01-07 13:17:27 -07:00
Mathias Vorreiter Pedersen
633c42ced0 C++: Removed comment 2020-01-07 14:41:37 +01:00
Robert Marsh
ba9741f552 C++: compute overlap on irvars with vvar indexes 2020-01-06 09:14:03 -08:00
Anders Schack-Mulligen
9ba169b346 Java: Fix bad join-order. 2020-01-06 16:52:06 +01:00
Jonas Jensen
4830e43b3e C++: Fix overlappingVariableMemoryLocations perf
The `overlappingVariableMemoryLocations` predicate was a helper
predicate introduced to fix a join-order issue in
`overlappingIRVariableMemoryLocations`. Unfortunately it caused a
performance issue of its own because it could grow too large. On the
small project (38MB zip) awslabs/s2n there were 181M rows in
`overlappingVariableMemoryLocations`, and it took 134s to evaluate.

The fix is to collapse the two predicates into one and fix join ordering
by including an extra column in the predicates being joined.

In addition, some parameters were reordered to avoid the overhead of
auto-generated `join_rhs` predicates.

Tuple counts of `overlappingVariableMemoryLocations` before:

    623285    ~176%     {2} r1 = JOIN AliasedSSA::isCoveredOffset#fff_120#join_rhs AS L WITH AliasedSSA::isCoveredOffset#fff_120#join_rhs AS R ON FIRST 2 OUTPUT L.<2>, R.<2>
    119138    ~3%       {2} r2 = SCAN AliasedSSA::VariableMemoryLocation::getVirtualVariable_dispred#ff AS I OUTPUT I.<1>, I.<0>
    172192346 ~0%       {2} r3 = JOIN r2 WITH AliasedSSA::hasUnknownOffset#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, r2.<1>
    172815631 ~0%       {2} r4 = r1 \/ r3
    172192346 ~0%       {2} r5 = JOIN r2 WITH AliasedSSA::hasUnknownOffset#ff_10#join_rhs AS R ON FIRST 1 OUTPUT r2.<1>, R.<1>
    345007977 ~87%      {2} r6 = r4 \/ r5
                        return r6

Tuple counts of `overlappingIRVariableMemoryLocations` after:

    117021 ~134%     {2} r1 = JOIN AliasedSSA::isCoveredOffset#ffff AS L WITH AliasedSSA::isCoveredOffset#ffff AS R ON FIRST 3 OUTPUT L.<3>, R.<3>
    201486 ~1%       {2} r2 = JOIN AliasedSSA::hasUnknownOffset#fff AS L WITH AliasedSSA::hasVariableAndVirtualVariable#fff AS R ON FIRST 2 OUTPUT L.<2>, R.<2>
    318507 ~26%      {2} r3 = r1 \/ r2
    201486 ~3%       {2} r4 = JOIN AliasedSSA::hasUnknownOffset#fff AS L WITH AliasedSSA::hasVariableAndVirtualVariable#fff AS R ON FIRST 2 OUTPUT R.<2>, L.<2>
    519993 ~92%      {2} r5 = r3 \/ r4
                     return r5
2019-12-27 16:06:24 +01:00
Jonas Jensen
618bf2e29e C++: IR data flow through total chi operands 2019-12-27 11:44:41 +01:00
Jonas Jensen
64c79bf9e1 C++: Deprecate UninitializedNode in IR data flow
It's not used outside of tests, and it's not useful. It will break the
tests when we start allowing flow through chi nodes.
2019-12-27 11:21:33 +01:00
Mathias Vorreiter Pedersen
bb282f403e Fix comments
Co-Authored-By: Jonas Jensen <jbj@github.com>
2019-12-23 12:37:18 +01:00
Mathias Vorreiter Pedersen
11a545e08e C++: Removed abstract classes from binary and assignment operations 2019-12-23 11:52:12 +01:00
Mathias Vorreiter Pedersen
46421efcef C++: Rename crement operations 2019-12-23 10:41:14 +01:00
Jonas Jensen
7e84453ec9 Merge pull request #2542 from geoffw0/datetime
C++: Sort through the leap year and japanese era queries
2019-12-23 10:13:12 +01:00
Dave Bartolomeo
5b5d2f2b67 Merge pull request #2154 from rdmarsh2/rdmarsh/cpp/ir-callee-side-effects
C++: add InitializeIndirection for pointer params
2019-12-20 13:13:54 -07:00
Mathias Vorreiter Pedersen
006c8bb0cd C++: Remove abstract classes from unary operations 2019-12-20 18:38:09 +01:00
yo-h
cc7f98e0f6 Merge pull request #2555 from hvitved/csharp/xml-sync
C#: Sync `XML.qll` with other languages
2019-12-20 09:03:55 -05:00
Jonas Jensen
de55a6846f Merge pull request #2204 from alexet/cache-to-string
Cache the computation of core toString predicates for cpp c# and java.
2019-12-20 14:54:46 +01:00
Jonas Jensen
939979ddef Merge branch 'master' into overflowcalc 2019-12-19 14:12:00 +01:00
Jonas Jensen
a13748f484 Merge pull request #2259 from rdmarsh2/rdmarsh/cpp/default-taint-tracking-sources
C++: move sources into DefaultTaintTracking.qll
2019-12-19 14:09:41 +01:00