Commit Graph

3579 Commits

Author SHA1 Message Date
Mathias Vorreiter Pedersen
b20afa6370 Merge pull request #2979 from jbj/GVN-noinline
C++: pragma[noinline] on GVN charpred
2020-03-04 12:19:27 +01:00
Jonas Jensen
60bcbf477a C++: pragma[noinline] on GVN charpred
The charpred of class `GVN` in `ASTValueNumbering.qll` got inlined into
the member predicate `getAnInstruction` and caused a tuple explosion on
Wireshark in the query `StrncpyFlippedArgs.ql`.

I interrupted the predicate after 10 minutes and got these intermediate
tuple counts:

    (5208s) Tuple counts for ASTValueNumbering::GVN::getAnInstruction_dispred#ff:
    8754900909 ~5%          {3} r1 = JOIN ValueNumberingInternal::tvalueNumber#ff_10#join_rhs AS L WITH ValueNumberingInternal::tvalueNumber#ff_10#join_rhs AS R ON FIRST 1 OUTPUT R.<1>, L.<1>, L.<0>
    4390274632 ~150085%     {2} r2 = JOIN r1 WITH project#SSAConstruction::Cached::getInstructionUnconvertedResultExpression AS R ON FIRST 1 OUTPUT r1.<2>, r1.<1>
                            return r2

After this change, the `getAnInstruction` predicate is itself inlined,
like it should be. The new non-inlined charpred takes 2.1s and has these
tuple counts:

    (2s) Tuple counts for ASTValueNumbering::GVN#f:
    9158442  ~117%     {1} r1 = JOIN project#SSAConstruction::Cached::getInstructionUnconvertedResultExpression AS L WITH ValueNumberingInternal::tvalueNumber#ff@staged_ext AS R ON FIRST 1 OUTPUT R.<1>
                       return r1
2020-03-04 10:34:05 +01:00
Robert Marsh
1e3419fd60 C++/C#: generate IR for funcs excluded in PrintIR
Previously, functions excluded from PrintIR would not have IR
generated. This sometimes affected escacpe analysis of functions that
were printed.
2020-03-03 14:34:08 -08:00
Nick Rolfe
c2db3d7984 Merge pull request #2968 from igfoo/unused_types
C++: Update tests following extractor no longer extracting some unused types
2020-03-03 16:03:40 +00:00
Jonas Jensen
30b43b9322 C++: Tests for variables with ambiguous types 2020-03-03 14:45:04 +01:00
Jonas Jensen
88c74b2a4b Merge pull request #2917 from MathiasVP/inexact-is-chi-for-all-aliased-memory
C++: `isChiForAllAliasedMemory` recursion through inexact Phi operands
2020-03-03 14:25:49 +01:00
Jonas Jensen
4f23acf080 Merge pull request #2957 from MathiasVP/dataflow-dispatch-same-num-args
C++: Only return functions that match arguments in DataFlowDispatch::viableCallable
2020-03-03 14:19:26 +01:00
Ian Lynagh
5b0cb10f9b C++: Update tests following extractor no longer extracting some unused types 2020-03-03 01:30:18 +00:00
Mathias Vorreiter Pedersen
0b082a4089 C++: Only do argument check for 2020-03-02 16:22:05 +01:00
Jonas Jensen
76066afe6a C++: Add getCanonicalQLClass overrides in Variable 2020-03-02 13:49:12 +01:00
Mathias Vorreiter Pedersen
9df7a7a87e Merge branch 'master' into inexact-is-chi-for-all-aliased-memory 2020-03-02 12:34:24 +01:00
Mathias Vorreiter Pedersen
20529b4436 C++/C#: Sync identical files 2020-03-02 12:15:54 +01:00
Mathias Vorreiter Pedersen
14d836ba59 C++: should only match those functions that has the same number of parameters as the call has arguments. 2020-03-02 12:15:28 +01:00
Mathias Vorreiter Pedersen
3a3aa75121 Merge pull request #2935 from jbj/MissingEnumCaseInSwitch-perf
C++: Optimize EnumSwitch.getAMissingCase
2020-03-02 10:32:44 +01:00
Jonas Jensen
dab6691eb0 Merge pull request #2900 from dbartol/dbartol/void-buffer
C++: Better fix for `void` type on buffer access
2020-03-02 09:00:15 +01:00
Jonas Jensen
ec85f9f1a1 Merge pull request #2797 from rdmarsh2/rdmarsh/cpp/malloc-alias-locations
C++: Support dynamic memory allocations in IR alias analysis
2020-03-02 08:49:59 +01:00
Jonas Jensen
30b5db3b7f C++: autoformat fixup 2020-03-02 08:48:54 +01:00
Jonas Jensen
bbc57878dd C++: Performance fix for large basic blocks
The code is now quadratic in the number of statements in a basic block,
whereas before it was quadratic in the number of _control-flow nodes_ in
a basic block.
2020-03-02 08:46:58 +01:00
Robert Marsh
28ee756c6a Merge pull request #2934 from geoffw0/add_tests
C++: Test and typos.
2020-02-28 15:12:32 -08:00
Geoffrey White
82191102d9 Merge pull request #2930 from jbj/getUnconverted
C++: Add Expr.getUnconverted predicate
2020-02-28 14:25:36 +00:00
Jonas Jensen
dfe1a7e2f0 C++: Avoid iDominates* in Overflow.qll
The `iDominates` relation is directly on control-flow nodes, and its
transitive closure is far too large. It got compiled into a recursion
rather than `fastTC`, and I've observed that recursion to take about an
hour on a medium-size customer snapshot.

The fix is to check for dominance at the basic-block level.
2020-02-28 10:48:23 +01:00
Geoffrey White
4ca57db553 Merge pull request #2929 from Semmle/rc/1.23
Merge rc/1.23 into master
2020-02-28 09:30:20 +00:00
Jonas Jensen
0be13e45f2 Merge remote-tracking branch 'upstream/master' into MissingEnumCaseInSwitch-perf 2020-02-28 09:57:29 +01:00
semmle-qlci
ec90627a64 Merge pull request #2909 from yo-h/experimental
Approved by aschackmull, jbj, max-schaefer, tausbn
2020-02-28 03:15:58 +00:00
Dave Bartolomeo
b0fb16c068 C++/C#: Fix formatting 2020-02-27 13:44:02 -05:00
Geoffrey White
729c310eb9 C++: More typos. 2020-02-27 15:49:59 +00:00
Jonas Jensen
d686347315 C++: Optimize EnumSwitch.getAMissingCase
The `cpp/missing-case-in-switch` performed badly on some snapshots, to
the extent where it was as slow as the most expensive IR stages
(example: ChakraCore). This commit makes it faster, removing a
`pragma[noopt]` along the way.

The intermediate tuple counts on a customer codebase drop from 84M to
3M, while the content hash of `getAMissingCase` is the same.

Before:

    (124s) Tuple counts for Stmt::EnumSwitch::getAMissingCase#ff#antijoin_rhs:
    20867789 ~0%       {3} r1 = JOIN Stmt::SwitchStmt::getASwitchCase_dispred#ff AS L WITH Stmt::EnumSwitch::getAMissingCase#ff#shared AS R ON FIRST 1 OUTPUT L.<1>, R.<0>, R.<1>
    20122830 ~0%       {3} r2 = JOIN r1 WITH Stmt::SwitchCase::getExpr_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r1.<1>, r1.<2>
    20122830 ~0%       {3} r3 = JOIN r2 WITH Expr::Expr::getValue_dispred#ff AS R ON FIRST 1 OUTPUT r2.<2>, r2.<1>, R.<1>
    83961918 ~0%       {4} r4 = JOIN r3 WITH Enum::EnumConstant::getInitializer_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r3.<1>, r3.<0>, r3.<2>
    83961918 ~0%       {4} r5 = JOIN r4 WITH initialisers AS R ON FIRST 1 OUTPUT R.<2>, r4.<3>, r4.<1>, r4.<2>
    234348   ~185%     {2} r6 = JOIN r5 WITH Expr::Expr::getValue_dispred#ff AS R ON FIRST 2 OUTPUT r5.<2>, r5.<3>
                       return r6
    ...
    (124s) Tuple counts for Stmt::EnumSwitch::getAMissingCase#ff:
    663127 ~4%     {2} r1 = Stmt::EnumSwitch::getAMissingCase#ff#shared AS L AND NOT Stmt::EnumSwitch::getAMissingCase#ff#antijoin_rhs AS R(L.<0>, L.<1>)
                   return r1
    (124s) Registering Stmt::EnumSwitch::getAMissingCase#ff + [] with content 2060ff326cvhihcsvoph6k9divuv4
    (124s)  >>> Wrote relation Stmt::EnumSwitch::getAMissingCase#ff with 663127 rows and 2 columns.

After:

    (5s) Tuple counts for Stmt::EnumSwitch::getAMissingCase_dispred#ff#antijoin_rhs:
    746029   ~0%       {2} r1 = JOIN Stmt::EnumSwitch::getAMissingCase_dispred#ff#shared AS L WITH Enum::Enum::getAnEnumConstant_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, L.<1>
    3116197  ~2%       {3} r2 = JOIN r1 WITH Enum::EnumConstant::getInitializer_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r1.<1>, r1.<0>
    3116197  ~0%       {3} r3 = JOIN r2 WITH initialisers AS R ON FIRST 1 OUTPUT R.<2>, r2.<1>, r2.<2>
    3116197  ~311%     {3} r4 = JOIN r3 WITH Expr::Expr::getValue_dispred#ff AS R ON FIRST 1 OUTPUT r3.<1>, R.<1>, r3.<2>
    234348   ~185%     {2} r5 = JOIN r4 WITH Stmt::EnumSwitch::matchesValue#ff AS R ON FIRST 2 OUTPUT r4.<0>, r4.<2>
                       return r5
    (5s) Registering Stmt::EnumSwitch::getAMissingCase_dispred#ff#antijoin_rhs + [] with content 173483d71508vl534mvlr1g0ehi12
    (5s)  >>> Wrote relation Stmt::EnumSwitch::getAMissingCase_dispred#ff#antijoin_rhs with 82902 rows and 2 columns.
    (5s) Starting to evaluate predicate Stmt::EnumSwitch::getAMissingCase_dispred#ff/2@ae4c0b
    (5s) Tuple counts for Stmt::EnumSwitch::getAMissingCase_dispred#ff:
    746029 ~2%     {2} r1 = JOIN Stmt::EnumSwitch::getAMissingCase_dispred#ff#shared AS L WITH Enum::Enum::getAnEnumConstant_dispred#ff AS R ON FIRST 1 OUTPUT L.<1>, R.<1>
    663127 ~4%     {2} r2 = r1 AND NOT Stmt::EnumSwitch::getAMissingCase_dispred#ff#antijoin_rhs AS R(r1.<0>, r1.<1>)
                   return r2
    (5s) Registering Stmt::EnumSwitch::getAMissingCase_dispred#ff + [] with content 2060ff326cvhihcsvoph6k9divuv4
    (5s)  >>> Wrote relation Stmt::EnumSwitch::getAMissingCase_dispred#ff with 663127 rows and 2 columns.
2020-02-27 16:27:52 +01:00
Geoffrey White
f8a61ffc4c C++: Expand the test as described in ODASA-640. 2020-02-27 15:26:53 +00:00
Geoffrey White
0a7d9db335 C++: Add example described in ODASA-640. 2020-02-27 15:23:16 +00:00
Geoffrey White
e6d35d314d C++: Fix typo. 2020-02-27 15:23:10 +00:00
Jonas Jensen
c9e56d13f7 C++: Add Expr.getUnconverted predicate
This gets rid of the expensive predicate
`#Cast::Conversion::getExpr_dispred#ffPlus`, I've observed to cause
memory pressure on large databases.
2020-02-27 14:52:42 +01:00
Anders Schack-Mulligen
67d386b5ba C++/C#: Add synchronization. 2020-02-27 14:10:16 +01:00
Robert Marsh
95a762c987 Merge master for submodule update 2020-02-26 13:44:26 -08:00
Robert Marsh
4333fe7905 Merge branch 'master' into rdmarsh/cpp/ir-flow-through-outparams 2020-02-26 13:15:27 -08:00
Mathias Vorreiter Pedersen
1bee0ffe3b C++: Autoformat 2020-02-26 12:09:21 +01:00
Jonas Jensen
5f6d07dd57 C++: Fix performance of UnsignedGEZero.ql
This query used two fastTC operations that were already somewhat
inefficient on their own but could send the evaluator into an OOM loop
when run in parallel without enough RAM.

The fix is to recurse manually, starting just from the expressions that
are potential candidates for alerts.
2020-02-26 11:32:41 +01:00
Mathias Vorreiter Pedersen
d942a3b54a C++: Change definition of isChiForAllAliasedMemory to recurse through inexact PhiInstructions 2020-02-26 10:21:27 +01:00
Jonas Jensen
db33c360bc Merge pull request #2910 from aschackmull/dataflow/cleanup
Java/C++: Minor dataflow cleanup.
2020-02-25 12:47:10 +01:00
Mathias Vorreiter Pedersen
b9bb2ec0ac Merge pull request #2864 from jbj/DefaultTaintTracking-cached
C++: Cache DefaultTaintTracking
2020-02-25 10:15:43 +01:00
Anders Schack-Mulligen
fba8772411 Java/C++: Minor dataflow cleanup. 2020-02-25 09:40:25 +01:00
yo-h
43bcd5b26c Add guidelines for experimental CodeQL queries and libraries 2020-02-24 15:08:31 -05:00
Robert Marsh
ea4ca31fb3 Merge pull request #2907 from geoffw0/argvlocal
C++: Modify the argvlocal tests
2020-02-24 10:55:21 -08:00
Geoffrey White
4af0193c98 C++: Modify the argvlocal tests. 2020-02-24 16:51:47 +00:00
Geoffrey White
9f271949d5 C++: Adjust layout of the argvlocal test. 2020-02-24 15:52:31 +00:00
Geoffrey White
c641a31640 C++: Refine nodeIsBarrierIn using getNodeForSource. 2020-02-24 14:39:31 +00:00
Geoffrey White
843b72b11a C++: hasGlobalOrStdName(). 2020-02-24 14:12:19 +00:00
Jonas Jensen
2d9df70abc Merge pull request #2887 from MathiasVP/fix-ir-gen-switch
C++: Fix IR generation for switch statements
2020-02-24 13:29:27 +01:00
Jonas Jensen
ae68878476 C++: Cache DefaultTaintTracking
This should speed up the overall suite, where `DefaultTaintTracking` is
used in several queries.
2020-02-24 13:03:34 +01:00
Geoffrey White
a0e839d3f1 C++: Block duplicate taint results from 'gets' and other functions. 2020-02-24 11:53:22 +00:00
Geoffrey White
06e649fc30 C++: Add support for fgetws. 2020-02-24 11:47:32 +00:00