Commit Graph

4510 Commits

Author SHA1 Message Date
Ziemowit Laski
a0cfe826ee [CPP-340] Replace whitelist with f.getBlock() test. Fix doc comment. 2019-04-29 09:58:31 -07:00
Jonas Jensen
5fd425ae95 C++: fix IRBlock::backEdgeSuccessor performance
The `IRBlock::backEdgeSuccessor` predicate, in its three copies, had
become slow:

    6:IRBlock::Cached::backEdgeSuccessor#fff ...... 1m1s
    7:IRBlock::Cached::backEdgeSuccessor#2#fff .... 52.3s
    8:IRBlock::Cached::backEdgeSuccessor#3#fff .... 26.4s

The slow part was finding all the nodes involved in cycles in the
`forwardEdgeRaw` graph. This was done with `forwardEdgeRaw+(pred, pred)`,
but that got compiled into a materialization of `forwardEdgeRaw+`, which
is a huge relation with 1,816,752,107 rows on Wireshark:

    (1474s) Starting to evaluate predicate IRBlock::Cached::backEdgeSuccessor#3#fff
    (1501s) Tuple counts:
    0          ~0%     {2} r1 = SELECT #IRBlock::Cached::forwardEdgeRaw#3#ffPlus ON FIELDS #IRBlock::Cached::forwardEdgeRaw#3#ffPlus.<0>=#IRBlock::Cached::forwardEdgeRaw#3#ffPlus.<1>
    0          ~0%     {1} r2 = SCAN r1 OUTPUT FIELDS {r1.<0>}
    0          ~0%     {3} r3 = JOIN r2 WITH IRBlock::Cached::blockSuccessor#6#fff ON r2.<0>=IRBlock::Cached::blockSuccessor#6#fff.<0> OUTPUT FIELDS {r2.<0>,IRBlock::Cached::blockSuccessor#6#fff.<1>,IRBlock::Cached::blockSuccessor#6#fff.<2>}
    12411      ~7%     {3} r4 = IRBlock::Cached::backEdgeSuccessorRaw#3#fff \/ r3
                       return r4
    (1501s)  >>> Relation IRBlock::Cached::backEdgeSuccessor#3#fff: 12411 rows using 0 MB

The problem is the `SELECT`. It's fast to join on a fastTC result once
we know what we're looking for, so this fix materializes the identity
relation on `IRBlock` and joins with that so the fastTC ends up on the
RHS of a join, where it's fast. I had to introduce a helper predicate
because even with `noopt` I couldn't get `pred = pred2` to come _before_
`forwardEdgeRaw+(pred, pred2)`. The predicate now takes less than a
second to evaluate:

    (539s) Starting to evaluate predicate IRBlock::Cached::backEdgeSuccessor#fff
    (539s)  >>> Relation IRBlock::Cached::blockImmediatelyDominates#ff: 574677 rows using 0 MB
    (539s) 	 ... created with 574677 rows and 2 columns.
    (539s) Tuple counts:
    702445     ~1%     {2} r1 = SELECT IRBlock::Cached::blockIdentity#ff ON FIELDS IRBlock::Cached::blockIdentity#ff.<0>=IRBlock::Cached::blockIdentity#ff.<1>
    702445     ~1%     {2} r2 = SCAN r1 OUTPUT FIELDS {r1.<0>,r1.<0>}
    0          ~0%     {1} r3 = JOIN r2 WITH #IRBlock::Cached::forwardEdgeRaw#ffPlus ON r2.<0>=#IRBlock::Cached::forwardEdgeRaw#ffPlus.<0> AND r2.<1>=#IRBlock::Cached::forwardEdgeRaw#ffPlus.<1> OUTPUT FIELDS {r2.<0>}
    0          ~0%     {3} r4 = JOIN r3 WITH IRBlock::Cached::blockSuccessor#2#fff ON r3.<0>=IRBlock::Cached::blockSuccessor#2#fff.<0> OUTPUT FIELDS {r3.<0>,IRBlock::Cached::blockSuccessor#2#fff.<1>,IRBlock::Cached::blockSuccessor#2#fff.<2>}
    20487      ~0%     {3} r5 = IRBlock::Cached::backEdgeSuccessorRaw#fff \/ r4
                       return r5
    (539s)  >>> Relation IRBlock::Cached::backEdgeSuccessor#fff: 20487 rows using 0 MB
2019-04-29 15:44:50 +02:00
Jonas Jensen
cd7ba176ab C++: iterated dominance frontier algorithm for IR
Use the iterated dominance frontier algorithm to speed up dominance
frontier calculations. The implementation is copied from d310338c9b.

Before this change, the SSA calculations for unaliased and aliased SSA
used 169.9 seconds in total on these predicates:

    7:Dominance::getDominanceFrontier#2#ff .. 49s
    7:Dominance::blockDominates#2#ff ........ 47.5s
    8:Dominance::getDominanceFrontier#ff .... 44.4s
    8:Dominance::blockDominates#ff .......... 29s

After this change, the above predicates are replaced by two copies of
`getDominanceFrontier`, each of which takes less than a second.
2019-04-29 13:01:37 +02:00
ian-semmle
5fd10b56a2 Merge pull request #1280 from jbj/noTarget-workaround
C++: Work around extractor issue CPP-383
2019-04-29 10:47:06 +01:00
Jonas Jensen
c112a4dd20 Merge pull request #1285 from geoffw0/rnperf
CPP: Improve performance of RedundantNullCheckSimple.ql
2019-04-29 08:41:43 +02:00
Ziemowit Laski
4a760b1561 [CPP-340] Delete ArgumentsToImplicit.ql and associated files.
Reduce MistypedFunctionArguments.ql precision to `medium`.
2019-04-28 13:49:46 -07:00
Jonas Jensen
bdb678a318 Merge pull request #1267 from rdmarsh2/rdmarsh/cpp/def-by-ref-taint
C++: add taint edges to DefinitionByReferenceNode
2019-04-26 08:50:20 +02:00
Robert Marsh
f5c57b77e6 C++: fix whitespace 2019-04-25 16:16:27 -07:00
Geoffrey White
63b6942d0d CPP: Improve performance of RedundantNullCheckSimple.ql. 2019-04-25 15:56:49 +01:00
Jonas Jensen
48a3385809 C++: Work around extractor issue CPP-383
This fixes `PointlessComparison.ql` on https://github.com/an-tao/drogon.
The QL is a bit obfuscated because it looks for a pattern that's
impossible according to the dbscheme. There is no accompanying test
because we haven't been able to boil this problem down to a simple test
case. If we could, we'd fix it directly in the extractor instead.
2019-04-25 15:05:27 +02:00
Ziemowit Laski
ac58bdfc58 [CPP-340] For MistypedFunctionArguments.ql, add support for pointers to pointers and pointers to arrays. 2019-04-24 14:54:01 -07:00
Jonas Jensen
1dcfd21a5c Merge pull request #1264 from geoffw0/redundantnullperf
CPP: Add qhelp for RedundantNullCheckSimple.ql.
2019-04-24 10:25:23 +02:00
Robert Marsh
919f5c616f C++: comment and test for taint flow via memcpy 2019-04-23 11:17:18 -07:00
Geoffrey White
6234b26496 CPP: Make some repairs manually. 2019-04-23 14:45:27 +01:00
Geoffrey White
e395f5215f CPP: Autoformat 'Critical'. 2019-04-23 14:45:27 +01:00
Robert Marsh
262f724235 C++: add taint edges to DefinitionByReferenceNode 2019-04-22 10:39:02 -07:00
Robert Marsh
45a35a8572 Merge pull request #1265 from rdmarsh2/rdmarsh/cpp/gvn-string-pooling
C++: string pooling in IR value numbering
2019-04-22 09:29:44 -07:00
Ziemowit Laski
36b2c14f88 [CPP-340] Minor formatting tweaks 2019-04-19 11:46:54 -07:00
Ziemowit Laski
62b030d27f [CPP-340] Add a fourth query, ArgumentsToImplicit.ql, to deal strictly with implicitly declared
functions.  TooManyArguments.ql will now deal with explicitly declared/prototyped functions.
2019-04-18 17:56:41 -07:00
Robert Marsh
3907ef98a3 C++: value number string constants 2019-04-18 16:14:54 -07:00
Robert Marsh
c6f01265be Merge pull request #1263 from geoffw0/bufferoverflowqueries
CPP: Resolve overlap between OverflowCalculated.ql and NoSpaceForZeroTerminator.ql
2019-04-18 13:21:57 -04:00
Geoffrey White
eaed0004a3 CPP: Add qhelp for RedundantNullCheckSimple.ql. 2019-04-18 12:47:07 +01:00
Geoffrey White
57a4e52b47 CPP: Remove the overlap between these two queries. 2019-04-18 10:33:33 +01:00
Geoffrey White
ca6ba36d87 CPP: Unify and improve the MallocCall classes. 2019-04-18 10:30:18 +01:00
Max Schaefer
599185e125 CPP: Fix two doc comments. 2019-04-17 10:49:38 +01:00
Geoffrey White
f33b24c917 Merge pull request #1239 from jbj/qlformat-1
C++: Autoformat QL code in Architecture and Best Practices
2019-04-17 09:56:29 +01:00
Ziemowit Laski
65130c40ab [CPP-340] Add white list (for false positive suppression) to TooManyArguments.ql 2019-04-16 14:02:34 -07:00
Robert Marsh
09d0548c81 Merge pull request #1237 from geoffw0/commentedoutcode2
CPP: Fix FPs from detecting commented out preprocessor logic
2019-04-16 10:31:42 -07:00
Ziemowit Laski
61c91b67aa [CPP-340] Refactor MistypedFunctionArguments.ql further. 2019-04-14 11:31:10 -07:00
Ziemowit Laski
b58f414ede [CPP-340] Add more test case; exclude K&R definitions of functions when looking
up ()-declarations; refactor QL code.
2019-04-12 17:25:33 -07:00
Jonas Jensen
29aa5f550c C++: Tidy up code so it looks good after qlformat 2019-04-12 10:43:24 +02:00
Geoffrey White
1e0e3192bb CPP: Restrict to #elif, #else, #endif. 2019-04-11 15:14:21 +01:00
Jonas Jensen
6049c2ccfd C++: Autoformat Architecture + Best Practices 2019-04-11 14:27:07 +02:00
Geoffrey White
4a8b4b32d5 CPP: Fix indentation. 2019-04-11 11:38:50 +01:00
Geoffrey White
2c0ccf4a85 CPP: Exclude unusual header files such as config.h. 2019-04-11 11:28:45 +01:00
Geoffrey White
f381768a1e CPP: Create HeaderFile.noTopLevelCode from existing logic. 2019-04-11 11:21:53 +01:00
Geoffrey White
9e6b178d48 CPP: Resolve #endif FPs. 2019-04-11 11:05:53 +01:00
Dave Bartolomeo
878cdf7cb6 C++: Fix false positive in PointlessComparison
We avoid putting a variable into SSA if its address is ever taken in a way that could allow mutation of the variable via indirection. We currently just look to see if the address is either "pointer to non-const" or "reference to non-const". However, if the address was cast to an integral type (e.g. `uintptr_t n = (uintptr_t)&x;`), we were treating it as unescaped. This change makes the conservative assumption that casting a pointer to an integer may result in the pointed-to value being modified later.

This fixes a customer-reported false positive (#2 from https://discuss.lgtm.com/t/2-false-positives-in-c-for-comparison-is-always-same/1943)
2019-04-11 01:56:22 -07:00
Ziemowit Laski
d76138f189 [CPP-340] Remove use of getUnderlyingType() predicate as it does
not appear necessary.  Correct comment to refer to
           arguments rather than parameters.
2019-04-10 10:51:22 -07:00
Ziemowit Laski
dc7497835e [CPP-340] Make the query more strict (again). 2019-04-10 09:55:37 -07:00
Tom Hvitved
813dfc6417 C++: Generalize data-flow library in preparation for C# adoption 2019-04-10 13:05:39 +02:00
Geoffrey White
5101a5bc3d Merge pull request #1056 from jbj/SimpleRangeAnalysis-use-after-cast
C++: Fix use-after-cast bug in SimpleRangeAnalysis
2019-04-10 11:04:20 +01:00
Robert Marsh
75ab311c3a Merge pull request #1223 from geoffw0/commentedoutcode
CPP: Detect commented out preprocessor logic
2019-04-09 16:16:19 -04:00
Robert Marsh
c9fbbfe7d8 Merge pull request #984 from rdmarsh2/rdmarsh/cpp/ir-stmtexpr
C++: add support for GNU StmtExpr in IR
2019-04-09 12:54:35 -04:00
Geoffrey White
13ed50f049 CPP: Improve the regexp. 2019-04-09 13:08:31 +01:00
Geoffrey White
ddb1b0ac1c CPP: Declaration -> definition. 2019-04-09 12:35:20 +01:00
Jonas Jensen
fd4967e6f1 C++: Fix SnprintfOverflow issues
Requiring strict inclusion between types turned out to cause false
positives in `SnprintfOverflow`, which relied indirectly on
`RangeAnalysisUtils::linearAccessImpl` to identify acceptable bounds
checks. This query was particularly affected because `snprintf` returns
`int` (signed) but takes `size_t` (unsigned), so conversions are bound
to happen.
2019-04-09 11:05:14 +02:00
Geoffrey White
48fff334da CPP: Detect commented preprocessor code. 2019-04-08 18:17:23 +01:00
Geoffrey White
4d67bd32dd CPP: Move comments explaining implementation details into the body of 'looksLikeCode'. 2019-04-08 18:14:54 +01:00
Geoffrey White
f432f1a03a CPP: Autoformat CommentedOutCode.qll. 2019-04-08 18:00:49 +01:00