Commit Graph

2357 Commits

Author SHA1 Message Date
Jonas Jensen
9ee8a89492 C++ IR: Make TOperand cached
Just like `TInstruction` is cached to prevent re-numbering its tuples in
every IR query, I think `TOperand` should be cached too. I tested it on
the small comdb2 snapshot, where it only saves one second of work when
running a second IR query, but the savings should grow when snapshots
are larger and when there are more IR queries in a suite. Tuple
numbering is mildly quadratic, so it should be good to avoid repeating
it.

Adding these annotations adds three cached stages to the existing four
cached stages of the IR. The new cached stages are small and do not
appear to repeat any work from the other stages, so I see no advantage
to merging them with the existing stages.
2019-07-09 16:07:55 +02:00
Jonas Jensen
83e618d49e C++: Make cpp/comparison-with-wider-type visible
The results from this query look good on real-world projects, so let's
make it visible by default.
2019-07-09 14:48:36 +02:00
Jonas Jensen
0889d5d27a C++ IR: Improve ErrorExpr test
The previous version of the test used `0 = 1;` to test an lvalue-typed
`ErrorExpr`, but the extractor replaced the whole assignment expression
with `ErrorExpr` instead of just the LHS. This variation of the test
only leads to an `ErrorExpr` for the part of the syntax that's supposed
to be an lvalue-typed expression, so that's an improvement.
Unfortunately it still doesn't demonstrate that we can `Store` into an
address computed by an `ErrorExpr`.
2019-07-09 13:35:20 +02:00
Jonas Jensen
4324c97d39 C++: Use Opcode::Error for ErrorExpr translation 2019-07-09 13:26:00 +02:00
Jonas Jensen
a86ddd50de C++ IR: Translate ErrorExpr to NoOp 2019-07-09 13:18:11 +02:00
Jonas Jensen
e2a43eeed6 C++ IR: Tests with ErrorExpr 2019-07-09 13:18:09 +02:00
Jonas Jensen
39854a3f7b C++ IR: guard against cycles in operand graph
This doesn't fix the underlying problem that for some reason there are
cycles in the operand graph on our snapshots of the Linux kernel, but it
ensures that the cycles don't lead to non-termination of
`ConstantAnalysis` and `ValueNumbering`.
2019-07-09 11:00:27 +02:00
Jonas Jensen
da13dc6442 C++ IR: Don't propagate GVN through non-exact Copy
The `ValueNumbering` library is supposed to propagate value numberings
through a `CopyInstruction` only when it's _congruent_, meaning it must
have exact overlap with its source. A `CopyInstruction` can be a
`LoadInstruction`, a `StoreInstruction`, or a `CopyValueInstruction`.
The latter is also a `UnaryInstruction`, and the value numbering rule
for `UnaryInstruction` applied to it as well.

This meant that value numbering would propagate even through a
non-congruent `CopyValueInstruction`. That's semantically wrong but
probably only an issue in very rare circumstances, and it should get
corrected when we change the definition of `getUnary` to require
congruence.

What's worse is the performance implications. It meant that the value
numbering IPA witness could take two different paths through every
`CopyValueInstruction`. If multiple `CopyValueInstruction`s were
chained, this would lead to an exponential number of variable numbers
for the same `Instruction`, and we would run out of time and space
while performing value numbering.

This fixes the performance of `ValueNumbering.qll` on
https://github.com/asterisk/asterisk, although this project might also
require a separate change for fixing an infinite loop in the IR constant
analysis.
2019-07-09 10:58:03 +02:00
Jonas Jensen
46d779248d Merge pull request #1559 from zlaski-semmle/zlaski/futile-params-fix
Reduce precision from `very-high` to `low` due to inability to handle…
2019-07-09 06:51:56 +02:00
Dave Bartolomeo
7bbfffec4d Merge pull request #1552 from jbj/ir-builtin_addressof
C++ IR: Support __builtin_addressof
2019-07-08 17:08:38 -07:00
Dave Bartolomeo
52e0f3fb62 Merge pull request #1551 from jbj/ir-DeleteExpr-placeholder
C++: Placeholder translation of delete expressions
2019-07-08 17:07:16 -07:00
Robert Marsh
41e4d920e3 C++: alias and side effect info for pure functions 2019-07-08 12:26:58 -07:00
Ziemowit Laski
ed5e2f3211 It turns out that the bminor/bash alert spewage was caused by
a bug in the extractor, which is verified fixed in the next release.
Reverting query to its original form.
2019-07-08 12:11:15 -07:00
Robert Marsh
ea7602b571 C++: add test for Alias and SideEffect models 2019-07-08 11:41:46 -07:00
Robert Marsh
11581e4720 Merge pull request #1562 from geoffw0/models
CPP: Extend StrcpyFunction and update UsingStrcpyAsBoolean.ql
2019-07-08 09:56:16 -07:00
Geoffrey White
29e3e2a5bd CPP: Fix typo. 2019-07-08 09:45:40 +01:00
Ziemowit Laski
07ee9be9b6 Set query precision to high 2019-07-06 14:33:00 -07:00
Ziemowit Laski
be0db66a55 Squelch bminor/bash alerts and set query precision to high. 2019-07-06 14:27:02 -07:00
Ziemowit Laski
9e600e3768 Reduce precision from very-high to low due to inability to handle K&R definitions correctly. 2019-07-05 18:10:03 -07:00
Jonas Jensen
8d3cb78a9d C++: Fix DeclarationHidesVariable FP
We don't want alerts about the compiler-generated variables that appear
in the desugaring of range-based `for`.
2019-07-05 20:39:43 +02:00
Jonas Jensen
443a8fbc07 C++: Test for DeclarationHidesVariable FP 2019-07-05 20:34:30 +02:00
Jonas Jensen
4b4e7caf9f C++ IR: Support __builtin_addressof 2019-07-05 11:05:00 +02:00
Jonas Jensen
6fe9945c04 C++: Placeholder translation of delete expressions
Before this change, `delete` and `delete[]` expressions had no control
flow after them, which caused the reachability analysis to remove all
code after a delete expression. This commit adds placeholder support for
delete expression by translating them to `NoOp` instructions so their
presence doesn't cause large chunks of the program to be removed.
2019-07-05 10:54:35 +02:00
Jonas Jensen
2f8787379a Merge pull request #1535 from geoffw0/nospacezero
CPP: Fix false positives from NoSpaceForZeroTerminator.ql
2019-07-04 22:36:04 +02:00
Jonas Jensen
8c733fd58d Merge pull request #1537 from geoffw0/add-tests
CPP: Add some tests
2019-07-04 21:20:55 +02:00
Geoffrey White
73c7bc1db9 CPP: Generalize a little. 2019-07-04 17:27:40 +01:00
Geoffrey White
7fc31f263a CPP: Basic fix. 2019-07-04 17:27:40 +01:00
Geoffrey White
34d307ecef CPP: Test a common false positive. 2019-07-04 17:27:40 +01:00
Geoffrey White
8ce6822d6f CPP: Fix format literal. 2019-07-04 16:31:35 +01:00
Geoffrey White
70b996f721 CPP: Speed up LeapYear.qll 'ChecksForLeapYearFunctionCall'. 2019-07-04 15:59:32 +01:00
Jonas Jensen
2111bf5387 C++ IR: getAnyDef -> getDef in RangeAnalysis 2019-07-03 11:05:06 +02:00
Jonas Jensen
c62f73e2a2 C++ IR: getAnyDef -> getDef in SignAnalysis
For signs that follow from guards, we want the guard and the guarded
access to overlap exactly.
2019-07-03 11:05:06 +02:00
Jonas Jensen
a16ed7d613 C++ IR: getAnyDef -> getDef in ValueNumbering
This change seems more in line with what users would expect.
2019-07-03 11:05:06 +02:00
Jonas Jensen
2ce8612a05 C++ IR: allow inexact defs in taint tracking 2019-07-03 11:05:06 +02:00
Jonas Jensen
984405be2e C++ IR: Change many uses of getAnyDef to getDef
This changes all the getters on `Instruction` to use `getDef` instead of
`getAnyDef`, with the result that these getters now only have a result
if the definition is exact.

This is a backwards-INCOMPATIBLE change.
2019-07-03 11:04:57 +02:00
Jonas Jensen
e082451352 C++ IR: add getDef and deprecated predicates
These are the hand-written changes that complete the automatic changes
from the previous commit.
- Add deprecated compatibility wrappers for the renamed predicates.
- Add a new `Operand.getDef` predicate.
- Clarify the QLDoc for all these predicates.
2019-07-03 10:06:48 +02:00
Jonas Jensen
206a96df94 C++ IR: Rename getters for def/use on Operand
This renames `getDefinitionInstruction` to `getAnyDef`, reflecting that
it includes definitions without exact overlap. It renames
`getUseInstruction` to `getUse` for consistency.

    perl -p -i -e 's/\bgetUseInstruction\b/getUse/g; s/\bgetDefinitionInstruction\b/getAnyDef/g' \
      cpp/ql/src/semmle/code/cpp/ir/**/*.ql* \
      cpp/ql/test/**/*.ql* \
      cpp/ql/src/semmle/code/cpp/rangeanalysis/**/*.ql*
2019-07-03 10:06:48 +02:00
Geoffrey White
e079406a5f Merge pull request #1536 from jbj/leap-year-sameBaseType-perf
C++: Fix performance of leap year queries
2019-07-02 17:04:00 +01:00
Jonas Jensen
2a6000c270 C++: getter/setter performance in StructLikeClass
The predicates `getter` and `setter` in `StructLikeClass.qll` were very
slow on some snapshots. On https://github.com/dotnet/coreclr they had
this performance:

    StructLikeClass::getter#fff#antijoin_rhs ........... 3m55s
    Variable::Variable::getAnAssignedValue_dispred#bb .. 3m36s
    StructLikeClass::setter#fff#antijoin_rhs ........... 20.5s

The `getAnAssignedValue_dispred` predicate in the middle was slow due to
magic propagated from `setter`.

With this commit, performance is instead:

   StructLikeClass::getter#fff#antijoin_rhs ........... 497ms
   Variable::Variable::getAnAssignedValue_dispred#ff .. 617ms
   StructLikeClass::setter#fff#antijoin_rhs ........... 158ms

Instead of hand-optimizing the QL for performance, I simplified `setter`
and `getter` to require slightly stronger conditions. Previously, a
function was only considered a setter if it had no writes to other
fields on the same class. That requirement is now relaxed by dropping
the "on the same class" part. I made the corresponding change for what
defines a getter. I think that still captures the spirit of what getters
and setters are.

I also changed the double-negation with `exists` into a `forall`.
2019-07-02 13:49:52 +02:00
Geoffrey White
01ce34449d Merge pull request #1530 from Semmle/getExpr-qldoc
C++: expand MacroInvocation.getExpr QLDoc
2019-07-02 11:00:57 +01:00
Jonas Jensen
5ea69601c3 Merge pull request #1525 from aibaars/drop-import-additional-libraries
Drop ImportAdditionalLibraries.ql
2019-07-02 11:26:31 +02:00
Jonas Jensen
5ad0b39f0c C++: Fix performance of leap year queries
The `sameBaseType` predicate was fundamentally quadratic, and this blew
up on large C++ code bases. Replacing it with calls to `Type.stripType`
fixes performance and does not affect the qltests. It looks like
`sameBaseType` was used purely an ad hoc heuristic, so I'm not worried
about the slight semantic difference between `sameBaseType` and
`stripType`.
2019-07-02 11:17:18 +02:00
Jonas Jensen
bf99a0ee15 C++: expand MacroInvocation.getExpr QLDoc 2019-07-01 20:22:24 +02:00
Jonas Jensen
757ec97e7a Merge pull request #1251 from zlaski-semmle/zlaski/cpp370
[CPP-370] Non-constant `format` arguments to `printf` and friends
2019-07-01 14:43:19 +02:00
Arthur Baars
9197c186e1 Drop: ImportAdditionalLibraries.ql 2019-06-28 15:53:07 +02:00
Pavel Avgustinov
da7591d1f6 Merge pull request #1519 from geoffw0/depkind
CPP: Deprecate Expr.getKind() and Stmt.getKind().
2019-06-27 19:22:57 +01:00
Jonas Jensen
c29ef904e0 Merge pull request #1498 from rdmarsh2/rdmarsh/exprHasNoEffect-defaulted-functions
C++: fix FP with ExprHasNoEffect in defaulted func
2019-06-27 20:10:37 +02:00
Geoffrey White
95ab8cc706 CPP: Add a test of More64BitWaste.ql. 2019-06-27 17:14:46 +01:00
Geoffrey White
5e328908a0 CPP: Modify violation message of NonPortablePrintf.ql for consistency with WrongTypeFormatArguments.ql. 2019-06-27 17:11:37 +01:00
Geoffrey White
5cef0e21c6 CPP: Add a test of NonPortablePrintf.ql. 2019-06-27 16:51:07 +01:00