1985 Commits

Author SHA1 Message Date
Dave Bartolomeo
c73b516862 Merge pull request #1541 from jbj/ir-operand-exact
C++ IR: Make instruction operand getters have only exact results
2019-07-11 13:13:20 -07:00
Dave Bartolomeo
00ff2bb6c4 Merge pull request #1554 from jbj/ir-ErrorExpr
C++ IR: support for translating ErrorExpr
2019-07-11 13:05:04 -07:00
Jonas Jensen
23001d5471 Merge pull request #1566 from rdmarsh2/rdmarsh/cpp/pure-functions-effect-model
C++: alias and side effect info for pure functions
2019-07-11 21:21:54 +02:00
Geoffrey White
ed069fe3cc CPP: Upgrade precision/severity. 2019-07-11 20:00:50 +01:00
Geoffrey White
62fb216102 CPP: Fix false positive. 2019-07-11 20:00:50 +01:00
Geoffrey White
629d127174 CPP: QLDoc comments. 2019-07-11 20:00:50 +01:00
Geoffrey White
e1efdd7d47 CPP: Add a test where continue is used in a switch to exit the loop. 2019-07-11 20:00:50 +01:00
Geoffrey White
3337a859aa CPP: Corrections to qhelp. 2019-07-11 20:00:50 +01:00
Geoffrey White
4c4be2d3c2 CPP: Add (basic) qhelp. 2019-07-11 20:00:50 +01:00
Geoffrey White
8a3f8c5c1d CPP: Add precision/tags and adjust severity. 2019-07-11 20:00:50 +01:00
Geoffrey White
83d4b23ae3 CPP: Fix false positives in while/for loops. 2019-07-11 20:00:50 +01:00
Geoffrey White
136ca72297 CPP: Add a test. 2019-07-11 20:00:49 +01:00
Robert Marsh
c195420ba1 C++: respond to PR comments 2019-07-11 11:00:52 -07:00
Geoffrey White
db6be05a92 Merge pull request #1580 from jbj/inconsistent-loop-direction-perf
C++: Fix inconsistent-loop-direction performance
2019-07-11 16:39:05 +01:00
Jonas Jensen
2324ce77ae C++ IR: Fix soundness of ConstantAnalysis
Now that `PhiInstruction.getAnInput` only has results for congruent
operands, a previous optimization I made to `getConstantValue` is no
longer sound. We have to check that all phi inputs give the same value,
not just the congruent ones. After this change, if there are any
non-congruent operands on a phi instruction, the whole aggregate will
have no result.
2019-07-11 15:51:09 +02:00
Jonas Jensen
7fb43a5a03 C++ IR: getAnyDef -> getDef in RangeUtils.qll
As recommended by Dave in PR review.
2019-07-11 15:35:14 +02:00
ian-semmle
463547f810 Merge pull request #1581 from jbj/revert-noTarget-workaround
Revert "C++: Work around extractor issue CPP-383"
2019-07-11 14:26:15 +01:00
Jonas Jensen
c831c4b58e C++ IR: Fix SignAnalysis after getAnyDef -> getDef
In the `SignAnalysis` abstract interpretation, "unknown sign"
corresponds to the set of _all_ `Sign`, but using `getDef` leads to the
operand having _no_ `Sign`. To fix that, we assign all signs to inexact
operands.
2019-07-11 15:17:55 +02:00
Geoffrey White
59964bd9a4 Merge pull request #1575 from jbj/UncheckedLeapYear-bb
C++: Fix performance of unchecked leap year query
2019-07-11 13:57:07 +01:00
Jonas Jensen
ee5eaef5e4 Revert "C++: Work around extractor issue CPP-383"
The issue is now fixed in the extractor, and I've confirmed that the
workaround is no longer needed for g/an-tao/drogon.

This reverts commit 48a3385809.
2019-07-11 14:18:29 +02:00
Jonas Jensen
e523f93d91 C++: Fix inconsistent-loop-direction performance
This query seems to have been de-optimized by recent optimizer or stats
changes. On libretro/libretro-uae, the query took 1 second on a warm
cache with dist 89ad5f1 but took 9979 seconds with dist a3b9b6eb9.

The slowness was due to a Cartesian product in
`illDefined{Decr,Incr}ForStmt` between all the definitions and all the
uses of `Variable v`. This would be no problem with the right join
order, but that has apparently been lost. This commit factors out a pair
of `pragma[noinline]` helper predicates to make sure the definitions
(`v.getAnAssignedValue()`) and the uses (`v.getAnAccess()`) are queried
and filtered in separate predicates.

The performance problem can be seen in the tuple counts of this pipeline
I interrupted during evaluation of
`inconsistentLoopDirection::illDefinedDecrForStmt#ffff#shared`:

    89716     ~3%     {2} r1 = SCAN Variable::Variable::getAnAssignedValue_dispred#ff OUTPUT FIELDS {Variable::Variable::getAnAssignedValue_dispred#ff.<1>,Variable::Variable::getAnAssignedValue_dispred#ff.<0>}
    89716     ~0%     {3} r2 = JOIN r1 WITH DataFlowUtil::TExprNode#ff@staged_ext ON r1.<0>=DataFlowUtil::TExprNode#ff@staged_ext.<0> OUTPUT FIELDS {r1.<1>,DataFlowUtil::TExprNode#ff@staged_ext.<0>,DataFlowUtil::TExprNode#ff@staged_ext.<1>}
    502539405 ~0%     {4} r3 = JOIN r2 WITH Variable::Variable::getAnAccess_dispred#fb ON r2.<0>=Variable::Variable::getAnAccess_dispred#fb.<0> OUTPUT FIELDS {Variable::Variable::getAnAccess_dispred#fb.<1>,r2.<1>,r2.<2>,r2.<0>}
                      return r3
2019-07-11 12:09:17 +02:00
Robert Marsh
72f9addd0b C++: move strstr back into main pure str model 2019-07-10 12:27:04 -07:00
Jonas Jensen
52cfbffb95 C++ IR: Fix calls to non-existent predicates
The last commit introduced calls to two predicates that did not exist. I
created `Instruction.getResultAddress` so it now exists and changed the
other call back to use the predicate that does exist.
2019-07-10 15:18:17 +02:00
Jonas Jensen
6d87c05155 Apply suggestions from code review
Co-Authored-By: Dave Bartolomeo <42150477+dave-bartolomeo@users.noreply.github.com>
2019-07-10 15:07:44 +02:00
Jonas Jensen
70f81badcb C++ IR: Move ErrorExpr filter to TranslatedElement
The convention in the IR translation is to handle all ignored
expressions in this central place.
2019-07-10 14:20:09 +02:00
Jonas Jensen
21c6340180 C++: Fix performance of unchecked leap year query
This query used `getASuccessor()` on the CFG, which worked in many cases
but became quadratic on certain projects including PostgreSQL and
MySQL. The problem was that there was just enough context for magic to
apply to the transitive closure, but the use of magic meant that the
fast transitive closure algorithm wasn't used. In projects where the
magic had little effect, that led to the
`#ControlFlowGraph::ControlFlowNode::getASuccessor_dispred#bfPlus`
predicate taking quadratic time and space.

This commit changes the query to use basic blocks to find successors,
which is much faster because (1) there are many more `ControlFlowNode`s
than `BasicBlocks`, and (2) the optimizer does not apply magic but uses
fast transitive closure instead.

Behavior changes slightly in the `isUsedInCorrectLeapYearCheck` case: we
now accept a `yfacheck` that comes _before_ `yfa` if they are in the
same basic block. I don't think that matters in practice.
2019-07-10 13:20:32 +02:00
Dave Bartolomeo
e087b6c82a Merge pull request #1571 from jbj/ir-operand-cached
C++ IR: Make TOperand cached
2019-07-09 16:14:58 -07:00
Robert Marsh
3804c1fbcf C++: model returns of strstr and strpbrk 2019-07-09 11:45:27 -07:00
Jonas Jensen
523fc9c1ce C++ IR: make isInCycle fast
Without this `pragma[noopt]`, `isInCycle` gets compiled into RA that
unpacks every tuple of the fast TC:

                      0          ~0%     {2} r1 = SELECT #Operand::getNonPhiOperandDef#3#ffPlus ON FIELDS #Operand::getNonPhiOperandDef#3#ffPlus.<0>=#Operand::getNonPhiOperandDef#3#ffPlus.<1>
                      0          ~0%     {1} r2 = SCAN r1 OUTPUT FIELDS {r1.<0>}
                                         return r2

With this change, it just becomes one lookup in the fast TC data
structure per instruction.
2019-07-09 16:28:55 +02:00
Jonas Jensen
9ee8a89492 C++ IR: Make TOperand cached
Just like `TInstruction` is cached to prevent re-numbering its tuples in
every IR query, I think `TOperand` should be cached too. I tested it on
the small comdb2 snapshot, where it only saves one second of work when
running a second IR query, but the savings should grow when snapshots
are larger and when there are more IR queries in a suite. Tuple
numbering is mildly quadratic, so it should be good to avoid repeating
it.

Adding these annotations adds three cached stages to the existing four
cached stages of the IR. The new cached stages are small and do not
appear to repeat any work from the other stages, so I see no advantage
to merging them with the existing stages.
2019-07-09 16:07:55 +02:00
Jonas Jensen
0889d5d27a C++ IR: Improve ErrorExpr test
The previous version of the test used `0 = 1;` to test an lvalue-typed
`ErrorExpr`, but the extractor replaced the whole assignment expression
with `ErrorExpr` instead of just the LHS. This variation of the test
only leads to an `ErrorExpr` for the part of the syntax that's supposed
to be an lvalue-typed expression, so that's an improvement.
Unfortunately it still doesn't demonstrate that we can `Store` into an
address computed by an `ErrorExpr`.
2019-07-09 13:35:20 +02:00
Jonas Jensen
4324c97d39 C++: Use Opcode::Error for ErrorExpr translation 2019-07-09 13:26:00 +02:00
Jonas Jensen
a86ddd50de C++ IR: Translate ErrorExpr to NoOp 2019-07-09 13:18:11 +02:00
Jonas Jensen
e2a43eeed6 C++ IR: Tests with ErrorExpr 2019-07-09 13:18:09 +02:00
Jonas Jensen
39854a3f7b C++ IR: guard against cycles in operand graph
This doesn't fix the underlying problem that for some reason there are
cycles in the operand graph on our snapshots of the Linux kernel, but it
ensures that the cycles don't lead to non-termination of
`ConstantAnalysis` and `ValueNumbering`.
2019-07-09 11:00:27 +02:00
Jonas Jensen
da13dc6442 C++ IR: Don't propagate GVN through non-exact Copy
The `ValueNumbering` library is supposed to propagate value numberings
through a `CopyInstruction` only when it's _congruent_, meaning it must
have exact overlap with its source. A `CopyInstruction` can be a
`LoadInstruction`, a `StoreInstruction`, or a `CopyValueInstruction`.
The latter is also a `UnaryInstruction`, and the value numbering rule
for `UnaryInstruction` applied to it as well.

This meant that value numbering would propagate even through a
non-congruent `CopyValueInstruction`. That's semantically wrong but
probably only an issue in very rare circumstances, and it should get
corrected when we change the definition of `getUnary` to require
congruence.

What's worse is the performance implications. It meant that the value
numbering IPA witness could take two different paths through every
`CopyValueInstruction`. If multiple `CopyValueInstruction`s were
chained, this would lead to an exponential number of variable numbers
for the same `Instruction`, and we would run out of time and space
while performing value numbering.

This fixes the performance of `ValueNumbering.qll` on
https://github.com/asterisk/asterisk, although this project might also
require a separate change for fixing an infinite loop in the IR constant
analysis.
2019-07-09 10:58:03 +02:00
Jonas Jensen
46d779248d Merge pull request #1559 from zlaski-semmle/zlaski/futile-params-fix
Reduce precision from `very-high` to `low` due to inability to handle…
2019-07-09 06:51:56 +02:00
Dave Bartolomeo
7bbfffec4d Merge pull request #1552 from jbj/ir-builtin_addressof
C++ IR: Support __builtin_addressof
2019-07-08 17:08:38 -07:00
Dave Bartolomeo
52e0f3fb62 Merge pull request #1551 from jbj/ir-DeleteExpr-placeholder
C++: Placeholder translation of delete expressions
2019-07-08 17:07:16 -07:00
Robert Marsh
41e4d920e3 C++: alias and side effect info for pure functions 2019-07-08 12:26:58 -07:00
Ziemowit Laski
ed5e2f3211 It turns out that the bminor/bash alert spewage was caused by
a bug in the extractor, which is verified fixed in the next release.
Reverting query to its original form.
2019-07-08 12:11:15 -07:00
Robert Marsh
ea7602b571 C++: add test for Alias and SideEffect models 2019-07-08 11:41:46 -07:00
Robert Marsh
11581e4720 Merge pull request #1562 from geoffw0/models
CPP: Extend StrcpyFunction and update UsingStrcpyAsBoolean.ql
2019-07-08 09:56:16 -07:00
Geoffrey White
29e3e2a5bd CPP: Fix typo. 2019-07-08 09:45:40 +01:00
Ziemowit Laski
07ee9be9b6 Set query precision to high 2019-07-06 14:33:00 -07:00
Ziemowit Laski
be0db66a55 Squelch bminor/bash alerts and set query precision to high. 2019-07-06 14:27:02 -07:00
Ziemowit Laski
9e600e3768 Reduce precision from very-high to low due to inability to handle K&R definitions correctly. 2019-07-05 18:10:03 -07:00
Jonas Jensen
8d3cb78a9d C++: Fix DeclarationHidesVariable FP
We don't want alerts about the compiler-generated variables that appear
in the desugaring of range-based `for`.
2019-07-05 20:39:43 +02:00
Jonas Jensen
443a8fbc07 C++: Test for DeclarationHidesVariable FP 2019-07-05 20:34:30 +02:00
Jonas Jensen
4b4e7caf9f C++ IR: Support __builtin_addressof 2019-07-05 11:05:00 +02:00