1165 Commits

Author SHA1 Message Date
Geoffrey White
018450500d CPP: Fix closing tag. 2019-02-05 17:58:30 +00:00
Geoffrey White
c05df6ea4c CPP: Add reference. 2019-02-05 17:58:30 +00:00
Geoffrey White
f73a3a6a24 CPP: Explain the danger of gets a bit more in qhelp. 2019-02-05 17:58:30 +00:00
Geoffrey White
0541950c44 CPP: Clean up PotentialBufferOverflow.ql a bit. 2019-02-05 17:58:30 +00:00
Geoffrey White
c32e1b8000 CPP: Change the @name of PotentialBufferOverflow.ql to be in line with everything else. 2019-02-05 17:58:30 +00:00
Geoffrey White
f7e7737789 CPP: Update qhelp. 2019-02-05 17:58:30 +00:00
Geoffrey White
87a25f0cbe CPP: Update CWE tags. 2019-02-05 17:58:30 +00:00
Geoffrey White
429f53ed74 CPP: Move the 'gets' case. 2019-02-05 17:58:30 +00:00
Geoffrey White
a82832e779 CPP: Add a test that uses 'gets'. 2019-02-05 17:58:30 +00:00
Geoffrey White
bbc8e7886b CPP: Rearrange PotentiallyDangerousFunction.ql. 2019-02-05 17:58:30 +00:00
alexet
59a5bec769 CPP: Use more field overriding 2019-02-05 13:07:41 +00:00
Jonas Jensen
cad4bac548 C++: Concretize ConstantAnalysis NegateInstruction
This is just to make the QL shorter. It generates the same DIL.
2019-02-05 11:05:47 +01:00
Jonas Jensen
be35c674a7 C++: Factor out getConstantValueToPhi
This speeds up `getConstantValue`, the main predicate in
`ConstantAnalysis`, from 2.4s to 1.6s on comdb2.
2019-02-05 11:05:47 +01:00
Jonas Jensen
283bb2f6d0 C++: Factor out ConstantAnalysis BinaryInstruction
This speeds up comdb2 constant analysis from 6.5s to 4.5s.
2019-02-05 11:05:47 +01:00
Jonas Jensen
d66578eaa8 C++: Add IntegerPartial, use in ConstantAnalysis
This adds `IntegerPartial.qll`, which is similar to
`IntegerConstant.qll` except that it contains partial functions on
integers instead of total functions on optional integers. This speeds up
the constant analysis so it takes 6.5s instead of 10.3s on comdb2.
2019-02-05 11:05:47 +01:00
semmle-qlci
06ae0c421a Merge pull request #864 from jbj/ir-TIRVariable-shared
Approved by dave-bartolomeo
2019-02-05 07:55:28 +00:00
Dave Bartolomeo
dc209246aa Merge pull request #866 from jbj/ir-TInstruction-normalize
C++: Normalize TInstruction
2019-02-04 12:14:45 -08:00
Dave Bartolomeo
aadd5cf202 Merge pull request #863 from jbj/ir-variableLiveOnEntryToBlock-rhs
C++: Speed up variableLiveOnEntryToBlock in IR
2019-02-04 10:47:29 -08:00
Jonas Jensen
3735cb69ce C++: No InstructionTag in SSAConstruction
This does to `SSAConstruction` what the previous commit did to
`IRConstruction`. An instruction in `SSAConstruction` is now defined in
terms of how it was created rather than what it can be queried for.
Effectively, this defines `TInstruction` as `TInstructionTag` was
defined before and then removes `TInstructionTag` from
`SSAConstruction`. This also has the benefit of removing the concept of
an instruction tag from the public predicates on `Instruction`.
2019-02-04 19:43:17 +01:00
Jonas Jensen
8ae3551ec1 C++: Normalize TInstruction in raw IR
This definition was denormalized to the extent that an instruction was
defined in terms of the six main attributes it could be queried for.
This made it possible to do multi-column joins on those six attributes,
but it doesn't appear that this feature was useful in practice. The main
multi-column join that was in use was on the pair of
(`TranslatedElement, InstructionTag`), but the `TranslatedElement` was
not part of the `TInstruction`.

This commit changes `TInstruction` to be defined in terms of what it's
_built from_ (`TranslatedElement, InstructionTag`) instead. This makes
it possible to do multi-column joins on those two components, and then
there are separate predicates (usually with two columns) to query
instruction attributes, replacing the many uncached projections from
`MkInstruction` that were generated before.

An immediate advantage is that an `Expr` with multiple types will no
longer give rise to multiple `Instruction`s, fixing most of the errors
from the sanity query `ambiguousSuccessors`. The code inside
`IRConstruction.qll` becomes simpler and hopefully faster as there is no
longer a translation from `TranslatedElement` to `Locatable` and back
again.
2019-02-04 19:43:17 +01:00
Jonas Jensen
3e03835630 C++: Only create variables in FunctionIRs
The previous commit had the side effect that `IRVariable`s were created
for all `Functions`, including those that did not have IR. This commit
restricts all `TIRVariable` constructors to functions that have IR.
2019-02-04 19:34:16 +01:00
Dave Bartolomeo
6d3d9025f7 Merge pull request #867 from jbj/ir-ignoreExprAndDescendants-perf
C++: Replace FastTC with iteration in ignoreExpr
2019-02-04 09:26:32 -08:00
Dave Bartolomeo
7345c921d9 Merge pull request #857 from jbj/ir-getInstruction
C++: Fix TranslatedElement.getInstruction perf
2019-02-04 09:24:00 -08:00
Robert Marsh
411c285aa3 Merge pull request #870 from jbj/ir-shortestDistances
C++: Use shortestDistances HOP for IR BB indexes
2019-02-04 09:19:15 -08:00
Jonas Jensen
45a995ba52 C++: Accept test changes from last commit 2019-02-04 13:00:28 +01:00
Jonas Jensen
8368c37781 C++: Use shortestDistances HOP for IR BB indexes
This doesn't make it much faster, but it reduces the debug output
volume. It also simplifies the code.

I've found this change necessary when I compute the full IR on a
Wireshark snapshot in QL4E. Without it, Eclipse runs out of memory
because the console log is too large.
2019-02-04 11:40:11 +01:00
Jonas Jensen
60141bf317 C++: ignoreExprAndDescendants QL-796 workaround
The new predicate `isOrphan` gets inlined into
`ignoreExprAndDescendants`, whose performance improves from

    TranslatedElement::ignoreExprAndDescendants#f .. 23.4s (executed 9 times)

to

    TranslatedElement::ignoreExprAndDescendants#f ... 4.3s (executed 9 times)

This dramatic improvement is not only due to eliminating a type check in
the recursive case. Removing the type check from the other base cases
also enabled them to get better join orders.
2019-02-03 16:55:12 +01:00
Jonas Jensen
66e7c26d4e C++: Replace FastTC with iteration in ignoreExpr
Before, `ignoreExprAndDescendants` and its related predicates had this
timing on Wireshark.

    #TranslatedElement::getRealParent#ffPlus#swapped ......... 25.7s
    TranslatedElement::ignoreExprAndDescendants#f ............ 16.9s
    TranslatedElement::getRealParent#ff ...................... 7.2s
    TranslatedElement::ignoreExpr#f .......................... 4.8s
    TranslatedElement::ignoreExpr#f#antijoin_rhs ............. 3.2s
    TranslatedElement::getRealParent#ff_10#higher_order_body . 2.2s

After, it looks like this

    TranslatedElement::ignoreExprAndDescendants#f ............ 23.4s (executed 9 times)
    TranslatedElement::getRealParent#ff ...................... 6.3s
    TranslatedElement::ignoreExpr#f#antijoin_rhs ............. 4.8s
    TranslatedElement::ignoreExpr#f .......................... 3.7s
    TranslatedElement::getRealParent#ff_10#join_rhs .......... 2.5s
    project#TranslatedElement::getRealParent#ff .............. 1.3s
2019-02-03 16:55:12 +01:00
Patrik Schönfeldt
ac249cdbbe Fix reccomendation for LargeParameter (C++)
The previous reccomentation changed the behaviour of the code.
A user following the advice might have broken her/his code:
With call-by-value, the original parameter is not changed.
With a call-by-reference, however, it may be changed. To be sure,
nothing breaks by blindly following the advice, suggest to pass a
const reference.
2019-02-03 15:44:13 +01:00
Jonas Jensen
f8318ef96f C++: Move TIRVariable to its own file
The `SSAConstruction.getNewIRVariable` was very slow on Wireshark. This
was probably because it couldn't join on multiple columns
simultaneously. Instead of improving the join, I observed that the
`TIRVariable` type was the same between all three IR stages except for a
few occurrences of `FunctionIR` that could easily be changed to
`Function`. By sharing `TIRVariable` between all the stages, we avoid
recomputing it and translating it between every stage, turning the slow
`getNewIRVariable` predicate into a no-op.

This change means that later stages of the IR can't introduce new
variables, but that was already the case because
`config/identical-files.json` forced all three `IRVariable.qll` files to
be identical.
2019-02-03 13:36:30 +01:00
Jonas Jensen
3afefce8ef C++: Improve order of parameters in SSA def/use
This changes the order so the parameter that's sometimes projected away
is the last one, making the projection cheap.
2019-02-03 13:34:02 +01:00
Jonas Jensen
4ac22253eb C++: Speed up variableLiveOnEntryToBlock in IR
This predicate computed a local CP between all defs and uses of the same
virtual variable in a basic block. This wasn't a problem in
`unaliased_ssa`, but it became a huge problem in `aliased_ssa`, probably
because many variables can be modelled with a single virtual variable
there.

Before this commit, evaluation of `aliased_ssa`'s
`variableLiveOnEntryToBlock#ff#antijoin_rhs` on Wireshark took 80
_minutes_. After this commit, that predicate and its immediate
dependencies take around 5 _seconds_.
2019-02-03 13:25:18 +01:00
Jonas Jensen
e81d197ebd C++: Revert doc-related changes to dbscheme
These changes to the dbscheme were made in 7cc1442ecb and a98aae0a24
without a corresponding upgrade script in the internal repo.
2019-02-01 10:01:29 +01:00
Jonas Jensen
ee4526687d Merge pull request #859 from rdmarsh2/rdmarsh/cpp/ir-performance-1
C++: use field overrides in TranslatedElement and subclasses
2019-02-01 08:43:20 +01:00
Robert Marsh
5327ca7f77 Merge pull request #812 from jbj/ir-backedge
C++: IR back-edge detection based on TranslatedStmt
2019-01-31 11:28:21 -08:00
Dave Bartolomeo
bbe8e7ebfc C++: fix typo, ThrowExpr -> ReThrowExpr
Co-Authored-By: rdmarsh2 <rdmarsh2@gmail.com>
2019-01-31 10:47:17 -08:00
Dave Bartolomeo
ab1f96fb2c Merge pull request #770 from jbj/cfg-static-init-pr
C++: Add addresses to `Expr.isConstant`
2019-01-31 10:24:48 -08:00
Dave Bartolomeo
b0b2fc80c1 Merge pull request #855 from jbj/ir-getRealParent
C++: Simplify TranslatedElement.getRealParent
2019-01-31 10:15:30 -08:00
Dave Bartolomeo
8896d3bf88 Merge pull request #856 from jbj/ir-getInstructionOperandDefinition
C++: Speed up `getInstructionOperandDefinition`
2019-01-31 10:11:59 -08:00
Robert Marsh
ffb46638b0 C++: use more field overrides in IR generation 2019-01-31 07:47:21 -08:00
Robert Marsh
fa56981bce C++: use field overrides in TranslatedExpr 2019-01-31 07:47:21 -08:00
Jonas Jensen
be2a480394 Merge pull request #843 from geoffw0/strtoul
CPP: Improve ArithmeticTainted.ql
2019-01-31 07:04:17 -08:00
Jonas Jensen
b55573ebe3 C++: Accept test changes in ir_gvn.expected 2019-01-31 10:08:16 +01:00
Jonas Jensen
35d7fb5322 C++: Fix TranslatedElement.getInstruction perf
This relation was almost 40x the size it needed to be on Wireshark
because it lacked a restriction on the `tag` parameter. To implement
that restriction efficiently, I had to split the relation in two to
dictate the join order.

With the fix, `getInstruction` now computes the same as
`getInstructionTranslatedElementAndTag`, so the latter could be
simplified.

I made a corresponding change to `TranslatedElement.getTempVariable` for
the sake of consistency.
2019-01-31 08:45:02 +01:00
Jonas Jensen
947634f66f C++: Speed up getInstructionOperandDefinition
A part of `SSAConstruction.getInstructionOperandDefinition` was more
expensive than it had to be. On a ChakraCore snapshot, this changes the
tuple counts from

    3020569 ~2%       {3} r40 = JOIN OperandTag::TUnmodeledUseOperand#f WITH Instruction::Instruction::getFunction_dispred#ff CARTESIAN PRODUCT OUTPUT FIELDS {Instruction::Instruction::getFunction_dispred#ff.<0>,OperandTag::TUnmodeledUseOperand#f.<0>,Instruction::Instruction::getFunction_dispred#ff.<1>}
    62405   ~0%       {3} r41 = JOIN r40 WITH Instruction::UnmodeledUseInstruction#class#fffffff ON r40.<0>=Instruction::UnmodeledUseInstruction#class#fffffff.<0> OUTPUT FIELDS {r40.<2>,r40.<1>,r40.<0>}
    2868421 ~1%       {3} r42 = JOIN r41 WITH Instruction::Instruction::getFunction_dispred#ff_10#join_rhs ON r41.<0>=Instruction::Instruction::getFunction_dispred#ff_10#join_rhs.<0> OUTPUT FIELDS {Instruction::Instruction::getFunction_dispred#ff_10#join_rhs.<1>,r41.<1>,r41.<2>}
    62405   ~0%       {3} r43 = JOIN r42 WITH Instruction::UnmodeledDefinitionInstruction#class#fffffff ON r42.<0>=Instruction::UnmodeledDefinitionInstruction#class#fffffff.<0> OUTPUT FIELDS {r42.<2>,r42.<1>,r42.<0>}

to

    (0s) Starting to evaluate predicate SSAConstruction::Cached::getUnmodeledUseInstruction#ff
    (0s) Tuple counts:
    62405   ~0%     {2} r1 = JOIN Instruction::UnmodeledUseInstruction#class#fffffff WITH Instruction::Instruction::getFunction_dispred#ff ON Instruction::UnmodeledUseInstruction#class#fffffff.<0>=Instruction::Instruction::getFunction_dispred#ff.<0> OUTPUT FIELDS {Instruction::Instruction::getFunction_dispred#ff.<1>,Instruction::Instruction::getFunction_dispred#ff.<0>}
                                      return r1
    ...
    75716   ~0%       {3} r40 = JOIN OperandTag::TUnmodeledUseOperand#f WITH FunctionIR::FunctionIR::getUnmodeledDefinitionInstruction#ff CARTESIAN PRODUCT OUTPUT FIELDS {FunctionIR::FunctionIR::getUnmodeledDefinitionInstruction#ff.<0>,OperandTag::TUnmodeledUseOperand#f.<0>,FunctionIR::FunctionIR::getUnmodeledDefinitionInstruction#ff.<1>}
    62405   ~0%       {3} r41 = JOIN r40 WITH FunctionIR::FunctionIR::getUnmodeledUseInstruction#ff ON r40.<0>=FunctionIR::FunctionIR::getUnmodeledUseInstruction#ff.<0> OUTPUT FIELDS {FunctionIR::FunctionIR::getUnmodeledUseInstruction#ff.<1>,r40.<1>,r40.<2>}
2019-01-31 08:43:00 +01:00
Jonas Jensen
5b685383c8 C++: Simplify TranslatedElement.getRealParent
Now that we have `Expr.getParentWithConversions`, we can implement
`TranslatedElement.getRealParent` simpler. This implementation also
avoids recursion.
2019-01-31 08:41:29 +01:00
Geoffrey White
07adf6f201 CPP: Handle array accesses. 2019-01-30 18:36:32 +00:00
Geoffrey White
4685f193f5 CPP: Widen varMaybeStackAllocated. 2019-01-30 18:36:32 +00:00
Geoffrey White
c87036f2fd CPP: Simplify. 2019-01-30 18:36:32 +00:00
Geoffrey White
276738a435 CPP: Auto-format the query. 2019-01-30 18:36:32 +00:00