1165 Commits

Author SHA1 Message Date
Robert Marsh
fa8b771944 Merge pull request #1186 from jbj/dataflow-defbyref-1.20-fixes
C++: Let data flow past definition by reference
2019-04-02 13:36:37 -07:00
Jonas Jensen
842aafc888 C++: Fix new UnsafeDaclSecurityDescriptor FP
This query uses data flow for nullness analysis, which is always going
to be a large overapproximation. The overapproximation became too big
for one of the test cases after the recent change to make data flow go
across assignment by reference.

To make this query more conservative, it will now only report that the
`pDacl` argument can be null if there isn't also evidence that it can be
non-null.
2019-04-02 11:31:12 +02:00
Arthur Baars
5eb58f3ba2 C++: fix HubClasses.ql by changing its kind to 'table' 2019-04-01 16:17:23 +02:00
Jonas Jensen
71659594c8 C++: Let data flow past definition by reference
This commit changes how data flow works in the following code.

    MyType x = source();
    defineByReference(&x);
    sink(x);

The question here is whether there should be flow from `source` to
`sink`. Such flow is desirable if `defineByReference` doesn't write to
all of `x`, but it's undesirable if `defineByReference` is a typical
init function in `C` that writes to every field or if
`defineByReference` is `memcpy` or `memset` on the full range.

Before 1.20.0, there would be flow from `source` to `sink` in case `x`
happened to be modeled with `BlockVar` but not in case `x` happened to
be modelled with SSA. The choice of modelling depends on an analysis of
how `x` is used elsewhere in the function, and it's supposed to be an
internal implementation detail that there are two ways to model
variables. In 1.20.0, I changed the `BlockVar` behavior so it worked the
same as SSA, never allowing that flow. It turns out that this change
broke a customer's query.

This commit reverts `BlockVar` to its old behavior of letting flow
propagate past the `defineByReference` call and then regains consistency
by changing all variables that are ever defined by reference to be
modelled with `BlockVar` instead of SSA. This means we now get too much
flow in certain cases, but that appears to be better overall than
getting too little flow. See also the discussion in CPP-336.
2019-04-01 14:13:47 +02:00
Arthur Baars
4b95fbbb39 C++ Fix select statements of AV 3 and 81 2019-04-01 11:20:12 +02:00
Arthur Baars
ba7fdddafb Change @kind to 'table' for test and sanity checks queries that don't select problems 2019-04-01 11:20:12 +02:00
Jonas Jensen
111a462d16 C++: Recover some of the good results we lost
My recent changes to suppress FPs in `ReturnStackAllocatedMemory.ql`
caused us to lose all results where there was a `Conversion` at the
initial address escape. We cannot handle conversions in general, but
this commit restores the good results for the trivial types of
conversion that we can handle.
2019-03-19 11:09:58 +01:00
Jonas Jensen
d864df5b7f C++: Tests for new false negatives 2019-03-19 10:30:14 +01:00
Jonas Jensen
6b1cd17009 C++: Fix FPs due to data flow Conversion handling
Since we cannot track data flow from a fully-converted expression but
only the unconverted expression, we should check whether the address
initially escapes into the unconverted expression, not the
fully-converted one.

This fixes most of the false positives observed on lgtm.com.
2019-03-16 20:50:27 +01:00
Jonas Jensen
1a7351ef6e C++: Add tests for three FPs observed on lgtm.com 2019-03-16 20:50:27 +01:00
Robert Marsh
dfb7076fae C++: fix cartesian product in IRGuards.qll 2019-03-14 13:37:35 -07:00
Dave Bartolomeo
b0ad64c3e7 C++: PhiOperand -> PhiInputOperand
Also added `PhiInstruction::getAnInputOperand()`, and renamed `PhiInstruction::getAnOperandDefinitionInstruction()` to `getAnInput()` for consistency with other `Instruction` classes.
2019-03-12 11:57:53 -07:00
Dave Bartolomeo
b5a3edfdae C++: FunctionIR -> IRFunction 2019-03-12 11:28:22 -07:00
Geoffrey White
77c983b99a Merge pull request #1070 from jbj/dataflow-defbyref-join-order
C++: Fix join order in def-by-reference data flow
2019-03-12 15:34:07 +00:00
Jonas Jensen
9758164dd6 Merge pull request #1083 from geoffw0/newdelete-perf2
CPP: Fix NewDelete.qll performance.
2019-03-12 16:08:46 +01:00
Geoffrey White
249f350cc8 Fix NewDelete.qll performance. 2019-03-12 11:32:24 +00:00
Jonas Jensen
ece122aca3 C++: Fix join order in def-by-reference data flow
The performance was adequate on most projects but degenerated on
https://github.com/Microsoft/Tocino.
2019-03-11 10:57:00 +01:00
Geoffrey White
0b21f4d59b CPP: Add an empty references section to the ReturnStackAllocatedMemory qhelp. 2019-03-08 23:21:47 +00:00
Jonas Jensen
a90e4a7bdf Merge pull request #1066 from xiemaisi/fix-qhelp-backticks
Fix qhelp backticks
2019-03-08 19:06:48 +01:00
Max Schaefer
a94f25e8fa C++: Fix erroneous backticks in query help. 2019-03-08 15:28:14 +00:00
Robert Marsh
07bc9ca26c C++: fix whitespace 2019-03-07 13:14:58 -08:00
Robert Marsh
8a2a4678d8 C++: accept dataflow test change 2019-03-07 13:14:57 -08:00
Robert Marsh
ef836c39bb C++: respond to PR comments 2019-03-07 13:14:57 -08:00
Robert Marsh
17ad124c9e C++: remove VariableAddress from points_to test 2019-03-07 13:14:56 -08:00
Robert Marsh
7e30ce0c09 C++: add phi node support to escape analysis 2019-03-07 13:14:56 -08:00
Robert Marsh
97c11a5222 C++: points-to for argument-returning calls 2019-03-07 13:14:55 -08:00
Robert Marsh
878502f82e C++: remove duplicate logic 2019-03-07 13:14:52 -08:00
Jonas Jensen
794a8954cd C++: Simplify automaticVariableAddressEscapes
The `automaticVariableAddressEscapes` predicate got join-ordered badly
in its `unaliased_ssa` version. These are the tuple counts on Wireshark,
where one pipeline step is seen to have 716 million tuples:

```
[2019-03-02 11:29:41] (42s) Starting to evaluate predicate AliasAnalysis::automaticVariableAddressEscapes#2#f
[2019-03-02 11:30:06] (67s) Tuple counts:
                      353419    ~0%      {1} r1 = JOIN project#Instruction::VariableAddressInstruction#class#2#ff WITH AliasAnalysis::resultEscapesNonReturn#2#f ON project#Instruction::VariableAddressInstruction#class#2#ff.<0>=AliasAnalysis::resultEscapesNonReturn#2#f.<0> OUTPUT FIELDS {AliasAnalysis::resultEscapesNonReturn#2#f.<0>}
                      353419    ~0%      {2} r2 = JOIN r1 WITH IRConstruction::Cached::getInstructionEnclosingFunctionIR#ff@staged_ext ON r1.<0>=IRConstruction::Cached::getInstructionEnclosingFunctionIR#ff@staged_ext.<0> OUTPUT FIELDS {IRConstruction::Cached::getInstructionEnclosingFunctionIR#ff@staged_ext.<1>,r1.<0>}
                      353419    ~0%      {2} r3 = JOIN r2 WITH FunctionIR::FunctionIR::getFunction_dispred#3#ff ON r2.<0>=FunctionIR::FunctionIR::getFunction_dispred#3#ff.<0> OUTPUT FIELDS {FunctionIR::FunctionIR::getFunction_dispred#3#ff.<1>,r2.<1>}
                      716040298 ~0%      {2} r4 = JOIN r3 WITH IRVariable::IRVariable#class#3#ff_10#join_rhs ON r3.<0>=IRVariable::IRVariable#class#3#ff_10#join_rhs.<0> OUTPUT FIELDS {IRVariable::IRVariable#class#3#ff_10#join_rhs.<1>,r3.<1>}
                      4480139   ~0%      {2} r5 = JOIN r4 WITH IRVariable::IRAutomaticVariable#class#3#ff ON r4.<0>=IRVariable::IRAutomaticVariable#class#3#ff.<0> OUTPUT FIELDS {r4.<1>,r4.<0>}
                      66760     ~91%     {1} r6 = JOIN r5 WITH Instruction::VariableInstruction::getVariable_dispred#2#ff ON r5.<0>=Instruction::VariableInstruction::getVariable_dispred#2#ff.<0> AND r5.<1>=Instruction::VariableInstruction::getVariable_dispred#2#ff.<1> OUTPUT FIELDS {r5.<1>}
                                         return r6
[2019-03-02 11:30:06] (67s)  >>> Relation AliasAnalysis::automaticVariableAddressEscapes#2#f: 35531 rows using 0 MB
```

The predicate contained a cyclic join, which is always hard to optimize.
I couldn't see a reason to join the `FunctionIR`, so I removed that
part. The predicate is now fast, and there are no changes in the tests.
2019-03-07 13:14:51 -08:00
Robert Marsh
a72cd23d1d C++: fix escape test failures 2019-03-07 13:14:51 -08:00
Robert Marsh
09321ee062 C++: refactor escape analysis for performance 2019-03-07 13:14:51 -08:00
Robert Marsh
6f76c13385 C++: fix unused variable warning 2019-03-07 13:14:50 -08:00
Robert Marsh
726f38c802 C++: refactor alias analysis for performance 2019-03-07 13:14:50 -08:00
Robert Marsh
c70bd285de C++: assume arguments to virtual functions escape 2019-03-07 13:14:49 -08:00
Robert Marsh
2c94a8887d C++: test for virtual functions in escape analysis 2019-03-07 13:14:49 -08:00
Robert Marsh
6089172554 C++: escape analysis for this parameters 2019-03-07 13:14:49 -08:00
Robert Marsh
466e110338 C++: add new interprocedural escape analysis 2019-03-07 13:14:48 -08:00
Robert Marsh
bd39698528 C++: test changes for interproc escape analysis 2019-03-07 13:14:48 -08:00
Jonas Jensen
6ef946c2b0 C++: Make IR TaintTracking available on LGTM
Because this new library is not used in a default query, it needs to be
imported here in order to be available in the LGTM query console.
2019-03-05 18:05:27 +01:00
Jonas Jensen
b3d935063f Merge pull request #815 from geoffw0/keyset
CPP: dbscheme annotations
2019-03-05 14:53:46 +00:00
Jonas Jensen
9d595aa5ea Merge pull request #1033 from geoffw0/newdelete-perf
CPP: NewDelete.qll performance
2019-03-05 12:52:59 +00:00
Geoffrey White
4e1e3131ac CPP: Revert annotation on 'externalData'. 2019-03-05 10:22:33 +00:00
Geoffrey White
56fe91d774 CPP: cached -> pragma[nomagic]. 2019-03-05 08:59:16 +00:00
Max Schaefer
7f5e2630a1 Merge pull request #1032 from xiemaisi/master-for-merge
Merge master into rc/1.20
2019-03-04 21:23:51 +00:00
Geoffrey White
eb4efc4745 Merge pull request #1023 from jbj/gets-qualified
C++: Use getQualifiedName() = "gets", not hasName
2019-03-04 18:10:15 +00:00
Geoffrey White
a9ce2f7a62 CPP: Simplify out some old optimizations (that make little difference now). 2019-03-04 13:13:04 +00:00
Geoffrey White
df73bb3468 CPP: Fix performance issue. Also has a small positive effect on correctness. 2019-03-04 12:47:55 +00:00
Geoffrey White
f0085ed25a CPP: Additional test cases. 2019-03-04 12:45:05 +00:00
Jonas Jensen
4f9ffb38e6 C++: Set cpp/command-line-injection precision=low
This query is only appropriate for setuid programs. Since such programs
are at most 0.1% of all code we analyse, I would say this query has a
precision of at most 0.1%.
2019-03-04 09:51:33 +01:00
Jonas Jensen
c49c23068a Merge pull request #923 from geoffw0/potentialbufferoverflow
CPP: Deprecate PotentialBufferOverflow.ql
2019-03-04 08:11:27 +00:00
Jonas Jensen
0ed1618824 C++: Use getQualifiedName() = "gets", not hasName
This fixes false positives on
https://lgtm.com/projects/g/brandonpelfrey/Construct caused by a member
function named `gets` -- probably short for "get s".
2019-03-04 09:01:20 +01:00