Commit Graph

13682 Commits

Author SHA1 Message Date
Jonas Jensen
fd6d06fe6f C++: Data flow through address-of operator (&)
The data flow library conflates pointers and their objects in some
places but not others. For example, a member function call `x.f()` will
cause flow from `x` of type `T` to `this` of type `T*` inside `f`. It
might be ideal to avoid that conflation, but that's not realistic
without using the IR.

We've had good experience in the taint tracking library with conflating
pointers and objects, and it improves results for field flow, so perhaps
it's time to try it out for all data flow.
2019-09-17 13:16:34 +02:00
Dave Bartolomeo
21f6ab787d C++: Rename predicates in FunctionInputsAndOutputs.qll and add QLDoc 2019-09-16 12:06:06 -07:00
Geoffrey White
3df31e6ccf CPP: Tiny qldoc fixes. 2019-09-16 16:52:48 +01:00
Dave Bartolomeo
553238a9e8 Merge pull request #1922 from jbj/qlcfg-const-pointer-to-member
C++: Add PointerToFieldLiteral class
2019-09-13 10:44:52 -07:00
Jonas Jensen
7cfbe88e7b C++: IR DataFlow::Node.toString consistency
The `toString` for IR data-flow nodes are now similar to AST data-flow
nodes. This should make it easier to use the IR as a drop-in replacement
in the future. There are still differences because the IR data flow
library takes conversions into account.

I did not attempt to align the new nodes we use for field flow. That can
come later, when we add field flow to IR data flow.
2019-09-13 14:33:31 +02:00
Jonas Jensen
562bffe710 C++: Simplify toString of ImplicitParameterNode
This string looked out of place compared to `ExplicitParameterNode`,
whose string is simply the name of the parameter and therefore
indistinguishable from an access to the parameter without looking at the
location also. This has not been a problem so far, and if we want to
distinguish more clearly between initial values and accesses at some
point, we should do it for `ExplicitParameterNode` and
`UninitializedNode` too.
2019-09-13 14:33:26 +02:00
Tom Hvitved
f5cae9b6ea Merge pull request #1881 from aschackmull/java/pathgraph-nodes
Java/C++/C#: Add nodes predicate to PathGraph.
2019-09-13 10:32:47 +02:00
Dave Bartolomeo
e8cf3f876e Merge pull request #1660 from zlaski-semmle/zlaski/builtin-va-list
Add a `__builtin_va_list` type, to complement `__builtin_va_*`
2019-09-12 14:04:55 -07:00
Dave Bartolomeo
9072f6231f Merge pull request #1928 from jbj/autoformat-ssa
C++: Autoformat IR SSA files
2019-09-12 14:03:20 -07:00
zlaski-semmle
45640395a9 Merge pull request #1803 from geoffw0/qldoceg9
CPP: Add syntax examples to QLDoc in Variable.qll
2019-09-12 12:32:58 -07:00
Jonas Jensen
0c092e21b0 C++: Autoformat IR SSA files
One autoformat omission had also slipped into
`DefaultTaintTracking.qll`.
2019-09-12 15:45:08 +02:00
Jonas Jensen
10270cb36d C++: Turn a comment into QLDoc 2019-09-12 15:44:04 +02:00
Jonas Jensen
c7e6081079 C++: Add DataFlow::instructionNode
This is for symmetry with `exprNode` etc., and it should be handy for
the same reasons. I found one caller of `asInstruction` that got simpler
by using the new predicate instead.
2019-09-12 11:44:17 +02:00
Anders Schack-Mulligen
61e4e61087 C++: Adjust qltest expected output. 2019-09-12 11:00:49 +02:00
Anders Schack-Mulligen
95e2f162d9 Java/C++/C#: Adjust toString of empty accesspath. 2019-09-12 11:00:49 +02:00
Anders Schack-Mulligen
0a4b15d40b Java/C++/C#: Add nodes predicate to PathGraph. 2019-09-12 11:00:49 +02:00
semmle-qlci
10076a6b2b Merge pull request #1886 from jbj/ir-taint-shared
Approved by rdmarsh2
2019-09-12 06:48:24 +01:00
Robert Marsh
e71a39f6b6 Merge pull request #1912 from jbj/tainttracking-ir-1
C++: Stub replacement for security.TaintTracking
2019-09-11 13:44:39 -07:00
Geoffrey White
d1cc28e253 CPP: Address review comments. 2019-09-11 17:14:05 +01:00
Geoffrey White
ee07c705a4 CPP: More review suggestions. 2019-09-11 17:14:05 +01:00
Geoffrey White
8134d80c46 CPP: Review suggestions. 2019-09-11 17:14:05 +01:00
Geoffrey White
120b0c0c2c CPP: Modernize the TemplateVariables test and have the TemplateVariables actually included in the scope of the test. 2019-09-11 17:14:05 +01:00
Geoffrey White
68196df561 CPP: Examples Variable.qll. 2019-09-11 17:11:53 +01:00
Jonas Jensen
6912cafc54 C++: Use the RelationalOperation class 2019-09-11 15:21:49 +02:00
Jonas Jensen
0d0ab9157c C++: Address review comments 2019-09-11 15:20:36 +02:00
Jonas Jensen
6021b4f04a C++: Remove local flow from additional taint step
This case was not supposed to be there -- that was the whole point of
having the `localAdditionalTaintStep` predicate.
2019-09-11 14:09:17 +02:00
Jonas Jensen
ee16b239de C++: Add PointerToFieldLiteral class
Marking these expressions as constants fixes the CFG discrepancies that
can be observed on the affected test and on snapshots of MySQL.
2019-09-11 13:40:24 +02:00
Jonas Jensen
bd59029e2b C++: Add pointer-to-member test to syntax-zoo
This test was inspired by problems observed in a MySQL snapshot. The
results show there are problems with both the QL CFG and the IR.
2019-09-10 16:23:23 +02:00
Jonas Jensen
de4e2a259e C++: Stub replacement for security.TaintTracking
This commit adds a `semmle.code.cpp.ir.dataflow.DefaultTaintTracking`
library that's API-compatible with the
`semmle.code.cpp.security.TaintTracking` library. The new library is
implemented on top of the IR data flow library.

The idea is to evolve this library until it can replace
`semmle.code.cpp.security.TaintTracking` without decreasing our SAMATE
score. Then we'll have the IR in production use, and we will have one
less taint-tracking library in production.
2019-09-10 13:40:45 +02:00
Jonas Jensen
d6fba0ef46 C++: Don't create partial defs for calls to const
These partial defs don't do any harm, but they could hurt performance.
In typical C++ snapshots, between 5% and 20% of all calls are to `const`
functions.
2019-09-10 09:49:16 +02:00
Jonas Jensen
fd3615d120 C++: Show that there are too many partial defs 2019-09-10 09:44:07 +02:00
Jonas Jensen
7b09e4177e C++: Add localExprTaint for IR
This is for ODASA-8053.
2019-09-10 09:40:31 +02:00
Jonas Jensen
80a0027808 C++: Shared TaintTrackingImpl for IR TaintTracking 2019-09-10 09:40:27 +02:00
Jonas Jensen
770212567f C++: Fix up IR data flow QLDoc 2019-09-10 09:34:54 +02:00
Robert Marsh
2806a52ec5 Merge pull request #1888 from jbj/ir-dataflow-node-ipa
C++: Hide that IR DataFlow::Node is Instruction
2019-09-09 11:00:37 -07:00
Geoffrey White
4283a1508d Merge pull request #1870 from jbj/autoformat-all
C++: Autoformat everything
2019-09-09 16:05:32 +01:00
Jonas Jensen
79f456e8bd Merge pull request #1905 from ian-semmle/mangling_more
C++: Resolve all classes
2019-09-09 16:48:30 +02:00
Geoffrey White
22e1715368 Merge pull request #1900 from jbj/dataflow-this-by-ref
C++: Fix flow out of `this` by reference
2019-09-09 11:15:32 +01:00
Jonas Jensen
4ef5c9af62 C++: Autoformat everything
Some files that will change in #1736 have been spared.

    ./build -j4 target/jars/qlformat
    find ql/cpp/ql -name "*.ql"  -print0 | xargs -0 target/jars/qlformat --input
    find ql/cpp/ql -name "*.qll" -print0 | xargs -0 target/jars/qlformat --input
    (cd ql && git checkout 'cpp/ql/src/semmle/code/cpp/ir/implementation/**/*SSA*.qll')
    buildutils-internal/scripts/pr-checks/sync-identical-files.py --latest
2019-09-09 11:25:53 +02:00
Jonas Jensen
1784122929 C++: Fixes from Geoffrey's review round 4 2019-09-09 11:21:55 +02:00
Jonas Jensen
969d76671e C++: Tidy up long comments that attach to items 2019-09-09 11:04:05 +02:00
Jonas Jensen
4769d00c50 C++: Fix autoformat of //-comments after +
The autoformatter would associate these comments to the following term
instead of the preceding term.
2019-09-09 11:04:05 +02:00
Jonas Jensen
3324bfb198 C++: Fix long comments without * on each line
Comments like these will make the autoformatter produce bad indentation.

For the record (not for explainability), these issues were found with

    git grep -P -A1 '^( */\*| +\*( |$))(.(?!\*/))*$' cpp/ql/src/'**/*.ql*' |grep -B10 'qll\?- [^*]*$'
2019-09-09 11:04:04 +02:00
Jonas Jensen
44aca8a0f4 C++: Prepare BufferWrite.qll for autoformat
The autoformatter cannot process these long end-of-line comments
properly when the line starts with `or`.
2019-09-09 11:04:04 +02:00
Jonas Jensen
29c83537b4 C++: Fixes from Geoffrey's review round 3 2019-09-09 11:04:04 +02:00
Jonas Jensen
c8725766bd C++: Fixes from Geoffrey's review round 2 2019-09-09 11:04:04 +02:00
Jonas Jensen
64e2277904 C++: Don't use @param in QLDoc
It superficially looks like `@param` is supported in QLDoc, but this is
mostly an accident of how its parser works. Attributes starting with `@`
are only intended to be used in the top-level QLDoc of a query, and
there can only be one of each attribute. If there are multiple `@param`
entries, the QLDoc parser will only keep the first one.

Even though `parseConvSpec` in `Scanf.qll` documented multiple
parameters, only the first one would be shown in an IDE. The
corresponding predicate in `Print.qll` documented only its first
parameter, perhaps because of an autoformatting accident earlier in
time. I've attempted to reconstruct documentation for its other
parameters based on its sibling in `Scanf.qll`.
2019-09-09 11:04:04 +02:00
Jonas Jensen
8524b95baa C++: Simplify has{Copy,Move}Signature
These functions were overly complicated, and the comments explaining the
complications did not auto-format well. A reference type cannot have
specifiers on it, so it's fine to call `getUnspecifiedType` before
checking if it's a reference type.
2019-09-09 11:04:04 +02:00
Jonas Jensen
8e98d42504 C++: Turn more "short" comments into "long"
The autoformatter is opinionated about comment styles and assumes that
"short" comments attach to the following item while "long" comments are
items themselves. I found top-level short comments with the following
two commands and then searched the output for empty lines that came
after the comment.

    git grep -A1 '^/\* .*\*/' cpp/ql/src
    git grep -A1 '^//' 'cpp/ql/src/**/*.ql*'
2019-09-09 11:04:04 +02:00
Jonas Jensen
95f53639b1 C++: Fixes to avoid confusing autoformat
These issues were found by Geoffrey in PR review.
2019-09-09 11:04:04 +02:00