From 3782d1b6e4fd52d9db187d98bd6b512c2bab5f79 Mon Sep 17 00:00:00 2001 From: james Date: Wed, 27 Nov 2019 12:28:57 +0000 Subject: [PATCH 01/17] docs: update links on opening slide --- docs/language/ql-training/cpp/bad-overflow-guard.rst | 2 +- docs/language/ql-training/cpp/control-flow-cpp.rst | 2 +- docs/language/ql-training/cpp/data-flow-cpp.rst | 2 +- docs/language/ql-training/cpp/global-data-flow-cpp.rst | 2 +- docs/language/ql-training/cpp/intro-ql-cpp.rst | 2 +- docs/language/ql-training/cpp/snprintf.rst | 2 +- docs/language/ql-training/java/apache-struts-java.rst | 2 +- docs/language/ql-training/java/data-flow-java.rst | 2 +- docs/language/ql-training/java/global-data-flow-java.rst | 2 +- docs/language/ql-training/java/intro-ql-java.rst | 2 +- docs/language/ql-training/java/query-injection-java.rst | 2 +- 11 files changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/language/ql-training/cpp/bad-overflow-guard.rst b/docs/language/ql-training/cpp/bad-overflow-guard.rst index dceeec3320f..1ae193b10f0 100644 --- a/docs/language/ql-training/cpp/bad-overflow-guard.rst +++ b/docs/language/ql-training/cpp/bad-overflow-guard.rst @@ -11,7 +11,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `ChakraCore database `__ .. note:: diff --git a/docs/language/ql-training/cpp/control-flow-cpp.rst b/docs/language/ql-training/cpp/control-flow-cpp.rst index f72633b714f..64c34b46328 100644 --- a/docs/language/ql-training/cpp/control-flow-cpp.rst +++ b/docs/language/ql-training/cpp/control-flow-cpp.rst @@ -13,7 +13,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `ChakraCore database `__ .. note:: diff --git a/docs/language/ql-training/cpp/data-flow-cpp.rst b/docs/language/ql-training/cpp/data-flow-cpp.rst index 32a3dfa233b..9d2f29863c9 100644 --- a/docs/language/ql-training/cpp/data-flow-cpp.rst +++ b/docs/language/ql-training/cpp/data-flow-cpp.rst @@ -11,7 +11,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `dotnet/coreclr database `__ .. note:: diff --git a/docs/language/ql-training/cpp/global-data-flow-cpp.rst b/docs/language/ql-training/cpp/global-data-flow-cpp.rst index aaa567e7a8e..8fb3f64b534 100644 --- a/docs/language/ql-training/cpp/global-data-flow-cpp.rst +++ b/docs/language/ql-training/cpp/global-data-flow-cpp.rst @@ -11,7 +11,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `dotnet/coreclr database `__ .. note:: diff --git a/docs/language/ql-training/cpp/intro-ql-cpp.rst b/docs/language/ql-training/cpp/intro-ql-cpp.rst index 4e2cd4ca0b4..72fa57308f4 100644 --- a/docs/language/ql-training/cpp/intro-ql-cpp.rst +++ b/docs/language/ql-training/cpp/intro-ql-cpp.rst @@ -11,7 +11,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `exiv2 database `__ .. note:: diff --git a/docs/language/ql-training/cpp/snprintf.rst b/docs/language/ql-training/cpp/snprintf.rst index 3b76f1a4ce2..1591531460e 100644 --- a/docs/language/ql-training/cpp/snprintf.rst +++ b/docs/language/ql-training/cpp/snprintf.rst @@ -11,7 +11,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `rsyslog database `__ .. note:: diff --git a/docs/language/ql-training/java/apache-struts-java.rst b/docs/language/ql-training/java/apache-struts-java.rst index c21ad4608ca..67864699673 100644 --- a/docs/language/ql-training/java/apache-struts-java.rst +++ b/docs/language/ql-training/java/apache-struts-java.rst @@ -15,7 +15,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `Apache Struts database `__ .. note:: diff --git a/docs/language/ql-training/java/data-flow-java.rst b/docs/language/ql-training/java/data-flow-java.rst index 93ccc87b7d1..85ff8688758 100644 --- a/docs/language/ql-training/java/data-flow-java.rst +++ b/docs/language/ql-training/java/data-flow-java.rst @@ -11,7 +11,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `VIVO Vitro database `__ .. note:: diff --git a/docs/language/ql-training/java/global-data-flow-java.rst b/docs/language/ql-training/java/global-data-flow-java.rst index d65fd6133c5..80d13fbadac 100644 --- a/docs/language/ql-training/java/global-data-flow-java.rst +++ b/docs/language/ql-training/java/global-data-flow-java.rst @@ -11,7 +11,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `Apache Struts database `__ .. note:: diff --git a/docs/language/ql-training/java/intro-ql-java.rst b/docs/language/ql-training/java/intro-ql-java.rst index 1666d9caca8..2d5a98fc944 100644 --- a/docs/language/ql-training/java/intro-ql-java.rst +++ b/docs/language/ql-training/java/intro-ql-java.rst @@ -11,7 +11,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `Apache Struts database `__ .. note:: diff --git a/docs/language/ql-training/java/query-injection-java.rst b/docs/language/ql-training/java/query-injection-java.rst index 799c99e5209..f9911de137b 100644 --- a/docs/language/ql-training/java/query-injection-java.rst +++ b/docs/language/ql-training/java/query-injection-java.rst @@ -11,7 +11,7 @@ Setup For this example you should download: -- `QL for Eclipse `__ +- `CodeQL for Visual Studio Code `__ - `VIVO Vitro database `__ .. note:: From 24857e5616d1cb2ba5557bff11af9e304102efc0 Mon Sep 17 00:00:00 2001 From: james Date: Wed, 27 Nov 2019 15:05:08 +0000 Subject: [PATCH 02/17] docs: update or remove other uses of QL4E --- docs/language/ql-training/java/intro-ql-java.rst | 4 ++-- docs/language/ql-training/slide-snippets/intro-ql-general.rst | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/language/ql-training/java/intro-ql-java.rst b/docs/language/ql-training/java/intro-ql-java.rst index 2d5a98fc944..f93a619c142 100644 --- a/docs/language/ql-training/java/intro-ql-java.rst +++ b/docs/language/ql-training/java/intro-ql-java.rst @@ -105,8 +105,8 @@ Each query library also implicitly defines a module. .. note:: - Queries are always contained in query files with the file extension ``.ql``. `Quick queries `__, run in `QL for Eclipse `__, are no exception: the quick query window maintains a temporary QL file in the background. - + Queries are always contained in query files with the file extension ``.ql``. + Parts of queries can be lifted into `library files `__ with the extension ``.qll``. Definitions within such libraries can be brought into scope using “import” statements, and similarly QLL files can import each other’s definitions using “import” statements. Logic can be encapsulated as user-defined `predicates `__ and `classes `__, and organized into `modules `__. Each QLL file implicitly defines a module, but QL and QLL files can also contain explicit module definitions, as we will see later. diff --git a/docs/language/ql-training/slide-snippets/intro-ql-general.rst b/docs/language/ql-training/slide-snippets/intro-ql-general.rst index ae0c0cdabb5..b53c6bbb015 100644 --- a/docs/language/ql-training/slide-snippets/intro-ql-general.rst +++ b/docs/language/ql-training/slide-snippets/intro-ql-general.rst @@ -109,7 +109,7 @@ Analysis overview Queries are written in `QL `__ and usually depend on one or more of the `standard CodeQL libraries `__ (and of course you can write your own custom libraries). They are compiled into an efficiently executable format by the QL compiler and then run on a CodeQL database by the QL evaluator, either on a remote worker machine or locally on a developer’s machine. - Query results can be interpreted and presented in a variety of ways, including displaying them in an `IDE plugin `__ such as QL for Eclipse, or in a web dashboard as on `LGTM `__. + Query results can be interpreted and presented in a variety of ways, including displaying them in an `IDE extension `__ such as CodeQL for Visual Studio Code, or in a web dashboard as on `LGTM `__. Introducing QL ============== From 931cc73d1ee733f58ac53e6e54df8f68c09213c9 Mon Sep 17 00:00:00 2001 From: james Date: Wed, 27 Nov 2019 15:05:50 +0000 Subject: [PATCH 03/17] docs: add brief instructions for using databases in VS Code --- .../slides-semmle-2/static/theme/css/default.css | 9 +++++++++ .../ql-training/slide-snippets/database-note.rst | 10 +++++++++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/docs/language/ql-training/_static-training/slides-semmle-2/static/theme/css/default.css b/docs/language/ql-training/_static-training/slides-semmle-2/static/theme/css/default.css index 3f8a89c6555..dad8b305ed6 100644 --- a/docs/language/ql-training/_static-training/slides-semmle-2/static/theme/css/default.css +++ b/docs/language/ql-training/_static-training/slides-semmle-2/static/theme/css/default.css @@ -1667,6 +1667,15 @@ li > ul > li { margin-bottom: 0; } +.admonition.note ol { + width: 90%; + margin-left: 2.2em; +} + +.admonition.note ol > li { + margin-top: 0.5em; +} + /* * extra styles for more appropriate for syntax highlighting * diff --git a/docs/language/ql-training/slide-snippets/database-note.rst b/docs/language/ql-training/slide-snippets/database-note.rst index af6dce23728..f0bfbeca07f 100644 --- a/docs/language/ql-training/slide-snippets/database-note.rst +++ b/docs/language/ql-training/slide-snippets/database-note.rst @@ -1 +1,9 @@ -Note that results generated in the query console are likely to differ to those generated in the QL plugin as LGTM.com analyzes the most recent revisions of each project that has been added–the CodeQL database available to download above is based on an historical version of the codebase. \ No newline at end of file +You can download the database as a zip file by clicking the link on the slide above. To use the database in CodeQL for Visual Studio Code: + +#. Unzip the file +#. Add the unzipped database to Visual Studio Code +#. Upgrade the database if necessary + +For further information, see `Using the extension `__ in the CodeQL for Visual Studio Code help. + +Note that results generated in the query console are likely to differ to those generated in CodeQL for Visual Studio Code as LGTM.com analyzes the most recent revisions of each project that has been added–the CodeQL database available to download above is based on an historical version of the codebase. \ No newline at end of file From e59d65e67fb65ea1afafbc36f44d688aefe63221 Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Fri, 6 Dec 2019 15:41:18 +0000 Subject: [PATCH 04/17] Update supported versions for 1.23 release --- docs/language/support/versions-compilers.csv | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/language/support/versions-compilers.csv b/docs/language/support/versions-compilers.csv index 74f69d662b6..90d4c66a0a3 100644 --- a/docs/language/support/versions-compilers.csv +++ b/docs/language/support/versions-compilers.csv @@ -6,14 +6,14 @@ GNU extensions (up to GCC 8.3), Microsoft extensions (up to VS 2019), Arm Compiler 5.0 [2]_.","``.cpp``, ``.c++``, ``.cxx``, ``.hpp``, ``.hh``, ``.h++``, ``.hxx``, ``.c``, ``.cc``, ``.h``" -C#,C# up to 7.3. with .NET up to 4.8 [3]_.,"Microsoft Visual Studio up to 2019, +C#,C# up to 8.0. with .NET up to 4.8 [3]_.,"Microsoft Visual Studio up to 2019, -.NET Core up to 2.2","``.sln``, ``.csproj``, ``.cs``, ``.cshtml``, ``.xaml``" +.NET Core up to 3.0","``.sln``, ``.csproj``, ``.cs``, ``.cshtml``, ``.xaml``" COBOL,ANSI 85 or newer [4]_.,Not applicable,"``.cbl``, ``.CBL``, ``.cpy``, ``.CPY``, ``.copy``, ``.COPY``" Go (aka Golang), "Go up to 1.13", "Go 1.11 or more recent", ``.go`` -Java,"Java 6 to 12 [5]_.","javac (OpenJDK and Oracle JDK), +Java,"Java 6 to 13 [5]_.","javac (OpenJDK and Oracle JDK), Eclipse compiler for Java (ECJ) [6]_.",``.java`` JavaScript,ECMAScript 2019 or lower,Not applicable,"``.js``, ``.jsx``, ``.mjs``, ``.es``, ``.es6``, ``.htm``, ``.html``, ``.xhm``, ``.xhtml``, ``.vue``, ``.json``, ``.yaml``, ``.yml``, ``.raml``, ``.xml`` [7]_." -Python,"2.7, 3.5, 3.6, 3.7",Not applicable,``.py`` -TypeScript [8]_.,"2.6-3.5",Standard TypeScript compiler,"``.ts``, ``.tsx``" +Python,"2.7, 3.5, 3.6, 3.7, 3.8",Not applicable,``.py`` +TypeScript [8]_.,"2.6-3.7",Standard TypeScript compiler,"``.ts``, ``.tsx``" From f40b1b570c7aec96cf4f2d1fca6838f168efbd30 Mon Sep 17 00:00:00 2001 From: Shati Patel Date: Fri, 6 Dec 2019 15:56:01 +0000 Subject: [PATCH 05/17] Fix typo --- csharp/ql/src/semmle/code/csharp/dataflow/SSA.qll | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/csharp/ql/src/semmle/code/csharp/dataflow/SSA.qll b/csharp/ql/src/semmle/code/csharp/dataflow/SSA.qll index a4a5bfed774..adc92df444f 100644 --- a/csharp/ql/src/semmle/code/csharp/dataflow/SSA.qll +++ b/csharp/ql/src/semmle/code/csharp/dataflow/SSA.qll @@ -1765,7 +1765,7 @@ module Ssa { * * The write is live because of the implicit entry definition `def`, which can be * reached using one or more calls (as indicated by `additionalCalls`), starting - * from call `c`. That is, data can flow from the write at index `i` into the the + * from call `c`. That is, data can flow from the write at index `i` into the * callable containing `def`. * * Example: @@ -2329,7 +2329,7 @@ module Ssa { * ``` * * If this definition is the update of `i` on line 5, then the value may be read inside - * `M2` via the the call on line 6. + * `M2` via the call on line 6. */ predicate isCapturedVariableDefinitionFlowIn( ImplicitEntryDefinition def, ControlFlow::Nodes::ElementNode c, boolean additionalCalls @@ -2356,7 +2356,7 @@ module Ssa { * ``` * * If this definition is the update of `i` on line 4, then the value may be read outside - * of `M2` via the the call on line 5. + * of `M2` via the call on line 5. */ predicate isCapturedVariableDefinitionFlowOut( ImplicitCallDefinition cdef, boolean additionalCalls From 07f35e8b5892f7c11014267ac6f59982b0c55858 Mon Sep 17 00:00:00 2001 From: james Date: Mon, 9 Dec 2019 12:17:30 +0000 Subject: [PATCH 06/17] docs: remove some full stops --- docs/language/support/versions-compilers.csv | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/language/support/versions-compilers.csv b/docs/language/support/versions-compilers.csv index 90d4c66a0a3..bb5009cb1fa 100644 --- a/docs/language/support/versions-compilers.csv +++ b/docs/language/support/versions-compilers.csv @@ -5,15 +5,15 @@ GNU extensions (up to GCC 8.3), Microsoft extensions (up to VS 2019), -Arm Compiler 5.0 [2]_.","``.cpp``, ``.c++``, ``.cxx``, ``.hpp``, ``.hh``, ``.h++``, ``.hxx``, ``.c``, ``.cc``, ``.h``" -C#,C# up to 8.0. with .NET up to 4.8 [3]_.,"Microsoft Visual Studio up to 2019, +Arm Compiler 5.0 [2]_","``.cpp``, ``.c++``, ``.cxx``, ``.hpp``, ``.hh``, ``.h++``, ``.hxx``, ``.c``, ``.cc``, ``.h``" +C#,C# up to 8.0. with .NET up to 4.8 [3]_,"Microsoft Visual Studio up to 2019, .NET Core up to 3.0","``.sln``, ``.csproj``, ``.cs``, ``.cshtml``, ``.xaml``" -COBOL,ANSI 85 or newer [4]_.,Not applicable,"``.cbl``, ``.CBL``, ``.cpy``, ``.CPY``, ``.copy``, ``.COPY``" +COBOL,ANSI 85 or newer [4]_,Not applicable,"``.cbl``, ``.CBL``, ``.cpy``, ``.CPY``, ``.copy``, ``.COPY``" Go (aka Golang), "Go up to 1.13", "Go 1.11 or more recent", ``.go`` -Java,"Java 6 to 13 [5]_.","javac (OpenJDK and Oracle JDK), +Java,"Java 6 to 13 [5]_","javac (OpenJDK and Oracle JDK), -Eclipse compiler for Java (ECJ) [6]_.",``.java`` -JavaScript,ECMAScript 2019 or lower,Not applicable,"``.js``, ``.jsx``, ``.mjs``, ``.es``, ``.es6``, ``.htm``, ``.html``, ``.xhm``, ``.xhtml``, ``.vue``, ``.json``, ``.yaml``, ``.yml``, ``.raml``, ``.xml`` [7]_." +Eclipse compiler for Java (ECJ) [6]_",``.java`` +JavaScript,ECMAScript 2019 or lower,Not applicable,"``.js``, ``.jsx``, ``.mjs``, ``.es``, ``.es6``, ``.htm``, ``.html``, ``.xhm``, ``.xhtml``, ``.vue``, ``.json``, ``.yaml``, ``.yml``, ``.raml``, ``.xml`` [7]_" Python,"2.7, 3.5, 3.6, 3.7, 3.8",Not applicable,``.py`` -TypeScript [8]_.,"2.6-3.7",Standard TypeScript compiler,"``.ts``, ``.tsx``" +TypeScript [8]_,"2.6-3.7",Standard TypeScript compiler,"``.ts``, ``.tsx``" From ff7b6e2ce754ab9aed3b0913464713d9050b1936 Mon Sep 17 00:00:00 2001 From: Jonas Jensen Date: Mon, 9 Dec 2019 15:36:51 +0100 Subject: [PATCH 07/17] C++: Add new queries in 1.23 to legacy suites I didn't add `JapaneseEraDate.ql` since it's not displayed on LGTM by default. --- cpp/config/suites/cpp/correctness | 2 ++ 1 file changed, 2 insertions(+) diff --git a/cpp/config/suites/cpp/correctness b/cpp/config/suites/cpp/correctness index d1079bdcf60..5e1350b4a27 100644 --- a/cpp/config/suites/cpp/correctness +++ b/cpp/config/suites/cpp/correctness @@ -24,6 +24,8 @@ + semmlecode-cpp-queries/Likely Bugs/Arithmetic/FloatComparison.ql: /Correctness/Common Errors + semmlecode-cpp-queries/Likely Bugs/Arithmetic/BitwiseSignCheck.ql: /Correctness/Common Errors + semmlecode-cpp-queries/Likely Bugs/Arithmetic/BadAdditionOverflowCheck.ql: /Correctness/Common Errors ++ semmlecode-cpp-queries/Likely Bugs/Arithmetic/SignedOverflowCheck.ql: /Correctness/Common Errors ++ semmlecode-cpp-queries/Likely Bugs/Memory Management/PointerOverflow.ql: /Correctness/Common Errors + semmlecode-cpp-queries/Likely Bugs/NestedLoopSameVar.ql: /Correctness/Common Errors + semmlecode-cpp-queries/Likely Bugs/UseInOwnInitializer.ql: /Correctness/Common Errors + semmlecode-cpp-queries/Critical/NewArrayDeleteMismatch.ql: /Correctness/Common Errors From 9bbebfc01fbf7553b035aff3feb0eaa07bc7f977 Mon Sep 17 00:00:00 2001 From: Jonas Jensen Date: Mon, 9 Dec 2019 17:00:33 +0100 Subject: [PATCH 08/17] C++: Add new queries to C suite too --- cpp/config/suites/c/correctness | 2 ++ 1 file changed, 2 insertions(+) diff --git a/cpp/config/suites/c/correctness b/cpp/config/suites/c/correctness index b13c4f61752..fcf0f28fb02 100644 --- a/cpp/config/suites/c/correctness +++ b/cpp/config/suites/c/correctness @@ -23,6 +23,8 @@ + semmlecode-cpp-queries/Likely Bugs/Arithmetic/FloatComparison.ql: /Correctness/Common Errors + semmlecode-cpp-queries/Likely Bugs/Arithmetic/BitwiseSignCheck.ql: /Correctness/Common Errors + semmlecode-cpp-queries/Likely Bugs/Arithmetic/BadAdditionOverflowCheck.ql: /Correctness/Common Errors ++ semmlecode-cpp-queries/Likely Bugs/Arithmetic/SignedOverflowCheck.ql: /Correctness/Common Errors ++ semmlecode-cpp-queries/Likely Bugs/Memory Management/PointerOverflow.ql: /Correctness/Common Errors + semmlecode-cpp-queries/Likely Bugs/NestedLoopSameVar.ql: /Correctness/Common Errors + semmlecode-cpp-queries/Likely Bugs/UseInOwnInitializer.ql: /Correctness/Common Errors + semmlecode-cpp-queries/Critical/NewArrayDeleteMismatch.ql: /Correctness/Common Errors From 7c151644f5ed7b60b016614434c4e5ba003cf7e4 Mon Sep 17 00:00:00 2001 From: Jonas Jensen Date: Tue, 10 Dec 2019 13:43:36 +0100 Subject: [PATCH 09/17] C++: Fix getTempVariable join order in IR This join order seems to have broken so it took forever on wireshark/wireshark. --- .../raw/internal/TranslatedElement.qll | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/TranslatedElement.qll b/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/TranslatedElement.qll index f118561546b..023ed92fa3a 100644 --- a/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/TranslatedElement.qll +++ b/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/TranslatedElement.qll @@ -684,9 +684,17 @@ abstract class TranslatedElement extends TTranslatedElement { * Gets the temporary variable generated by this element with tag `tag`. */ final IRTempVariable getTempVariable(TempVariableTag tag) { - result.getAST() = getAST() and - result.getTag() = tag and - hasTempVariable(tag, _) + exists(Locatable ast | + result.getAST() = ast and + result.getTag() = tag and + hasTempVariableAndAST(tag, ast) + ) + } + + pragma[noinline] + private predicate hasTempVariableAndAST(TempVariableTag tag, Locatable ast) { + hasTempVariable(tag, _) and + ast = getAST() } /** From 66876d0f637b0469e2246aca24965afa8d18f228 Mon Sep 17 00:00:00 2001 From: Jonas Jensen Date: Tue, 10 Dec 2019 16:03:39 +0100 Subject: [PATCH 10/17] C++: Compute isInCycle only for raw IR On wireshark/wireshark, `isInCycle` ran into a low-memory loop on the `aliased_ssa` stage. It shouldn't be necessary to detect cycles after the `raw` stage, so this commit moves cycle detection into the `Construction` modules and makes it a no-op in `SSAConstruction.qll`. --- .../ir/implementation/aliased_ssa/Operand.qll | 26 ++----------------- .../aliased_ssa/internal/SSAConstruction.qll | 10 +++++++ .../cpp/ir/implementation/raw/Operand.qll | 26 ++----------------- .../raw/internal/IRConstruction.qll | 23 ++++++++++++++++ .../implementation/unaliased_ssa/Operand.qll | 26 ++----------------- .../internal/SSAConstruction.qll | 10 +++++++ .../csharp/ir/implementation/raw/Operand.qll | 26 ++----------------- .../raw/internal/IRConstruction.qll | 23 ++++++++++++++++ .../implementation/unaliased_ssa/Operand.qll | 26 ++----------------- .../internal/SSAConstruction.qll | 10 +++++++ 10 files changed, 86 insertions(+), 120 deletions(-) diff --git a/cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/Operand.qll b/cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/Operand.qll index 07dd107bf12..2d19b053188 100644 --- a/cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/Operand.qll +++ b/cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/Operand.qll @@ -11,13 +11,13 @@ cached private newtype TOperand = TRegisterOperand(Instruction useInstr, RegisterOperandTag tag, Instruction defInstr) { defInstr = Construction::getRegisterOperandDefinition(useInstr, tag) and - not isInCycle(useInstr) + not Construction::isInCycle(useInstr) } or TNonPhiMemoryOperand( Instruction useInstr, MemoryOperandTag tag, Instruction defInstr, Overlap overlap ) { defInstr = Construction::getMemoryOperandDefinition(useInstr, tag, overlap) and - not isInCycle(useInstr) + not Construction::isInCycle(useInstr) } or TPhiOperand( PhiInstruction useInstr, Instruction defInstr, IRBlock predecessorBlock, Overlap overlap @@ -25,28 +25,6 @@ private newtype TOperand = defInstr = Construction::getPhiOperandDefinition(useInstr, predecessorBlock, overlap) } -/** Gets a non-phi instruction that defines an operand of `instr`. */ -private Instruction getNonPhiOperandDef(Instruction instr) { - result = Construction::getRegisterOperandDefinition(instr, _) - or - result = Construction::getMemoryOperandDefinition(instr, _, _) -} - -/** - * Holds if `instr` is part of a cycle in the operand graph that doesn't go - * through a phi instruction and therefore should be impossible. - * - * If such cycles are present, either due to a programming error in the IR - * generation or due to a malformed database, it can cause infinite loops in - * analyses that assume a cycle-free graph of non-phi operands. Therefore it's - * better to remove these operands than to leave cycles in the operand graph. - */ -pragma[noopt] -private predicate isInCycle(Instruction instr) { - instr instanceof Instruction and - getNonPhiOperandDef+(instr) = instr -} - /** * A source operand of an `Instruction`. The operand represents a value consumed by the instruction. */ diff --git a/cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/SSAConstruction.qll b/cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/SSAConstruction.qll index a789edc7590..72edd497fcc 100644 --- a/cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/SSAConstruction.qll +++ b/cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/SSAConstruction.qll @@ -133,6 +133,16 @@ private module Cached { overlap instanceof MustExactlyOverlap } + /** + * Holds if `instr` is part of a cycle in the operand graph that doesn't go + * through a phi instruction and therefore should be impossible. + * + * For performance reasons, this predicate is not implemented (never holds) + * for the SSA stages of the IR. + */ + cached + predicate isInCycle(Instruction instr) { none() } + cached Language::LanguageType getInstructionOperandType(Instruction instr, TypedOperandTag tag) { exists(OldInstruction oldInstruction, OldIR::TypedOperand oldOperand | diff --git a/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/Operand.qll b/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/Operand.qll index 07dd107bf12..2d19b053188 100644 --- a/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/Operand.qll +++ b/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/Operand.qll @@ -11,13 +11,13 @@ cached private newtype TOperand = TRegisterOperand(Instruction useInstr, RegisterOperandTag tag, Instruction defInstr) { defInstr = Construction::getRegisterOperandDefinition(useInstr, tag) and - not isInCycle(useInstr) + not Construction::isInCycle(useInstr) } or TNonPhiMemoryOperand( Instruction useInstr, MemoryOperandTag tag, Instruction defInstr, Overlap overlap ) { defInstr = Construction::getMemoryOperandDefinition(useInstr, tag, overlap) and - not isInCycle(useInstr) + not Construction::isInCycle(useInstr) } or TPhiOperand( PhiInstruction useInstr, Instruction defInstr, IRBlock predecessorBlock, Overlap overlap @@ -25,28 +25,6 @@ private newtype TOperand = defInstr = Construction::getPhiOperandDefinition(useInstr, predecessorBlock, overlap) } -/** Gets a non-phi instruction that defines an operand of `instr`. */ -private Instruction getNonPhiOperandDef(Instruction instr) { - result = Construction::getRegisterOperandDefinition(instr, _) - or - result = Construction::getMemoryOperandDefinition(instr, _, _) -} - -/** - * Holds if `instr` is part of a cycle in the operand graph that doesn't go - * through a phi instruction and therefore should be impossible. - * - * If such cycles are present, either due to a programming error in the IR - * generation or due to a malformed database, it can cause infinite loops in - * analyses that assume a cycle-free graph of non-phi operands. Therefore it's - * better to remove these operands than to leave cycles in the operand graph. - */ -pragma[noopt] -private predicate isInCycle(Instruction instr) { - instr instanceof Instruction and - getNonPhiOperandDef+(instr) = instr -} - /** * A source operand of an `Instruction`. The operand represents a value consumed by the instruction. */ diff --git a/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/IRConstruction.qll b/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/IRConstruction.qll index 628aae26995..f0c006111f5 100644 --- a/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/IRConstruction.qll +++ b/cpp/ql/src/semmle/code/cpp/ir/implementation/raw/internal/IRConstruction.qll @@ -93,6 +93,29 @@ private module Cached { overlap instanceof MustTotallyOverlap } + /** Gets a non-phi instruction that defines an operand of `instr`. */ + private Instruction getNonPhiOperandDef(Instruction instr) { + result = getRegisterOperandDefinition(instr, _) + or + result = getMemoryOperandDefinition(instr, _, _) + } + + /** + * Holds if `instr` is part of a cycle in the operand graph that doesn't go + * through a phi instruction and therefore should be impossible. + * + * If such cycles are present, either due to a programming error in the IR + * generation or due to a malformed database, it can cause infinite loops in + * analyses that assume a cycle-free graph of non-phi operands. Therefore it's + * better to remove these operands than to leave cycles in the operand graph. + */ + pragma[noopt] + cached + predicate isInCycle(Instruction instr) { + instr instanceof Instruction and + getNonPhiOperandDef+(instr) = instr + } + cached CppType getInstructionOperandType(Instruction instruction, TypedOperandTag tag) { // For all `LoadInstruction`s, the operand type of the `LoadOperand` is the same as diff --git a/cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/Operand.qll b/cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/Operand.qll index 07dd107bf12..2d19b053188 100644 --- a/cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/Operand.qll +++ b/cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/Operand.qll @@ -11,13 +11,13 @@ cached private newtype TOperand = TRegisterOperand(Instruction useInstr, RegisterOperandTag tag, Instruction defInstr) { defInstr = Construction::getRegisterOperandDefinition(useInstr, tag) and - not isInCycle(useInstr) + not Construction::isInCycle(useInstr) } or TNonPhiMemoryOperand( Instruction useInstr, MemoryOperandTag tag, Instruction defInstr, Overlap overlap ) { defInstr = Construction::getMemoryOperandDefinition(useInstr, tag, overlap) and - not isInCycle(useInstr) + not Construction::isInCycle(useInstr) } or TPhiOperand( PhiInstruction useInstr, Instruction defInstr, IRBlock predecessorBlock, Overlap overlap @@ -25,28 +25,6 @@ private newtype TOperand = defInstr = Construction::getPhiOperandDefinition(useInstr, predecessorBlock, overlap) } -/** Gets a non-phi instruction that defines an operand of `instr`. */ -private Instruction getNonPhiOperandDef(Instruction instr) { - result = Construction::getRegisterOperandDefinition(instr, _) - or - result = Construction::getMemoryOperandDefinition(instr, _, _) -} - -/** - * Holds if `instr` is part of a cycle in the operand graph that doesn't go - * through a phi instruction and therefore should be impossible. - * - * If such cycles are present, either due to a programming error in the IR - * generation or due to a malformed database, it can cause infinite loops in - * analyses that assume a cycle-free graph of non-phi operands. Therefore it's - * better to remove these operands than to leave cycles in the operand graph. - */ -pragma[noopt] -private predicate isInCycle(Instruction instr) { - instr instanceof Instruction and - getNonPhiOperandDef+(instr) = instr -} - /** * A source operand of an `Instruction`. The operand represents a value consumed by the instruction. */ diff --git a/cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll b/cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll index a789edc7590..72edd497fcc 100644 --- a/cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll +++ b/cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll @@ -133,6 +133,16 @@ private module Cached { overlap instanceof MustExactlyOverlap } + /** + * Holds if `instr` is part of a cycle in the operand graph that doesn't go + * through a phi instruction and therefore should be impossible. + * + * For performance reasons, this predicate is not implemented (never holds) + * for the SSA stages of the IR. + */ + cached + predicate isInCycle(Instruction instr) { none() } + cached Language::LanguageType getInstructionOperandType(Instruction instr, TypedOperandTag tag) { exists(OldInstruction oldInstruction, OldIR::TypedOperand oldOperand | diff --git a/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/Operand.qll b/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/Operand.qll index 07dd107bf12..2d19b053188 100644 --- a/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/Operand.qll +++ b/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/Operand.qll @@ -11,13 +11,13 @@ cached private newtype TOperand = TRegisterOperand(Instruction useInstr, RegisterOperandTag tag, Instruction defInstr) { defInstr = Construction::getRegisterOperandDefinition(useInstr, tag) and - not isInCycle(useInstr) + not Construction::isInCycle(useInstr) } or TNonPhiMemoryOperand( Instruction useInstr, MemoryOperandTag tag, Instruction defInstr, Overlap overlap ) { defInstr = Construction::getMemoryOperandDefinition(useInstr, tag, overlap) and - not isInCycle(useInstr) + not Construction::isInCycle(useInstr) } or TPhiOperand( PhiInstruction useInstr, Instruction defInstr, IRBlock predecessorBlock, Overlap overlap @@ -25,28 +25,6 @@ private newtype TOperand = defInstr = Construction::getPhiOperandDefinition(useInstr, predecessorBlock, overlap) } -/** Gets a non-phi instruction that defines an operand of `instr`. */ -private Instruction getNonPhiOperandDef(Instruction instr) { - result = Construction::getRegisterOperandDefinition(instr, _) - or - result = Construction::getMemoryOperandDefinition(instr, _, _) -} - -/** - * Holds if `instr` is part of a cycle in the operand graph that doesn't go - * through a phi instruction and therefore should be impossible. - * - * If such cycles are present, either due to a programming error in the IR - * generation or due to a malformed database, it can cause infinite loops in - * analyses that assume a cycle-free graph of non-phi operands. Therefore it's - * better to remove these operands than to leave cycles in the operand graph. - */ -pragma[noopt] -private predicate isInCycle(Instruction instr) { - instr instanceof Instruction and - getNonPhiOperandDef+(instr) = instr -} - /** * A source operand of an `Instruction`. The operand represents a value consumed by the instruction. */ diff --git a/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/IRConstruction.qll b/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/IRConstruction.qll index 72881a7c720..5c5420ed7db 100644 --- a/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/IRConstruction.qll +++ b/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/IRConstruction.qll @@ -95,6 +95,29 @@ private module Cached { .getInstructionOperand(getInstructionTag(instruction), tag) } + /** Gets a non-phi instruction that defines an operand of `instr`. */ + private Instruction getNonPhiOperandDef(Instruction instr) { + result = getRegisterOperandDefinition(instr, _) + or + result = getMemoryOperandDefinition(instr, _, _) + } + + /** + * Holds if `instr` is part of a cycle in the operand graph that doesn't go + * through a phi instruction and therefore should be impossible. + * + * If such cycles are present, either due to a programming error in the IR + * generation or due to a malformed database, it can cause infinite loops in + * analyses that assume a cycle-free graph of non-phi operands. Therefore it's + * better to remove these operands than to leave cycles in the operand graph. + */ + pragma[noopt] + cached + predicate isInCycle(Instruction instr) { + instr instanceof Instruction and + getNonPhiOperandDef+(instr) = instr + } + cached CSharpType getInstructionOperandType(Instruction instruction, TypedOperandTag tag) { // For all `LoadInstruction`s, the operand type of the `LoadOperand` is the same as diff --git a/csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/Operand.qll b/csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/Operand.qll index 07dd107bf12..2d19b053188 100644 --- a/csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/Operand.qll +++ b/csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/Operand.qll @@ -11,13 +11,13 @@ cached private newtype TOperand = TRegisterOperand(Instruction useInstr, RegisterOperandTag tag, Instruction defInstr) { defInstr = Construction::getRegisterOperandDefinition(useInstr, tag) and - not isInCycle(useInstr) + not Construction::isInCycle(useInstr) } or TNonPhiMemoryOperand( Instruction useInstr, MemoryOperandTag tag, Instruction defInstr, Overlap overlap ) { defInstr = Construction::getMemoryOperandDefinition(useInstr, tag, overlap) and - not isInCycle(useInstr) + not Construction::isInCycle(useInstr) } or TPhiOperand( PhiInstruction useInstr, Instruction defInstr, IRBlock predecessorBlock, Overlap overlap @@ -25,28 +25,6 @@ private newtype TOperand = defInstr = Construction::getPhiOperandDefinition(useInstr, predecessorBlock, overlap) } -/** Gets a non-phi instruction that defines an operand of `instr`. */ -private Instruction getNonPhiOperandDef(Instruction instr) { - result = Construction::getRegisterOperandDefinition(instr, _) - or - result = Construction::getMemoryOperandDefinition(instr, _, _) -} - -/** - * Holds if `instr` is part of a cycle in the operand graph that doesn't go - * through a phi instruction and therefore should be impossible. - * - * If such cycles are present, either due to a programming error in the IR - * generation or due to a malformed database, it can cause infinite loops in - * analyses that assume a cycle-free graph of non-phi operands. Therefore it's - * better to remove these operands than to leave cycles in the operand graph. - */ -pragma[noopt] -private predicate isInCycle(Instruction instr) { - instr instanceof Instruction and - getNonPhiOperandDef+(instr) = instr -} - /** * A source operand of an `Instruction`. The operand represents a value consumed by the instruction. */ diff --git a/csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll b/csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll index a789edc7590..72edd497fcc 100644 --- a/csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll +++ b/csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll @@ -133,6 +133,16 @@ private module Cached { overlap instanceof MustExactlyOverlap } + /** + * Holds if `instr` is part of a cycle in the operand graph that doesn't go + * through a phi instruction and therefore should be impossible. + * + * For performance reasons, this predicate is not implemented (never holds) + * for the SSA stages of the IR. + */ + cached + predicate isInCycle(Instruction instr) { none() } + cached Language::LanguageType getInstructionOperandType(Instruction instr, TypedOperandTag tag) { exists(OldInstruction oldInstruction, OldIR::TypedOperand oldOperand | From 5a8407749f64d8a5935778306e96ddd4282854f9 Mon Sep 17 00:00:00 2001 From: Jonas Jensen Date: Wed, 11 Dec 2019 09:10:23 +0100 Subject: [PATCH 11/17] C#: autoformat fixup --- .../raw/internal/IRConstruction.qll | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/IRConstruction.qll b/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/IRConstruction.qll index 5c5420ed7db..2e0e94b425c 100644 --- a/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/IRConstruction.qll +++ b/csharp/ql/src/semmle/code/csharp/ir/implementation/raw/internal/IRConstruction.qll @@ -103,14 +103,14 @@ private module Cached { } /** - * Holds if `instr` is part of a cycle in the operand graph that doesn't go - * through a phi instruction and therefore should be impossible. - * - * If such cycles are present, either due to a programming error in the IR - * generation or due to a malformed database, it can cause infinite loops in - * analyses that assume a cycle-free graph of non-phi operands. Therefore it's - * better to remove these operands than to leave cycles in the operand graph. - */ + * Holds if `instr` is part of a cycle in the operand graph that doesn't go + * through a phi instruction and therefore should be impossible. + * + * If such cycles are present, either due to a programming error in the IR + * generation or due to a malformed database, it can cause infinite loops in + * analyses that assume a cycle-free graph of non-phi operands. Therefore it's + * better to remove these operands than to leave cycles in the operand graph. + */ pragma[noopt] cached predicate isInCycle(Instruction instr) { From d56c02b1b7c592bc497713cb1dc08957514f884c Mon Sep 17 00:00:00 2001 From: james Date: Tue, 10 Sep 2019 14:07:25 +0100 Subject: [PATCH 12/17] docs: start work on debugging queries topic --- .../writing-queries/debugging-queries.rst | 92 +++++++++++++++++++ .../writing-queries/writing-queries.rst | 1 + 2 files changed, 93 insertions(+) create mode 100644 docs/language/learn-ql/writing-queries/debugging-queries.rst diff --git a/docs/language/learn-ql/writing-queries/debugging-queries.rst b/docs/language/learn-ql/writing-queries/debugging-queries.rst new file mode 100644 index 00000000000..e1a9a964b1d --- /dev/null +++ b/docs/language/learn-ql/writing-queries/debugging-queries.rst @@ -0,0 +1,92 @@ +Query writing: common performance issues +======================================== + +This topic offers some simple tips on how to avoid commons problems that can affect the performance of your queries. +Before reading the tips below, it is worth reiterating a few important points about CodeQL and the QL language: + +- In CodeQL `predicates `__ and `classes `__ are all just database `relations `__---that is, sets of tuples in a table. Large predicates generate tables with large numbers of rows, and are therefore expensive to compute. +- The QL language is implemented using standard database operations and `relational algebra `__ (such as join, projection, union, etc.). For further information about query languages and databases, see :doc:`About QL <../about-ql>`. +- Queries is evaluated *bottom-up*, which means that a predicate is not evaluated until **all** of the predicates that it depends on are evaluated. For more information on query evaluation, see `Evaluation of QL programs `__ in the QL handbook. + +Performance tips +---------------- + +Follow the guidelines below to ensure that you don't get get tripped up by the most common CodeQL performance pitfalls. + +Eliminate cartesian products +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The performance of a predicate can often be judged by considering roughly how many results it has. +One way of creating badly performing predicates is by using two variables without relating them in any way, or only relating them using a negation. +This leads to computing the `Cartesian product `__ between the sets of possible values for each variable, potentially generating a huge table of results. + +This can occur whenever you inadvertently fail to specify restrictions on your variables. + +For instance, in the following case none of the parameters are related in the example predicate ``methodAndAClass``, and therefore the results are unrestricted:: + + // BAD! Cross-product + predicate methodAndAClass(Method m, Class c) { + any() + } + +This example shows a similar mistake in a member predicate:: + + class Foo extends Class { + ... + // BAD! Does not use ‘this’ + Method getToString() { + result.getName().matches("ToString") + } + ... + } + +Here, the class ``Method getToString()`` is equivalent to ``predicate getToString(Class this, Method result)``, in which the parameters are unrestricted. +To avoid making this mistake, ``this`` should be restricted in the member predicate ``getToString()`` on the class ``Foo``. + +Finally, consider a predicate of the following form:: + + predicate p(T1 x1, T2 x2) { + or + or + ... + } + +In this situation, if ``x1`` and ``x2`` are not mentioned in all ```` terms, the compiler will produce the Cartesian product between what you wanted and all possible values of the unused parameter. + +Use specific types +~~~~~~~~~~~~~~~~~~ + +`Types `__ provide an upper bound on the size of a relation. +This helps the query optimizer be more effective, so it's generally good to use the most specific types possible. For example:: + + predicate foo(LoggingCall e) + +is preferred over:: + + predicate foo(Expr e) + +From the type context, the query optimizer deduces that some parts of the program are redundant and removes them, or **specializes** them. + +Avoid complex recursion +~~~~~~~~~~~~~~~~~~~~~~~ + +`Recursion `__ is about self-referencing definitions. +It can be extremely powerful as long as it is used appropriately. +On the whole, you should try to make recursive predicates as simple as possible. +That is, you should define a **base case** that allows the predicate to *bottom out*, along with a single **recursive call**:: + + int depth(Stmt s) { + exists(Callable c | c.getBody() = s | result = 0) // base case + or + result = depth(s.getParent()) + 1 // recursive case + } + +.. pull-quote:: Note + + The query optimizer has special data structures for dealing with `transitive closures `__. + If possible, use a transitive closure over a simple recursive predicate, as it is likely to be computed faster. + +Further information +------------------- + +- Find out more about QL in the `QL language handbook `__ and `QL language specification `__. \ No newline at end of file diff --git a/docs/language/learn-ql/writing-queries/writing-queries.rst b/docs/language/learn-ql/writing-queries/writing-queries.rst index c51601b6838..e08048953c0 100644 --- a/docs/language/learn-ql/writing-queries/writing-queries.rst +++ b/docs/language/learn-ql/writing-queries/writing-queries.rst @@ -15,6 +15,7 @@ Information for query writers ../intro-to-data-flow select-statement ../locations + debugging-queries Visit `Learning CodeQL `__ to find basic information about CodeQL. This includes information about the underlying query language QL, as well as help and advice on writing queries for specific programming languages. From 61576caede9f28f1d012cbff32fdf3dd24385f82 Mon Sep 17 00:00:00 2001 From: James Fletcher <42464962+jf205@users.noreply.github.com> Date: Wed, 11 Dec 2019 11:55:45 +0000 Subject: [PATCH 13/17] Apply suggestions from code review Co-Authored-By: Max Schaefer <54907921+max-schaefer@users.noreply.github.com> --- .../learn-ql/writing-queries/debugging-queries.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/language/learn-ql/writing-queries/debugging-queries.rst b/docs/language/learn-ql/writing-queries/debugging-queries.rst index e1a9a964b1d..995384449d7 100644 --- a/docs/language/learn-ql/writing-queries/debugging-queries.rst +++ b/docs/language/learn-ql/writing-queries/debugging-queries.rst @@ -4,9 +4,9 @@ Query writing: common performance issues This topic offers some simple tips on how to avoid commons problems that can affect the performance of your queries. Before reading the tips below, it is worth reiterating a few important points about CodeQL and the QL language: -- In CodeQL `predicates `__ and `classes `__ are all just database `relations `__---that is, sets of tuples in a table. Large predicates generate tables with large numbers of rows, and are therefore expensive to compute. +- CodeQL `predicates `__ and `classes `__ are evaluated to database `tables `__. Large predicates generate large tables with many rows, and are therefore expensive to compute. - The QL language is implemented using standard database operations and `relational algebra `__ (such as join, projection, union, etc.). For further information about query languages and databases, see :doc:`About QL <../about-ql>`. -- Queries is evaluated *bottom-up*, which means that a predicate is not evaluated until **all** of the predicates that it depends on are evaluated. For more information on query evaluation, see `Evaluation of QL programs `__ in the QL handbook. +- Queries are evaluated *bottom-up*, which means that a predicate is not evaluated until **all** of the predicates that it depends on are evaluated. For more information on query evaluation, see `Evaluation of QL programs `__ in the QL handbook. Performance tips ---------------- @@ -35,12 +35,12 @@ This example shows a similar mistake in a member predicate:: ... // BAD! Does not use ‘this’ Method getToString() { - result.getName().matches("ToString") + result.getName() = "ToString" } ... } -Here, the class ``Method getToString()`` is equivalent to ``predicate getToString(Class this, Method result)``, in which the parameters are unrestricted. +Note that while `getToString()` does not declare any parameters, it has two implicit parameters `result` and `this`, which it fails to relate. Hence the table computed by `getToString()` contains a row for every combination of values of `result` and `this`, that is, for every combination of a method named `"ToString"` and an instance of `Foo`. To avoid making this mistake, ``this`` should be restricted in the member predicate ``getToString()`` on the class ``Foo``. Finally, consider a predicate of the following form:: @@ -89,4 +89,4 @@ That is, you should define a **base case** that allows the predicate to *bottom Further information ------------------- -- Find out more about QL in the `QL language handbook `__ and `QL language specification `__. \ No newline at end of file +- Find out more about QL in the `QL language handbook `__ and `QL language specification `__. From d6202da8765c0e27a24b118407d84fd35da1bc64 Mon Sep 17 00:00:00 2001 From: james Date: Wed, 11 Dec 2019 12:02:15 +0000 Subject: [PATCH 14/17] docs: address max's comments --- .../writing-queries/debugging-queries.rst | 27 ++++++++----------- 1 file changed, 11 insertions(+), 16 deletions(-) diff --git a/docs/language/learn-ql/writing-queries/debugging-queries.rst b/docs/language/learn-ql/writing-queries/debugging-queries.rst index 995384449d7..634b146a578 100644 --- a/docs/language/learn-ql/writing-queries/debugging-queries.rst +++ b/docs/language/learn-ql/writing-queries/debugging-queries.rst @@ -1,7 +1,7 @@ Query writing: common performance issues ======================================== -This topic offers some simple tips on how to avoid commons problems that can affect the performance of your queries. +This topic offers some simple tips on how to avoid common problems that can affect the performance of your queries. Before reading the tips below, it is worth reiterating a few important points about CodeQL and the QL language: - CodeQL `predicates `__ and `classes `__ are evaluated to database `tables `__. Large predicates generate large tables with many rows, and are therefore expensive to compute. @@ -22,13 +22,18 @@ This leads to computing the `Cartesian product or - or - ... - } - -In this situation, if ``x1`` and ``x2`` are not mentioned in all ```` terms, the compiler will produce the Cartesian product between what you wanted and all possible values of the unused parameter. - Use specific types ~~~~~~~~~~~~~~~~~~ From 2ce1c2bfee8212aa280a641cdc817ce4e82eb3ff Mon Sep 17 00:00:00 2001 From: James Fletcher <42464962+jf205@users.noreply.github.com> Date: Wed, 11 Dec 2019 12:44:35 +0000 Subject: [PATCH 15/17] Apply suggestions from code review Co-Authored-By: Max Schaefer <54907921+max-schaefer@users.noreply.github.com> --- docs/language/learn-ql/writing-queries/debugging-queries.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/language/learn-ql/writing-queries/debugging-queries.rst b/docs/language/learn-ql/writing-queries/debugging-queries.rst index 634b146a578..23263b55ab6 100644 --- a/docs/language/learn-ql/writing-queries/debugging-queries.rst +++ b/docs/language/learn-ql/writing-queries/debugging-queries.rst @@ -32,7 +32,7 @@ For instance, consider the following predicate that checks whether a Java method The predicate holds if ``m`` contains an access to ``f``, but also conservatively assumes that methods without bodies (for example, native methods) may access *any* field. -However, if ``m`` is a native method, the table computed by ``mayAccess`` will contain a row ``m, f`` for *all* fields ``f`` in the codebase, which could potentially be very large. +However, if ``m`` is a native method, the table computed by ``mayAccess`` will contain a row ``m, f`` for *all* fields ``f`` in the codebase, making it potentially very large. This example shows a similar mistake in a member predicate:: @@ -45,7 +45,7 @@ This example shows a similar mistake in a member predicate:: ... } -Note that while ``getToString()`` does not declare any parameters, it has two implicit parameters, ``result`` and ``this``, which it fails to relate. Therefore, the table computed by ``getToString()`` contains a row for every combination of ``result`` and ``this``. That is, a row for every combination of a method named ``"ToString"`` that is an instance of ``Foo``. +Note that while ``getToString()`` does not declare any parameters, it has two implicit parameters, ``result`` and ``this``, which it fails to relate. Therefore, the table computed by ``getToString()`` contains a row for every combination of ``result`` and ``this``. That is, a row for every combination of a method named ``"ToString"`` and an instance of ``Foo``. To avoid making this mistake, ``this`` should be restricted in the member predicate ``getToString()`` on the class ``Foo``. Use specific types From b2db72d3362b0bd0261560c70de05d615cf13d48 Mon Sep 17 00:00:00 2001 From: James Fletcher <42464962+jf205@users.noreply.github.com> Date: Wed, 11 Dec 2019 14:13:56 +0000 Subject: [PATCH 16/17] Apply suggestions from code review Co-Authored-By: shati-patel <42641846+shati-patel@users.noreply.github.com> --- .../learn-ql/writing-queries/debugging-queries.rst | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/language/learn-ql/writing-queries/debugging-queries.rst b/docs/language/learn-ql/writing-queries/debugging-queries.rst index 23263b55ab6..933d1b90e8f 100644 --- a/docs/language/learn-ql/writing-queries/debugging-queries.rst +++ b/docs/language/learn-ql/writing-queries/debugging-queries.rst @@ -5,13 +5,13 @@ This topic offers some simple tips on how to avoid common problems that can affe Before reading the tips below, it is worth reiterating a few important points about CodeQL and the QL language: - CodeQL `predicates `__ and `classes `__ are evaluated to database `tables `__. Large predicates generate large tables with many rows, and are therefore expensive to compute. -- The QL language is implemented using standard database operations and `relational algebra `__ (such as join, projection, union, etc.). For further information about query languages and databases, see :doc:`About QL <../about-ql>`. -- Queries are evaluated *bottom-up*, which means that a predicate is not evaluated until **all** of the predicates that it depends on are evaluated. For more information on query evaluation, see `Evaluation of QL programs `__ in the QL handbook. +- The QL language is implemented using standard database operations and `relational algebra `__ (such as join, projection, and union). For further information about query languages and databases, see :doc:`About QL <../about-ql>`. +- Queries are evaluated *bottom-up*, which means that a predicate is not evaluated until *all* of the predicates that it depends on are evaluated. For more information on query evaluation, see `Evaluation of QL programs `__ in the QL handbook. Performance tips ---------------- -Follow the guidelines below to ensure that you don't get get tripped up by the most common CodeQL performance pitfalls. +Follow the guidelines below to ensure that you don't get tripped up by the most common CodeQL performance pitfalls. Eliminate cartesian products ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -20,7 +20,7 @@ The performance of a predicate can often be judged by considering roughly how ma One way of creating badly performing predicates is by using two variables without relating them in any way, or only relating them using a negation. This leads to computing the `Cartesian product `__ between the sets of possible values for each variable, potentially generating a huge table of results. -This can occur whenever you inadvertently fail to specify restrictions on your variables. +This can occur if you don't specify restrictions on your variables. For instance, consider the following predicate that checks whether a Java method ``m`` may access a field ``f``:: @@ -60,7 +60,7 @@ is preferred over:: predicate foo(Expr e) -From the type context, the query optimizer deduces that some parts of the program are redundant and removes them, or **specializes** them. +From the type context, the query optimizer deduces that some parts of the program are redundant and removes them, or *specializes* them. Avoid complex recursion ~~~~~~~~~~~~~~~~~~~~~~~ @@ -68,7 +68,7 @@ Avoid complex recursion `Recursion `__ is about self-referencing definitions. It can be extremely powerful as long as it is used appropriately. On the whole, you should try to make recursive predicates as simple as possible. -That is, you should define a **base case** that allows the predicate to *bottom out*, along with a single **recursive call**:: +That is, you should define a *base case* that allows the predicate to *bottom out*, along with a single *recursive call*:: int depth(Stmt s) { exists(Callable c | c.getBody() = s | result = 0) // base case From ff4a6041192262d186b293e9a681d8ee5af44936 Mon Sep 17 00:00:00 2001 From: James Fletcher <42464962+jf205@users.noreply.github.com> Date: Wed, 11 Dec 2019 14:29:10 +0000 Subject: [PATCH 17/17] Update docs/language/learn-ql/writing-queries/debugging-queries.rst Co-Authored-By: shati-patel <42641846+shati-patel@users.noreply.github.com> --- docs/language/learn-ql/writing-queries/debugging-queries.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/language/learn-ql/writing-queries/debugging-queries.rst b/docs/language/learn-ql/writing-queries/debugging-queries.rst index 933d1b90e8f..9609fa0ac04 100644 --- a/docs/language/learn-ql/writing-queries/debugging-queries.rst +++ b/docs/language/learn-ql/writing-queries/debugging-queries.rst @@ -73,7 +73,7 @@ That is, you should define a *base case* that allows the predicate to *bottom ou int depth(Stmt s) { exists(Callable c | c.getBody() = s | result = 0) // base case or - result = depth(s.getParent()) + 1 // recursive case + result = depth(s.getParent()) + 1 // recursive call } .. pull-quote:: Note