mirror of
https://github.com/github/codeql.git
synced 2026-04-30 19:26:02 +02:00
Merge branch 'master' into merge-master-docs
This commit is contained in:
@@ -1,41 +1,7 @@
|
||||
# Experimental CodeQL queries and libraries
|
||||
|
||||
In addition to our standard CodeQL queries and libraries, this repository may also contain queries and libraries of a more experimental nature. Experimental queries and libraries can be improved incrementally and may eventually reach a sufficient maturity to be included in our standard libraries and queries.
|
||||
In addition to [our supported queries and libraries](supported-queries.md), this repository also contains queries and libraries of a more experimental nature. Experimental queries and libraries can be improved incrementally and may eventually reach a sufficient maturity to be included in our supported queries and libraries.
|
||||
|
||||
Experimental queries and libraries may not be actively maintained as the standard libraries evolve. They may also be changed in backwards-incompatible ways or may be removed entirely in the future without deprecation warnings.
|
||||
Experimental queries and libraries may not be actively maintained as the [supported](supported-queries.md) libraries evolve. They may also be changed in backwards-incompatible ways or may be removed entirely in the future without deprecation warnings.
|
||||
|
||||
## Requirements
|
||||
|
||||
1. **Directory structure**
|
||||
|
||||
- Experimental queries and libraries are stored in the `experimental` subdirectory within each language-specific directory in the [CodeQL repository](https://github.com/Semmle/ql). For example, experimental Java queries and libraries are stored in `ql/java/ql/src/experimental` and any corresponding tests in `ql/java/ql/test/experimental`.
|
||||
- The structure of an `experimental` subdirectory mirrors the structure of standard queries and libraries (or tests) in the parent directory.
|
||||
|
||||
2. **Query metadata**
|
||||
|
||||
- The query `@id` must not clash with any other queries in the repository.
|
||||
- The query must have a `@name` and `@description` to explain its purpose.
|
||||
- The query must have a `@kind` and `@problem.severity` as required by CodeQL tools.
|
||||
|
||||
For details, see the [guide on query metadata](https://github.com/Semmle/ql/blob/master/docs/query-metadata-style-guide.md).
|
||||
|
||||
3. **Formatting**
|
||||
|
||||
- The queries and libraries must be [autoformatted](https://help.semmle.com/codeql/codeql-for-vscode/reference/editor.html#autoformatting).
|
||||
|
||||
4. **Compilation**
|
||||
|
||||
- Compilation of the query and any associated libraries and tests must be resilient to future development of the standard libraries. This means that the functionality cannot use internal APIs, cannot depend on the output of `getAQlClass`, and cannot make use of regexp matching on `toString`.
|
||||
- The query and any associated libraries and tests must not cause any compiler warnings to be emitted (such as use of deprecated functionality or missing `override` annotations).
|
||||
|
||||
5. **Results**
|
||||
|
||||
- The query must have at least one true positive result on some revision of a real project.
|
||||
|
||||
6. **Contributor License Agreement**
|
||||
|
||||
- The contributor can satisfy the [CLA](CONTRIBUTING.md#contributor-license-agreement).
|
||||
|
||||
## Non-requirements
|
||||
|
||||
Other criteria typically required for our standard queries and libraries are not required for experimental queries and libraries. In particular, fully disciplined query [metadata](docs/query-metadata-style-guide.md), query [help](docs/query-help-style-guide.md), tests, a low false positive rate and performance tuning are not required (but nonetheless recommended).
|
||||
See [CONTRIBUTING.md](../CONTRIBUTING.md) for guidelines on submitting a new experimental query.
|
||||
|
||||
@@ -34,7 +34,7 @@
|
||||
<div id="siteBanner">
|
||||
<div class="textContainer">
|
||||
<div class="logocontainer">
|
||||
<a href="https://semmle.com/" id="Header-logo" class="">
|
||||
<a href="https://help.semmle.com/" id="Header-logo" class="">
|
||||
<svg class="Header-logo-white" width="98" height="20" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
|
||||
<defs>
|
||||
<path id="a" d="M0 .149h12.872v18.814H0z"></path>
|
||||
@@ -102,7 +102,7 @@
|
||||
{{super()}}
|
||||
</div>
|
||||
<div class="privacy">
|
||||
<a target="_blank" href="https://semmle.com/privacy-policy" alt="Privacy policy and tracking preferences" title="Privacy policy and tracking preferences">Privacy policy</a>
|
||||
<a target="_blank" href="https://help.semmle.com/privacy-policy.html" alt="Privacy policy and tracking preferences" title="Privacy policy and tracking preferences">Privacy policy</a>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
|
||||
@@ -61,7 +61,7 @@ These topics are discussed in detail in the `QL language handbook <https://help.
|
||||
References
|
||||
----------
|
||||
|
||||
Academic references available from the `Semmle website <https://semmle.com/publications>`__ also provide an overview of QL and its semantics. Other useful references on database query languages and Datalog:
|
||||
Academic references available from the `Semmle website <https://help.semmle.com/publications.html>`__ also provide an overview of QL and its semantics. Other useful references on database query languages and Datalog:
|
||||
|
||||
- `Database theory: Query languages <http://www.lsv.ens-cachan.fr/~segoufin/Papers/Mypapers/DB-chapter.pdf>`__
|
||||
- `Logic Programming and Databases book - Amazon page <http://www.amazon.co.uk/Programming-Databases-Surveys-Computer-Science/dp/3642839541>`__
|
||||
|
||||
@@ -1,151 +0,0 @@
|
||||
Introducing the CodeQL libraries for COBOL
|
||||
==========================================
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
There is an extensive library for analyzing COBOL code. The classes in this library present the data from a CodeQL database in an object-oriented form and provide abstractions and predicates to help you with common analysis tasks.
|
||||
|
||||
The library is implemented as a set of QL modules–that is, files with the extension ``.qll``. The module ``cobol.qll`` imports most other standard library modules, so you can include the complete library by beginning your query with:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
import cobol
|
||||
|
||||
The rest of this tutorial briefly summarizes the most important classes and predicates provided by this library, including references to the `detailed API documentation <https://help.semmle.com/qldoc/cobol/>`__ where applicable.
|
||||
|
||||
Introducing the library
|
||||
-----------------------
|
||||
|
||||
The CodeQL library for COBOL presents information about COBOL source code at different levels:
|
||||
|
||||
- **Textual** — classes that represent source code as unstructured text files
|
||||
- **Lexical** — classes that represent comments and other tokens of interest
|
||||
- **Syntactic** — classes that represent source code as an abstract syntax tree
|
||||
- **Name binding** — classes that represent data entries and data references
|
||||
- **Control flow** — classes that represent the flow of control during execution
|
||||
- **Frameworks** — classes that represent interactions via CICS and SQL
|
||||
|
||||
Note that representations above the textual level (for example the lexical representation or the flow graphs) are only available for COBOL code that does not contain fatal syntax errors. For code with such errors, the only information available is at the textual level, as well as information about the errors themselves.
|
||||
|
||||
Textual level
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
At its most basic level, a COBOL code base can simply be viewed as a collection of files organized into folders.
|
||||
|
||||
Files and folders
|
||||
^^^^^^^^^^^^^^^^^
|
||||
|
||||
Files are represented as entities of class `File <https://help.semmle.com/qldoc/cobol/semmle/cobol/Files.qll/type.Files$File.html>`__, and folders as entities of class `Folder <https://help.semmle.com/qldoc/cobol/semmle/cobol/Files.qll/type.Files$Folder.html>`__, both of which are subclasses of class `Container <https://help.semmle.com/qldoc/cobol/semmle/cobol/Files.qll/type.Files$Container.html>`__.
|
||||
|
||||
Class `Container <https://help.semmle.com/qldoc/cobol/semmle/cobol/Files.qll/type.Files$Container.html>`__ provides the following member predicates:
|
||||
|
||||
- ``Container.getParentContainer()`` returns the parent folder of the file or folder.
|
||||
- ``Container.getAFile()`` returns a file within the folder.
|
||||
- ``Container.getAFolder()`` returns a folder nested within the folder.
|
||||
|
||||
Note that while ``getAFile`` and ``getAFolder`` are declared on class `Container <https://help.semmle.com/qldoc/cobol/semmle/cobol/Files.qll/type.Files$Container.html>`__, they currently only have results for `Folder <https://help.semmle.com/qldoc/cobol/semmle/cobol/Files.qll/type.Files$Folder.html>`__\ s.
|
||||
|
||||
Both files and folders have paths, which can be accessed by the predicate ``Container.getAbsolutePath()``. For example, if ``f`` represents a file with the path ``/home/user/project/src/main.cbl``, then ``f.getAbsolutePath()`` evaluates to the string ``"/home/user/project/src/main.cbl"``, while ``f.getParentContainer().getAbsolutePath()`` returns ``"/home/user/project/src"``.
|
||||
|
||||
These paths are absolute file system paths. If you want to obtain the path of a file relative to the source location in the CodeQL database, use ``Container.getRelativePath()`` instead. Note, however, that a database may contain files that are not located underneath the source location; for such files, ``getRelativePath()`` will not return anything.
|
||||
|
||||
The following member predicates of class `Container <https://help.semmle.com/qldoc/cobol/semmle/cobol/Files.qll/type.Files$Container.html>`__ provide more information about the name of a file or folder:
|
||||
|
||||
- ``Container.getBaseName()`` returns the base name of a file or folder, not including its parent folder, but including its extension. In the above example, ``f.getBaseName()`` would return the string ``"main.cbl"``.
|
||||
- ``Container.getStem()`` is similar to ``Container.getBaseName()``, but it does *not* include the file extension; so ``f.getStem()`` returns ``"main"``.
|
||||
- ``Container.getExtension()`` returns the file extension, not including the dot; so ``f.getExtension()`` returns ``"cbl"``.
|
||||
|
||||
For example, the following query computes, for each folder, the number of COBOL files (that is, files with extension ``cbl``) contained in the folder:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
import cobol
|
||||
|
||||
from Folder d
|
||||
select d.getRelativePath(), count(File f | f = d.getAFile() and f.getExtension() = "cbl")
|
||||
|
||||
Locations
|
||||
^^^^^^^^^
|
||||
|
||||
Most entities in a CodeQL database have an associated source location. Locations are identified by four pieces of information: a file, a start line, a start column, an end line, and an end column. Line and column counts are 1-based (so the first character of a file is at line 1, column 1), and the end position is inclusive.
|
||||
|
||||
All entities associated with a source location belong to the class `Locatable <https://help.semmle.com/qldoc/cobol/semmle/cobol/Location.qll/type.Location$Locatable.html>`__. The location itself is modeled by the class `Location <https://help.semmle.com/qldoc/cobol/semmle/cobol/Location.qll/type.Location$Location.html>`__ and can be accessed through the member predicate ``Locatable.getLocation()``. The `Location <https://help.semmle.com/qldoc/cobol/semmle/cobol/Location.qll/type.Location$Location.html>`__ class provides the following member predicates:
|
||||
|
||||
- ``Location.getFile()``, ``Location.getStartLine()``, ``Location.getStartColumn()``, ``Location.getEndLine()``, ``Location.getEndColumn()`` return detailed information about the location.
|
||||
- ``Location.getNumLines()`` returns the number of (whole or partial) lines covered by the location.
|
||||
- ``Location.startsBefore(Location)`` and ``Location.endsAfter(Location)`` determine whether one location starts before or ends after another location.
|
||||
- ``Location.contains(Location)`` indicates whether one location completely contains another location; ``l1.contains(l2)`` holds if, and only if, ``l1.startsBefore(l2)`` and ``l1.endsAfter(l2)``.
|
||||
|
||||
Lexical level
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
At this level we represent comments through the `Comment <https://help.semmle.com/qldoc/cobol/semmle/cobol/Comments.qll/type.Comments$Comment.html>`__ class. We do not currently retain any tokens other than scope terminators (for example ``END-IF``), which are represented by the `ScopeTerminator <https://help.semmle.com/qldoc/cobol/semmle/cobol/Stmts.qll/type.Stmts$ScopeTerminator.html>`__ class.
|
||||
|
||||
Comments
|
||||
^^^^^^^^
|
||||
|
||||
The class `Comment <https://help.semmle.com/qldoc/cobol/semmle/cobol/Comments.qll/type.Comments$Comment.html>`__ represents the comments that occur in COBOL programs:
|
||||
|
||||
The most important member predicates are as follows:
|
||||
|
||||
- ``Comment.getText()`` returns the source text of the comment, not including delimiters.
|
||||
- ``Comment.getScope()`` returns the location of the source code to which the comment is bound.
|
||||
|
||||
Scope terminators
|
||||
^^^^^^^^^^^^^^^^^
|
||||
|
||||
The class `ScopeTerminator <https://help.semmle.com/qldoc/cobol/semmle/cobol/Stmts.qll/type.Stmts$ScopeTerminator.html>`__ represents the scope terminators that occur in COBOL programs:
|
||||
|
||||
The most important member predicates are as follows:
|
||||
|
||||
- ``ScopeTerminator.getStmt()`` returns the statement whose scope this terminator is closing.
|
||||
|
||||
Syntactic level
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
The majority of classes in the CodeQL library for COBOL are concerned with representing a COBOL program as a collection of `abstract syntax trees <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`__ (ASTs).
|
||||
|
||||
The class `ASTNode <https://help.semmle.com/qldoc/cobol/semmle/cobol/AstNode.qll/type.AstNode$AstNode.html>`__ contains all entities representing nodes in the abstract syntax trees and defines generic tree traversal predicates:
|
||||
|
||||
- ``ASTNode.getParent()``: returns the parent node of this AST node, if any.
|
||||
|
||||
Please note that the libraries for COBOL do not currently represent all possible parts of a COBOL program. Due to the complexity of the language, and its many dialects, this is an ongoing task. We prioritize elements that are of interest to queries, and expand this selection over time. Please check the `detailed API documentation <https://help.semmle.com/qldoc/cobol/>`__ to see what is currently available.
|
||||
|
||||
The main structure of any COBOL program is represented by the `Unit <https://help.semmle.com/qldoc/cobol/semmle/cobol/Units.qll/type.Units$Unit.html>`__ class and its subclasses. For example, each program definition has a `ProgramDefinition <https://help.semmle.com/qldoc/cobol/semmle/cobol/Units.qll/type.Units$ProgramDefinition.html>`__ counterpart. For each ``PROCEDURE DIVISION`` in the program, there will be a `ProcedureDivision <https://help.semmle.com/qldoc/cobol/semmle/cobol/AST_extended.qll/type.AST_extended$ProcedureDivision.html>`__ class.
|
||||
|
||||
All data definitions are made accessible through the `DescriptionEntry <https://help.semmle.com/qldoc/cobol/semmle/cobol/DataEntries.qll/type.DataEntries$DescriptionEntry.html>`__ class and its subclasses. In particular, you can use `DataDescriptionEntry <https://help.semmle.com/qldoc/cobol/semmle/cobol/DataEntries.qll/type.DataEntries$DataDescriptionEntry.html>`__ to find the typical data entries defined in a ``WORKING-STORAGE SECTION``.
|
||||
|
||||
References to data items are modeled through the `DataReference <https://help.semmle.com/qldoc/cobol/semmle/cobol/References.qll/type.References$DataReference.html>`__ class. You can use ``DataReference.getTarget()`` to resolve the reference to the matching data item.
|
||||
|
||||
Individual statements are represented by the class `Stmt <https://help.semmle.com/qldoc/cobol/semmle/cobol/Stmts.qll/type.Stmts$Stmt.html>`__ and its subclasses. The name of the specific type starts with the statement's verb. For example, ``OPEN`` statements are covered by the class `Open <https://help.semmle.com/qldoc/cobol/semmle/cobol/Stmts.qll/type.Stmts$Open.html>`__. Unknown statement types are covered by the
|
||||
`OtherStmt <https://help.semmle.com/qldoc/cobol/semmle/cobol/AST_extended.qll/type.AST_extended$OtherStmt.html>`__ class.
|
||||
|
||||
Control flow
|
||||
~~~~~~~~~~~~
|
||||
|
||||
You can represent a program in terms of its control flow graph (CFG) using the ``AstNode.getASuccessor`` predicate. You can use this predicate to find possible successors to any statement, sentence, or unit in a procedure division.
|
||||
|
||||
Parse errors
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
COBOL code that contains breaking syntax errors cannot usually be analyzed. All that is available in this case is a value of class `Error <https://help.semmle.com/qldoc/cobol/semmle/cobol/Errors.qll/type.Errors$Error.html>`__ representing the parse error. It provides information about the syntax error location and the error message through predicates ``Error.getLocation`` and ``Error.getMessage`` respectively.
|
||||
|
||||
Frameworks
|
||||
~~~~~~~~~~
|
||||
|
||||
CICS
|
||||
^^^^
|
||||
|
||||
Calls to the CICS system through ``EXEC CICS`` are represented by the class `CICS <https://help.semmle.com/qldoc/cobol/semmle/cobol/AST_extended.qll/type.AST_extended$Cics.html>`__.
|
||||
|
||||
SQL
|
||||
^^^
|
||||
|
||||
Calls to the SQL system through ``EXEC SQL`` are represented by the class
|
||||
`SqlStmt <https://help.semmle.com/qldoc/cobol/semmle/cobol/Sql.qll/type.Sql$SqlStmt.html>`__ and its subclasses.
|
||||
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- Find out more about QL in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__ and `QL language specification <https://help.semmle.com/QL/ql-spec/language.html>`__.
|
||||
@@ -1,20 +0,0 @@
|
||||
CodeQL for COBOL
|
||||
================
|
||||
|
||||
.. toctree::
|
||||
:glob:
|
||||
:hidden:
|
||||
|
||||
introduce-libraries-cobol
|
||||
|
||||
.. include:: ../../support/cobol-note.rst
|
||||
|
||||
This page provides an overview of the CodeQL for COBOL documentation that is currently available.
|
||||
|
||||
- :doc:`Introducing the CodeQL libraries for COBOL <introduce-libraries-cobol>` introduces the standard libraries used to write queries for COBOL code.
|
||||
|
||||
|
||||
Other resources
|
||||
---------------
|
||||
|
||||
- For more information about the library for COBOL see the `CodeQL library for COBOL <https://help.semmle.com/qldoc/cobol/>`__.
|
||||
@@ -1,7 +1,7 @@
|
||||
Learning CodeQL
|
||||
###############
|
||||
|
||||
CodeQL is the code analysis platform used by security researchers to automate `variant analysis <https://semmle.com/variant-analysis>`__.
|
||||
CodeQL is the code analysis platform used by security researchers to automate variant analysis.
|
||||
You can use CodeQL queries to explore code and quickly find variants of security vulnerabilities and bugs.
|
||||
These queries are easy to write and share–visit the topics below and `our open source repository on GitHub <https://github.com/Semmle/ql>`__ to learn more.
|
||||
You can also try out CodeQL in the `query console <https://lgtm.com/query>`__ on `LGTM.com <https://lgtm.com>`__.
|
||||
@@ -24,7 +24,6 @@ CodeQL is based on a powerful query language called QL. The following topics hel
|
||||
writing-queries/writing-queries
|
||||
cpp/ql-for-cpp
|
||||
csharp/ql-for-csharp
|
||||
cobol/ql-for-cobol
|
||||
go/ql-for-go
|
||||
java/ql-for-java
|
||||
javascript/ql-for-javascript
|
||||
|
||||
@@ -4,9 +4,9 @@ CodeQL training and variant analysis examples
|
||||
CodeQL and variant analysis
|
||||
---------------------------
|
||||
|
||||
`Variant analysis <https://semmle.com/variant-analysis>`__ is the process of using a known vulnerability as a seed to find similar problems in your code. Security engineers typically perform variant analysis to identify possible vulnerabilities and to ensure that these threats are properly fixed across multiple code bases.
|
||||
Variant analysis is the process of using a known vulnerability as a seed to find similar problems in your code. Security engineers typically perform variant analysis to identify possible vulnerabilities and to ensure that these threats are properly fixed across multiple code bases.
|
||||
|
||||
`CodeQL <https://semmle.com/ql>`__ is the code analysis engine that underpins LGTM, Semmle's community driven security analysis platform. Together, CodeQL and LGTM provide continuous monitoring and scalable variant analysis for your projects, even if you don’t have your own team of dedicated security engineers. You can read more about using CodeQL and LGTM in variant analysis on the `Security Lab research page <https://securitylab.github.com/research>`__.
|
||||
CodeQL is the code analysis engine that underpins LGTM, the community driven security analysis platform. Together, CodeQL and LGTM provide continuous monitoring and scalable variant analysis for your projects, even if you don’t have your own team of dedicated security engineers. You can read more about using CodeQL and LGTM in variant analysis on the `Security Lab research page <https://securitylab.github.com/research>`__.
|
||||
|
||||
CodeQL is easy to learn, and exploring code using CodeQL is the most efficient way to perform variant analysis.
|
||||
|
||||
|
||||
@@ -158,4 +158,4 @@ What next?
|
||||
- To learn more about writing path queries, see :doc:`Creating path queries <path-queries>`.
|
||||
- Take a look at the `built-in queries <https://help.semmle.com/wiki/display/QL/Built-in+queries>`__ to see examples of the queries included in CodeQL.
|
||||
- Explore the `query cookbooks <https://help.semmle.com/wiki/display/QL/QL+cookbooks>`__ to see how to access the basic language elements contained in the CodeQL libraries.
|
||||
- For a full list of resources to help you learn CodeQL, including beginner tutorials and language-specific examples, visit `Learning CodeQL <https://help.semmle.com/QL/learn-ql/>`__.
|
||||
- For a full list of resources to help you learn CodeQL, including beginner tutorials and language-specific examples, visit `Learning CodeQL <https://help.semmle.com/QL/learn-ql/>`__.
|
||||
|
||||
@@ -94,6 +94,23 @@ In a valid range, the start and end expression are integers, floats, or dates. I
|
||||
is a date, then both must be dates. If one of them is an integer and the other a float, then
|
||||
both are treated as floats.
|
||||
|
||||
.. index:: setliteral
|
||||
.. _setliteral:
|
||||
|
||||
Set literal expressions
|
||||
***********************
|
||||
|
||||
A set literal expression allows the explicit listing of a choice between several values.
|
||||
It consists of a comma-separated collection of expressions that are enclosed in brackets (``[`` and ``]``).
|
||||
For example, ``[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]`` is a valid set literal expression.
|
||||
Its values are the first ten prime numbers.
|
||||
|
||||
The values of the contained expressions need to be of :ref:`compatible types <type-compatibility>` for a valid set literal expression.
|
||||
Furthermore, at least one of the set elements has to be of a type that is a supertype of the types of all
|
||||
the other contained expressions.
|
||||
|
||||
Set literals are supported from release 2.1.0 of the CodeQL CLI, and release 1.24 of LGTM Enterprise.
|
||||
|
||||
.. index:: super
|
||||
.. _super:
|
||||
|
||||
@@ -278,6 +295,22 @@ The following aggregates are available in QL:
|
||||
evaluates to the empty set (instead of defaulting to ``0`` or the empty string).
|
||||
This is useful if you're only interested in results where the aggregation body is non-trivial.
|
||||
|
||||
.. index:: unique
|
||||
|
||||
- ``unique``: This aggregate depends on the values of ``<expression>`` over all possible assignments to
|
||||
the aggregation variables. If there is a unique value of ``<expression>`` over the aggregation variables,
|
||||
then the aggregate evaluates to that value.
|
||||
Otherwise, the aggregate has no value.
|
||||
|
||||
For example, the following query returns the positive integers ``1``, ``2``, ``3``, ``4``, ``5``.
|
||||
For negative integers ``x``, the expressions ``x`` and ``x.abs()`` have different values, so the
|
||||
value for ``y`` in the aggregate expression is not uniquely determined. ::
|
||||
|
||||
from int x
|
||||
where x in [-5 .. 5] and x != 0
|
||||
select unique(int y | y = x or y = x.abs() | y)
|
||||
|
||||
The ``unique`` aggregate is supported from release 2.1.0 of the CodeQL CLI, and release 1.24 of LGTM Enterprise.
|
||||
|
||||
Evaluation of aggregates
|
||||
========================
|
||||
|
||||
@@ -1058,6 +1058,7 @@ An aggregation can be written in one of two forms:
|
||||
|
||||
aggregation ::= aggid ("[" expr "]")? "(" (var_decls)? ("|" (formula)? ("|" as_exprs ("order" "by" aggorderbys)?)?)? ")"
|
||||
| aggid ("[" expr "]")? "(" as_exprs ("order" "by" aggorderbys)? ")"
|
||||
| "unique" "(" var_decls "|" (formula)? ("|" as_exprs)? ")"
|
||||
|
||||
aggid ::= "avg" | "concat" | "count" | "max" | "min" | "rank" | "strictconcat" | "strictcount" | "strictsum" | "sum"
|
||||
|
||||
@@ -1098,7 +1099,7 @@ The typing environment for ordering directives is obtained by taking the typing
|
||||
|
||||
The number and types of the aggregation expressions are restricted as follows:
|
||||
|
||||
- A ``max``, ``min`` or ``rank`` aggregation must have a single expression.
|
||||
- A ``max``, ``min``, ``rank`` or ``unique`` aggregation must have a single expression.
|
||||
- The type of the expression in a ``max``, ``min`` or ``rank`` aggregation without an ordering directive expression must be an orderable type.
|
||||
- A ``count`` or ``strictcount`` aggregation must not have an expression.
|
||||
- A ``sum``, ``strictsum`` or ``avg`` aggregation must have a single aggregation expression, which must have a type which is a subtype of ``float``.
|
||||
@@ -1140,6 +1141,8 @@ The values of the aggregation expression are given by applying the aggregation f
|
||||
|
||||
- If the aggregation id is ``strictconcat``, then the result is the same as for ``concat`` except in the case where there are no aggregation tuples in which case the aggregation has no value.
|
||||
|
||||
- If the aggregation id is ``unique``, then the result is the value of the aggregation variable if there is precisely one such value. Otherwise, the aggregation has no value.
|
||||
|
||||
Any
|
||||
~~~
|
||||
|
||||
@@ -1168,6 +1171,21 @@ If both expressions are subtypes of ``int`` then the type of the range is ``int`
|
||||
|
||||
The values of a range expression are those values which are ordered inclusively between a value of the first expression and a value of the second expression.
|
||||
|
||||
Set literals
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Set literals denote a choice from a collection of values.
|
||||
|
||||
::
|
||||
|
||||
setliteral ::= "[" expr ("," expr)* "]"
|
||||
|
||||
Set literals can be of any type, but the types within a set literal have to be consistent according to the following criterion: At least one of the set elements has to be of a type that is a supertype of all the set element types. This supertype is the type of the set literal. For example, ``float`` is a supertype of ``float`` and ``int``, therefore ``x = [4, 5.6]`` is valid. On the other hand, ``y = [5, "test"]`` does not adhere to the criterion.
|
||||
|
||||
The values of a set literal expression are all the values of all the contained element expressions.
|
||||
|
||||
Set literals are supported from release 2.1.0 of the CodeQL CLI, and release 1.24 of LGTM Enterprise.
|
||||
|
||||
Disambiguation of expressions
|
||||
-----------------------------
|
||||
|
||||
@@ -1934,6 +1952,7 @@ The complete grammar for QL is as follows:
|
||||
| aggregation
|
||||
| any
|
||||
| range
|
||||
| setliteral
|
||||
|
||||
eparen ::= "(" expr ")"
|
||||
|
||||
@@ -1960,7 +1979,8 @@ The complete grammar for QL is as follows:
|
||||
|
||||
aggregation ::= aggid ("[" expr "]")? "(" (var_decls)? ("|" (formula)? ("|" as_exprs ("order" "by" aggorderbys)?)?)? ")"
|
||||
| aggid ("[" expr "]")? "(" as_exprs ("order" "by" aggorderbys)? ")"
|
||||
|
||||
| "unique" "(" var_decls "|" (formula)? ("|" as_exprs)? ")"
|
||||
|
||||
aggid ::= "avg" | "concat" | "count" | "max" | "min" | "rank" | "strictconcat" | "strictcount" | "strictsum" | "sum"
|
||||
|
||||
aggorderbys ::= aggorderby ("," aggorderby)*
|
||||
@@ -1973,6 +1993,8 @@ The complete grammar for QL is as follows:
|
||||
| primary "." predicateName (closure)? "(" (exprs)? ")"
|
||||
|
||||
range ::= "[" expr ".." expr "]"
|
||||
|
||||
setliteral ::= "[" expr ("," expr)* "]"
|
||||
|
||||
simpleId ::= lowerId | upperId
|
||||
|
||||
|
||||
@@ -141,7 +141,7 @@ Let’s look for overflow guards of the form ``v + b < v``, using the classes
|
||||
|
||||
.. note::
|
||||
|
||||
- When performing `variant analysis <https://semmle.com/variant-analysis>`__, it is usually helpful to write a simple query that finds the simple syntactic pattern, before trying to go on to describe the cases where it goes wrong.
|
||||
- When performing variant analysis, it is usually helpful to write a simple query that finds the simple syntactic pattern, before trying to go on to describe the cases where it goes wrong.
|
||||
- In this case, we start by looking for all the *overflow* checks, before trying to refine the query to find all *bad overflow* checks.
|
||||
- The ``select`` clause defines what this query is looking for:
|
||||
|
||||
|
||||
@@ -77,7 +77,7 @@ Let’s start by looking for calls to methods with names of the form ``sparql*Qu
|
||||
|
||||
.. note::
|
||||
|
||||
- When performing `variant analysis <https://semmle.com/variant-analysis>`__, it is usually helpful to write a simple query that finds the simple syntactic pattern, before trying to go on to describe the cases where it goes wrong.
|
||||
- When performing variant analysis, it is usually helpful to write a simple query that finds the simple syntactic pattern, before trying to go on to describe the cases where it goes wrong.
|
||||
- In this case, we start by looking for all the method calls that appear to run, before trying to refine the query to find cases which are vulnerable to query injection.
|
||||
- The ``select`` clause defines what this query is looking for:
|
||||
|
||||
|
||||
@@ -81,8 +81,6 @@ Find all instances!
|
||||
|
||||
- All were fixed with a mid-flight patch.
|
||||
|
||||
- For more detail on the collaboration between Semmle and NASA, see our case study: `Semmle at NASA: Landing Curiosity safely on Mars <https://semmle.com/case-studies/semmle-nasa-landing-curiosity-safely-mars>`__.
|
||||
|
||||
.. note::
|
||||
|
||||
The JPL team ran the query across the full Curiosity control software–it identified the original problem, and more than 30 other variants, of which three were in the critical Entry, Descent, and Landing module.
|
||||
@@ -107,7 +105,7 @@ Analysis overview
|
||||
|
||||
Once the extraction finishes, all this information is collected into a single `CodeQL database <https://help.semmle.com/QL/learn-ql/database.html>`__, which is then ready to query, possibly on a different machine. A copy of the source files, made at the time the database was created, is also included in the CodeQL database so analysis results can be displayed at the correct location in the code. The database schema is (source) language specific.
|
||||
|
||||
Queries are written in `QL <https://semmle.com/ql>`__ and usually depend on one or more of the `standard CodeQL libraries <https://github.com/semmle/ql>`__ (and of course you can write your own custom libraries). They are compiled into an efficiently executable format by the QL compiler and then run on a CodeQL database by the QL evaluator, either on a remote worker machine or locally on a developer’s machine.
|
||||
Queries are written in QL and usually depend on one or more of the `standard CodeQL libraries <https://github.com/semmle/ql>`__ (and of course you can write your own custom libraries). They are compiled into an efficiently executable format by the QL compiler and then run on a CodeQL database by the QL evaluator, either on a remote worker machine or locally on a developer’s machine.
|
||||
|
||||
Query results can be interpreted and presented in a variety of ways, including displaying them in an `IDE extension <https://lgtm.com/help/lgtm/running-queries-ide>`__ such as CodeQL for Visual Studio Code, or in a web dashboard as on `LGTM <https://lgtm.com/help/lgtm/about-lgtm>`__.
|
||||
|
||||
|
||||
@@ -1,5 +0,0 @@
|
||||
.. pull-quote:: Important
|
||||
|
||||
CodeQL for COBOL is being deprecated after the 1.23 release of CodeQL.
|
||||
Future releases, starting with 1.24, will no longer contain support for analyzing COBOL source code.
|
||||
We are not aware of any customers who will be affected by this change. If you do have any concerns, please contact your account manager.
|
||||
@@ -80,4 +80,4 @@ htmlhelp_basename = 'Supported languages and frameworks'
|
||||
|
||||
# List of patterns, relative to source directory, that match files and
|
||||
# directories to ignore when looking for source files.
|
||||
exclude_patterns = ['read-me-project.rst', 'cobol-note.rst']
|
||||
exclude_patterns = ['read-me-project.rst']
|
||||
@@ -10,8 +10,6 @@ Customers with any questions should contact their usual Semmle contact with any
|
||||
If you're not a customer yet, contact us at info@semmle.com
|
||||
with any questions you have about language and compiler support.
|
||||
|
||||
.. include:: cobol-note.rst
|
||||
|
||||
.. csv-table::
|
||||
:file: versions-compilers.csv
|
||||
:header-rows: 1
|
||||
|
||||
@@ -1,15 +1,15 @@
|
||||
Language,Variants,Compilers,Extensions
|
||||
C/C++,"C89, C99, C11, C++98, C++03, C++11, C++14, C++17","Clang (and clang-cl [1]_) extensions (up to Clang 8.0),
|
||||
C/C++,"C89, C99, C11, C18, C++98, C++03, C++11, C++14, C++17","Clang (and clang-cl [1]_) extensions (up to Clang 9.0),
|
||||
|
||||
GNU extensions (up to GCC 8.3),
|
||||
GNU extensions (up to GCC 9.2),
|
||||
|
||||
Microsoft extensions (up to VS 2019),
|
||||
|
||||
Arm Compiler 5.0 [2]_","``.cpp``, ``.c++``, ``.cxx``, ``.hpp``, ``.hh``, ``.h++``, ``.hxx``, ``.c``, ``.cc``, ``.h``"
|
||||
Arm Compiler 5 [2]_","``.cpp``, ``.c++``, ``.cxx``, ``.hpp``, ``.hh``, ``.h++``, ``.hxx``, ``.c``, ``.cc``, ``.h``"
|
||||
C#,C# up to 8.0. with .NET up to 4.8 [3]_,"Microsoft Visual Studio up to 2019,
|
||||
|
||||
.NET Core up to 3.0","``.sln``, ``.csproj``, ``.cs``, ``.cshtml``, ``.xaml``"
|
||||
Go (aka Golang), "Go up to 1.13", "Go 1.11 or more recent", ``.go``
|
||||
Go (aka Golang), "Go up to 1.14", "Go 1.11 or more recent", ``.go``
|
||||
Java,"Java 6 to 13 [4]_","javac (OpenJDK and Oracle JDK),
|
||||
|
||||
Eclipse compiler for Java (ECJ) [5]_",``.java``
|
||||
|
||||
|
@@ -2,7 +2,7 @@
|
||||
|
||||
## Introduction
|
||||
|
||||
This document outlines the structure of Semmle query files. You should adopt this structure when contributing custom queries to this repository, in order to ensure that new queries are consistent with the standard Semmle queries.
|
||||
This document outlines the structure of CodeQL query files. You should adopt this structure when contributing custom queries to this repository, in order to ensure that new queries are consistent with the standard CodeQL queries.
|
||||
|
||||
## Query files (.ql extension)
|
||||
|
||||
@@ -67,11 +67,11 @@ You must define an `@description` property for your query. This property defines
|
||||
|
||||
### Query ID `@id`
|
||||
|
||||
You must specify an `@id` property for your query. It must be unique in the Semmle namespace and should follow the standard Semmle convention. That is, it should begin with the 'language code' for the language that the query analyzes followed by a forward slash. The following language codes are supported:
|
||||
You must specify an `@id` property for your query. It must be unique and should follow the standard CodeQL convention. That is, it should begin with the 'language code' for the language that the query analyzes followed by a forward slash. The following language codes are supported:
|
||||
|
||||
* C and C++: `cpp`
|
||||
* C#: `cs`
|
||||
* COBOL: `cobol`
|
||||
* Go: `go`
|
||||
* Java: `java`
|
||||
* JavaScript and TypeScript: `js`
|
||||
* Python: `py`
|
||||
@@ -105,7 +105,7 @@ Note, `@id` properties should be consistent for queries that highlight the same
|
||||
* alerts (`@kind problem`)
|
||||
* alerts containing path information (`@kind path-problem`)
|
||||
|
||||
Alert queries (`@kind problem` or `path-problem`) support two further properties. These are added by Semmle after the query has been tested, prior to deployment to LGTM. The following information is for reference:
|
||||
Alert queries (`@kind problem` or `path-problem`) support two further properties. These are added by GitHub staff after the query has been tested, prior to deployment to LGTM. The following information is for reference:
|
||||
|
||||
|
||||
|
||||
|
||||
79
docs/supported-queries.md
Normal file
79
docs/supported-queries.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# Supported CodeQL queries and libraries
|
||||
|
||||
Queries and libraries outside [the `experimental` directories](experimental.md) are _supported_ by GitHub, allowing our users to rely on their continued existence and functionality in the future:
|
||||
|
||||
1. Once a query or library has appeared in a stable release, a one-year deprecation period is required before we can remove it. There can be exceptions to this when it's not technically possible to mark it as deprecated.
|
||||
2. Major changes to supported queries and libraries are always announced in the [change notes for stable releases](../change-notes/).
|
||||
3. We will do our best to address user reports of false positives or false negatives.
|
||||
|
||||
Because of these commitments, we set a high bar for accepting new supported queries. The requirements are detailed in the rest of this document.
|
||||
|
||||
## Steps for introducing a new supported query
|
||||
|
||||
The process must begin with the first step and must conclude with the final step. The remaining steps can be performed in any order.
|
||||
|
||||
1. **Have the query merged into the appropriate `experimental` subdirectory**
|
||||
|
||||
See [CONTRIBUTING.md](../CONTRIBUTING.md).
|
||||
|
||||
2. **Write a query help file**
|
||||
|
||||
Query help files explain the purpose of your query to other users. Write your query help in a `.qhelp` file and save it in the same directory as your query. For more information on writing query help, see the [Query help style guide](query-help-style-guide.md).
|
||||
|
||||
- Note, in particular, that almost all queries need to have a pair of "before" and "after" examples demonstrating the kind of problem the query identifies and how to fix it. Make sure that the examples are actually consistent with what the query does, for example by including them in your unit tests.
|
||||
- At the time of writing, there is no way of previewing help locally. Once you've opened a PR, a preview will be created as part of the CI checks. A GitHub employee will review this and let you know of any problems.
|
||||
|
||||
3. **Write unit tests**
|
||||
|
||||
Add one or more unit tests for the query (and for any library changes you make) to the `ql/<language>/ql/test/experimental` directory. Tests for library changes go into the `library-tests` subdirectory, and tests for queries go into `query-tests` with their relative path mirroring the query's location under `ql/<language>/ql/src/experimental`.
|
||||
|
||||
See the section on [Testing custom queries](https://help.semmle.com/codeql/codeql-cli/procedures/test-queries.html) in the [CodeQL documentation](https://help.semmle.com/codeql/) for more information.
|
||||
|
||||
4. **Test for correctness on real-world code**
|
||||
|
||||
Test the query on a number of large real-world projects to make sure it doesn't give too many false positive results. Adjust the `@precision` and `@problem.severity` attributes in accordance with the real-world results you observe. See the advice on query metadata below.
|
||||
|
||||
You can use the LGTM.com [query console](https://lgtm.com/query) to get an overview of true and false positive results on a large number of projects. The simplest way to do this is to:
|
||||
|
||||
1. [Create a list of prominent projects](https://lgtm.com/help/lgtm/managing-project-lists) on LGTM.
|
||||
2. In the query console, [run your query against your custom project list](https://lgtm.com/help/lgtm/using-query-console).
|
||||
3. Save links to your query console results and include them in discussions on issues and pull requests.
|
||||
|
||||
5. **Test and improve performance**
|
||||
|
||||
There must be a balance between the execution time of a query and the value of its results: queries that are highly valuable and broadly applicable can be allowed to take longer to run. In all cases, you need to address any easy-to-fix performance issues before the query is put into production.
|
||||
|
||||
QL performance profiling and tuning is an advanced topic, and some tasks will require assistance from GitHub employees. With that said, there are several things you can do.
|
||||
|
||||
- Understand [the evaluation model of QL](https://help.semmle.com/QL/ql-handbook/evaluation.html). It's more similar to SQL than to any mainstream programming language.
|
||||
- Most performance tuning in QL boils down to computing as few tuples (rows of data) as possible. As a mental model, think of predicate evaluation as enumerating all combinations of parameters that satisfy the predicate body. This includes the implicit parameters `this` and `result`.
|
||||
- The major libraries in CodeQL are _cached_ and will only be computed once for the entire suite of queries. The first query that needs a cached _stage_ will trigger its evaluation. This means that query authors should usually only look at the run time of the last stage of evaluation.
|
||||
- In [the settings for the VSCode extension](https://help.semmle.com/codeql/codeql-for-vscode/reference/settings.html), check the box "Running Queries: Debug" (`codeQL.runningQueries.debug`). Then find "CodeQL Query Server" in the VSCode Output panel (View -> Output) and capture the output when running the query. That output contains timing and tuple counts for all computed predicates.
|
||||
- To clear the entire cache, invoke "CodeQL: Clear Cache" from the VSCode command palette.
|
||||
|
||||
6. **Make sure your query has the correct metadata**
|
||||
|
||||
For the full reference on writing query metadata, see the [Query metadata style guide](query-metadata-style-guide.md). The following constitutes a checklist.
|
||||
|
||||
a. Each query needs a `@name`, a `@description`, and a `@kind`.
|
||||
|
||||
b. Alert queries also need a `@problem.severity` and a `@precision`.
|
||||
|
||||
- The severity is one of `error`, `warning`, or `recommendation`.
|
||||
- The precision is one of `very-high`, `high`, `medium` or `low`. It may take a few iterations to get this right.
|
||||
- Currently, LGTM runs all `error` or `warning` queries with a `very-high`, `high`, or `medium` precision. In addition, `recommendation` queries with `very-high` or `high` precision are run.
|
||||
- However, results from `error` and `warning` queries with `medium` precision, as well as `recommendation` queries with `high` precision, are not shown by default.
|
||||
|
||||
c. All queries need an `@id`.
|
||||
|
||||
- The ID should be consistent with the ids of similar queries for other languages; for example, there is a C/C++ query looking for comments containing the word "TODO" which has id `cpp/todo-comment`, and its C# counterpart has id `cs/todo-comment`.
|
||||
|
||||
d. Provide one or more `@tags` describing the query.
|
||||
|
||||
- Tags are free-form, but we have some conventions, especially for tagging security queries with corresponding CWE numbers.
|
||||
|
||||
7. **Move your query out of `experimental`**
|
||||
|
||||
- The structure of an `experimental` subdirectory mirrors the structure of its parent directory, so this step may just be a matter of removing the `experimental/` prefix of the query and test paths. Be sure to also edit any references to the query path in tests.
|
||||
- Add the query to one of the legacy suite files in `ql/<language>/config/suites/<language>/` if it exists. Note that there are separate suite directories for C and C++, `c` and `cpp` respectively, and the query should be added to one or both as appropriate.
|
||||
- Add a release note to `change-notes/<next-version>/analysis-<language>.md`.
|
||||
Reference in New Issue
Block a user