mirror of
https://github.com/github/codeql.git
synced 2026-04-25 16:55:19 +02:00
Merge branch 'main' into MathiasVP-patch-1
This commit is contained in:
@@ -7,8 +7,6 @@ CodeQL has a large selection of classes for representing the abstract syntax tre
|
||||
|
||||
.. include:: ../reusables/abstract-syntax-tree.rst
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
.. include:: ../reusables/kotlin-java-differences.rst
|
||||
|
||||
Statement classes
|
||||
@@ -385,4 +383,4 @@ Further reading
|
||||
.. _TypeLiteral: https://codeql.github.com/codeql-standard-libraries/java/semmle/code/java/Expr.qll/type.Expr$TypeLiteral.html
|
||||
.. _ClassInstanceExpr: https://codeql.github.com/codeql-standard-libraries/java/semmle/code/java/Expr.qll/type.Expr$ClassInstanceExpr.html
|
||||
.. _ArrayInit: https://codeql.github.com/codeql-standard-libraries/java/semmle/code/java/Expr.qll/type.Expr$ArrayInit.html
|
||||
.. _Annotation: https://codeql.github.com/codeql-standard-libraries/java/semmle/code/java/Annotation.qll/type.Annotation$Annotation.html
|
||||
.. _Annotation: https://codeql.github.com/codeql-standard-libraries/java/semmle/code/java/Annotation.qll/type.Annotation$Annotation.html
|
||||
|
||||
@@ -408,7 +408,7 @@ Exercise 4
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
.. include:: ../reusables/cpp-further-reading.rst
|
||||
|
||||
@@ -380,7 +380,7 @@ Exercise 4
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
.. include:: ../reusables/cpp-further-reading.rst
|
||||
|
||||
@@ -541,7 +541,7 @@ This can be adapted from the ``SystemUriFlow`` class:
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
.. include:: ../reusables/csharp-further-reading.rst
|
||||
|
||||
@@ -3,9 +3,7 @@
|
||||
Analyzing data flow in Java and Kotlin
|
||||
======================================
|
||||
|
||||
You can use CodeQL to track the flow of data through a Java/Kotlin program to its use.
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
You can use CodeQL to track the flow of data through a Java/Kotlin program to its use.
|
||||
|
||||
.. include:: ../reusables/kotlin-java-differences.rst
|
||||
|
||||
@@ -171,7 +169,7 @@ Global data flow tracks data flow throughout the entire program, and is therefor
|
||||
.. pull-quote:: Note
|
||||
|
||||
.. include:: ../reusables/path-problem.rst
|
||||
|
||||
|
||||
Using global data flow
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
@@ -362,7 +360,7 @@ Exercise 4
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
.. include:: ../reusables/java-further-reading.rst
|
||||
|
||||
@@ -16,7 +16,7 @@ For a more general introduction to modeling data flow, see ":ref:`About data flo
|
||||
Data flow nodes
|
||||
---------------
|
||||
|
||||
Both local and global data flow, as well as taint tracking, work on a representation of the program known as the :ref:`data flow graph <data-flow-graph>`.
|
||||
Both local and global data flow, as well as taint tracking, work on a representation of the program known as the :ref:`data flow graph <data-flow-graph>`.
|
||||
Nodes on the data flow flow graph may also correspond to nodes on the abstract syntax tree, but they are not the same.
|
||||
While AST nodes belong to class ``ASTNode`` and its subclasses, data flow nodes belong to class ``DataFlow::Node`` and its subclasses:
|
||||
|
||||
@@ -557,8 +557,8 @@ Exercise 4
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
.. include:: ../reusables/java-further-reading.rst
|
||||
.. include:: ../reusables/codeql-ref-tools-further-reading.rst
|
||||
.. include:: ../reusables/codeql-ref-tools-further-reading.rst
|
||||
|
||||
@@ -359,7 +359,7 @@ This data flow configuration tracks data flow from environment variables to open
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
.. include:: ../reusables/python-further-reading.rst
|
||||
|
||||
@@ -111,7 +111,7 @@ This query finds the filename argument passed in each call to ``File.open``:
|
||||
|
||||
import codeql.ruby.DataFlow
|
||||
import codeql.ruby.ApiGraphs
|
||||
|
||||
|
||||
from DataFlow::CallNode call
|
||||
where call = API::getTopLevelMember("File").getAMethodCall("open")
|
||||
select call.getArgument(0)
|
||||
@@ -126,7 +126,7 @@ So we use local data flow to find all expressions that flow into the argument:
|
||||
|
||||
import codeql.ruby.DataFlow
|
||||
import codeql.ruby.ApiGraphs
|
||||
|
||||
|
||||
from DataFlow::CallNode call, DataFlow::ExprNode expr
|
||||
where
|
||||
call = API::getTopLevelMember("File").getAMethodCall("open") and
|
||||
@@ -143,7 +143,7 @@ We can update the query to specify that ``expr`` is an instance of a ``LocalSour
|
||||
|
||||
import codeql.ruby.DataFlow
|
||||
import codeql.ruby.ApiGraphs
|
||||
|
||||
|
||||
from DataFlow::CallNode call, DataFlow::ExprNode expr
|
||||
where
|
||||
call = API::getTopLevelMember("File").getAMethodCall("open") and
|
||||
@@ -158,7 +158,7 @@ That would allow us to use the member predicate ``flowsTo`` on ``LocalSourceNode
|
||||
|
||||
import codeql.ruby.DataFlow
|
||||
import codeql.ruby.ApiGraphs
|
||||
|
||||
|
||||
from DataFlow::CallNode call, DataFlow::ExprNode expr
|
||||
where
|
||||
call = API::getTopLevelMember("File").getAMethodCall("open") and
|
||||
@@ -171,7 +171,7 @@ As an alternative, we can ask more directly that ``expr`` is a local source of t
|
||||
|
||||
import codeql.ruby.DataFlow
|
||||
import codeql.ruby.ApiGraphs
|
||||
|
||||
|
||||
from DataFlow::CallNode call, DataFlow::ExprNode expr
|
||||
where
|
||||
call = API::getTopLevelMember("File").getAMethodCall("open") and
|
||||
@@ -190,7 +190,7 @@ This query finds instances where a parameter is used as the name when opening a
|
||||
|
||||
import codeql.ruby.DataFlow
|
||||
import codeql.ruby.ApiGraphs
|
||||
|
||||
|
||||
from DataFlow::CallNode call, DataFlow::ParameterNode p
|
||||
where
|
||||
call = API::getTopLevelMember("File").getAMethodCall("open") and
|
||||
@@ -206,7 +206,7 @@ This query finds calls to ``File.open`` where the file name is derived from a pa
|
||||
import codeql.ruby.DataFlow
|
||||
import codeql.ruby.TaintTracking
|
||||
import codeql.ruby.ApiGraphs
|
||||
|
||||
|
||||
from DataFlow::CallNode call, DataFlow::ParameterNode p
|
||||
where
|
||||
call = API::getTopLevelMember("File").getAMethodCall("open") and
|
||||
@@ -327,17 +327,17 @@ The following global taint-tracking query finds path arguments in filesystem acc
|
||||
import codeql.ruby.TaintTracking
|
||||
import codeql.ruby.Concepts
|
||||
import codeql.ruby.dataflow.RemoteFlowSources
|
||||
|
||||
|
||||
module RemoteToFileConfiguration implements DataFlow::ConfigSig {
|
||||
predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource }
|
||||
|
||||
|
||||
predicate isSink(DataFlow::Node sink) {
|
||||
sink = any(FileSystemAccess fa).getAPathArgument()
|
||||
}
|
||||
}
|
||||
|
||||
module RemoteToFileFlow = TaintTracking::Global<RemoteToFileConfiguration>;
|
||||
|
||||
|
||||
from DataFlow::Node input, DataFlow::Node fileAccess
|
||||
where RemoteToFileFlow::flow(input, fileAccess)
|
||||
select fileAccess, "This file access uses data from $@.", input, "user-controllable input."
|
||||
@@ -352,7 +352,7 @@ The following global data-flow query finds calls to ``File.open`` where the file
|
||||
import codeql.ruby.DataFlow
|
||||
import codeql.ruby.controlflow.CfgNodes
|
||||
import codeql.ruby.ApiGraphs
|
||||
|
||||
|
||||
module EnvironmentToFileConfiguration implements DataFlow::ConfigSig {
|
||||
predicate isSource(DataFlow::Node source) {
|
||||
exists(ExprNodes::ConstantReadAccessCfgNode env |
|
||||
@@ -367,7 +367,7 @@ The following global data-flow query finds calls to ``File.open`` where the file
|
||||
}
|
||||
|
||||
module EnvironmentToFileFlow = DataFlow::Global<EnvironmentToFileConfiguration>;
|
||||
|
||||
|
||||
from DataFlow::Node environment, DataFlow::Node fileOpen
|
||||
where EnvironmentToFileFlow::flow(environment, fileOpen)
|
||||
select fileOpen, "This call to 'File.open' uses data from $@.", environment,
|
||||
@@ -376,7 +376,7 @@ The following global data-flow query finds calls to ``File.open`` where the file
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
.. include:: ../reusables/ruby-further-reading.rst
|
||||
|
||||
@@ -32,7 +32,7 @@ The ``Node`` class has a number of useful subclasses, such as ``ExprNode`` for e
|
||||
Expr asExpr() { ... }
|
||||
|
||||
/**
|
||||
* Gets the control flow node that corresponds to this data flow node.
|
||||
* Gets the control flow node that corresponds to this data flow node.
|
||||
*/
|
||||
ControlFlowNode getCfgNode() { ... }
|
||||
|
||||
@@ -282,7 +282,7 @@ The following global taint-tracking query finds places where a value from a remo
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
.. include:: ../reusables/swift-further-reading.rst
|
||||
|
||||
@@ -5,8 +5,6 @@ Annotations in Java and Kotlin
|
||||
|
||||
CodeQL databases of Java/Kotlin projects contain information about all annotations attached to program elements.
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
About working with annotations
|
||||
------------------------------
|
||||
|
||||
|
||||
@@ -21,6 +21,8 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
|
||||
using-range-analsis-in-cpp
|
||||
hash-consing-and-value-numbering
|
||||
advanced-dataflow-scenarios-cpp
|
||||
customizing-library-models-for-cpp
|
||||
|
||||
|
||||
|
||||
- :doc:`Basic query for C and C++ code <basic-query-for-cpp-code>`: Learn to write and run a simple CodeQL query.
|
||||
@@ -46,3 +48,5 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
|
||||
- :doc:`Hash consing and value numbering <hash-consing-and-value-numbering>`: You can use specialized CodeQL libraries to recognize expressions that are syntactically identical or compute the same value at runtime in C and C++ codebases.
|
||||
|
||||
- :doc:`Advanced C/C++ dataflow scenarios <advanced-dataflow-scenarios-cpp>`: You can track precise data flow in C and C++ codebases by distinguishing between a pointer and its indirection(s).
|
||||
|
||||
- :doc:`Customizing library models for C and C++ <customizing-library-models-for-cpp>`: You can model frameworks and libraries that your codebase depends on using data extensions and publish them as CodeQL model packs.
|
||||
|
||||
@@ -5,9 +5,6 @@ CodeQL for Java and Kotlin
|
||||
|
||||
Experiment and learn how to write effective and efficient queries for CodeQL databases generated from Java and Kotlin codebases.
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
|
||||
.. pull-quote:: Enabling Kotlin support
|
||||
|
||||
CodeQL treats Java and Kotlin as parts of the same language, so to enable Kotlin support you should enable ``java-kotlin`` as a language.
|
||||
|
||||
@@ -15,6 +15,7 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
|
||||
functions-in-python
|
||||
expressions-and-statements-in-python
|
||||
analyzing-control-flow-in-python
|
||||
customizing-library-models-for-python
|
||||
|
||||
- :doc:`Basic query for Python code <basic-query-for-python-code>`: Learn to write and run a simple CodeQL query.
|
||||
|
||||
@@ -29,3 +30,5 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
|
||||
- :doc:`Expressions and statements in Python <expressions-and-statements-in-python>`: You can use syntactic classes from the CodeQL library to explore how Python expressions and statements are used in a codebase.
|
||||
|
||||
- :doc:`Analyzing control flow in Python <analyzing-control-flow-in-python>`: You can write CodeQL queries to explore the control-flow graph of a Python program, for example, to discover unreachable code or mutually exclusive blocks of code.
|
||||
|
||||
- :doc:`Customizing library models for Python <customizing-library-models-for-python>`: You can model frameworks and libraries that your codebase depends on using data extensions and publish them as CodeQL model packs.
|
||||
|
||||
@@ -5,8 +5,6 @@ CodeQL library for Java and Kotlin
|
||||
|
||||
When you're analyzing a Java/Kotlin program, you can make use of the large collection of classes in the CodeQL library for Java/Kotlin.
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
About the CodeQL library for Java and Kotlin
|
||||
--------------------------------------------
|
||||
|
||||
|
||||
@@ -0,0 +1,184 @@
|
||||
.. _customizing-library-models-for-cpp:
|
||||
|
||||
Customizing library models for C and C++
|
||||
========================================
|
||||
|
||||
You can model the methods and callables that control data flow in any framework or library. This is especially useful for custom frameworks or niche libraries, that are not supported by the standard CodeQL libraries.
|
||||
|
||||
.. include:: ../reusables/beta-note-customizing-library-models.rst
|
||||
|
||||
About this article
|
||||
------------------
|
||||
|
||||
This article contains reference material about how to define custom models for sources, sinks, and flow summaries for C and C++ dependencies in data extension files.
|
||||
|
||||
About data extensions
|
||||
---------------------
|
||||
|
||||
You can customize analysis by defining models (summaries, sinks, and sources) of your code's C and C++ dependencies in data extension files. Each model defines the behavior of one or more elements of your library or framework, such as callables. When you run dataflow analysis, these models expand the potential sources and sinks tracked by dataflow analysis and improve the precision of results.
|
||||
|
||||
Many of the security queries search for paths from a source of untrusted input to a sink that represents a vulnerability. This is known as taint tracking. Each source is a starting point for dataflow analysis to track tainted data and each sink is an end point.
|
||||
|
||||
Taint tracking queries also need to know how data can flow through elements that are not included in the source code. These are modeled as summaries. A summary model enables queries to synthesize the flow behavior through elements in dependency code that is not stored in your repository.
|
||||
|
||||
Syntax used to define an element in an extension file
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Each model of an element is defined using a data extension where each tuple constitutes a model.
|
||||
A data extension file to extend the standard CPP queries included with CodeQL is a YAML file with the form:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/cpp-all
|
||||
extensible: <name of extensible predicate>
|
||||
data:
|
||||
- <tuple1>
|
||||
- <tuple2>
|
||||
- ...
|
||||
|
||||
Each YAML file may contain one or more top-level extensions.
|
||||
|
||||
- ``addsTo`` defines the CodeQL pack name and extensible predicate that the extension is injected into.
|
||||
- ``data`` defines one or more rows of tuples that are injected as values into the extensible predicate. The number of columns and their types must match the definition of the extensible predicate.
|
||||
|
||||
Data extensions use union semantics, which means that the tuples of all extensions for a single extensible predicate are combined, duplicates are removed, and all of the remaining tuples are queryable by referencing the extensible predicate.
|
||||
|
||||
Publish data extension files in a CodeQL model pack to share
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You can group one or more data extension files into a CodeQL model pack and publish it to the GitHub Container Registry. This makes it easy for anyone to download the model pack and use it to extend their analysis. For more information, see `Creating a CodeQL model pack <https://docs.github.com/en/code-security/codeql-cli/using-the-advanced-functionality-of-the-codeql-cli/creating-and-working-with-codeql-packs#creating-a-codeql-model-pack>`__ and `Publishing and using CodeQL packs <https://docs.github.com/en/code-security/codeql-cli/using-the-advanced-functionality-of-the-codeql-cli/publishing-and-using-codeql-packs/>`__ in the CodeQL CLI documentation.
|
||||
|
||||
Extensible predicates used to create custom models in C and C++
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The CodeQL library for CPP analysis exposes the following extensible predicates:
|
||||
|
||||
- ``sourceModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model sources of potentially tainted data. The ``kind`` of the sources defined using this predicate determine which threat model they are associated with. Different threat models can be used to customize the sources used in an analysis. For more information, see ":ref:`Threat models <threat-models-cpp>`."
|
||||
- ``sinkModel(namespace, type, subtypes, name, signature, ext, input, kind, provenance)``. This is used to model sinks where tainted data may be used in a way that makes the code vulnerable.
|
||||
- ``summaryModel(namespace, type, subtypes, name, signature, ext, input, output, kind, provenance)``. This is used to model flow through elements.
|
||||
|
||||
The extensible predicates are populated using the models defined in data extension files.
|
||||
|
||||
Example of custom model definitions
|
||||
------------------------------------
|
||||
|
||||
The examples in this section are taken from the standard CodeQL CPP query pack published by GitHub. They demonstrate how to add tuples to extend extensible predicates that are used by the standard queries.
|
||||
|
||||
Example: Taint source from the ``boost::asio`` namespace
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This example shows how the CPP query pack models the return value from the ``read_until`` function as a ``remote`` source.
|
||||
|
||||
.. code-block:: cpp
|
||||
|
||||
boost::asio::read_until(socket, recv_buffer, '\0', error);
|
||||
|
||||
We need to add a tuple to the ``sourceModel``\(namespace, type, subtypes, name, signature, ext, output, kind, provenance) extensible predicate by updating a data extension file.
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/cpp-all
|
||||
extensible: sourceModel
|
||||
data:
|
||||
- ["boost::asio", "", False, "read_until", "", "", "Argument[*1]", "remote", "manual"]
|
||||
|
||||
Since we are adding a new source, we need to add a tuple to the ``sourceModel`` extensible predicate.
|
||||
The first five values identify the callable (in this case a free function) to be modeled as a source.
|
||||
|
||||
- The first value ``"boost::asio"`` is the namespace name.
|
||||
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modelling a free function, the type is left blank.
|
||||
- The third value ``False`` is a flag that indicates whether or not the sink also applies to all overrides of the method. For a free function, this should be ``False``.
|
||||
- The fourth value ``"read_until"`` is the function name.
|
||||
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name. In this case, we want the model to include all functions in ``boost::asio`` called ``read_until``.
|
||||
|
||||
The sixth value should be left empty and is out of scope for this documentation.
|
||||
The remaining values are used to define the output specification, the ``kind``, and the ``provenance`` (origin) of the source.
|
||||
|
||||
- The seventh value ``"Argument[*1]"`` is the output specification, which means in this case that the sink is the first indirection (or pointed-to value, ``*``) of the second argument (``Argument[1]``) passed to the function.
|
||||
- The eighth value ``"remote"`` is the kind of the source. The source kind is used to define the threat model where the source is in scope. ``remote`` applies to many of the security related queries as it means a remote source of untrusted data. For more information, see ":ref:`Threat models <threat-models-cpp>`."
|
||||
- The ninth value ``"manual"`` is the provenance of the source, which is used to identify the origin of the source model.
|
||||
|
||||
Example: Taint sink in the ``boost::asio`` namespace
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This example shows how the CPP query pack models the second argument of the ``boost::asio::write`` function as a remote flow sink. A remote flow sink is where data is transmitted to other machines across a network, which is used for example by the "Cleartext transmission of sensitive information" (`cpp/cleartext-transmission`) query.
|
||||
|
||||
.. code-block:: cpp
|
||||
|
||||
boost::asio::write(socket, send_buffer, error);
|
||||
|
||||
We need to add a tuple to the ``sinkModel``\(namespace, type, subtypes, name, signature, ext, input, kind, provenance) extensible predicate by updating a data extension file.
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/cpp-all
|
||||
extensible: sinkModel
|
||||
data:
|
||||
- ["boost::asio", "", False, "write", "", "", "Argument[*1]", "remote-sink", "manual"]
|
||||
|
||||
Since we want to add a new sink, we need to add a tuple to the ``sinkModel`` extensible predicate.
|
||||
The first five values identify the callable (in this case a free function) to be modeled as a sink.
|
||||
|
||||
- The first value ``"boost::asio"`` is the namespace name.
|
||||
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modelling a free function, the type is left blank.
|
||||
- The third value ``False`` is a flag that indicates whether or not the sink also applies to all overrides of the method. For a free function, this should be ``False``.
|
||||
- The fourth value ``"write"`` is the function name.
|
||||
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name. In this case, we want the model to include all functions in ``boost::asio`` called ``write``.
|
||||
|
||||
The sixth value should be left empty and is out of scope for this documentation.
|
||||
The remaining values are used to define the output specification, the ``kind``, and the ``provenance`` (origin) of the sink.
|
||||
|
||||
- The seventh value ``"Argument[*1]"`` is the output specification, which means in this case that the sink is the first indirection (or pointed-to value, ``*``) of the second argument (``Argument[1]``) passed to the function.
|
||||
- The eighth value ``"remote-sink"`` is the kind of the sink. The sink kind is used to define the queries where the sink is in scope.
|
||||
- The ninth value ``"manual"`` is the provenance of the sink, which is used to identify the origin of the sink model.
|
||||
|
||||
Example: Add flow through the ``boost::asio::buffer`` method
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This example shows how the CPP query pack models flow through a function for a simple case.
|
||||
|
||||
.. code-block:: cpp
|
||||
|
||||
boost::asio::write(socket, boost::asio::buffer(send_str), error);
|
||||
|
||||
We need to add tuples to the ``summaryModel``\(namespace, type, subtypes, name, signature, ext, input, output, kind, provenance) extensible predicate by updating a data extension file:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/cpp-all
|
||||
extensible: summaryModel
|
||||
data:
|
||||
- ["boost::asio", "", False, "buffer", "", "", "Argument[*0]", "ReturnValue", "taint", "manual"]
|
||||
|
||||
Since we are adding flow through a function, we need to add tuples to the ``summaryModel`` extensible predicate.
|
||||
|
||||
The first five values identify the callable (in this case free function) to be modeled as a summary.
|
||||
|
||||
- The first value ``"boost::asio"`` is the namespace name.
|
||||
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modelling a free function, the type is left blank.
|
||||
- The third value ``False`` is a flag that indicates whether or not the sink also applies to all overrides of the method. For a free function, this should be ``False``.
|
||||
- The fourth value ``"buffer"`` is the function name.
|
||||
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name. In this case, we want the model to include all functions in ``boost::asio`` called ``buffer``.
|
||||
|
||||
The sixth value should be left empty and is out of scope for this documentation.
|
||||
The remaining values are used to define the input and output specifications, the ``kind``, and the ``provenance`` (origin) of the summary.
|
||||
|
||||
- The seventh value is the input specification (where data flows from). ``Argument[*0]`` specifies the first indirection (or pointed-to value, ``*``) of the first argument (``Argument[0]``) passed to the function.
|
||||
- The eighth value ``"ReturnValue"`` is the output specification (where data flows to), in this case the return value.
|
||||
- The ninth value ``"taint"`` is the kind of the flow. ``taint`` means that taint is propagated through the call.
|
||||
- The tenth value ``"manual"`` is the provenance of the summary, which is used to identify the origin of the summary model.
|
||||
|
||||
.. _threat-models-cpp:
|
||||
|
||||
Threat models
|
||||
-------------
|
||||
|
||||
.. include:: ../reusables/threat-model-description.rst
|
||||
@@ -5,8 +5,6 @@ Customizing library models for Java and Kotlin
|
||||
|
||||
You can model the methods and callables that control data flow in any framework or library. This is especially useful for custom frameworks or niche libraries, that are not supported by the standard CodeQL libraries.
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
.. include:: ../reusables/beta-note-customizing-library-models.rst
|
||||
|
||||
About this article
|
||||
@@ -16,7 +14,8 @@ This article contains reference material about how to define custom models for s
|
||||
|
||||
The best way to create your own models is using the CodeQL model editor in the CodeQL extension for Visual Studio Code. The model editor automatically guides you through the process of defining models, displaying the properties you need to define and the options available. You can save the resulting models as data extension files in CodeQL model packs and use them without worrying about the syntax.
|
||||
|
||||
For more information, see ":ref:`Using the CodeQL model editor <using-the-codeql-model-editor>`."
|
||||
For more information, see `Using the CodeQL model editor <https://docs.github.com/en/code-security/codeql-for-vs-code/using-the-advanced-functionality-of-the-codeql-for-vs-code-extension/using-the-codeql-model-editor>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
About data extensions
|
||||
---------------------
|
||||
|
||||
@@ -0,0 +1,451 @@
|
||||
.. _customizing-library-models-for-python:
|
||||
|
||||
Customizing Library Models for Python
|
||||
=========================================
|
||||
|
||||
.. include:: ../reusables/beta-note-customizing-library-models.rst
|
||||
|
||||
Python analysis can be customized by adding library models in data extension files.
|
||||
|
||||
A data extension for Python is a YAML file of the form:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: <name of extensible predicate>
|
||||
data:
|
||||
- <tuple1>
|
||||
- <tuple2>
|
||||
- ...
|
||||
|
||||
The CodeQL library for Python exposes the following extensible predicates:
|
||||
|
||||
- **sourceModel**\(type, path, kind)
|
||||
- **sinkModel**\(type, path, kind)
|
||||
- **typeModel**\(type1, type2, path)
|
||||
- **summaryModel**\(type, path, input, output, kind)
|
||||
|
||||
We'll explain how to use these using a few examples, and provide some reference material at the end of this article.
|
||||
|
||||
Example: Taint sink in the 'fabric' package
|
||||
-------------------------------------------
|
||||
|
||||
In this example, we'll show how to add the following argument, passed to **sudo** from the **fabric** package, as a command-line injection sink:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from fabric.operations import sudo
|
||||
sudo(cmd) # <-- add 'cmd' as a taint sink
|
||||
|
||||
Note that this sink is already recognized by the CodeQL Python analysis, but for this example, you could use the following data extension:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: sinkModel
|
||||
data:
|
||||
- ["fabric", "Member[operations].Member[sudo].Argument[0]", "command-injection"]
|
||||
|
||||
|
||||
- Since we're adding a new sink, we add a tuple to the **sinkModel** extensible predicate.
|
||||
- The first column, **"fabric"**, identifies a set of values from which to begin the search for the sink.
|
||||
The string **"fabric"** means we start at the places where the codebase imports the package **fabric**.
|
||||
- The second column is an access path that is evaluated from left to right, starting at the values that were identified by the first column.
|
||||
|
||||
- **Member[operations]** selects accesses to the **operations** module.
|
||||
- **Member[sudo]** selects accesses to the **sudo** function in the **operations** module.
|
||||
- **Argument[0]** selects the first argument to calls to that function.
|
||||
|
||||
- **"command-injection"** indicates that this is considered a sink for the command injection query.
|
||||
|
||||
Example: Taint sink in the 'invoke' package
|
||||
-------------------------------------------
|
||||
|
||||
Often sinks are found as arguments to methods rather than functions. In this example, we'll show how to add the following argument, passed to **run** from the **invoke** package, as a command-line injection sink:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import invoke
|
||||
c = invoke.Context()
|
||||
c.run(cmd) # <-- add 'cmd' as a taint sink
|
||||
|
||||
Note that this sink is already recognized by the CodeQL Python analysis, but for this example, you could use the following data extension:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: sinkModel
|
||||
data:
|
||||
- ["invoke", "Member[Context].Instance.Member[run].Argument[0]", "command-injection"]
|
||||
|
||||
- The first column, **"invoke"**, begins the search at places where the codebase imports the package **invoke**.
|
||||
- The second column is an access path that is evaluated from left to right, starting at the values that were identified by the first column.
|
||||
|
||||
- **Member[Context]** selects accesses to the **Context** class.
|
||||
- **Instance** selects instances of the **Context** class.
|
||||
- **Member[run]** selects accesses to the **run** method in the **Context** class.
|
||||
- **Argument[0]** selects the first argument to calls to that method.
|
||||
|
||||
- **"command-injection"** indicates that this is considered a sink for the command injection query.
|
||||
|
||||
Note that the **Instance** component is used to select instances of a class, including instances of its subclasses.
|
||||
Since methods on instances are common targets, we have a more compact syntax for selecting them. The first column, the type, is allowed to contain a dotted path ending in a class name.
|
||||
This will begin the search at instances of that class. Using this syntax, the previous example could be written as:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: sinkModel
|
||||
data:
|
||||
- ["invoke.Context", "Member[run].Argument[0]", "command-injection"]
|
||||
|
||||
Continued example: Multiple ways to obtain a type
|
||||
-------------------------------------------------
|
||||
|
||||
The invoke package provides multiple ways to obtain a **Context** instance. The following example shows how to add a new way to obtain a **Context** instance:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from invoke import context
|
||||
c = context.Context()
|
||||
c.run(cmd) # <-- add 'cmd' as a taint sink
|
||||
|
||||
Comparing to the previous Python snippet, the **Context** class is now found as **invoke.context.Context** instead of **invoke.Context**.
|
||||
We could add a data extension similar to the previous one, but with the type **invoke.context.Context**. However, we can also use the **typeModel** extensible predicate to describe how to reach **invoke.Context** from **invoke.context.Context**:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: typeModel
|
||||
data:
|
||||
- ["invoke.Context", "invoke.context.Context", ""]
|
||||
|
||||
- The first column, **"invoke.Context"**, is the name of the type to reach.
|
||||
- The second column, **"invoke.context.Context"**, is the name of the type from which to evaluate the path.
|
||||
- The third column is just an empty string, indicating that any instance of **invoke.context.Context** is also an instance of **invoke.Context**.
|
||||
|
||||
Combining this with the sink model we added earlier, the sink in the example is detected by the model.
|
||||
|
||||
Example: Taint sources from Django 'upload_to' argument
|
||||
-------------------------------------------------------
|
||||
|
||||
This example is a bit more advanced, involving both a callback function and a class constructor.
|
||||
The Django web framework allows you to specify a function that determines the path where uploaded files are stored (see the `Django documentation <https://docs.djangoproject.com/en/5.0/ref/models/fields/#django.db.models.FileField.upload_to>`_).
|
||||
This function is passed as an argument to the **FileField** constructor.
|
||||
The function is called with two arguments: the instance of the model and the filename of the uploaded file.
|
||||
This filename is what we want to mark as a taint source. An example use looks as follows:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from django.db import models
|
||||
|
||||
def user_directory_path(instance, filename): # <-- add 'filename' as a taint source
|
||||
# file will be uploaded to MEDIA_ROOT/user_<id>/<filename>
|
||||
return "user_{0}/{1}".format(instance.user.id, filename)
|
||||
|
||||
class MyModel(models.Model):
|
||||
upload = models.FileField(upload_to=user_directory_path) # <-- the 'upload_to' parameter defines our custom function
|
||||
|
||||
Note that this source is already known by the CodeQL Python analysis, but for this example, you could use the following data extension:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: sourceModel
|
||||
data:
|
||||
- [
|
||||
"django.db.models.FileField!",
|
||||
"Call.Argument[0,upload_to:].Parameter[1]",
|
||||
"remote",
|
||||
]
|
||||
|
||||
|
||||
- Since we're adding a new taint source, we add a tuple to the **sourceModel** extensible predicate.
|
||||
- The first column, **"django.db.models.FileField!"**, is a dotted path to the **FileField** class from the **django.db.models** package.
|
||||
The **!** at the end of the type name indicates that we are looking for the class itself rather than instances of this class.
|
||||
|
||||
- The second column is an access path that is evaluated from left to right, starting at the values that were identified by the first column.
|
||||
|
||||
- **Call** selects calls to the class. That is, constructor calls.
|
||||
- **Argument[0,upload_to:]** selects the first positional argument, or the named argument named **upload_to**. Note that the colon at the end of the argument name indicates that we are looking for a named argument.
|
||||
- **Parameter[1]** selects the second parameter of the callback function, which is the parameter receiving the filename.
|
||||
|
||||
- Finally, the kind **"remote"** indicates that this is considered a source of remote flow.
|
||||
|
||||
Example: Adding flow through 're.compile'
|
||||
----------------------------------------------
|
||||
|
||||
In this example, we'll show how to add flow through calls to ``re.compile``.
|
||||
``re.compile`` returns a compiled regular expression for efficient evaluation, but the pattern to be compiled is stored in the ``pattern`` attribute of the resulting object.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import re
|
||||
|
||||
let y = re.compile(pattern = x); // add value flow from 'x' to 'y.pattern'
|
||||
|
||||
Note that this flow is already recognized by the CodeQL Python analysis, but for this example, you could use the following data extension:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: summaryModel
|
||||
data:
|
||||
- [
|
||||
"re",
|
||||
"Member[compile]",
|
||||
"Argument[0,pattern:]",
|
||||
"ReturnValue.Attribute[pattern]",
|
||||
"value",
|
||||
]
|
||||
|
||||
|
||||
- Since we're adding flow through a function call, we add a tuple to the **summaryModel** extensible predicate.
|
||||
- The first column, **"re"**, begins the search for relevant calls at places where the **re** package is imported.
|
||||
- The second column, **"Member[compile]"**, is a path leading to the function calls we wish to model.
|
||||
In this case, we select references to the **compile** function from the ``re`` package.
|
||||
- The third column, **"Argument[0,pattern:]"**, indicates the input of the flow. In this case, either the first argument to the function call or the argument named **pattern**.
|
||||
- The fourth column, **"ReturnValue.Attribute[pattern]"**, indicates the output of the flow. In this case, the ``pattern`` attribute of the return value of the function call.
|
||||
- The last column, **"value"**, indicates the kind of flow to add. The value **value** means the input value is unchanged as
|
||||
it flows to the output.
|
||||
|
||||
Example: Adding flow through 'sorted'
|
||||
-------------------------------------------------
|
||||
|
||||
In this example, we'll show how to add flow through calls to the built-in function **sorted**:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
y = sorted(x) # add taint flow from 'x' to 'y'
|
||||
|
||||
Note that this flow is already recognized by the CodeQL Python analysis, but for this example, you could use the following data extension:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: summaryModel
|
||||
data:
|
||||
- [
|
||||
"builtins",
|
||||
"Member[sorted]",
|
||||
"Argument[0]",
|
||||
"ReturnValue",
|
||||
"taint",
|
||||
]
|
||||
|
||||
|
||||
- Since we're adding flow through a function call, we add a tuple to the **summaryModel** extensible predicate.
|
||||
- The first column, **"builtins"**, begins the search for relevant calls among references to the built-in names.
|
||||
In Python, many built-in functions are available. Technically, most of these are part of the **builtins** package, but they can be accessed without an explicit import. When we write **builtins** in the first column, we will find both the implicit and explicit references to the built-in functions.
|
||||
- The second column, **"Member[sorted]"**, selects references to the **sorted** function from the **builtins** package; that is, the built-in function **sorted**.
|
||||
- The third column, **"Argument[0]"**, indicates the input of the flow. In this case, the first argument to the function call.
|
||||
- The fourth column, **"ReturnValue"**, indicates the output of the flow. In this case, the return value of the function call.
|
||||
- The last column, **"taint"**, indicates the kind of flow to add. The value **taint** means the output is not necessarily equal
|
||||
to the input, but was derived from the input in a taint-preserving way.
|
||||
|
||||
We might also provide a summary stating that the elements of the input list are preserved in the output list:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: summaryModel
|
||||
data:
|
||||
- [
|
||||
"builtins",
|
||||
"Member[sorted]",
|
||||
"Argument[0].ListElement",
|
||||
"ReturnValue.ListElement",
|
||||
"value",
|
||||
]
|
||||
|
||||
The tracking of list elements is imprecise in that the analysis does not know where in the list the tracked value is found.
|
||||
So this summary simply states that if the value is found somewhere in the input list, it will also be found somewhere in the output list, unchanged.
|
||||
|
||||
Reference material
|
||||
------------------
|
||||
|
||||
The following sections provide reference material for extensible predicates, access paths, types, and kinds.
|
||||
|
||||
Extensible predicates
|
||||
---------------------
|
||||
|
||||
sourceModel(type, path, kind)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Adds a new taint source. Most taint-tracking queries will use the new source.
|
||||
|
||||
- **type**: Name of a type from which to evaluate **path**.
|
||||
- **path**: Access path leading to the source.
|
||||
- **kind**: Kind of source to add. Currently only **remote** is used.
|
||||
|
||||
Example:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: sourceModel
|
||||
data:
|
||||
- ["flask", "Member[request]", "remote"]
|
||||
|
||||
sinkModel(type, path, kind)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Adds a new taint sink. Sinks are query-specific and will typically affect one or two queries.
|
||||
|
||||
- **type**: Name of a type from which to evaluate **path**.
|
||||
- **path**: Access path leading to the sink.
|
||||
- **kind**: Kind of sink to add. See the section on sink kinds for a list of supported kinds.
|
||||
|
||||
Example:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: sinkModel
|
||||
data:
|
||||
- ["builtins", "Member[exec].Argument[0]", "code-injection"]
|
||||
|
||||
summaryModel(type, path, input, output, kind)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Adds flow through a function call.
|
||||
|
||||
- **type**: Name of a type from which to evaluate **path**.
|
||||
- **path**: Access path leading to a function call.
|
||||
- **input**: Path relative to the function call that leads to input of the flow.
|
||||
- **output**: Path relative to the function call leading to the output of the flow.
|
||||
- **kind**: Kind of summary to add. Can be **taint** for taint-propagating flow, or **value** for value-preserving flow.
|
||||
|
||||
Example:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: summaryModel
|
||||
data:
|
||||
- [
|
||||
"builtins",
|
||||
"Member[reversed]",
|
||||
"Argument[0]",
|
||||
"ReturnValue",
|
||||
"taint",
|
||||
]
|
||||
|
||||
typeModel(type1, type2, path)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
A description of how to reach **type1** from **type2**.
|
||||
If this is the only way to reach **type1**, for instance if **type1** is a name we made up to represent the inner workings of a library, we think of this as a definition of **type1**.
|
||||
In the context of instances, this describes how to obtain an instance of **type1** from an instance of **type2**.
|
||||
|
||||
- **type1**: Name of the type to reach.
|
||||
- **type2**: Name of the type from which to evaluate **path**.
|
||||
- **path**: Access path leading from **type2** to **type1**.
|
||||
|
||||
Example:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
extensions:
|
||||
- addsTo:
|
||||
pack: codeql/python-all
|
||||
extensible: typeModel
|
||||
data:
|
||||
- [
|
||||
"flask.Response",
|
||||
"flask",
|
||||
"Member[jsonify].ReturnValue",
|
||||
]
|
||||
|
||||
Types
|
||||
-----
|
||||
|
||||
A type is a string that identifies a set of values.
|
||||
In each of the extensible predicates mentioned in previous section, the first column is always the name of a type.
|
||||
A type can be defined by adding **typeModel** tuples for that type. Additionally, the following built-in types are available:
|
||||
|
||||
- The name of a package matches imports of that package. For example, the type **django** matches the expression **import django**.
|
||||
- The type **builtins** identifies the builtins package. In Python, all built-in values are found in this package, so they can be identified using this type.
|
||||
- A dotted path ending in a class name identifies instances of that class. If the suffix **!** is added, the type refers to the class itself.
|
||||
|
||||
Access paths
|
||||
------------
|
||||
|
||||
The **path**, **input**, and **output** columns consist of a **.**-separated list of components, which is evaluated from left to right, with each step selecting a new set of values derived from the previous set of values.
|
||||
|
||||
The following components are supported:
|
||||
|
||||
- **Argument[**\ ``number``\ **]** selects the argument at the given index.
|
||||
- **Argument[**\ ``name``:\ **]** selects the argument with the given name.
|
||||
- **Argument[this]** selects the receiver of a method call.
|
||||
- **Parameter[**\ ``number``\ **]** selects the parameter at the given index.
|
||||
- **Parameter[**\ ``name``:\ **]** selects the named parameter with the given name.
|
||||
- **Parameter[this]** selects the **this** parameter of a function.
|
||||
- **ReturnValue** selects the return value of a function or call.
|
||||
- **Member[**\ ``name``\ **]** selects the function/method/class/value with the given name.
|
||||
- **Instance** selects instances of a class, including instances of its subclasses.
|
||||
- **Attribute[**\ ``name``\ **]** selects the attribute with the given name.
|
||||
- **ListElement** selects an element of a list.
|
||||
- **SetElement** selects an element of a set.
|
||||
- **TupleElement[**\ ``number``\ **]** selects the subscript at the given index.
|
||||
- **DictionaryElement[**\ ``name``\ **]** selects the subscript at the given name.
|
||||
|
||||
|
||||
Additional notes about the syntax of operands:
|
||||
|
||||
- Multiple operands may be given to a single component, as a shorthand for the union of the operands. For example, **Member[foo,bar]** matches the union of **Member[foo]** and **Member[bar]**.
|
||||
- Numeric operands to **Argument**, **Parameter**, and **WithArity** may be given as an interval. For example, **Argument[0..2]** matches argument 0, 1, or 2.
|
||||
- **Argument[N-1]** selects the last argument of a call, and **Parameter[N-1]** selects the last parameter of a function, with **N-2** being the second-to-last and so on.
|
||||
|
||||
Kinds
|
||||
-----
|
||||
|
||||
Source kinds
|
||||
~~~~~~~~~~~~
|
||||
|
||||
- **remote**: A generic source of remote flow. Most taint-tracking queries will use such a source. Currently this is the only supported source kind.
|
||||
|
||||
Sink kinds
|
||||
~~~~~~~~~~
|
||||
|
||||
Unlike sources, sinks tend to be highly query-specific, rarely affecting more than one or two queries. Not every query supports customizable sinks. If the following sinks are not suitable for your use case, you should add a new query.
|
||||
|
||||
- **code-injection**: A sink that can be used to inject code, such as in calls to **exec**.
|
||||
- **command-injection**: A sink that can be used to inject shell commands, such as in calls to **os.system**.
|
||||
- **path-injection**: A sink that can be used for path injection in a file system access, such as in calls to **flask.send_from_directory**.
|
||||
- **sql-injection**: A sink that can be used for SQL injection, such as in a MySQL **query** call.
|
||||
- **html-injection**: A sink that can be used for HTML injection, such as a server response body.
|
||||
- **js-injection**: A sink that can be used for JS injection, such as a server response body.
|
||||
- **url-redirection**: A sink that can be used to redirect the user to a malicious URL.
|
||||
- **unsafe-deserialization**: A deserialization sink that can lead to code execution or other unsafe behavior, such as an unsafe YAML parser.
|
||||
- **log-injection**: A sink that can be used for log injection, such as in a **logging.info** call.
|
||||
|
||||
Summary kinds
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
- **taint**: A summary that propagates taint. This means the output is not necessarily equal to the input, but it was derived from the input in an unrestrictive way. An attacker who controls the input will have significant control over the output as well.
|
||||
- **value**: A summary that preserves the value of the input or creates a copy of the input such that all of its object properties are preserved.
|
||||
@@ -254,8 +254,8 @@ Troubleshooting
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
.. include:: ../reusables/javascript-further-reading.rst
|
||||
.. include:: ../reusables/codeql-ref-tools-further-reading.rst
|
||||
.. include:: ../reusables/codeql-ref-tools-further-reading.rst
|
||||
|
||||
@@ -5,8 +5,6 @@ Overflow-prone comparisons in Java and Kotlin
|
||||
|
||||
You can use CodeQL to check for comparisons in Java/Kotlin code where one side of the comparison is prone to overflow.
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
About this article
|
||||
------------------
|
||||
|
||||
|
||||
@@ -5,8 +5,6 @@ Types in Java and Kotlin
|
||||
|
||||
You can use CodeQL to find out information about data types used in Java/Kotlin code. This allows you to write queries to identify specific type-related issues.
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
About working with Java types
|
||||
-----------------------------
|
||||
|
||||
|
||||
@@ -405,7 +405,7 @@ string may be an absolute path and whether it may contain ``..`` components.
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
|
||||
.. include:: ../reusables/javascript-further-reading.rst
|
||||
|
||||
@@ -5,8 +5,6 @@ Working with source locations
|
||||
|
||||
You can use the location of entities within Java/Kotlin code to look for potential errors. Locations allow you to deduce the presence, or absence, of white space which, in some cases, may indicate a problem.
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
About source locations
|
||||
----------------------
|
||||
|
||||
|
||||
@@ -0,0 +1,170 @@
|
||||
.. _codeql-cli-2.17.2:
|
||||
|
||||
==========================
|
||||
CodeQL 2.17.2 (2024-05-07)
|
||||
==========================
|
||||
|
||||
.. contents:: Contents
|
||||
:depth: 2
|
||||
:local:
|
||||
:backlinks: none
|
||||
|
||||
This is an overview of changes in the CodeQL CLI and relevant CodeQL query and library packs. For additional updates on changes to the CodeQL code scanning experience, check out the `code scanning section on the GitHub blog <https://github.blog/tag/code-scanning/>`__, `relevant GitHub Changelog updates <https://github.blog/changelog/label/code-scanning/>`__, `changes in the CodeQL extension for Visual Studio Code <https://marketplace.visualstudio.com/items/GitHub.vscode-codeql/changelog>`__, and the `CodeQL Action changelog <https://github.com/github/codeql-action/blob/main/CHANGELOG.md>`__.
|
||||
|
||||
Security Coverage
|
||||
-----------------
|
||||
|
||||
CodeQL 2.17.2 runs a total of 413 security queries when configured with the Default suite (covering 161 CWE). The Extended suite enables an additional 130 queries (covering 34 more CWE). 1 security query has been added with this release.
|
||||
|
||||
CodeQL CLI
|
||||
----------
|
||||
|
||||
Improvements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
* When uploading a SARIF file to GitHub using :code:`codeql github upload-results`, the CodeQL CLI now waits for the file to be processed by GitHub. If any errors occurred during processing of the analysis results, the command will log these and return a non-zero exit code. To disable this behaviour, pass the
|
||||
:code:`--no-wait-for-processing` flag.
|
||||
|
||||
By default, the command will wait for the SARIF file to be processed for a maximum of 2 minutes, however this is configurable with the
|
||||
:code:`--wait-for-processing-timeout` option.
|
||||
|
||||
* The build tracer is no longer enabled when using the |link-code-none-build-mode-1|_
|
||||
to analyze a compiled language, thus improving performance.
|
||||
|
||||
Known Issues
|
||||
~~~~~~~~~~~~
|
||||
|
||||
* The beta support for analyzing Swift in this release and all previous releases requires :code:`g++-13` when running on Linux. Users analyzing Swift using the :code:`ubuntu-latest`, :code:`ubuntu-22.04`, or
|
||||
:code:`ubuntu-20.04` runner images for GitHub Actions should update their workflows to install :code:`g++-13`. For more information, see `the runner images announcement <https://github.com/actions/runner-images/issues/9679>`__.
|
||||
|
||||
Query Packs
|
||||
-----------
|
||||
|
||||
Minor Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
C/C++
|
||||
"""""
|
||||
|
||||
* The "Uncontrolled data used in path expression" query (:code:`cpp/path-injection`) query produces fewer near-duplicate results.
|
||||
* The "Global variable may be used before initialization" query (:code:`cpp/global-use-before-init`) no longer raises an alert on global variables that are initialized when they are declared.
|
||||
* The "Inconsistent null check of pointer" query (:code:`cpp/inconsistent-nullness-testing`) query no longer raises an alert when the guarded check is in a macro expansion.
|
||||
|
||||
Golang
|
||||
""""""
|
||||
|
||||
* The query :code:`go/incomplete-hostname-regexp` now recognizes more sources involving concatenation of string literals and also follows flow through string concatenation. This may lead to more alerts.
|
||||
* Added some more barriers to flow for :code:`go/incorrect-integer-conversion` to reduce false positives, especially around type switches.
|
||||
|
||||
JavaScript/TypeScript
|
||||
"""""""""""""""""""""
|
||||
|
||||
* The JavaScript extractor will on longer report syntax errors related to "strict mode".
|
||||
Files containing such errors are now being fully analyzed along with other sources files.
|
||||
This improves our support for source files that technically break the "strict mode" rules,
|
||||
but where a build steps transforms the code such that it ends up working at runtime.
|
||||
|
||||
Language Libraries
|
||||
------------------
|
||||
|
||||
Breaking Changes
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
C/C++
|
||||
"""""
|
||||
|
||||
* Deleted the deprecated :code:`GlobalValueNumberingImpl.qll` implementation.
|
||||
|
||||
C#
|
||||
""
|
||||
|
||||
* Deleted the deprecated :code:`getAssemblyName` predicate from the :code:`Operator` class. Use :code:`getFunctionName` instead.
|
||||
* Deleted the deprecated :code:`LShiftOperator`, :code:`RShiftOperator`, :code:`AssignLShiftExpr`, :code:`AssignRShiftExpr`, :code:`LShiftExpr`, and :code:`RShiftExpr` aliases.
|
||||
* Deleted the deprecated :code:`getCallableDescription` predicate from the :code:`ExternalApiDataNode` class. Use :code:`hasQualifiedName` instead.
|
||||
|
||||
Golang
|
||||
""""""
|
||||
|
||||
* Deleted the deprecated :code:`CsvRemoteSource` alias. Use :code:`MaDRemoteSource` instead.
|
||||
|
||||
Java
|
||||
""""
|
||||
|
||||
* Deleted the deprecated :code:`AssignLShiftExpr`, :code:`AssignRShiftExpr`, :code:`AssignURShiftExpr`, :code:`LShiftExpr`, :code:`RShiftExpr`, and :code:`URShiftExpr` aliases.
|
||||
|
||||
JavaScript/TypeScript
|
||||
"""""""""""""""""""""
|
||||
|
||||
* Deleted the deprecated :code:`getInput` predicate from the :code:`CryptographicOperation` class. Use :code:`getAnInput` instead.
|
||||
* Deleted the deprecated :code:`RegExpPatterns` module from :code:`Regexp.qll`.
|
||||
* Deleted the deprecated :code:`semmle/javascript/security/BadTagFilterQuery.qll`, :code:`semmle/javascript/security/OverlyLargeRangeQuery.qll`, :code:`semmle/javascript/security/regexp/RegexpMatching.qll`, and :code:`Security/CWE-020/HostnameRegexpShared.qll` files.
|
||||
|
||||
Python
|
||||
""""""
|
||||
|
||||
* Deleted the deprecated :code:`RegExpPatterns` module from :code:`Regexp.qll`.
|
||||
* Deleted the deprecated :code:`Security/CWE-020/HostnameRegexpShared.qll` file.
|
||||
|
||||
Ruby
|
||||
""""
|
||||
|
||||
* Deleted the deprecated :code:`RegExpPatterns` module from :code:`Regexp.qll`.
|
||||
* Deleted the deprecated :code:`security/cwe-020/HostnameRegexpShared.qll` file.
|
||||
|
||||
Minor Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
C/C++
|
||||
"""""
|
||||
|
||||
* Source models have been added for the standard library function :code:`getc` (and variations).
|
||||
* Source, sink and flow models for the ZeroMQ (ZMQ) networking library have been added.
|
||||
* Parameters of functions without definitions now have :code:`ParameterNode`\ s.
|
||||
* The alias analysis used internally by various libraries has been improved to answer alias questions more conservatively. As a result, some queries may report fewer false positives.
|
||||
|
||||
C#
|
||||
""
|
||||
|
||||
* Generated .NET Runtime models for properties with both getters and setters have been removed as this is now handled by the data flow library.
|
||||
|
||||
JavaScript/TypeScript
|
||||
"""""""""""""""""""""
|
||||
|
||||
* Improved detection of whether a file uses CommonJS module system.
|
||||
|
||||
Deprecated APIs
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
Golang
|
||||
""""""
|
||||
|
||||
* To make Go consistent with other language libraries, the :code:`UntrustedFlowSource` name has been deprecated throughout. Use :code:`RemoteFlowSource` instead, which replaces it.
|
||||
* Where modules have classes named :code:`UntrustedFlowAsSource`, these are also deprecated and the :code:`Source` class in the same module or the :code:`RemoteFlowSource` class should be used instead.
|
||||
|
||||
Python
|
||||
""""""
|
||||
|
||||
* Renamed the :code:`StrConst` class to :code:`StringLiteral`, for greater consistency with other languages. The :code:`StrConst` and :code:`Str` classes are now deprecated and will be removed in a future release.
|
||||
|
||||
New Features
|
||||
~~~~~~~~~~~~
|
||||
|
||||
C/C++
|
||||
"""""
|
||||
|
||||
* Models-as-Data support has been added for C/C++. This feature allows flow sources, sinks and summaries to be expressed in compact strings as an alternative to modelling each source / sink / summary with explicit QL. See :code:`dataflow/ExternalFlow.qll` for documentation and specification of the model format, and :code:`models/implementations/ZMQ.qll` for a simple example of models. Importing models from :code:`.yml` is not yet supported.
|
||||
|
||||
Shared Libraries
|
||||
----------------
|
||||
|
||||
Major Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Dataflow Analysis
|
||||
"""""""""""""""""
|
||||
|
||||
* The data flow library performs heuristic filtering of code paths that have a high degree of control-flow uncertainty for improved performance in cases that are deemed unlikely to yield true positive flow paths. This filtering can be controlled with the :code:`fieldFlowBranchLimit` predicate in configurations. Two bugs have been fixed in relation to this: Some cases of high uncertainty were not being correctly identified. This fix improves performance in certain scenarios. Another group of cases of low uncertainty were also being misidentified, which led to false negatives. Taken together, we generally expect some additional query results with more true positives and fewer false positives.
|
||||
|
||||
.. |link-code-none-build-mode-1| replace:: :code:`none` build mode
|
||||
.. _link-code-none-build-mode-1: https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages#codeql-build-modes
|
||||
|
||||
@@ -0,0 +1,74 @@
|
||||
.. _codeql-cli-2.17.3:
|
||||
|
||||
==========================
|
||||
CodeQL 2.17.3 (2024-05-17)
|
||||
==========================
|
||||
|
||||
.. contents:: Contents
|
||||
:depth: 2
|
||||
:local:
|
||||
:backlinks: none
|
||||
|
||||
This is an overview of changes in the CodeQL CLI and relevant CodeQL query and library packs. For additional updates on changes to the CodeQL code scanning experience, check out the `code scanning section on the GitHub blog <https://github.blog/tag/code-scanning/>`__, `relevant GitHub Changelog updates <https://github.blog/changelog/label/code-scanning/>`__, `changes in the CodeQL extension for Visual Studio Code <https://marketplace.visualstudio.com/items/GitHub.vscode-codeql/changelog>`__, and the `CodeQL Action changelog <https://github.com/github/codeql-action/blob/main/CHANGELOG.md>`__.
|
||||
|
||||
Security Coverage
|
||||
-----------------
|
||||
|
||||
CodeQL 2.17.3 runs a total of 414 security queries when configured with the Default suite (covering 161 CWE). The Extended suite enables an additional 131 queries (covering 35 more CWE). 2 security queries have been added with this release.
|
||||
|
||||
CodeQL CLI
|
||||
----------
|
||||
|
||||
Improvements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
* The language server that our IDE integration is built on now defaults to fine-grained dependency tracking for incremental error-checking after file changes. This slightly improves the latency of refreshing errors after local source code edits and will enable significant speedups in the future.
|
||||
* We now properly handle globs (such as :code:`folder/**/*.py`) in :code:`paths` configuration to specify what files to include for Python analysis (see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/customizing-your-advanced-setup-for-code-scanning#specifying-directories-to-scan).
|
||||
* TRAP import (a part of :code:`codeql database create` and :code:`codeql database finalize`)
|
||||
now supports allocating 2^32 IDs during the import process. The previous limit was 2^31 IDs.
|
||||
|
||||
Query Packs
|
||||
-----------
|
||||
|
||||
New Queries
|
||||
~~~~~~~~~~~
|
||||
|
||||
C/C++
|
||||
"""""
|
||||
|
||||
* Added a new query, :code:`cpp/iterator-to-expired-container`, to detect the creation of iterators owned by a temporary objects that are about to be destroyed.
|
||||
|
||||
Python
|
||||
""""""
|
||||
|
||||
* The :code:`py/header-injection` query, originally contributed to the experimental query pack by @jorgectf, has been promoted to the main query pack and renamed to :code:`py/http-response-splitting`. This query finds instances of http header injection / response splitting vulnerabilities.
|
||||
|
||||
Language Libraries
|
||||
------------------
|
||||
|
||||
Breaking Changes
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
Java
|
||||
""""
|
||||
|
||||
* The Java extractor no longer supports the :code:`ODASA_JAVA_LAYOUT`, :code:`ODASA_TOOLS` and :code:`ODASA_HOME` legacy environment variables.
|
||||
* The Java extractor no longer supports the :code:`ODASA_BUILD_ERROR_DIR` legacy environment variable.
|
||||
|
||||
Major Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Python
|
||||
""""""
|
||||
|
||||
* Added modeling of the :code:`pyramid` framework, leading to new remote flow sources and sinks.
|
||||
|
||||
Minor Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Golang
|
||||
""""""
|
||||
|
||||
* Fixed a bug that stopped built-in functions from being referenced using the predicate :code:`hasQualifiedName` because technically they do not belong to any package. Now you can use the empty string as the package, e.g. :code:`f.hasQualifiedName("", "len")`.
|
||||
* Fixed a bug that stopped data flow models for built-in functions from having any effect because the package "" was not parsed correctly.
|
||||
* Fixed a bug that stopped data flow from being followed through variadic arguments to built-in functions or to functions called using a variable.
|
||||
@@ -0,0 +1,130 @@
|
||||
.. _codeql-cli-2.17.4:
|
||||
|
||||
==========================
|
||||
CodeQL 2.17.4 (2024-06-03)
|
||||
==========================
|
||||
|
||||
.. contents:: Contents
|
||||
:depth: 2
|
||||
:local:
|
||||
:backlinks: none
|
||||
|
||||
This is an overview of changes in the CodeQL CLI and relevant CodeQL query and library packs. For additional updates on changes to the CodeQL code scanning experience, check out the `code scanning section on the GitHub blog <https://github.blog/tag/code-scanning/>`__, `relevant GitHub Changelog updates <https://github.blog/changelog/label/code-scanning/>`__, `changes in the CodeQL extension for Visual Studio Code <https://marketplace.visualstudio.com/items/GitHub.vscode-codeql/changelog>`__, and the `CodeQL Action changelog <https://github.com/github/codeql-action/blob/main/CHANGELOG.md>`__.
|
||||
|
||||
Security Coverage
|
||||
-----------------
|
||||
|
||||
CodeQL 2.17.4 runs a total of 414 security queries when configured with the Default suite (covering 161 CWE). The Extended suite enables an additional 131 queries (covering 35 more CWE).
|
||||
|
||||
CodeQL CLI
|
||||
----------
|
||||
|
||||
New Features
|
||||
~~~~~~~~~~~~
|
||||
|
||||
* CodeQL package management is now generally available, and all GitHub-produced CodeQL packages have had their version numbers increased to 1.0.0.
|
||||
|
||||
Query Packs
|
||||
-----------
|
||||
|
||||
Breaking Changes
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
Java
|
||||
""""
|
||||
|
||||
* Removed :code:`local` query variants. The results pertaining to local sources can be found using the non-local counterpart query. As an example, the results previously found by :code:`java/unvalidated-url-redirection-local` can be found by :code:`java/unvalidated-url-redirection`, if the :code:`local` threat model is enabled. The removed queries are :code:`java/path-injection-local`, :code:`java/command-line-injection-local`, :code:`java/xss-local`, :code:`java/sql-injection-local`, :code:`java/http-response-splitting-local`, :code:`java/improper-validation-of-array-construction-local`, :code:`java/improper-validation-of-array-index-local`, :code:`java/tainted-format-string-local`, :code:`java/tainted-arithmetic-local`, :code:`java/unvalidated-url-redirection-local`, :code:`java/xxe-local` and :code:`java/tainted-numeric-cast-local`.
|
||||
|
||||
Minor Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
C/C++
|
||||
"""""
|
||||
|
||||
* The "Use of unique pointer after lifetime ends" query (:code:`cpp/use-of-unique-pointer-after-lifetime-ends`) no longer reports an alert when the pointer is converted to a boolean
|
||||
* The "Variable not initialized before use" query (:code:`cpp/not-initialised`) no longer reports an alert on static variables.
|
||||
|
||||
Golang
|
||||
""""""
|
||||
|
||||
* The query :code:`go/incorrect-integer-conversion` has now been restricted to only use flow through value-preserving steps. This reduces false positives, especially around type switches.
|
||||
|
||||
Java
|
||||
""""
|
||||
|
||||
* The alert message for the query "Trust boundary violation" (:code:`java/trust-boundary-violation`) has been updated to include a link to the remote source.
|
||||
* The sanitizer of the query :code:`java/zipslip` has been improved to include nodes that are safe due to having certain safe types. This reduces false positives.
|
||||
|
||||
Python
|
||||
""""""
|
||||
|
||||
* Added models of :code:`gradio` PyPI package.
|
||||
|
||||
Language Libraries
|
||||
------------------
|
||||
|
||||
Bug Fixes
|
||||
~~~~~~~~~
|
||||
|
||||
JavaScript/TypeScript
|
||||
"""""""""""""""""""""
|
||||
|
||||
* Fixed a bug where very large TypeScript files would cause database creation to crash. Large files over 10MB were already excluded from analysis, but the file size check was not applied to TypeScript files.
|
||||
|
||||
Major Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Java
|
||||
""""
|
||||
|
||||
* Added support for data flow through side-effects on static fields. For example, when a static field containing an array is updated.
|
||||
|
||||
Minor Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Golang
|
||||
""""""
|
||||
|
||||
* A bug has been fixed which meant that the query :code:`go/incorrect-integer-conversion` did not consider type assertions and type switches which use a defined type whose underlying type is an integer type. This may lead to fewer false positive alerts.
|
||||
* A bug has been fixed which meant flow was not followed through some ranged for loops. This may lead to more alerts being found.
|
||||
* Added value flow models for the built-in functions :code:`append`, :code:`copy`, :code:`max` and :code:`min` using Models-as-Data. Removed the old-style models for :code:`max` and :code:`min`.
|
||||
|
||||
Java
|
||||
""""
|
||||
|
||||
* JDK version detection based on Gradle projects has been improved. Java extraction using build-modes :code:`autobuild` or :code:`none` is more likely to pick an appropriate JDK version, particularly when the Android Gradle Plugin or Spring Boot Plugin are in use.
|
||||
|
||||
JavaScript/TypeScript
|
||||
"""""""""""""""""""""
|
||||
|
||||
* Additional heuristics for a new sensitive data classification for private information (e.g. credit card numbers) have been added to the shared :code:`SensitiveDataHeuristics.qll` library. This may result in additional results for queries that use sensitive data such as :code:`js/clear-text-storage-sensitive-data` and :code:`js/clear-text-logging`.
|
||||
|
||||
Python
|
||||
""""""
|
||||
|
||||
* The :code:`request` parameter of Flask :code:`SessionInterface.open_session` method is now modeled as a remote flow source.
|
||||
* Additional heuristics for a new sensitive data classification for private information (e.g. credit card numbers) have been added to the shared :code:`SensitiveDataHeuristics.qll` library. This may result in additional results for queries that use sensitive data such as :code:`py/clear-text-storage-sensitive-data` and :code:`py/clear-text-logging-sensitive-data`.
|
||||
|
||||
Ruby
|
||||
""""
|
||||
|
||||
* Additional heuristics for a new sensitive data classification for private information (e.g. credit card numbers) have been added to the shared :code:`SensitiveDataHeuristics.qll` library. This may result in additional results for queries that use sensitive data such as :code:`rb/sensitive-get-query`.
|
||||
|
||||
New Features
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Python
|
||||
""""""
|
||||
|
||||
* A Python MaD (Models as Data) row may now contain a dotted path in the :code:`type` column. Like in Ruby, a path to a class will refer to instances of that class. This means that the summary :code:`["foo", "Member[MyClass].Instance.Member[instance_method]", "Argument[0]", "ReturnValue", "value"]` can now be written :code:`["foo.MS_Class", "Member[instance_method]", "Argument[0]", "ReturnValue", "value"]`. To refer to an actual class, one may add a :code:`!` at the end of the path.
|
||||
|
||||
Shared Libraries
|
||||
----------------
|
||||
|
||||
Minor Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Dataflow Analysis
|
||||
"""""""""""""""""
|
||||
|
||||
* The data flow library now adds intermediate nodes when data flows out of a function via a parameter, in order to make path explanations easier to follow. The intermediate nodes have the same location as the underlying parameter, but must be accessed via :code:`PathNode.asParameterReturnNode` instead of :code:`PathNode.asNode`.
|
||||
@@ -0,0 +1,127 @@
|
||||
.. _codeql-cli-2.17.5:
|
||||
|
||||
==========================
|
||||
CodeQL 2.17.5 (2024-06-12)
|
||||
==========================
|
||||
|
||||
.. contents:: Contents
|
||||
:depth: 2
|
||||
:local:
|
||||
:backlinks: none
|
||||
|
||||
This is an overview of changes in the CodeQL CLI and relevant CodeQL query and library packs. For additional updates on changes to the CodeQL code scanning experience, check out the `code scanning section on the GitHub blog <https://github.blog/tag/code-scanning/>`__, `relevant GitHub Changelog updates <https://github.blog/changelog/label/code-scanning/>`__, `changes in the CodeQL extension for Visual Studio Code <https://marketplace.visualstudio.com/items/GitHub.vscode-codeql/changelog>`__, and the `CodeQL Action changelog <https://github.com/github/codeql-action/blob/main/CHANGELOG.md>`__.
|
||||
|
||||
Security Coverage
|
||||
-----------------
|
||||
|
||||
CodeQL 2.17.5 runs a total of 414 security queries when configured with the Default suite (covering 161 CWE). The Extended suite enables an additional 131 queries (covering 35 more CWE).
|
||||
|
||||
CodeQL CLI
|
||||
----------
|
||||
|
||||
Breaking Changes
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
* All the commands that output SARIF will output a minified version to reduce the size.
|
||||
The :code:`codeql database analyze`, :code:`codeql database interpret-results`, :code:`codeql generate query-help`, and :code:`codeql bqrs interpret` commands support the option :code:`--no-sarif-minify` to output a pretty printed SARIF file.
|
||||
|
||||
* A number of breaking changes have been made to the :code:`semmle-extractor-options` functionality available for C and C++ CodeQL tests.
|
||||
|
||||
* The Arm, Intel, and CodeWarrior compilers are no longer supported and the
|
||||
:code:`--armcc`, :code:`--intel`, :code:`--codewarrior` flags are now ignored, as are all the flags that only applied to those compilers.
|
||||
* The :code:`--threads` and :code:`-main-file-name` options, which did not have any effect on tests, are now ignored. Any specification of these options as part of
|
||||
:code:`semmle-extractor-options` should be removed.
|
||||
* Support for :code:`--linker`, all flags that would only invoke the preprocessor,
|
||||
and the :code:`/clr` flag have been removed, as those flags would never produce any usable test output.
|
||||
* Support for the :code:`--include_path_environment` flag has been removed. All include paths should directly be specified as part of :code:`semmle-extractor-options`.
|
||||
* Microsoft C/C++ compiler response files specified via :code:`@some_file_name` are now ignored. Instead, all options should directly be specified as part of
|
||||
:code:`semmle-extractor-options`.
|
||||
* Support for Microsoft :code:`#import` preprocessor directive has been removed, as support depends on the availability of the Microsoft C/C++ compiler, and availability cannot be guaranteed on all platforms while executing tests.
|
||||
* Support for the Microsoft :code:`/EHa`, :code:`/EHs`, :code:`/GX`, :code:`/GZ`, :code:`/Tc`, :code:`/Tp`, and :code:`/Zl` flags, and all :code:`/RTC` flags have been removed. Any specification of these options as part of :code:`semmle-extractor-options` should be removed.
|
||||
* Support for the Apple-specific :code:`-F` and :code:`-iframework` flags has been removed.
|
||||
The :code:`-F` flag can still be used by replacing :code:`-F <directory>` by
|
||||
:code:`--edg -F --edg <directory>`. Any occurrence of :code:`-iframework <arg>` should be replaced by :code:`--edg --sys_framework --edg <arg>`.
|
||||
* Support for the :code:`/TC`, :code:`/TP`, and :code:`-x` flags has been removed. Please ensure all C, respectively C++, source files have a :code:`.c`, respectively :code:`.cpp`,
|
||||
extension.
|
||||
* The :code:`--build_error_dir`, :code:`-db`, :code:`--edg_base_dir`, :code:`--error_limit`,
|
||||
:code:`--src_archive`, :code:`--trapfolder`, and :code:`--variadic_macros` flags are now ignored.
|
||||
|
||||
The above changes do not affect the creation of databases through the CodeQL CLI,
|
||||
or when calling the C/C++ extractor directly with the :code:`--mimic` or :code:`--linker` flags.
|
||||
Similar functionality continues to be supported in those scenarios, except for CodeWarrior and the :code:`--edg_base_dir`, :code:`--include_path_environment`, :code:`/Tc`, and :code:`/Tp` flags, which were never supported.
|
||||
|
||||
Improvements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
* :code:`codeql generate log-summary` now reports completed pipeline runs that are part of an incomplete recursive predicate.
|
||||
|
||||
Miscellaneous
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
* The OWASP Java HTML Sanitizer library used by the CodeQL CLI for internal documentation generation commands has been updated to version
|
||||
\ `20240325.1 <https://github.com/OWASP/java-html-sanitizer/releases/tag/release-20240325.1>`__.
|
||||
|
||||
Query Packs
|
||||
-----------
|
||||
|
||||
Minor Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
C/C++
|
||||
"""""
|
||||
|
||||
* The :code:`cpp/dangerous-function-overflow` no longer produces a false positive alert when the :code:`gets` function does not have exactly one parameter.
|
||||
|
||||
C#
|
||||
""
|
||||
|
||||
* .NET 8 Runtime models have been updated based on the newest version of the model generator. Furthermore, the database sources have been changed slightly to reduce result multiplicity.
|
||||
|
||||
Java
|
||||
""""
|
||||
|
||||
* The query :code:`java/spring-disabled-csrf-protection` detects disabling CSRF via :code:`ServerHttpSecurity$CsrfSpec::disable`.
|
||||
* Added more :code:`java.io.File`\ -related sinks to the path injection query.
|
||||
|
||||
Python
|
||||
""""""
|
||||
|
||||
* Added models for :code:`opml` library.
|
||||
|
||||
Language Libraries
|
||||
------------------
|
||||
|
||||
Major Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Java
|
||||
""""
|
||||
|
||||
* The precision of virtual dispatch has been improved. This increases precision in general for all data flow queries.
|
||||
|
||||
Minor Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
C/C++
|
||||
"""""
|
||||
|
||||
* A partial model for the :code:`Boost.Asio` network library has been added. This includes sources, sinks and summaries for certain functions in :code:`Boost.Asio`, such as :code:`read_until` and :code:`write`.
|
||||
|
||||
Java
|
||||
""""
|
||||
|
||||
* Support for Eclipse Compiler for Java (ecj) has been fixed to work with (a) runs that don't pass :code:`-noExit` and (b) runs that use post-Java-9 command-line arguments.
|
||||
|
||||
New Features
|
||||
~~~~~~~~~~~~
|
||||
|
||||
C/C++
|
||||
"""""
|
||||
|
||||
* Data models can now be added with data extensions. In this way source, sink and summary models can be added in extension :code:`.model.yml` files, rather than by writing classes in QL code. New models should be added in the :code:`lib/ext` folder.
|
||||
|
||||
Golang
|
||||
""""""
|
||||
|
||||
* When writing models-as-data models, the receiver is now referred to as :code:`Argument[receiver]` rather than :code:`Argument[-1]`.
|
||||
* Neutral models are now supported. They have no effect except that a manual neutral summary model will stop a generated summary model from having any effect.
|
||||
@@ -82,7 +82,7 @@ Bug Fixes
|
||||
Python
|
||||
""""""
|
||||
|
||||
* The `View AST functionality <https://docs.github.com/en/code-security/codeql-for-vs-code/using-the-advanced-functionality-of-the-codeql-for-vs-code-extension/exploring-the-structure-of-your-source-code/>`__ no longer prints detailed information about regular expressions, greatly improving performance.
|
||||
* The `View AST functionality <https://docs.github.com/en/code-security/codeql-for-vs-code/using-the-advanced-functionality-of-the-codeql-for-vs-code-extension/exploring-the-structure-of-your-source-code>`__ no longer prints detailed information about regular expressions, greatly improving performance.
|
||||
|
||||
Minor Analysis Improvements
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
@@ -11,6 +11,10 @@ A list of queries for each suite and language `is available here <https://docs.g
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
codeql-cli-2.17.5
|
||||
codeql-cli-2.17.4
|
||||
codeql-cli-2.17.3
|
||||
codeql-cli-2.17.2
|
||||
codeql-cli-2.17.1
|
||||
codeql-cli-2.17.0
|
||||
codeql-cli-2.16.6
|
||||
|
||||
@@ -61,7 +61,7 @@ The DIL format may change without warning between CLI releases.
|
||||
When you specify the ``--dump-dil`` option for ``codeql query compile``, CodeQL
|
||||
prints DIL to standard output for the queries it compiles. You can also
|
||||
view results in DIL format when you run queries in VS Code.
|
||||
For more information, see ":ref:`Analyzing your projects <viewing-query-results>`" in the CodeQL for VS Code help.
|
||||
For more information, see `Running CodeQL queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/running-codeql-queries#understanding-your-query-results>`__ in the GitHub documentation.
|
||||
|
||||
.. _extractor:
|
||||
|
||||
|
||||
@@ -12,8 +12,6 @@ Supported platforms
|
||||
|
||||
.. include:: ../reusables/supported-platforms.rst
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
Additional software requirements
|
||||
################################
|
||||
|
||||
@@ -26,6 +24,10 @@ For extraction of compiled languages (C/C++, C#, Go, Java) and Ruby on Linux:
|
||||
- ``glibc`` version 2.17 or greater must be installed.
|
||||
- ``musl-c``-based Linux distributions, such as Alpine Linux, are not supported.
|
||||
|
||||
For extraction of compiled languages on Windows:
|
||||
|
||||
- The ``PowerShell.exe`` executable must be available on the ``PATH``.
|
||||
|
||||
For TypeScript extraction on all platforms:
|
||||
|
||||
- Node.js 14 or higher must be installed and available on the ``PATH`` as ``node``.
|
||||
@@ -39,3 +41,10 @@ For Python extraction:
|
||||
For Ruby extraction:
|
||||
|
||||
- On Windows, the ``msvcp140.dll`` must be installed and available on the system. This can be installed by downloading the appropriate Microsoft Visual C++ Redistributable for Visual Studio.
|
||||
|
||||
For Java extraction:
|
||||
|
||||
- There must be a ``java`` or ``java.exe`` executable available on the ``PATH``, and the ``JAVA_HOME`` environment variable must point to the corresponding JDK's home directory.
|
||||
- If you need to analyse projects using varying JDK versions, it may be useful to supply alternate JDK versions using environment variables of the form ``JAVA_HOME_$VERSION_$PLATFORM``, following the example of `the GitHub Actions runner images <https://github.com/actions/runner-images/blob/main/images/ubuntu/Ubuntu2404-Readme.md#java>`__. An Apache Maven `toolchains.xml file <https://maven.apache.org/guides/mini/guide-using-toolchains.html#using-toolchains-in-your-project>`__ can also be used for the same purpose.
|
||||
- Having a ``mvn`` or ``mvn.exe`` executable available on the ``PATH`` is recommended if your project uses Apache Maven and does not use the ``mvnw`` wrapper script.
|
||||
- Having a ``gradle`` or ``gradle.exe`` executable available on the ``PATH`` is recommended if your project uses Gradle and does not use the ``gradlew`` wrapper script.
|
||||
|
||||
@@ -2,13 +2,13 @@
|
||||
#
|
||||
# The Sphinx config values used in the CodeQL documentation that is published
|
||||
# at codeql.github.com/docs
|
||||
#
|
||||
#
|
||||
# Note that not all possible configuration values are present in this file.
|
||||
#
|
||||
# All configuration values have a default; values that are commented out
|
||||
# serve to show the default.
|
||||
#
|
||||
# For details of all possible config values,
|
||||
# For details of all possible config values,
|
||||
# see https://www.sphinx-doc.org/en/master/usage/configuration.html
|
||||
#
|
||||
# -- GENERAL CONFIG VALUES ------------------------------------------------
|
||||
@@ -53,7 +53,7 @@ import sphinx as sphinx_mod
|
||||
|
||||
|
||||
def setup(sphinx):
|
||||
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
||||
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
||||
from qllexer import QLLexer
|
||||
sphinx.add_lexer("ql", QLLexer() if sphinx_mod.version_info[0] <= 3 else QLLexer)
|
||||
|
||||
@@ -86,7 +86,7 @@ html_theme = 'alabaster'
|
||||
|
||||
# HTML theme options used to customize the look and feel of the docs.
|
||||
html_theme_options = {'font_size': '16px',
|
||||
'body_text': '#333',
|
||||
'body_text': '#333',
|
||||
'link': '#2F1695',
|
||||
'link_hover': '#2F1695',
|
||||
'show_powered_by': False,
|
||||
@@ -106,4 +106,4 @@ html_extra_path = ['index.html']
|
||||
html_favicon = 'images/site/favicon.ico'
|
||||
|
||||
# Exclude these paths from being built by Sphinx
|
||||
exclude_patterns = ['vale*', '_static', '_templates', 'reusables', 'images', 'support', 'ql-training', 'query-help', '_build', '*.py*', 'README.rst']
|
||||
exclude_patterns = ['vale*', '_static', '_templates', 'reusables', 'images', 'support', 'ql-training', 'query-help', '_build', '*.py*', 'README.rst', 'codeql-for-visual-studio-code']
|
||||
|
||||
@@ -6,7 +6,6 @@ CodeQL documentation
|
||||
:maxdepth: 3
|
||||
|
||||
codeql-overview/index
|
||||
codeql-for-visual-studio-code/index
|
||||
writing-codeql-queries/index
|
||||
codeql-language-guides/index
|
||||
ql-language-reference/index
|
||||
|
||||
@@ -446,7 +446,7 @@ The ``pragma[assume_small_delta]`` annotation has no effect and can be safely re
|
||||
Language pragmas
|
||||
================
|
||||
|
||||
**Available for**: |classes|, |characteristic predicates|, |member predicates|, |non-member predicates|
|
||||
**Available for**: |modules|, |classes|, |characteristic predicates|, |member predicates|, |non-member predicates|
|
||||
|
||||
``language[monotonicAggregates]``
|
||||
---------------------------------
|
||||
|
||||
@@ -285,9 +285,9 @@ Built-in modules
|
||||
****************
|
||||
|
||||
QL defines a ``QlBuiltins`` module that is always in scope.
|
||||
Currently, it defines a single parameterized sub-module
|
||||
``EquivalenceRelation``, that provides an efficient abstraction for working with
|
||||
(partial) equivalence relations in QL.
|
||||
``QlBuiltins`` defines parameterized sub-modules for working with
|
||||
(partial) equivalence relations (``EquivalenceRelation``) and sets
|
||||
(``InternSets``) in QL.
|
||||
|
||||
Equivalence relations
|
||||
=====================
|
||||
@@ -347,3 +347,80 @@ The above select clause returns the following partial equivalence relation:
|
||||
+---+---+
|
||||
| 4 | 4 |
|
||||
+---+---+
|
||||
|
||||
Sets
|
||||
====
|
||||
|
||||
The built-in ``InternSets`` module is parameterized by ``Key`` and ``Value`` types
|
||||
and a ``Value getAValue(Key key)`` relation. The module groups the ``Value``
|
||||
column by ``Key`` and creates a set for each group of values related by a key.
|
||||
|
||||
The ``InternSets`` module exports a functional ``Set getSet(Key key)`` relation
|
||||
that relates keys with the set of value related to the given key by
|
||||
``getAValue``. Sets are represented by the exported ``Set`` type which exposes
|
||||
a ``contains(Value v)`` member predicate that holds for values contained in the
|
||||
given set. `getSet(k).contains(v)` is thus equivalent to `v = getAValue(k)` as
|
||||
illustrated by the following ``InternSets`` example:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
int getAValue(int key) {
|
||||
key = 1 and result = 1
|
||||
or
|
||||
key = 2 and
|
||||
(result = 1 or result = 2)
|
||||
or
|
||||
key = 3 and result = 1
|
||||
or
|
||||
key = 4 and result = 2
|
||||
}
|
||||
|
||||
module Sets = QlBuiltins::InternSets<int, int, getAValue/1>;
|
||||
|
||||
from int k, int v
|
||||
where Sets::getSet(k).contains(v)
|
||||
select k, v
|
||||
|
||||
This evalutes to the `getAValue` relation:
|
||||
|
||||
+---+---+
|
||||
| k | v |
|
||||
+===+===+
|
||||
| 1 | 1 |
|
||||
+---+---+
|
||||
| 2 | 1 |
|
||||
+---+---+
|
||||
| 2 | 2 |
|
||||
+---+---+
|
||||
| 3 | 1 |
|
||||
+---+---+
|
||||
| 4 | 2 |
|
||||
+---+---+
|
||||
|
||||
If two keys `k1` and `k2` relate to the same set of values, then `getSet(k1) = getSet(k2)`.
|
||||
For the above example, keys 1 and 3 relate to the same set of values (namely the singleton
|
||||
set containing 1) and are therefore related to the same set by ``getSet``:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
from int k1, int k2
|
||||
where Sets::getSet(k1) = Sets::getSet(k2)
|
||||
select k1, k2
|
||||
|
||||
The above query therefore evalutes to:
|
||||
|
||||
+----+----+
|
||||
| k1 | k2 |
|
||||
+====+====+
|
||||
| 1 | 1 |
|
||||
+----+----+
|
||||
| 1 | 3 |
|
||||
+----+----+
|
||||
| 2 | 2 |
|
||||
+----+----+
|
||||
| 3 | 1 |
|
||||
+----+----+
|
||||
| 3 | 3 |
|
||||
+----+----+
|
||||
| 4 | 4 |
|
||||
+----+----+
|
||||
|
||||
@@ -3,8 +3,6 @@ CodeQL CWE coverage
|
||||
|
||||
You can view the full coverage of MITRE's Common Weakness Enumeration (CWE) or coverage by language for the latest release of CodeQL.
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
About CWEs
|
||||
##########
|
||||
|
||||
|
||||
@@ -12,8 +12,6 @@ View the query help for the queries included in the ``default``, ``security-exte
|
||||
- :doc:`CodeQL query help for Ruby <ruby>`
|
||||
- :doc:`CodeQL query help for Swift <swift>`
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
.. pull-quote:: Information
|
||||
|
||||
Each query help article includes:
|
||||
|
||||
@@ -3,6 +3,5 @@
|
||||
Note
|
||||
|
||||
You can use the CodeQL template (beta) in `GitHub Codespaces <https://github.com/codespaces/new?template_repository=github/codespaces-codeql>`__ to try out the QL concepts and programming-language-agnostic examples in these tutorials. The template includes a guided introduction to working with QL, and makes it easy to get started.
|
||||
|
||||
When you're ready to run CodeQL queries on actual codebases, you will need to install the CodeQL extension in Visual Studio Code. For instructions, see ":ref:`Setting up CodeQL in Visual Studio Code <setting-up-codeql-in-visual-studio-code>`."
|
||||
|
||||
|
||||
When you're ready to run CodeQL queries on actual codebases, you will need to install the CodeQL extension in Visual Studio Code. For instructions, see `Installing CodeQL for Visual Studio Code <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/installing-codeql-for-vs-code>`__ in the GitHub documentation.
|
||||
|
||||
@@ -1,4 +0,0 @@
|
||||
.. pull-quote:: Note
|
||||
|
||||
CodeQL analysis for Kotlin is currently in beta. During the beta, analysis of Kotlin code,
|
||||
and the accompanying documentation, will not be as comprehensive as for other languages.
|
||||
@@ -1 +1 @@
|
||||
For information about installing the CodeQL extension for Visual Studio code, see ":ref:`Setting up CodeQL in Visual Studio Code <setting-up-codeql-in-visual-studio-code>`."
|
||||
For information about installing the CodeQL extension for Visual Studio code, see `Installing CodeQL for Visual Studio Code <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/installing-codeql-for-vs-code>`__ in the GitHub documentation.
|
||||
|
||||
@@ -97,8 +97,6 @@ and the CodeQL library pack ``codeql/go-all`` (`changelog <https://github.com/gi
|
||||
Java and Kotlin built-in support
|
||||
==================================
|
||||
|
||||
.. include:: ../reusables/kotlin-beta-note.rst
|
||||
|
||||
Provided by the current versions of the
|
||||
CodeQL query pack ``codeql/java-queries`` (`changelog <https://github.com/github/codeql/tree/codeql-cli/latest/java/ql/src/CHANGELOG.md>`__, `source <https://github.com/github/codeql/tree/codeql-cli/latest/java/ql/src>`__)
|
||||
and the CodeQL library pack ``codeql/java-all`` (`changelog <https://github.com/github/codeql/tree/codeql-cli/latest/java/ql/lib/CHANGELOG.md>`__, `source <https://github.com/github/codeql/tree/codeql-cli/latest/java/ql/lib>`__).
|
||||
@@ -202,6 +200,7 @@ and the CodeQL library pack ``codeql/python-all`` (`changelog <https://github.co
|
||||
Flask-Admin, Web framework
|
||||
Tornado, Web framework
|
||||
Twisted, Web framework
|
||||
Gradio, Web framework
|
||||
starlette, Asynchronous Server Gateway Interface (ASGI)
|
||||
ldap3, Lightweight Directory Access Protocol (LDAP)
|
||||
python-ldap, Lightweight Directory Access Protocol (LDAP)
|
||||
|
||||
@@ -20,12 +20,12 @@
|
||||
Java,"Java 7 to 22 [5]_","javac (OpenJDK and Oracle JDK),
|
||||
|
||||
Eclipse compiler for Java (ECJ) [6]_",``.java``
|
||||
Kotlin [7]_,"Kotlin 1.5.0 to 1.9.2\ *x*","kotlinc",``.kt``
|
||||
JavaScript,ECMAScript 2022 or lower,Not applicable,"``.js``, ``.jsx``, ``.mjs``, ``.es``, ``.es6``, ``.htm``, ``.html``, ``.xhtm``, ``.xhtml``, ``.vue``, ``.hbs``, ``.ejs``, ``.njk``, ``.json``, ``.yaml``, ``.yml``, ``.raml``, ``.xml`` [8]_"
|
||||
Python [9]_,"2.7, 3.5, 3.6, 3.7, 3.8, 3.9, 3.10, 3.11, 3.12",Not applicable,``.py``
|
||||
Ruby [10]_,"up to 3.3",Not applicable,"``.rb``, ``.erb``, ``.gemspec``, ``Gemfile``"
|
||||
Swift [11]_,"Swift 5.4-5.10","Swift compiler","``.swift``"
|
||||
TypeScript [12]_,"2.6-5.4",Standard TypeScript compiler,"``.ts``, ``.tsx``, ``.mts``, ``.cts``"
|
||||
Kotlin,"Kotlin 1.5.0 to 2.0.0\ *x*","kotlinc",``.kt``
|
||||
JavaScript,ECMAScript 2022 or lower,Not applicable,"``.js``, ``.jsx``, ``.mjs``, ``.es``, ``.es6``, ``.htm``, ``.html``, ``.xhtm``, ``.xhtml``, ``.vue``, ``.hbs``, ``.ejs``, ``.njk``, ``.json``, ``.yaml``, ``.yml``, ``.raml``, ``.xml`` [7]_"
|
||||
Python [8]_,"2.7, 3.5, 3.6, 3.7, 3.8, 3.9, 3.10, 3.11, 3.12",Not applicable,``.py``
|
||||
Ruby [9]_,"up to 3.3",Not applicable,"``.rb``, ``.erb``, ``.gemspec``, ``Gemfile``"
|
||||
Swift [10]_,"Swift 5.4-5.10","Swift compiler","``.swift``"
|
||||
TypeScript [11]_,"2.6-5.5",Standard TypeScript compiler,"``.ts``, ``.tsx``, ``.mts``, ``.cts``"
|
||||
|
||||
.. container:: footnote-group
|
||||
|
||||
@@ -35,9 +35,8 @@
|
||||
.. [4] Support for the Arm Compiler (armcc) is preliminary.
|
||||
.. [5] Builds that execute on Java 7 to 22 can be analyzed. The analysis understands Java 22 standard language features.
|
||||
.. [6] ECJ is supported when the build invokes it via the Maven Compiler plugin or the Takari Lifecycle plugin.
|
||||
.. [7] Kotlin support is currently in beta.
|
||||
.. [8] JSX and Flow code, YAML, JSON, HTML, and XML files may also be analyzed with JavaScript files.
|
||||
.. [9] The extractor requires Python 3 to run. To analyze Python 2.7 you should install both versions of Python.
|
||||
.. [10] Requires glibc 2.17.
|
||||
.. [11] Support for the analysis of Swift requires macOS or Linux.
|
||||
.. [12] TypeScript analysis is performed by running the JavaScript extractor with TypeScript enabled. This is the default.
|
||||
.. [7] JSX and Flow code, YAML, JSON, HTML, and XML files may also be analyzed with JavaScript files.
|
||||
.. [8] The extractor requires Python 3 to run. To analyze Python 2.7 you should install both versions of Python.
|
||||
.. [9] Requires glibc 2.17.
|
||||
.. [10] Support for the analysis of Swift requires macOS or Linux.
|
||||
.. [11] TypeScript analysis is performed by running the JavaScript extractor with TypeScript enabled. This is the default.
|
||||
|
||||
@@ -26,7 +26,7 @@ Basic query structure
|
||||
.. code-block:: ql
|
||||
|
||||
/**
|
||||
*
|
||||
*
|
||||
* Query metadata
|
||||
*
|
||||
*/
|
||||
@@ -39,18 +39,18 @@ Basic query structure
|
||||
where /* ... logical formula ... */
|
||||
select /* ... expressions ... */
|
||||
|
||||
The following sections describe the information that is typically included in a query file for alerts. Path queries are discussed in more detail in ":doc:`Creating path queries <creating-path-queries>`."
|
||||
The following sections describe the information that is typically included in a query file for alerts. Path queries are discussed in more detail in ":doc:`Creating path queries <creating-path-queries>`."
|
||||
|
||||
Query metadata
|
||||
==============
|
||||
|
||||
Query metadata is used to identify your custom queries when they are added to the GitHub repository or used in your analysis. Metadata provides information about the query's purpose, and also specifies how to interpret and display the query results. For a full list of metadata properties, see ":doc:`Metadata for CodeQL queries <metadata-for-codeql-queries>`." The exact metadata requirement depends on how you are going to run your query:
|
||||
|
||||
- If you are contributing a query to the GitHub repository, please read the `query metadata style guide <https://github.com/github/codeql/blob/main/docs/query-metadata-style-guide.md>`__.
|
||||
- If you are contributing a query to the GitHub repository, please read the `query metadata style guide <https://github.com/github/codeql/blob/main/docs/query-metadata-style-guide.md>`__.
|
||||
- If you are analyzing a database using the `CodeQL CLI <https://docs.github.com/en/code-security/codeql-cli>`__, your query metadata must contain ``@kind``.
|
||||
- If you are running a query with the CodeQL extension for VS Code, metadata is not mandatory. However, if you want your results to be displayed as either an 'alert' or a 'path', you must specify the correct ``@kind`` property, as explained below. For more information, see ":ref:`Analyzing your projects <analyzing-your-projects>`" in the CodeQL for VS Code help.
|
||||
- If you are running a query with the CodeQL extension for VS Code, metadata is not mandatory. However, if you want your results to be displayed as either an 'alert' or a 'path', you must specify the correct ``@kind`` property, as explained below. For more information, see `Running CodeQL queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/running-codeql-queries>`__ in the GitHub documentation.
|
||||
|
||||
.. pull-quote::
|
||||
.. pull-quote::
|
||||
|
||||
Note
|
||||
|
||||
@@ -66,8 +66,8 @@ Query metadata is used to identify your custom queries when they are added to th
|
||||
Import statements
|
||||
=================
|
||||
|
||||
Each query generally contains one or more ``import`` statements, which define the :ref:`libraries <library-modules>` or :ref:`modules <modules>` to import into the query. Libraries and modules provide a way of grouping together related :ref:`types <types>`, :ref:`predicates <predicates>`, and other modules. The contents of each library or module that you import can then be accessed by the query.
|
||||
Our `open source repository on GitHub <https://github.com/github/codeql>`__ contains the standard CodeQL libraries for each supported language.
|
||||
Each query generally contains one or more ``import`` statements, which define the :ref:`libraries <library-modules>` or :ref:`modules <modules>` to import into the query. Libraries and modules provide a way of grouping together related :ref:`types <types>`, :ref:`predicates <predicates>`, and other modules. The contents of each library or module that you import can then be accessed by the query.
|
||||
Our `open source repository on GitHub <https://github.com/github/codeql>`__ contains the standard CodeQL libraries for each supported language.
|
||||
|
||||
When writing your own alert queries, you would typically import the standard library for the language of the project that you are querying. For more information about importing the standard CodeQL libraries, see the CodeQL library guides:
|
||||
|
||||
@@ -87,33 +87,33 @@ You can explore the contents of all the standard libraries in the `CodeQL librar
|
||||
Optional CodeQL classes and predicates
|
||||
--------------------------------------
|
||||
|
||||
You can customize your analysis by defining your own predicates and classes in the query. For further information, see :ref:`Defining a predicate <defining-a-predicate>` and :ref:`Defining a class <defining-a-class>`.
|
||||
You can customize your analysis by defining your own predicates and classes in the query. For further information, see :ref:`Defining a predicate <defining-a-predicate>` and :ref:`Defining a class <defining-a-class>`.
|
||||
|
||||
From clause
|
||||
===========
|
||||
|
||||
The ``from`` clause declares the variables that are used in the query. Each declaration must be of the form ``<type> <variable name>``.
|
||||
The ``from`` clause declares the variables that are used in the query. Each declaration must be of the form ``<type> <variable name>``.
|
||||
For more information on the available :ref:`types <types>`, and to learn how to define your own types using :ref:`classes <classes>`, see the :ref:`QL language reference <ql-language-reference>`.
|
||||
|
||||
Where clause
|
||||
============
|
||||
|
||||
The ``where`` clause defines the logical conditions to apply to the variables declared in the ``from`` clause to generate your results. This clause uses :ref:`aggregations <aggregations>`, :ref:`predicates <predicates>`, and logical :ref:`formulas <formulas>` to limit the variables of interest to a smaller set, which meet the defined conditions.
|
||||
The ``where`` clause defines the logical conditions to apply to the variables declared in the ``from`` clause to generate your results. This clause uses :ref:`aggregations <aggregations>`, :ref:`predicates <predicates>`, and logical :ref:`formulas <formulas>` to limit the variables of interest to a smaller set, which meet the defined conditions.
|
||||
The CodeQL libraries group commonly used predicates for specific languages and frameworks. You can also define your own predicates in the body of the query file or in your own custom modules, as described above.
|
||||
|
||||
Select clause
|
||||
=============
|
||||
|
||||
The ``select`` clause specifies the results to display for the variables that meet the conditions defined in the ``where`` clause. The valid structure for the select clause is defined by the ``@kind`` property specified in the metadata.
|
||||
The ``select`` clause specifies the results to display for the variables that meet the conditions defined in the ``where`` clause. The valid structure for the select clause is defined by the ``@kind`` property specified in the metadata.
|
||||
|
||||
Select clauses for alert queries (``@kind problem``) consist of two 'columns', with the following structure::
|
||||
|
||||
select element, string
|
||||
|
||||
- ``element``: a code element that is identified by the query, which defines where the alert is displayed.
|
||||
- ``string``: a message, which can also include links and placeholders, explaining why the alert was generated.
|
||||
- ``string``: a message, which can also include links and placeholders, explaining why the alert was generated.
|
||||
|
||||
You can modify the alert message defined in the final column of the ``select`` statement to give more detail about the alert or path found by the query using links and placeholders. For more information, see ":doc:`Defining the results of a query <defining-the-results-of-a-query>`."
|
||||
You can modify the alert message defined in the final column of the ``select`` statement to give more detail about the alert or path found by the query using links and placeholders. For more information, see ":doc:`Defining the results of a query <defining-the-results-of-a-query>`."
|
||||
|
||||
Select clauses for path queries (``@kind path-problem``) are crafted to display both an alert and the source and sink of an associated path graph. For more information, see ":doc:`Creating path queries <creating-path-queries>`."
|
||||
|
||||
@@ -140,4 +140,4 @@ Query contributions to the open source GitHub repository may also have an accomp
|
||||
Query help files
|
||||
****************
|
||||
|
||||
When you write a custom query, we also recommend that you write a query help file to explain the purpose of the query to other users. For more information, see the `Query help style guide <https://github.com/github/codeql/blob/main/docs/query-help-style-guide.md>`__ on GitHub, and the ":doc:`Query help files <query-help-files>`."
|
||||
When you write a custom query, we also recommend that you write a query help file to explain the purpose of the query to other users. For more information, see the `Query help style guide <https://github.com/github/codeql/blob/main/docs/query-help-style-guide.md>`__ on GitHub, and the ":doc:`Query help files <query-help-files>`."
|
||||
|
||||
@@ -85,4 +85,4 @@ These flow steps are modeled in the taint-tracking library using predicates that
|
||||
Further reading
|
||||
***************
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
|
||||
@@ -180,6 +180,5 @@ The alert message defined in the final column in the ``select`` statement can be
|
||||
Further reading
|
||||
***************
|
||||
|
||||
- ":ref:`Exploring data flow with path queries <exploring-data-flow-with-path-queries>`"
|
||||
|
||||
- `Exploring data flow with path queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/exploring-data-flow-with-path-queries>`__ in the GitHub documentation.
|
||||
- `CodeQL repository <https://github.com/github/codeql>`__
|
||||
|
||||
@@ -34,12 +34,12 @@ The same query can be slightly simplified by rewriting it without :ref:`path exp
|
||||
select sink, "Sink is reached from $@.", source.getNode(), "here"
|
||||
|
||||
If a data-flow query that you have written doesn't produce the results you expect it to, there may be a problem with your query.
|
||||
You can try to debug the potential problem by following the steps described below.
|
||||
You can try to debug the potential problem by following the steps described below.
|
||||
|
||||
Checking sources and sinks
|
||||
--------------------------
|
||||
|
||||
Initially, you should make sure that the source and sink definitions contain what you expect. If either the source or sink is empty then there can never be any data flow. The easiest way to check this is using quick evaluation in CodeQL for VS Code. Select the text ``node instanceof MySource``, right-click, and choose "CodeQL: Quick Evaluation". This will evaluate the highlighted text, which in this case means the set of sources. For more information, see :ref:`Analyzing your projects <running-a-specific-part-of-a-query-or-library>` in the CodeQL for VS Code help.
|
||||
Initially, you should make sure that the source and sink definitions contain what you expect. If either the source or sink is empty then there can never be any data flow. The easiest way to check this is using quick evaluation in CodeQL for VS Code. Select the text ``node instanceof MySource``, right-click, and choose "CodeQL: Quick Evaluation". This will evaluate the highlighted text, which in this case means the set of sources. For more information, see `Running CodeQL queries <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/running-codeql-queries#running-a-specific-part-of-a-query-or-library>`__ in the GitHub documentation.
|
||||
|
||||
If both source and sink definitions look good then we will need to look for missing flow steps.
|
||||
|
||||
@@ -106,9 +106,9 @@ To do quick evaluations of partial flow it is often easiest to add a predicate t
|
||||
If you are focusing on a single source then the ``src`` column is superfluous. You may of course also add other columns of interest based on ``n``, but including the enclosing callable and the distance to the source at the very least is generally recommended, as they can be useful columns to sort on to better inspect the results.
|
||||
|
||||
|
||||
If you see a large number of partial flow results, you can focus them in a couple of ways:
|
||||
If you see a large number of partial flow results, you can focus them in a couple of ways:
|
||||
|
||||
- If flow travels a long distance following an expected path, that can result in a lot of uninteresting flow being included in the exploration radius. To reduce the amount of uninteresting flow, you can replace the source definition with a suitable ``node`` that appears along the path and restart the partial flow exploration from that point.
|
||||
- If flow travels a long distance following an expected path, that can result in a lot of uninteresting flow being included in the exploration radius. To reduce the amount of uninteresting flow, you can replace the source definition with a suitable ``node`` that appears along the path and restart the partial flow exploration from that point.
|
||||
- Creative use of barriers can be used to cut off flow paths that are uninteresting. This also reduces the number of partial flow results to explore while debugging.
|
||||
|
||||
Further reading
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
.. _introduction-to-ql:
|
||||
|
||||
Introduction to QL
|
||||
Introduction to QL
|
||||
==================
|
||||
|
||||
Work through some simple exercises and examples to learn about the basics of QL and CodeQL.
|
||||
@@ -109,12 +109,12 @@ Example CodeQL queries
|
||||
----------------------
|
||||
|
||||
The previous examples used the primitive types built in to QL. Although we chose a project to query, we didn't use the information in that project's database.
|
||||
The following example queries *do* use these databases and give you an idea of how to use CodeQL to analyze projects.
|
||||
The following example queries *do* use these databases and give you an idea of how to use CodeQL to analyze projects.
|
||||
|
||||
Queries using the CodeQL libraries can find errors and uncover variants of important security vulnerabilities in codebases.
|
||||
Visit `GitHub Security Lab <https://securitylab.github.com/>`__ to read about examples of vulnerabilities that we have recently found in open source projects.
|
||||
|
||||
Before you can run the following examples, you will need to install the CodeQL extension for Visual Studio Code. For more information, see :ref:`Setting up CodeQL in Visual Studio Code <setting-up-codeql-in-visual-studio-code>`. You will also need to import and select a database in the corresponding programming language. For more information about obtaining CodeQL databases, see `Managing CodeQL databases <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/managing-codeql-databases/>`__ in the CodeQL for VS Code documentation.
|
||||
Before you can run the following examples, you will need to install the CodeQL extension for Visual Studio Code. For more information, see `Installing CodeQL for Visual Studio Code <https://docs.github.com/en/code-security/codeql-for-vs-code/getting-started-with-codeql-for-vs-code/installing-codeql-for-vs-code>`__ in the GitHub documentation. You will also need to import and select a database in the corresponding programming language.
|
||||
|
||||
To import the CodeQL library for a specific programming language, type ``import <language>`` at the start of the query.
|
||||
|
||||
@@ -166,7 +166,7 @@ Exercise 1
|
||||
from string s
|
||||
where s = "lgtm"
|
||||
select s.length()
|
||||
|
||||
|
||||
There is often more than one way to define a query. For example, we can also write the above query in the shorter form:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
Reference in New Issue
Block a user