Merge pull request #18467 from github/js/shared-dataflow-branch

JS: Migrate to shared data flow library (targeting main!) 🚀
This commit is contained in:
Asger F
2025-01-16 11:28:57 +01:00
committed by GitHub
531 changed files with 28648 additions and 35414 deletions

View File

@@ -204,58 +204,45 @@ data flow solver that can check whether there is (global) data flow from a sourc
Optionally, configurations may specify extra data flow edges to be added to the data flow graph, and may also specify `barriers`. Barriers are data flow nodes or edges through
which data should not be tracked for the purposes of this analysis.
To define a configuration, extend the class ``DataFlow::Configuration`` as follows:
To define a configuration, add a module that implements the signature ``DataFlow::ConfigSig`` and pass it to ``DataFlow::Global`` as follows:
.. code-block:: ql
class MyDataFlowConfiguration extends DataFlow::Configuration {
MyDataFlowConfiguration() { this = "MyDataFlowConfiguration" }
module MyAnalysisConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { /* ... */ }
override predicate isSource(DataFlow::Node source) { /* ... */ }
predicate isSink(DataFlow::Node sink) { /* ... */ }
override predicate isSink(DataFlow::Node sink) { /* ... */ }
// optional overrides:
override predicate isBarrier(DataFlow::Node nd) { /* ... */ }
override predicate isBarrierEdge(DataFlow::Node pred, DataFlow::Node succ) { /* ... */ }
override predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ) { /* ... */ }
// optional predicates:
predicate isBarrier(DataFlow::Node nd) { /* ... */ }
predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ) { /* ... */ }
}
The characteristic predicate ``MyDataFlowConfiguration()`` defines the name of the configuration, so ``"MyDataFlowConfiguration"`` should be replaced by a suitable
name describing your particular analysis configuration.
module MyAnalysisFlow = DataFlow::Global<MyAnalysisConfig>
The data flow analysis is performed using the predicate ``hasFlow(source, sink)``:
The data flow analysis is performed using the predicate ``MyAnalysisFlow::flow(source, sink)``:
.. code-block:: ql
from MyDataFlowConfiguration dataflow, DataFlow::Node source, DataFlow::Node sink
where dataflow.hasFlow(source, sink)
from DataFlow::Node source, DataFlow::Node sink
where MyAnalysisFlow::flow(source, sink)
select source, "Data flow from $@ to $@.", source, source.toString(), sink, sink.toString()
Using global taint tracking
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Global taint tracking extends global data flow with additional non-value-preserving steps, such as flow through string-manipulating operations. To use it, simply extend
``TaintTracking::Configuration`` instead of ``DataFlow::Configuration``:
Global taint tracking extends global data flow with additional non-value-preserving steps, such as flow through string-manipulating operations. To use it, simply
use ``TaintTracking::Global<...>`` instead of ``DataFlow::Global<...>``:
.. code-block:: ql
class MyTaintTrackingConfiguration extends TaintTracking::Configuration {
MyTaintTrackingConfiguration() { this = "MyTaintTrackingConfiguration" }
override predicate isSource(DataFlow::Node source) { /* ... */ }
override predicate isSink(DataFlow::Node sink) { /* ... */ }
module MyAnalysisConfig implements DataFlow::ConfigSig {
/* ... */
}
Analogous to ``isAdditionalFlowStep``, there is a predicate ``isAdditionalTaintStep`` that you can override to specify custom flow steps to consider in the analysis.
Instead of the ``isBarrier`` and ``isBarrierEdge`` predicates, the taint tracking configuration includes ``isSanitizer`` and ``isSanitizerEdge`` predicates that specify
data flow nodes or edges that act as taint sanitizers and hence stop flow from a source to a sink.
module MyAnalysisFlow = TaintTracking::Global<MyAnalysisConfig>
Similar to global data flow, the characteristic predicate ``MyTaintTrackingConfiguration()`` defines the unique name of the configuration, so ``"MyTaintTrackingConfiguration"``
should be replaced by an appropriate descriptive name.
The taint tracking analysis is again performed using the predicate ``hasFlow(source, sink)``.
The taint tracking analysis is again performed using the predicate ``MyAnalysisFlow::flow(source, sink)``.
Examples
~~~~~~~~
@@ -267,20 +254,20 @@ time using global taint tracking.
import javascript
class CommandLineFileNameConfiguration extends TaintTracking::Configuration {
CommandLineFileNameConfiguration() { this = "CommandLineFileNameConfiguration" }
override predicate isSource(DataFlow::Node source) {
module CommandLineFileNameConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
DataFlow::globalVarRef("process").getAPropertyRead("argv").getAPropertyRead() = source
}
override predicate isSink(DataFlow::Node sink) {
predicate isSink(DataFlow::Node sink) {
DataFlow::moduleMember("fs", "readFile").getACall().getArgument(0) = sink
}
}
from CommandLineFileNameConfiguration cfg, DataFlow::Node source, DataFlow::Node sink
where cfg.hasFlow(source, sink)
module CommandLineFileNameFlow = TaintTracking::Global<CommandLineFileNameConfig>;
from DataFlow::Node source, DataFlow::Node sink
where CommandLineFileNameFlow::flow(source, sink)
select source, sink
This query will now find flows that involve inter-procedural steps, like in the following example (where the individual steps have been marked with comments
@@ -325,15 +312,15 @@ with an error if it does not. We could then use that function in ``readFileHelpe
}
For the purposes of our above analysis, ``checkPath`` is a `sanitizer`: its output is always untainted, even if its input is tainted. To model this
we can add an override of ``isSanitizer`` to our taint-tracking configuration like this:
we can add an ``isBarrier`` predicate to our taint-tracking configuration like this:
.. code-block:: ql
class CommandLineFileNameConfiguration extends TaintTracking::Configuration {
module CommandLineFileNameConfig implements DataFlow::ConfigSig {
// ...
override predicate isSanitizer(DataFlow::Node nd) {
predicate isBarrier(DataFlow::Node nd) {
nd.(DataFlow::CallNode).getCalleeName() = "checkPath"
}
}
@@ -359,36 +346,36 @@ Note that ``checkPath`` is now no longer a sanitizer in the sense described abov
through ``checkPath`` any more. The flow is, however, `guarded` by ``checkPath`` in the sense that the expression ``checkPath(p)`` has to evaluate
to ``true`` (or, more precisely, to a truthy value) in order for the flow to happen.
Such sanitizer guards can be supported by defining a new subclass of ``TaintTracking::SanitizerGuardNode`` and overriding the predicate
``isSanitizerGuard`` in the taint-tracking configuration class to add all instances of this class as sanitizer guards to the configuration.
Such sanitizer guards can be supported by defining a class with a ``blocksExpr`` predicate and using the `DataFlow::MakeBarrierGuard`` module
to implement the ``isBarrier`` predicate.
For our above example, we would begin by defining a subclass of ``SanitizerGuardNode`` that identifies guards of the form ``checkPath(...)``:
For our above example, we would begin by defining a subclass of ``DataFlow::CallNode`` that identifies guards of the form ``checkPath(...)``:
.. code-block:: ql
class CheckPathSanitizerGuard extends TaintTracking::SanitizerGuardNode, DataFlow::CallNode {
class CheckPathSanitizerGuard extends DataFlow::CallNode {
CheckPathSanitizerGuard() { this.getCalleeName() = "checkPath" }
override predicate sanitizes(boolean outcome, Expr e) {
predicate blocksExpr(boolean outcome, Expr e) {
outcome = true and
e = getArgument(0).asExpr()
e = this.getArgument(0).asExpr()
}
}
The characteristic predicate of this class checks that the sanitizer guard is a call to a function named ``checkPath``. The overriding definition
of ``sanitizes`` says such a call sanitizes its first argument (that is, ``getArgument(0)``) if it evaluates to ``true`` (or rather, a truthy
The characteristic predicate of this class checks that the sanitizer guard is a call to a function named ``checkPath``. The definition
of ``blocksExpr`` says such a call sanitizes its first argument (that is, ``getArgument(0)``) if it evaluates to ``true`` (or rather, a truthy
value).
Now we can override ``isSanitizerGuard`` to add these sanitizer guards to our configuration:
Now we can implement ``isBarrier`` to add this sanitizer guard to our configuration:
.. code-block:: ql
class CommandLineFileNameConfiguration extends TaintTracking::Configuration {
module CommandLineFileNameConfig implements DataFlow::ConfigSig {
// ...
override predicate isSanitizerGuard(TaintTracking::SanitizerGuardNode nd) {
nd instanceof CheckPathSanitizerGuard
predicate isBarrier(DataFlow::Node node) {
node = DataFlow::MakeBarrierGuard<CheckPathSanitizerGuard>::getABarrierNode()
}
}
@@ -399,7 +386,7 @@ reach there if ``checkPath(p)`` evaluates to a truthy value. Consequently, there
Additional taint steps
~~~~~~~~~~~~~~~~~~~~~~
Sometimes the default data flow and taint steps provided by ``DataFlow::Configuration`` and ``TaintTracking::Configuration`` are not sufficient
Sometimes the default data flow and taint steps provided by the data flow library are not sufficient
and we need to add additional flow or taint steps to our configuration to make it find the expected flow. For example, this can happen because
the analyzed program uses a function from an external library whose source code is not available to the analysis, or because it uses a function
that is too difficult to analyze.
@@ -420,20 +407,20 @@ to resolve any symlinks in the path ``p`` before passing it to ``readFile``:
Resolving symlinks does not make an unsafe path any safer, so we would still like our query to flag this, but since the standard library does
not have a model of ``resolve-symlinks`` it will no longer return any results.
We can fix this quite easily by adding an overriding definition of the ``isAdditionalTaintStep`` predicate to our configuration, introducing an
We can fix this quite easily by adding a definition of the ``isAdditionalFlowStep`` predicate to our configuration, introducing an
additional taint step from the first argument of ``resolveSymlinks`` to its result:
.. code-block:: ql
class CommandLineFileNameConfiguration extends TaintTracking::Configuration {
module CommandLineFileNameConfig implements DataFlow::ConfigSig {
// ...
override predicate isAdditionalTaintStep(DataFlow::Node pred, DataFlow::Node succ) {
predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
exists(DataFlow::CallNode c |
c = DataFlow::moduleImport("resolve-symlinks").getACall() and
pred = c.getArgument(0) and
succ = c
node1 = c.getArgument(0) and
node2 = c
)
}
}
@@ -444,11 +431,11 @@ to wrap it in a new subclass of ``TaintTracking::SharedTaintStep`` like this:
.. code-block:: ql
class StepThroughResolveSymlinks extends TaintTracking::SharedTaintStep {
override predicate step(DataFlow::Node pred, DataFlow::Node succ) {
override predicate step(DataFlow::Node node1, DataFlow::Node node2) {
exists(DataFlow::CallNode c |
c = DataFlow::moduleImport("resolve-symlinks").getACall() and
pred = c.getArgument(0) and
succ = c
node1 = c.getArgument(0) and
node2 = c
)
}
}
@@ -494,18 +481,18 @@ Exercise 2
import javascript
class HardCodedTagNameConfiguration extends DataFlow::Configuration {
HardCodedTagNameConfiguration() { this = "HardCodedTagNameConfiguration" }
module HardCodedTagNameConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { source.asExpr() instanceof ConstantString }
override predicate isSource(DataFlow::Node source) { source.asExpr() instanceof ConstantString }
override predicate isSink(DataFlow::Node sink) {
predicate isSink(DataFlow::Node sink) {
sink = DataFlow::globalVarRef("document").getAMethodCall("createElement").getArgument(0)
}
}
from HardCodedTagNameConfiguration cfg, DataFlow::Node source, DataFlow::Node sink
where cfg.hasFlow(source, sink)
module HardCodedTagNameFlow = DataFlow::Global<HardCodedTagNameConfig>;
from DataFlow::Node source, DataFlow::Node sink
where HardCodedTagNameFlow::flow(source, sink)
select source, sink
Exercise 3
@@ -540,18 +527,18 @@ Exercise 4
}
}
class HardCodedTagNameConfiguration extends DataFlow::Configuration {
HardCodedTagNameConfiguration() { this = "HardCodedTagNameConfiguration" }
module HardCodedTagNameConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { source instanceof ArrayEntryCallResult }
override predicate isSource(DataFlow::Node source) { source instanceof ArrayEntryCallResult }
override predicate isSink(DataFlow::Node sink) {
predicate isSink(DataFlow::Node sink) {
sink = DataFlow::globalVarRef("document").getAMethodCall("createElement").getArgument(0)
}
}
from HardCodedTagNameConfiguration cfg, DataFlow::Node source, DataFlow::Node sink
where cfg.hasFlow(source, sink)
module HardCodedTagNameFlow = DataFlow::Global<HardCodedTagNameConfig>;
from DataFlow::Node source, DataFlow::Node sink
where HardCodedTagNameFlow::flow(source, sink)
select source, sink
Further reading

View File

@@ -18,6 +18,7 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
abstract-syntax-tree-classes-for-working-with-javascript-and-typescript-programs
data-flow-cheat-sheet-for-javascript
customizing-library-models-for-javascript
migrating-javascript-dataflow-queries
- :doc:`Basic query for JavaScript and TypeScript code <basic-query-for-javascript-code>`: Learn to write and run a simple CodeQL query.
@@ -37,4 +38,6 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
- :doc:`Data flow cheat sheet for JavaScript <data-flow-cheat-sheet-for-javascript>`: This article describes parts of the JavaScript libraries commonly used for variant analysis and in data flow queries.
- :doc:`Customizing library models for JavaScript <customizing-library-models-for-javascript>`: You can model frameworks and libraries that your codebase depends on using data extensions and publish them as CodeQL model packs.
- :doc:`Customizing library models for JavaScript <customizing-library-models-for-javascript>`: You can model frameworks and libraries that your codebase depends on using data extensions and publish them as CodeQL model packs.
- :doc:`Migrating JavaScript dataflow queries <migrating-javascript-dataflow-queries>`: Guide on migrating data flow queries to the new data flow library.

View File

@@ -700,19 +700,16 @@ The data flow graph-based analyses described so far are all intraprocedural: the
We distinguish here between data flow proper, and *taint tracking*: the latter not only considers value-preserving flow (such as from variable definitions to uses), but also cases where one value influences ("taints") another without determining it entirely. For example, in the assignment ``s2 = s1.substring(i)``, the value of ``s1`` influences the value of ``s2``, because ``s2`` is assigned a substring of ``s1``. In general, ``s2`` will not be assigned ``s1`` itself, so there is no data flow from ``s1`` to ``s2``, but ``s1`` still taints ``s2``.
It is a common pattern that we wish to specify data flow or taint analysis in terms of its *sources* (where flow starts), *sinks* (where it should be tracked), and *barriers* or *sanitizers* (where flow is interrupted). Sanitizers they are very common in security analyses: for example, an analysis that tracks the flow of untrusted user input into, say, a SQL query has to keep track of code that validates the input, thereby making it safe to use. Such a validation step is an example of a sanitizer.
It is a common pattern that we wish to specify data flow or taint analysis in terms of its *sources* (where flow starts), *sinks* (where it should be tracked), and *barriers* (also called *sanitizers*) where flow is interrupted. Sanitizers they are very common in security analyses: for example, an analysis that tracks the flow of untrusted user input into, say, a SQL query has to keep track of code that validates the input, thereby making it safe to use. Such a validation step is an example of a sanitizer.
The classes ``DataFlow::Configuration`` and ``TaintTracking::Configuration`` allow specifying a data flow or taint analysis, respectively, by overriding the following predicates:
A module implementing the signature `DataFlow::ConfigSig` may specify a data flow or taint analysis by implementing the following predicates:
- ``isSource(DataFlow::Node nd)`` selects all nodes ``nd`` from where flow tracking starts.
- ``isSink(DataFlow::Node nd)`` selects all nodes ``nd`` to which the flow is tracked.
- ``isBarrier(DataFlow::Node nd)`` selects all nodes ``nd`` that act as a barrier for data flow; ``isSanitizer`` is the corresponding predicate for taint tracking configurations.
- ``isBarrierEdge(DataFlow::Node src, DataFlow::Node trg)`` is a variant of ``isBarrier(nd)`` that allows specifying barrier *edges* in addition to barrier nodes; again, ``isSanitizerEdge`` is the corresponding predicate for taint tracking;
- ``isAdditionalFlowStep(DataFlow::Node src, DataFlow::Node trg)`` allows specifying custom additional flow steps for this analysis; ``isAdditionalTaintStep`` is the corresponding predicate for taint tracking configurations.
- ``isBarrier(DataFlow::Node nd)`` selects all nodes ``nd`` that act as a barrier/sanitizer for data flow.
- ``isAdditionalFlowStep(DataFlow::Node src, DataFlow::Node trg)`` allows specifying custom additional flow steps for this analysis.
Since for technical reasons both ``Configuration`` classes are subtypes of ``string``, you have to choose a unique name for each flow configuration and equate ``this`` with it in the characteristic predicate (as in the example below).
The predicate ``Configuration.hasFlow`` performs the actual flow tracking, starting at a source and looking for flow to a sink that does not pass through a barrier node or edge.
Such a module can be passed to ``DataFlow::Global<...>``. This will produce a module with a ``flow`` predicate that performs the actual flow tracking, starting at a source and looking for flow to a sink that does not pass through a barrier node.
For example, suppose that we are developing an analysis to find hard-coded passwords. We might write a simple query that looks for string constants flowing into variables named ``"password"``.
@@ -720,35 +717,27 @@ For example, suppose that we are developing an analysis to find hard-coded passw
import javascript
class PasswordTracker extends DataFlow::Configuration {
PasswordTracker() {
// unique identifier for this configuration
this = "PasswordTracker"
}
module PasswordConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node nd) { nd.asExpr() instanceof StringLiteral }
override predicate isSource(DataFlow::Node nd) {
nd.asExpr() instanceof StringLiteral
}
override predicate isSink(DataFlow::Node nd) {
passwordVarAssign(_, nd)
}
predicate passwordVarAssign(Variable v, DataFlow::Node nd) {
v.getAnAssignedExpr() = nd.asExpr() and
v.getName().toLowerCase() = "password"
}
predicate isSink(DataFlow::Node nd) { passwordVarAssign(_, nd) }
}
Now we can rephrase our query to use ``Configuration.hasFlow``:
predicate passwordVarAssign(Variable v, DataFlow::Node nd) {
v.getAnAssignedExpr() = nd.asExpr() and
v.getName().toLowerCase() = "password"
}
module PasswordFlow = DataFlow::Global<PasswordConfig>;
Now we can rephrase our query to use ``PasswordFlow::flow``:
.. code-block:: ql
from PasswordTracker pt, DataFlow::Node source, DataFlow::Node sink, Variable v
where pt.hasFlow(source, sink) and pt.passwordVarAssign(v, sink)
from DataFlow::Node source, DataFlow::Node sink, Variable v
where PasswordFlow::flow(_, sink) and passwordVarAssign(v, sink)
select sink, "Password variable " + v + " is assigned a constant string."
Syntax errors
~~~~~~~~~~~~~

View File

@@ -16,18 +16,17 @@ Use the following template to create a taint tracking path query:
* @kind path-problem
*/
import javascript
import DataFlow
import DataFlow::PathGraph
class MyConfig extends TaintTracking::Configuration {
MyConfig() { this = "MyConfig" }
override predicate isSource(Node node) { ... }
override predicate isSink(Node node) { ... }
override predicate isAdditionalTaintStep(Node pred, Node succ) { ... }
module MyConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node node) { ... }
predicate isSink(DataFlow::Node node) { ... }
predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ) { ... }
}
from MyConfig cfg, PathNode source, PathNode sink
where cfg.hasFlowPath(source, sink)
module MyFlow = TaintTracking::Global<MyConfig>;
from MyFlow::PathNode source, MyFlow::PathNode sink
where MyFlow::flowPath(source, sink)
select sink.getNode(), source, sink, "taint from $@.", source.getNode(), "here"
This query reports flow paths which:

View File

@@ -0,0 +1,301 @@
.. _migrating-javascript-dataflow-queries:
Migrating JavaScript Dataflow Queries
=====================================
The JavaScript analysis used to have its own data flow library, which differed from the shared data flow
library used by other languages. This library has now been deprecated in favor of the shared library.
This article explains how to migrate JavaScript data flow queries to use the shared data flow library,
and some important differences to be aware of. Note that the article on :ref:`analyzing data flow in JavaScript and TypeScript <analyzing-data-flow-in-javascript-and-typescript>`
provides a general guide to the new data flow library, whereas this article aims to help with migrating existing queries from the old data flow library.
Note that the ``DataFlow::Configuration`` class is still backed by the original data flow library, but has been marked as deprecated.
This means data flow queries using this class will continue to work, albeit with deprecation warnings, until the 1-year deprecation period expires in early 2026.
It is recommended that all custom queries are migrated before this time, to ensure they continue to work in the future.
Data flow queries should be migrated to use ``DataFlow::ConfigSig``-style modules instead of the ``DataFlow::Configuration`` class.
This is identical to the interface found in other languages.
When making this switch, the query will become backed by the shared data flow library instead. That is, data flow queries will only work
with the shared data flow library when they have been migrated to ``ConfigSig``-style, as shown in the following table:
.. list-table:: Data flow libraries
:widths: 20 80
:header-rows: 1
* - API
- Implementation
* - ``DataFlow::Configuration``
- Old library (deprecated, to be removed in early 2026)
* - ``DataFlow::ConfigSig``
- Shared library
A straightforward translation to ``DataFlow::ConfigSig``-style is usually possible, although there are some complications
that may cause the query to behave differently.
We'll first cover some straightforward migration examples, and then go over some of the complications that may arise.
Simple migration example
------------------------
A simple example of a query using the old data flow library is shown below:
.. code-block:: ql
/** @kind path-problem */
import javascript
import DataFlow::PathGraph
class MyConfig extends DataFlow::Configuration {
MyConfig() { this = "MyConfig" }
override predicate isSource(DataFlow::Node node) { ... }
override predicate isSink(DataFlow::Node node) { ... }
}
from MyConfig cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "Flow found"
With the new style this would look like this:
.. code-block:: ql
/** @kind path-problem */
import javascript
module MyConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node node) { ... }
predicate isSink(DataFlow::Node node) { ... }
}
module MyFlow = DataFlow::Global<MyConfig>;
import MyFlow::PathGraph
from MyFlow::PathNode source, MyFlow::PathNode sink
where MyFlow::flowPath(source, sink)
select sink, source, sink, "Flow found"
The changes can be summarized as:
- The ``DataFlow::Configuration`` class was replaced with a module implementing ``DataFlow::ConfigSig``.
- The characteristic predicate was removed (modules have no characteristic predicates).
- Predicates such as ``isSource`` no longer have the ``override`` keyword (as they are defined in a module now).
- The configuration module is being passed to ``DataFlow::Global``, resulting in a new module, called ``MyFlow`` in this example.
- The query imports ``MyFlow::PathGraph`` instead of ``DataFlow::PathGraph``.
- The ``MyConfig cfg`` variable was removed from the ``from`` clause.
- The ``hasFlowPath`` call was replaced with ``MyFlow::flowPath``.
- The type ``DataFlow::PathNode`` was replaced with ``MyFlow::PathNode``.
With these changes, we have produced an equivalent query that is backed by the new data flow library.
Taint tracking
--------------
For configuration classes extending ``TaintTracking::Configuration``, the migration is similar but with a few differences:
- The ``TaintTracking::Global`` module should be used instead of ``DataFlow::Global``.
- Some predicates originating from ``TaintTracking::Configuration`` should be renamed to match the ``DataFlow::ConfigSig`` interface:
- ``isSanitizer`` should be renamed to ``isBarrier``.
- ``isAdditionalTaintStep`` should be renamed to ``isAdditionalFlowStep``.
Note that there is no such thing as ``TaintTracking::ConfigSig``. The ``DataFlow::ConfigSig`` interface is used for both data flow and taint tracking.
For example:
.. code-block:: ql
class MyConfig extends TaintTracking::Configuration {
MyConfig() { this = "MyConfig" }
predicate isSanitizer(DataFlow::Node node) { ... }
predicate isAdditionalTaintStep(DataFlow::Node node1, DataFlow::Node node2) { ... }
...
}
The above configuration can be migrated to the shared data flow library as follows:
.. code-block:: ql
module MyConfig implements DataFlow::ConfigSig {
predicate isBarrier(DataFlow::Node node) { ... }
predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) { ... }
...
}
module MyFlow = TaintTracking::Global<MyConfig>;
Flow labels and flow states
---------------------------
The ``DataFlow::FlowLabel`` class has been deprecated. Queries that relied on flow labels should use the new `flow state` concept instead.
This is done by implementing ``DataFlow::StateConfigSig`` instead of ``DataFlow::ConfigSig``, and passing the module to ``DataFlow::GlobalWithState``
or ``TaintTracking::GlobalWithState``. See :ref:`using flow state <using-flow-labels-for-precise-data-flow-analysis>` for more details about flow state.
Some changes to be aware of:
- The 4-argument version of ``isAdditionalFlowStep`` now takes parameters in a different order.
It now takes ``node1, state1, node2, state2`` instead of ``node1, node2, state1, state2``.
- Taint steps apply to all flow states, not just the ``taint`` flow label. See more details further down in this article.
Barrier guards
--------------
The predicates ``isBarrierGuard`` and ``isSanitizerGuard`` have been removed.
Instead, the ``isBarrier`` predicate must be used to define all barriers. To do this, barrier guards can be reduced to a set of barrier nodes using the ``DataFlow::MakeBarrierGuard`` module.
For example, consider this data flow configuration using a barrier guard:
.. code-block:: ql
class MyConfig extends DataFlow::Configuration {
override predicate isBarrierGuard(DataFlow::BarrierGuardNode node) {
node instanceof MyBarrierGuard
}
..
}
class MyBarrierGuard extends DataFlow::BarrierGuardNode {
MyBarrierGuard() { ... }
override predicate blocks(Expr e, boolean outcome) { ... }
}
This can be migrated to the shared data flow library as follows:
.. code-block:: ql
module MyConfig implements DataFlow::ConfigSig {
predicate isBarrier(DataFlow::Node node) {
node = DataFlow::MakeBarrierGuard<MyBarrierGuard>::getABarrierNode()
}
..
}
class MyBarrierGuard extends DataFlow::Node {
MyBarrierGuard() { ... }
predicate blocksExpr(Expr e, boolean outcome) { ... }
}
The changes can be summarized as:
- The contents of ``isBarrierGuard`` have been moved to ``isBarrier``.
- The ``node instanceof MyBarrierGuard`` check was replaced with ``node = DataFlow::MakeBarrierGuard<MyBarrierGuard>::getABarrierNode()``.
- The ``MyBarrierGuard`` class no longer has ``DataFlow::BarrierGuardNode`` as a base class. We simply use ``DataFlow::Node`` instead.
- The ``blocks`` predicate has been renamed to ``blocksExpr`` and no longer has the ``override`` keyword.
See :ref:`using flow state <using-flow-labels-for-precise-data-flow-analysis>` for examples of how to use barrier guards with flow state.
Query-specific load and store steps
-----------------------------------
The predicates ``isAdditionalLoadStep``, ``isAdditionalStoreStep``, and ``isAdditionalLoadStoreStep`` have been removed. There is no way to emulate the original behavior.
Library models can still contribute such steps, but they will be applicable to all queries. Also see the section on jump steps further down.
Changes in behavior
--------------------
When the query has been migrated to the new interface, it may seem to behave differently due to some technical differences in the internals of
the two data flow libraries. The most significant changes are described below.
Taint steps now propagate all flow states
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There's an important change from the old data flow library when using flow state and taint-tracking together.
When using ``TaintTracking::GlobalWithState``, all flow states can propagate along taint steps.
In the old data flow library, only the ``taint`` flow label could propagate along taint steps.
A straightforward translation of such a query may therefore result in new flow paths being found, which might be unexpected.
To emulate the old behavior, use ``DataFlow::GlobalWithState`` instead of ``TaintTracking::GlobalWithState``,
and manually add taint steps using ``isAdditionalFlowStep``. The predicate ``TaintTracking::defaultTaintStep`` can be used to access to the set of taint steps.
For example:
.. code-block:: ql
module MyConfig implements DataFlow::StateConfigSig {
class FlowState extends string {
FlowState() { this = ["taint", "foo"] }
}
predicate isAdditionalFlowStep(DataFlow::Node node1, FlowState state1, DataFlow::Node node2, FlowState state2) {
// Allow taint steps to propagate the "taint" flow state
TaintTracking::defaultTaintStep(node1, node2) and
state1 = "taint" and
state2 = state
}
...
}
module MyFlow = DataFlow::GlobalWithState<MyConfig>;
Jump steps across function boundaries
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When a flow step crosses a function boundary, that is, it starts and ends in two different functions, it will now be classified as a "jump" step.
Jump steps can be problematic in some cases. Roughly speaking, the data flow library will "forget" which call site it came from when following a jump step.
This can lead to spurious flow paths that go into a function through one call site, and back out of a different call site.
If the step was generated by a library model, that is, the step is applicable to all queries, this is best mitigated by converting the step to a flow summary.
For example, the following library model adds a taint step from ``x`` to ``y`` in ``foo.bar(x, y => {})``:
.. code-block:: ql
class MyStep extends TaintTracking::SharedTaintStep {
override predicate step(DataFlow::Node node1, DataFlow::Node node2) {
exists(DataFlow::CallNode call |
call = DataFlow::moduleMember("foo", "bar").getACall() and
node1 = call.getArgument(0) and
node2 = call.getCallback(1).getParameter(0)
)
}
}
Because this step crosses a function boundary, it becomes a jump step. This can be avoided by converting it to a flow summary as follows:
.. code-block:: ql
class MySummary extends DataFlow::SummarizedCallable {
MySummary() { this = "MySummary" }
override DataFlow::CallNode getACall() { result = DataFlow::moduleMember("foo", "bar").getACall() }
override predicate propagatesFlow(string input, string output, boolean preservesValue) {
input = "Argument[this]" and
output = "Argument[1].Parameter[0]" and
preservesValue = false // taint step
}
}
See :ref:`customizing library models for JavaScript <customizing-library-models-for-javascript>` for details about the format of the ``input`` and ``output`` strings.
The aforementioned article also provides guidance on how to store the flow summary in a data extension.
For query-specific steps that cross function boundaries, that is, steps added with ``isAdditionalFlowStep``, there is currently no way to emulate the original behavior.
A possible workaround is to convert the query-specific step to a flow summary. In this case it should be stored in a data extension to avoid performance issues, although this also means
that all other queries will be able to use the flow summary.
Barriers block all flows
~~~~~~~~~~~~~~~~~~~~~~~~
In the shared data flow library, a barrier blocks all flows, even if the tracked value is inside a content.
In the old data flow library, only barriers specific to the ``data`` flow label blocked flows when the tracked value was inside a content.
This rarely has significant impact, but some users may observe some result changes because of this.
There is currently no way to emulate the original behavior.
Further reading
---------------
- :ref:`Analyzing data flow in JavaScript and TypeScript <analyzing-data-flow-in-javascript-and-typescript>` provides a general guide to the new data flow library.
- :ref:`Using flow state for precise data flow analysis <using-flow-labels-for-precise-data-flow-analysis>` provides a general guide on using flow state.

View File

@@ -1,9 +1,9 @@
.. _using-flow-labels-for-precise-data-flow-analysis:
Using flow labels for precise data flow analysis
Using flow state for precise data flow analysis
================================================
You can associate flow labels with each value tracked by the flow analysis to determine whether the flow contains potential vulnerabilities.
You can associate a flow state with each value tracked by the flow analysis to determine whether the flow contains potential vulnerabilities.
Overview
--------
@@ -16,9 +16,9 @@ program, and associates a flag with every data value telling us whether it might
source node.
In some cases, you may want to track more detailed information about data values. This can be done
by associating flow labels with data values, as shown in this tutorial. We will first discuss the
general idea behind flow labels and then show how to use them in practice. Finally, we will give an
overview of the API involved and provide some pointers to standard queries that use flow labels.
by associating flow states with data values, as shown in this tutorial. We will first discuss the
general idea behind flow states and then show how to use them in practice. Finally, we will give an
overview of the API involved and provide some pointers to standard queries that use flow states.
Limitations of basic data-flow analysis
---------------------------------------
@@ -47,22 +47,21 @@ contain ``..`` components. Untrusted user input has both bits set initially, ind
off individual bits, and if a value that has at least one bit set is interpreted as a path, a
potential vulnerability is flagged.
Using flow labels
Using flow states
-----------------
You can handle these cases and others like them by associating a set of `flow labels` (sometimes
also referred to as `taint kinds`) with each value being tracked by the analysis. Value-preserving
You can handle these cases and others like them by associating a set of `flow states` (sometimes
also referred to as `flow labels` or `taint kinds`) with each value being tracked by the analysis. Value-preserving
data-flow steps (such as flow steps from writes to a variable to its reads) preserve the set of flow
labels, but other steps may add or remove flow labels. Sanitizers, in particular, are simply flow
steps that remove some or all flow labels. The initial set of flow labels for a value is determined
states, but other steps may add or remove flow states. The initial set of flow states for a value is determined
by the source node that gives rise to it. Similarly, sink nodes can specify that an incoming value
needs to have a certain flow label (or one of a set of flow labels) in order for the flow to be
needs to have a certain flow state (or one of a set of flow states) in order for the flow to be
flagged as a potential vulnerability.
Example
-------
As an example of using flow labels, we will show how to write a query that flags property accesses
As an example of using flow state, we will show how to write a query that flags property accesses
on JSON values that come from user-controlled input where we have not checked whether the value is
``null``, so that the property access may cause a runtime exception.
@@ -88,8 +87,8 @@ This code, on the other hand, should not be flagged:
}
}
We will first try to write a query to find this kind of problem without flow labels, and use the
difficulties we encounter as a motivation for bringing flow labels into play, which will make the
We will first try to write a query to find this kind of problem without flow state, and use the
difficulties we encounter as a motivation for bringing flow state into play, which will make the
query much easier to implement.
To get started, let's write a query that simply flags any flow from ``JSON.parse`` into the base of
@@ -99,24 +98,24 @@ a property access:
import javascript
class JsonTrackingConfig extends DataFlow::Configuration {
JsonTrackingConfig() { this = "JsonTrackingConfig" }
override predicate isSource(DataFlow::Node nd) {
module JsonTrackingConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node nd) {
exists(JsonParserCall jpc |
nd = jpc.getOutput()
)
}
override predicate isSink(DataFlow::Node nd) {
predicate isSink(DataFlow::Node nd) {
exists(DataFlow::PropRef pr |
nd = pr.getBase()
)
}
}
from JsonTrackingConfig cfg, DataFlow::Node source, DataFlow::Node sink
where cfg.hasFlow(source, sink)
module JsonTrackingFlow = DataFlow::Global<JsonTrackingConfig>;
from DataFlow::Node source, DataFlow::Node sink
where JsonTrackingFlow::flow(source, sink)
select sink, "Property access on JSON value originating $@.", source, "here"
Note that we use the ``JsonParserCall`` class from the standard library to model various JSON
@@ -127,8 +126,7 @@ introduced any sanitizers yet.
There are many ways of checking for nullness directly or indirectly. Since this is not the main
focus of this tutorial, we will only show how to model one specific case: if some variable ``v`` is
known to be truthy, it cannot be ``null``. This kind of condition is easily expressed using a
``BarrierGuardNode`` (or its counterpart ``SanitizerGuardNode`` for taint-tracking configurations).
known to be truthy, it cannot be ``null``. This kind of condition is expressed using a "barrier guard".
A barrier guard node is a data-flow node ``b`` that blocks flow through some other node ``nd``,
provided that some condition checked at ``b`` is known to hold, that is, evaluate to a truthy value.
@@ -139,29 +137,29 @@ is a barrier guard blocking flow through the use of ``data`` on the right-hand s
At this point we know that ``data`` has evaluated to a truthy value, so it cannot be ``null``
anymore.
Implementing this additional condition is easy. We implement a subclass of ``DataFlow::BarrierGuardNode``:
Implementing this additional condition is easy. We implement a class with a predicate called ``blocksExpr``:
.. code-block:: ql
class TruthinessCheck extends DataFlow::BarrierGuardNode, DataFlow::ValueNode {
class TruthinessCheck extends DataFlow::ValueNode {
SsaVariable v;
TruthinessCheck() {
astNode = v.getAUse()
}
override predicate blocks(boolean outcome, Expr e) {
predicate blocksExpr(boolean outcome, Expr e) {
outcome = true and
e = astNode
}
}
and then use it to override predicate ``isBarrierGuard`` in our configuration class:
and then use it to implement the predicate ``isBarrier`` in our configuration module:
.. code-block:: ql
override predicate isBarrierGuard(DataFlow::BarrierGuardNode guard) {
guard instanceof TruthinessCheck
predicate isBarrier(DataFlow::Node node) {
node = DataFlow::MakeBarrierGuard<TruthinessCheck>::getABarrierNode()
}
With this change, we now flag the problematic case and don't flag the unproblematic case above.
@@ -182,11 +180,11 @@ checked for null-guardedness:
}
}
We could try to remedy the situation by overriding ``isAdditionalFlowStep`` in our configuration class to track values through property reads:
We could try to remedy the situation by adding ``isAdditionalFlowStep`` in our configuration module to track values through property reads:
.. code-block:: ql
override predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ) {
predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ) {
succ.(DataFlow::PropRead).getBase() = pred
}
@@ -199,79 +197,86 @@ altogether, it should simply record the fact that ``root`` itself is known to be
Any property read from ``root``, on the other hand, may well be null and needs to be checked
separately.
We can achieve this by introducing two different flow labels, ``json`` and ``maybe-null``. The former
We can achieve this by introducing two different flow states, ``json`` and ``maybe-null``. The former
means that the value we are dealing with comes from a JSON object, the latter that it may be
``null``. The result of any call to ``JSON.parse`` has both labels. A property read from a value
with label ``json`` also has both labels. Checking truthiness removes the ``maybe-null`` label.
Accessing a property on a value that has the ``maybe-null`` label should be flagged.
``null``. The result of any call to ``JSON.parse`` has both states. A property read from a value
with state ``json`` also results in a value with both states. Checking truthiness removes the ``maybe-null`` state.
Accessing a property on a value that has the ``maybe-null`` state should be flagged.
To implement this, we start by defining two new subclasses of the class ``DataFlow::FlowLabel``:
To implement this, we first change the signature of our configuration module to ``DataFlow::StateConfigSig``, and
replace ``DataFlow::Global<...>`` with ``DataFlow::GlobalWithState<...>``:
.. code-block:: ql
class JsonLabel extends DataFlow::FlowLabel {
JsonLabel() {
this = "json"
}
module JsonTrackingConfig implements DataFlow::StateConfigSig {
/* ... */
}
class MaybeNullLabel extends DataFlow::FlowLabel {
MaybeNullLabel() {
this = "maybe-null"
}
}
module JsonTrackingFlow = DataFlow::GlobalWithState<JsonTrackingConfig>;
Then we extend our ``isSource`` predicate from above to track flow labels by overriding the two-argument version instead of the one-argument version:
We then add a class called ``FlowState`` which has one value for each flow state:
.. code-block:: ql
override predicate isSource(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
module JsonTrackingConfig implements DataFlow::StateConfigSig {
class FlowState extends string {
FlowState() {
this = ["json", "maybe-null"]
}
}
/* ... */
}
Then we extend our ``isSource`` predicate with an additional parameter to specify the flow state:
.. code-block:: ql
predicate isSource(DataFlow::Node nd, FlowState state) {
exists(JsonParserCall jpc |
nd = jpc.getOutput() and
(lbl instanceof JsonLabel or lbl instanceof MaybeNullLabel)
state = ["json", "maybe-null"] // start in either state
)
}
Similarly, we make ``isSink`` flow-label aware and require the base of the property read to have the ``maybe-null`` label:
Similarly, we update ``isSink`` and require the base of the property read to have the ``maybe-null`` state:
.. code-block:: ql
override predicate isSink(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
predicate isSink(DataFlow::Node nd, FlowState state) {
exists(DataFlow::PropRef pr |
nd = pr.getBase() and
lbl instanceof MaybeNullLabel
state = "maybe-null"
)
}
Our overriding definition of ``isAdditionalFlowStep`` now needs to specify two flow labels, a
predecessor label ``predlbl`` and a successor label ``succlbl``. In addition to specifying flow from
the predecessor node ``pred`` to the successor node ``succ``, it requires that ``pred`` has label
``predlbl``, and adds label ``succlbl`` to ``succ``. In our case, we use this to add both the
``json`` label and the ``maybe-null`` label to any property read from a value labeled with ``json``
(no matter whether it has the ``maybe-null`` label):
Our definition of ``isAdditionalFlowStep`` now needs to specify two flow states, a
predecessor state ``predState`` and a successor state ``succState``. In addition to specifying flow from
the predecessor node ``pred`` to the successor node ``succ``, it requires that ``pred`` has state
``predState``, and adds state ``succState`` to ``succ``. In our case, we use this to add both the
``json`` state and the ``maybe-null`` state to any property read from a value in the ``json`` state
(no matter whether it has the ``maybe-null`` state):
.. code-block:: ql
override predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ,
DataFlow::FlowLabel predlbl, DataFlow::FlowLabel succlbl) {
predicate isAdditionalFlowStep(DataFlow::Node pred, FlowState predState,
DataFlow::Node succ, FlowState succState) {
succ.(DataFlow::PropRead).getBase() = pred and
predlbl instanceof JsonLabel and
(succlbl instanceof JsonLabel or succlbl instanceof MaybeNullLabel)
predState = "json" and
succState = ["json", "maybe-null"]
}
Finally, we turn ``TruthinessCheck`` from a ``BarrierGuardNode`` into a ``LabeledBarrierGuardNode``,
specifying that it only removes the ``maybe-null`` label (but not the ``json`` label) from the
sanitized value:
Finally, we add an additional parameter to the ``isBarrier`` predicate to specify the flow state
to block at the ``TruthinessCheck`` barrier.
.. code-block:: ql
class TruthinessCheck extends DataFlow::LabeledBarrierGuardNode, DataFlow::ValueNode {
...
module JsonTrackingConfig implements DataFlow::StateConfigSig {
/* ... */
override predicate blocks(boolean outcome, Expr e, DataFlow::FlowLabel lbl) {
outcome = true and
e = astNode and
lbl instanceof MaybeNullLabel
predicate isBarrier(DataFlow::Node node, FlowState state) {
node = DataFlow::MakeBarrierGuard<TruthinessCheck>::getABarrierNode() and
state = "maybe-null"
}
}
@@ -283,66 +288,60 @@ step by step in the UI:
/** @kind path-problem */
import javascript
import DataFlow::PathGraph
class JsonLabel extends DataFlow::FlowLabel {
JsonLabel() {
this = "json"
}
}
class MaybeNullLabel extends DataFlow::FlowLabel {
MaybeNullLabel() {
this = "maybe-null"
}
}
class TruthinessCheck extends DataFlow::LabeledBarrierGuardNode, DataFlow::ValueNode {
class TruthinessCheck extends DataFlow::ValueNode {
SsaVariable v;
TruthinessCheck() {
astNode = v.getAUse()
}
override predicate blocks(boolean outcome, Expr e, DataFlow::FlowLabel lbl) {
predicate blocksExpr(boolean outcome, Expr e, JsonTrackingConfig::FlowState state) {
outcome = true and
e = astNode and
lbl instanceof MaybeNullLabel
state = "maybe-null"
}
}
class JsonTrackingConfig extends DataFlow::Configuration {
JsonTrackingConfig() { this = "JsonTrackingConfig" }
module JsonTrackingConfig implements DataFlow::StateConfigSig {
class FlowState extends string {
FlowState() {
this = ["json", "maybe-null"]
}
}
override predicate isSource(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
predicate isSource(DataFlow::Node nd, FlowState state) {
exists(JsonParserCall jpc |
nd = jpc.getOutput() and
(lbl instanceof JsonLabel or lbl instanceof MaybeNullLabel)
state = ["json", "maybe-null"] // start in either state
)
}
override predicate isSink(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
predicate isSink(DataFlow::Node nd, FlowState state) {
exists(DataFlow::PropRef pr |
nd = pr.getBase() and
lbl instanceof MaybeNullLabel
state = "maybe-null"
)
}
override predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ,
DataFlow::FlowLabel predlbl, DataFlow::FlowLabel succlbl) {
predicate isAdditionalFlowStep(DataFlow::Node pred, FlowState predState,
DataFlow::Node succ, FlowState succState) {
succ.(DataFlow::PropRead).getBase() = pred and
predlbl instanceof JsonLabel and
(succlbl instanceof JsonLabel or succlbl instanceof MaybeNullLabel)
predState = "json" and
succState = ["json", "maybe-null"]
}
override predicate isBarrierGuard(DataFlow::BarrierGuardNode guard) {
guard instanceof TruthinessCheck
predicate isBarrier(DataFlow::Node node, FlowState state) {
node = DataFlow::MakeBarrierGuard<TruthinessCheck>::getABarrierNode() and
state = "maybe-null"
}
}
from JsonTrackingConfig cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "Property access on JSON value originating $@.", source, "here"
module JsonTrackingFlow = DataFlow::GlobalWithState<JsonTrackingConfig>;
from DataFlow::Node source, DataFlow::Node sink
where JsonTrackingFlow::flow(source, sink)
select sink, "Property access on JSON value originating $@.", source, "here"
We ran this query on the https://github.com/finos/plexus-interop repository. Many of the
results were false positives since the query does not currently model many ways in which we can check
@@ -354,52 +353,30 @@ this tutorial.
API
---
Plain data-flow configurations implicitly use a single flow label "data", which indicates that a
data value originated from a source. You can use the predicate ``DataFlow::FlowLabel::data()``,
which returns this flow label, as a symbolic name for it.
Flow state can be used in modules implementing the ``DataFlow::StateConfigSig`` signature. Compared to a ``DataFlow::ConfigSig`` the main differences are:
Taint-tracking configurations add a second flow label "taint" (``DataFlow::FlowLabel::taint()``),
which is similar to "data", but includes values that have passed through non-value preserving steps
such as string operations.
- The module must be passed to ``DataFlow::GlobalWithState<...>`` or ``TaintTracking::GlobalWithState<...>``.
instead of ``DataFlow::Global<...>`` or ``TaintTracking::Global<...>``.
- The module must contain a type named ``FlowState``.
- ``isSource`` expects an additional parameter specifying the flow state.
- ``isSink`` optionally can take an additional parameter specifying the flow state.
If omitted, the sinks are in effect for all flow states.
- ``isAdditionalFlowStep`` optionally can take two additional parameters specifying the predecessor and successor flow states.
If omitted, the generated steps apply for any flow state and preserve the current flow state.
- ``isBarrier`` optionally can take an additional parameter specifying the flow state to block.
If omitted, the barriers block all flow states.
Each of the three member predicates ``isSource``, ``isSink`` and
``isAdditionalFlowStep``/``isAdditionalTaintStep`` has one version that uses the default flow
labels, and one version that allows specifying custom flow labels through additional arguments.
For ``isSource``, there is one additional argument specifying which flow label(s) should be
associated with values originating from this source. If multiple flow labels are specified, each
value is associated with `all` of them.
For ``isSink``, the additional argument specifies which flow label(s) a value that flows into this
source may be associated with. If multiple flow labels are specified, then any value that is
associated with `at least one` of them will be considered by the configuration.
For ``isAdditionalFlowStep`` there are two additional arguments ``predlbl`` and ``succlbl``, which
allow flow steps to act as flow label transformers. If a value associated with ``predlbl`` arrives
at the start node of the additional step, it is propagated to the end node and associated with
``succlbl``. Of course, ``predlbl`` and ``succlbl`` may be the same, indicating that the flow step
preserves this label. There can also be multiple values of ``succlbl`` for a single ``predlbl`` or
vice versa.
Note that if you do not restrict ``succlbl`` then it will be allowed to range over all flow labels.
This may cause labels that were previously blocked on a path to reappear, which is not usually what
you want.
The flow label-aware version of ``isBarrier`` is called ``isLabeledBarrier``: unlike ``isBarrier``,
which prevents any flow past the given node, it only blocks flow of values associated with one of
the specified flow labels.
Standard queries using flow labels
Standard queries using flow state
----------------------------------
Some of our standard security queries use flow labels. You can look at their implementation
to get a feeling for how to use flow labels in practice.
Some of our standard security queries use flow state. You can look at their implementation
to get a feeling for how to use flow state in practice.
In particular, both of the examples mentioned in the section on limitations of basic data flow above
are from standard security queries that use flow labels. The `Prototype-polluting merge call
<https://codeql.github.com/codeql-query-help/javascript/js-prototype-pollution/>`_ query uses two flow labels to distinguish completely
are from standard security queries that use flow state. The `Prototype-polluting merge call
<https://codeql.github.com/codeql-query-help/javascript/js-prototype-pollution/>`_ query uses two flow states to distinguish completely
tainted objects from partially tainted objects. The `Uncontrolled data used in path expression
<https://codeql.github.com/codeql-query-help/javascript/js-path-injection/>`_ query uses four flow labels to track whether a user-controlled
<https://codeql.github.com/codeql-query-help/javascript/js-path-injection/>`_ query uses four flow states to track whether a user-controlled
string may be an absolute path and whether it may contain ``..`` components.
Further reading