Apply suggestions from code review

Co-authored-by: Erik Krogh Kristensen <erik-krogh@github.com>
This commit is contained in:
Asger F
2025-01-08 12:26:18 +01:00
committed by GitHub
parent 062391334e
commit 26d85d5ece

View File

@@ -8,10 +8,10 @@ library used by other languages. This library has now been deprecated in favor o
This article explains how to migrate JavaScript data flow queries to use the shared data flow library,
and some important differences to be aware of. Note that the article on :ref:`analyzing data flow in JavaScript and TypeScript <analyzing-data-flow-in-javascript-and-typescript>`
provides a general guide to new data flow library, whereas this article aims to help with migrating existing queries from the old data flow library.
provides a general guide to the new data flow library, whereas this article aims to help with migrating existing queries from the old data flow library.
Note that the ``DataFlow::Configuration`` class is still backed by the original data flow library, but has been marked as deprecated.
This means data flow queries using this class will continue work, albeit with deprecation warnings, until the 1-year deprecation period expires in early 2026.
This means data flow queries using this class will continue to work, albeit with deprecation warnings, until the 1-year deprecation period expires in early 2026.
It is recommended that all custom queries are migrated before this time, to ensure they continue to work in the future.
Data flow queries should be migrated to use ``DataFlow::ConfigSig``-style modules instead of the ``DataFlow::Configuration`` class.
@@ -30,9 +30,9 @@ with the shared data flow library when they have been migrated to ``ConfigSig``-
* - ``DataFlow::ConfigSig``
- Shared library
A straight-forward translation to ``DataFlow::ConfigSig``-style is usually possible, although there are some complications
A straightforward translation to ``DataFlow::ConfigSig``-style is usually possible, although there are some complications
that may cause the query to behave differently.
We'll first cover some straight-forward migration examples, and then go over some of the complications that may arise.
We'll first cover some straightforward migration examples, and then go over some of the complications that may arise.
Simple migration example
------------------------
@@ -78,10 +78,10 @@ With the new style this would look like this:
where MyFlow::flowPath(source, sink)
select sink, source, sink, "Flow found"
The changes can be summarised as:
The changes can be summarized as:
- The ``DataFlow::Configuration`` class was replaced with a module implementing ``DataFlow::ConfigSig``.
- The characteristic predicate was removed (modules have no characteristic predicates)
- The characteristic predicate was removed (modules have no characteristic predicates).
- Predicates such as ``isSource`` no longer have the ``override`` keyword (as they are defined in a module now).
- The configuration module is being passed to ``DataFlow::Global``, resulting in a new module, called ``MyFlow`` in this example.
- The query imports ``MyFlow::PathGraph`` instead of ``DataFlow::PathGraph``.
@@ -94,7 +94,7 @@ With these changes, we have produced an equivalent query that is backed by the n
Taint tracking
--------------
For configuration classes extending ``TaintTracking::Configuration``, the migration is similar but with few differences:
For configuration classes extending ``TaintTracking::Configuration``, the migration is similar but with a few differences:
- The ``TaintTracking::Global`` module should be used instead of ``DataFlow::Global``.
- The ``isSanitizer`` predicate should be renamed to ``isBarrier``.
@@ -134,7 +134,7 @@ or ``TaintTracking::GlobalWithState``. See :ref:`using flow state <using-flow-la
Some changes to be aware of:
- The 4-argument version of ``isAdditionalFlowStep`` now takes parameter in a different order.
- The 4-argument version of ``isAdditionalFlowStep`` now takes parameters in a different order.
It now takes ``node1, state1, node2, state2`` instead of ``node1, node2, state1, state2``.
- Taint steps apply to all flow states, not just the ``taint`` flow label. See more details further down in this article.
@@ -143,7 +143,7 @@ Barrier guards
The predicates ``isBarrierGuard`` and ``isSanitizerGuard`` have been removed.
Instead, the ``isBarrier`` predicate must used to define all barriers. To do this, barrier guards can be reduced to a set of barrier nodes using the ``DataFlow::MakeBarrierGuard`` module.
Instead, the ``isBarrier`` predicate must be used to define all barriers. To do this, barrier guards can be reduced to a set of barrier nodes using the ``DataFlow::MakeBarrierGuard`` module.
For example, consider this data flow configuration using a barrier guard:
@@ -179,7 +179,7 @@ This can be migrated to the shared data flow library as follows:
predicate blocksExpr(Expr e, boolean outcome) { ... }
}
The changes can be summarised as:
The changes can be summarized as:
- The contents of ``isBarrierGuard`` have been moved to ``isBarrier``.
- The ``node instanceof MyBarrierGuard`` check was replaced with ``node = DataFlow::MakeBarrierGuard<MyBarrierGuard>::getABarrierNode()``.
- The ``MyBarrierGuard`` class no longer has ``DataFlow::BarrierGuardNode`` as a base class. We simply use ``DataFlow::Node`` instead.
@@ -190,11 +190,11 @@ See :ref:`using flow state <using-flow-labels-for-precise-data-flow-analysis>` f
Query-specific load and store steps
-----------------------------------
The predicates ``isAdditionalLoadStep``, ``isAdditionalStoreStep``, and ``isAdditionalLoadStoreStep`` have been removed. There is no way to emulate the original behaviour.
The predicates ``isAdditionalLoadStep``, ``isAdditionalStoreStep``, and ``isAdditionalLoadStoreStep`` have been removed. There is no way to emulate the original behavior.
Library models can still contribute such steps, but they will be applicable to all queries. Also see the section on jump steps further down.
Changes in behaviour
Changes in behavior
--------------------
When the query has been migrated to the new interface, it may seem to behave differently due to some technical differences in the internals of
@@ -205,11 +205,11 @@ Taint steps now propagate all flow states
There's an important change from the old data flow library when using flow state and taint-tracking together.
When using when using ``TaintTracking::GlobalWithState``, all flow states can propagate along taint steps.
When using ``TaintTracking::GlobalWithState``, all flow states can propagate along taint steps.
In the old data flow library, only the ``taint`` flow label could propagate along taint steps.
A straight-forward translation of such a query may therefore result in new flow paths being found, which might be unexpected.
To emulate the old behaviour, use ``DataFlow::GlobalWithState`` instead of ``TaintTracking::GlobalWithState``,
To emulate the old behavior, use ``DataFlow::GlobalWithState`` instead of ``TaintTracking::GlobalWithState``,
and manually add taint steps using ``isAdditionalFlowStep``. The predicate ``TaintTracking::defaultTaintStep`` can be used to access to the set of taint steps.
For example:
@@ -276,7 +276,7 @@ Because this step crosses a function boundary, it becomes a jump step. This can
See :ref:`customizing library models for JavaScript <customizing-library-models-for-javascript>` for details about the format of the ``input`` and ``output`` strings.
The aforementioned article also provides guidance on how to store the flow summary in a data extension.
For query-specific steps that cross function boundaries, that is, steps added with ``isAdditionalFlowStep``, there is currently no way to emulate the original behaviour.
For query-specific steps that cross function boundaries, that is, steps added with ``isAdditionalFlowStep``, there is currently no way to emulate the original behavior.
A possible workaround is to convert the query-specific step to a flow summary. In this case it should be stored in a data extension to avoid performance issues, although this also means
that all other queries will be able to use the flow summary.
@@ -289,7 +289,7 @@ In the old data flow library, only barriers specific to the ``data`` flow label
This rarely has significant impact, but some users may observe some result changes because of this.
There is currently no way to emulate the original behavour.
There is currently no way to emulate the original behavior.
Further reading
---------------