Merge branch 'main' into python-add-typetrackingnode

This commit is contained in:
Taus
2021-07-02 20:55:37 +02:00
committed by GitHub
668 changed files with 9299 additions and 8461 deletions

View File

@@ -0,0 +1,108 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<include src="ReDoSIntroduction.inc.qhelp" />
<example>
<p>
Consider this use of a regular expression, which removes
all leading and trailing whitespace in a string:
</p>
<sample language="python">
re.sub(r"^\s+|\s+$", "", text) # BAD
</sample>
<p>
The sub-expression <code>"\s+$"</code> will match the
whitespace characters in <code>text</code> from left to right, but it
can start matching anywhere within a whitespace sequence. This is
problematic for strings that do <strong>not</strong> end with a whitespace
character. Such a string will force the regular expression engine to
process each whitespace sequence once per whitespace character in the
sequence.
</p>
<p>
This ultimately means that the time cost of trimming a
string is quadratic in the length of the string. So a string like
<code>"a b"</code> will take milliseconds to process, but a similar
string with a million spaces instead of just one will take several
minutes.
</p>
<p>
Avoid this problem by rewriting the regular expression to
not contain the ambiguity about when to start matching whitespace
sequences. For instance, by using a negative look-behind
(<code>^\s+|(?&lt;!\s)\s+$</code>), or just by using the built-in strip
method (<code>text.strip()</code>).
</p>
<p>
Note that the sub-expression <code>"^\s+"</code> is
<strong>not</strong> problematic as the <code>^</code> anchor restricts
when that sub-expression can start matching, and as the regular
expression engine matches from left to right.
</p>
</example>
<example>
<p>
As a similar, but slightly subtler problem, consider the
regular expression that matches lines with numbers, possibly written
using scientific notation:
</p>
<sample language="python">
^0\.\d+E?\d+$ # BAD
</sample>
<p>
The problem with this regular expression is in the
sub-expression <code>\d+E?\d+</code> because the second
<code>\d+</code> can start matching digits anywhere after the first
match of the first <code>\d+</code> if there is no <code>E</code> in
the input string.
</p>
<p>
This is problematic for strings that do <strong>not</strong>
end with a digit. Such a string will force the regular expression
engine to process each digit sequence once per digit in the sequence,
again leading to a quadratic time complexity.
</p>
<p>
To make the processing faster, the regular expression
should be rewritten such that the two <code>\d+</code> sub-expressions
do not have overlapping matches: <code>^0\.\d+(E\d+)?$</code>.
</p>
</example>
<include src="ReDoSReferences.inc.qhelp"/>
</qhelp>

View File

@@ -0,0 +1,34 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<include src="ReDoSIntroduction.inc.qhelp" />
<example>
<p>
Consider this regular expression:
</p>
<sample language="python">
^_(__|.)+_$
</sample>
<p>
Its sub-expression <code>"(__|.)+?"</code> can match the string <code>"__"</code> either by the
first alternative <code>"__"</code> to the left of the <code>"|"</code> operator, or by two
repetitions of the second alternative <code>"."</code> to the right. Thus, a string consisting
of an odd number of underscores followed by some other character will cause the regular
expression engine to run for an exponential amount of time before rejecting the input.
</p>
<p>
This problem can be avoided by rewriting the regular expression to remove the ambiguity between
the two branches of the alternative inside the repetition:
</p>
<sample language="python">
^_(__|[^_])+_$
</sample>
</example>
<include src="ReDoSReferences.inc.qhelp"/>
</qhelp>

View File

@@ -0,0 +1,54 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<overview>
<p>
Some regular expressions take a long time to match certain
input strings to the point where the time it takes to match a string
of length <i>n</i> is proportional to <i>n<sup>k</sup></i> or even
<i>2<sup>n</sup></i>. Such regular expressions can negatively affect
performance, or even allow a malicious user to perform a Denial of
Service ("DoS") attack by crafting an expensive input string for the
regular expression to match.
</p>
<p>
The regular expression engine provided by Python uses a backtracking non-deterministic finite
automata to implement regular expression matching. While this approach
is space-efficient and allows supporting advanced features like
capture groups, it is not time-efficient in general. The worst-case
time complexity of such an automaton can be polynomial or even
exponential, meaning that for strings of a certain shape, increasing
the input length by ten characters may make the automaton about 1000
times slower.
</p>
<p>
Typically, a regular expression is affected by this
problem if it contains a repetition of the form <code>r*</code> or
<code>r+</code> where the sub-expression <code>r</code> is ambiguous
in the sense that it can match some string in multiple ways. More
information about the precise circumstances can be found in the
references.
</p>
</overview>
<recommendation>
<p>
Modify the regular expression to remove the ambiguity, or
ensure that the strings matched with the regular expression are short
enough that the time-complexity does not matter.
</p>
</recommendation>
</qhelp>

View File

@@ -0,0 +1,16 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<references>
<li>
OWASP:
<a href="https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS">Regular expression Denial of Service - ReDoS</a>.
</li>
<li>Wikipedia: <a href="https://en.wikipedia.org/wiki/ReDoS">ReDoS</a>.</li>
<li>Wikipedia: <a href="https://en.wikipedia.org/wiki/Time_complexity">Time complexity</a>.</li>
<li>James Kirrage, Asiri Rathnayake, Hayo Thielecke:
<a href="http://www.cs.bham.ac.uk/~hxt/research/reg-exp-sec.pdf">Static Analysis for Regular Expression Denial-of-Service Attack</a>.
</li>
</references>
</qhelp>

View File

@@ -76,28 +76,16 @@ private module SensitiveDataModeling {
}
/**
* Gets a reference to a string constant that, if used as the key in a lookup,
* indicates the presence of sensitive data with `classification`.
*/
private DataFlow::TypeTrackingNode sensitiveLookupStringConst(
DataFlow::TypeTracker t, SensitiveDataClassification classification
) {
t.start() and
nameIndicatesSensitiveData(result.asExpr().(StrConst).getText(), classification)
or
exists(DataFlow::TypeTracker t2 |
result = sensitiveLookupStringConst(t2, classification).track(t2, t)
)
}
/**
* Gets a reference to a string constant that, if used as the key in a lookup,
* indicates the presence of sensitive data with `classification`.
*
* Also see `extraStepForCalls`.
* Gets a reference (in local scope) to a string constant that, if used as the key in
* a lookup, indicates the presence of sensitive data with `classification`.
*/
DataFlow::Node sensitiveLookupStringConst(SensitiveDataClassification classification) {
sensitiveLookupStringConst(DataFlow::TypeTracker::end(), classification).flowsTo(result)
// Note: If this is implemented with type-tracking, we will get cross-talk as
// illustrated in python/ql/test/experimental/dataflow/sensitive-data/test.py
exists(DataFlow::LocalSourceNode source |
nameIndicatesSensitiveData(source.asExpr().(StrConst).getText(), classification) and
source.flowsTo(result)
)
}
/** A function call that is considered a source of sensitive data. */
@@ -118,6 +106,8 @@ private module SensitiveDataModeling {
/**
* Tracks any modeled source of sensitive data (with any classification),
* to limit the scope of `extraStepForCalls`. See it's QLDoc for more context.
*
* Also see `extraStepForCalls`.
*/
private DataFlow::TypeTrackingNode possibleSensitiveCallable(DataFlow::TypeTracker t) {
t.start() and
@@ -129,6 +119,8 @@ private module SensitiveDataModeling {
/**
* Tracks any modeled source of sensitive data (with any classification),
* to limit the scope of `extraStepForCalls`. See it's QLDoc for more context.
*
* Also see `extraStepForCalls`.
*/
private DataFlow::Node possibleSensitiveCallable() {
possibleSensitiveCallable(DataFlow::TypeTracker::end()).flowsTo(result)

View File

@@ -17,6 +17,13 @@ abstract class AttrRef extends Node {
*/
abstract Node getObject();
/**
* Holds if this data flow node accesses attribute named `attrName` on object `object`.
*/
predicate accesses(Node object, string attrName) {
this.getObject() = object and this.getAttributeName() = attrName
}
/**
* Gets the expression node that defines the attribute being accessed, if any. This is
* usually an identifier or literal.

View File

@@ -1393,9 +1393,9 @@ private module Stage2 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself
@@ -2087,9 +2087,9 @@ private module Stage3 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself
@@ -2850,9 +2850,9 @@ private module Stage4 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself

View File

@@ -1393,9 +1393,9 @@ private module Stage2 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself
@@ -2087,9 +2087,9 @@ private module Stage3 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself
@@ -2850,9 +2850,9 @@ private module Stage4 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself

View File

@@ -1393,9 +1393,9 @@ private module Stage2 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself
@@ -2087,9 +2087,9 @@ private module Stage3 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself
@@ -2850,9 +2850,9 @@ private module Stage4 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself

View File

@@ -1393,9 +1393,9 @@ private module Stage2 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself
@@ -2087,9 +2087,9 @@ private module Stage3 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself
@@ -2850,9 +2850,9 @@ private module Stage4 {
exists(RetNodeEx ret, Ap ap0, ReturnKindExt kind, int pos |
parameterFlow(p, ap, ap0, c, config) and
c = ret.getEnclosingCallable() and
revFlow(ret, true, apSome(_), ap0, config) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), pragma[only_bind_into](ap0),
revFlow(pragma[only_bind_into](ret), true, apSome(_), pragma[only_bind_into](ap0),
pragma[only_bind_into](config)) and
fwdFlow(ret, any(CcCall ccc), apSome(ap), ap0, config) and
kind = ret.getKind() and
p.getPosition() = pos and
// we don't expect a parameter to return stored in itself

View File

@@ -1,7 +1,8 @@
private import python
private import semmle.python.dataflow.new.DataFlow
private import semmle.python.dataflow.new.internal.DataFlowPrivate
private import semmle.python.dataflow.new.internal.DataFlowPrivate as DataFlowPrivate
private import semmle.python.dataflow.new.internal.TaintTrackingPublic
private import semmle.python.ApiGraphs
/**
* Holds if `node` should be a sanitizer in all global taint flow configurations
@@ -82,13 +83,13 @@ predicate subscriptStep(DataFlow::CfgNode nodeFrom, DataFlow::CfgNode nodeTo) {
*/
predicate stringManipulation(DataFlow::CfgNode nodeFrom, DataFlow::CfgNode nodeTo) {
// transforming something tainted into a string will make the string tainted
exists(CallNode call | call = nodeTo.getNode() |
call.getFunction().(NameNode).getId() in ["str", "bytes", "unicode"] and
exists(DataFlow::CallCfgNode call | call = nodeTo |
(
nodeFrom.getNode() = call.getArg(0)
call = API::builtin(["str", "bytes", "unicode"]).getACall()
or
nodeFrom.getNode() = call.getArgByName("object")
)
call.getFunction().asCfgNode().(NameNode).getId() in ["str", "bytes", "unicode"]
) and
nodeFrom in [call.getArg(0), call.getArgByName("object")]
)
or
// String methods. Note that this doesn't recognize `meth = "foo".upper; meth()`
@@ -155,39 +156,37 @@ predicate stringManipulation(DataFlow::CfgNode nodeFrom, DataFlow::CfgNode nodeT
predicate containerStep(DataFlow::CfgNode nodeFrom, DataFlow::Node nodeTo) {
// construction by literal
// TODO: Not limiting the content argument here feels like a BIG hack, but we currently get nothing for free :|
storeStep(nodeFrom, _, nodeTo)
DataFlowPrivate::storeStep(nodeFrom, _, nodeTo)
or
// constructor call
exists(CallNode call | call = nodeTo.asCfgNode() |
call.getFunction().(NameNode).getId() in [
"list", "set", "frozenset", "dict", "defaultdict", "tuple"
] and
call.getArg(0) = nodeFrom.getNode()
exists(DataFlow::CallCfgNode call | call = nodeTo |
call = API::builtin(["list", "set", "frozenset", "dict", "tuple"]).getACall() and
call.getArg(0) = nodeFrom
// TODO: Properly handle defaultdict/namedtuple
)
or
// functions operating on collections
exists(CallNode call | call = nodeTo.asCfgNode() |
call.getFunction().(NameNode).getId() in ["sorted", "reversed", "iter", "next"] and
call.getArg(0) = nodeFrom.getNode()
exists(DataFlow::CallCfgNode call | call = nodeTo |
call = API::builtin(["sorted", "reversed", "iter", "next"]).getACall() and
call.getArg(0) = nodeFrom
)
or
// methods
exists(CallNode call, string name | call = nodeTo.asCfgNode() |
name in [
exists(DataFlow::MethodCallNode call, string methodName | call = nodeTo |
methodName in [
// general
"copy", "pop",
// dict
"values", "items", "get", "popitem"
] and
call.getFunction().(AttrNode).getObject(name) = nodeFrom.asCfgNode()
call.calls(nodeFrom, methodName)
)
or
// list.append, set.add
exists(CallNode call, string name |
name in ["append", "add"] and
call.getFunction().(AttrNode).getObject(name) =
nodeTo.(DataFlow::PostUpdateNode).getPreUpdateNode().asCfgNode() and
call.getArg(0) = nodeFrom.getNode()
exists(DataFlow::MethodCallNode call, DataFlow::Node obj |
call.calls(obj, ["append", "add"]) and
obj = nodeTo.(DataFlow::PostUpdateNode).getPreUpdateNode() and
call.getArg(0) = nodeFrom
)
}
@@ -195,14 +194,9 @@ predicate containerStep(DataFlow::CfgNode nodeFrom, DataFlow::Node nodeTo) {
* Holds if taint can flow from `nodeFrom` to `nodeTo` with a step related to copying.
*/
predicate copyStep(DataFlow::CfgNode nodeFrom, DataFlow::CfgNode nodeTo) {
exists(CallNode call | call = nodeTo.getNode() |
// Fully qualified: copy.copy, copy.deepcopy
(
call.getFunction().(NameNode).getId() in ["copy", "deepcopy"]
or
call.getFunction().(AttrNode).getObject(["copy", "deepcopy"]).(NameNode).getId() = "copy"
) and
call.getArg(0) = nodeFrom.getNode()
exists(DataFlow::CallCfgNode call | call = nodeTo |
call = API::moduleImport("copy").getMember(["copy", "deepcopy"]).getACall() and
call.getArg(0) = nodeFrom
)
}

View File

@@ -51,7 +51,7 @@ module CleartextLogging {
}
/** A piece of data printed, considered as a flow sink. */
class PrintedDataAsSink extends Sink, DataFlow::CallCfgNode {
class PrintedDataAsSink extends Sink {
PrintedDataAsSink() {
this = API::builtin("print").getACall().getArg(_)
or

View File

@@ -68,14 +68,14 @@ import ReDoSUtil
* Holds if state `s` might be inside a backtracking repetition.
*/
pragma[noinline]
predicate stateInsideBacktracking(State s) {
private predicate stateInsideBacktracking(State s) {
s.getRepr().getParent*() instanceof MaybeBacktrackingRepetition
}
/**
* A infinitely repeating quantifier that might backtrack.
*/
class MaybeBacktrackingRepetition extends InfiniteRepetitionQuantifier {
private class MaybeBacktrackingRepetition extends InfiniteRepetitionQuantifier {
MaybeBacktrackingRepetition() {
exists(RegExpTerm child |
child instanceof RegExpAlt or
@@ -89,7 +89,7 @@ class MaybeBacktrackingRepetition extends InfiniteRepetitionQuantifier {
/**
* A state in the product automaton.
*/
newtype TStatePair =
private newtype TStatePair =
/**
* We lazily only construct those states that we are actually
* going to need: `(q, q)` for every fork state `q`, and any
@@ -112,7 +112,7 @@ newtype TStatePair =
* Gets a unique number for a `state`.
* Is used to create an ordering of states, where states with the same `toString()` will be ordered differently.
*/
int rankState(State state) {
private int rankState(State state) {
state =
rank[result](State s, Location l |
l = s.getRepr().getLocation()
@@ -124,7 +124,7 @@ int rankState(State state) {
/**
* A state in the product automaton.
*/
class StatePair extends TStatePair {
private class StatePair extends TStatePair {
State q1;
State q2;
@@ -145,7 +145,7 @@ class StatePair extends TStatePair {
*
* Used in `statePairDist`
*/
predicate isStatePair(StatePair p) { any() }
private predicate isStatePair(StatePair p) { any() }
/**
* Holds if there are transitions from the components of `q` to the corresponding
@@ -153,13 +153,13 @@ predicate isStatePair(StatePair p) { any() }
*
* Used in `statePairDist`
*/
predicate delta2(StatePair q, StatePair r) { step(q, _, _, r) }
private predicate delta2(StatePair q, StatePair r) { step(q, _, _, r) }
/**
* Gets the minimum length of a path from `q` to `r` in the
* product automaton.
*/
int statePairDist(StatePair q, StatePair r) =
private int statePairDist(StatePair q, StatePair r) =
shortestDistances(isStatePair/1, delta2/2)(q, r, result)
/**
@@ -172,7 +172,7 @@ int statePairDist(StatePair q, StatePair r) =
* expression cannot be vulnerable to ReDoS attacks anyway).
*/
pragma[noopt]
predicate isFork(State q, InputSymbol s1, InputSymbol s2, State r1, State r2) {
private predicate isFork(State q, InputSymbol s1, InputSymbol s2, State r1, State r2) {
stateInsideBacktracking(q) and
exists(State q1, State q2 |
q1 = epsilonSucc*(q) and
@@ -222,7 +222,7 @@ predicate isFork(State q, InputSymbol s1, InputSymbol s2, State r1, State r2) {
* Gets the state pair `(q1, q2)` or `(q2, q1)`; note that only
* one or the other is defined.
*/
StatePair mkStatePair(State q1, State q2) {
private StatePair mkStatePair(State q1, State q2) {
result = MkStatePair(q1, q2) or result = MkStatePair(q2, q1)
}
@@ -230,7 +230,7 @@ StatePair mkStatePair(State q1, State q2) {
* Holds if there are transitions from the components of `q` to the corresponding
* components of `r` labelled with `s1` and `s2`, respectively.
*/
predicate step(StatePair q, InputSymbol s1, InputSymbol s2, StatePair r) {
private predicate step(StatePair q, InputSymbol s1, InputSymbol s2, StatePair r) {
exists(State r1, State r2 | step(q, s1, s2, r1, r2) and r = mkStatePair(r1, r2))
}
@@ -242,7 +242,7 @@ predicate step(StatePair q, InputSymbol s1, InputSymbol s2, StatePair r) {
* inside a repetition that might backtrack.
*/
pragma[noopt]
predicate step(StatePair q, InputSymbol s1, InputSymbol s2, State r1, State r2) {
private predicate step(StatePair q, InputSymbol s1, InputSymbol s2, State r1, State r2) {
exists(State q1, State q2 | q.getLeft() = q1 and q.getRight() = q2 |
deltaClosed(q1, s1, r1) and
deltaClosed(q2, s2, r2) and
@@ -268,7 +268,7 @@ private newtype TTrace =
* A list of pairs of input symbols that describe a path in the product automaton
* starting from some fork state.
*/
class Trace extends TTrace {
private class Trace extends TTrace {
/** Gets a textual representation of this element. */
string toString() {
this = Nil() and result = "Nil()"
@@ -282,7 +282,7 @@ class Trace extends TTrace {
/**
* Gets a string corresponding to the trace `t`.
*/
string concretise(Trace t) {
private string concretise(Trace t) {
t = Nil() and result = ""
or
exists(InputSymbol s1, InputSymbol s2, Trace rest | t = Step(s1, s2, rest) |
@@ -294,7 +294,7 @@ string concretise(Trace t) {
* Holds if `r` is reachable from `(fork, fork)` under input `w`, and there is
* a path from `r` back to `(fork, fork)` with `rem` steps.
*/
predicate isReachableFromFork(State fork, StatePair r, Trace w, int rem) {
private predicate isReachableFromFork(State fork, StatePair r, Trace w, int rem) {
// base case
exists(InputSymbol s1, InputSymbol s2, State q1, State q2 |
isFork(fork, s1, s2, q1, q2) and
@@ -316,7 +316,7 @@ predicate isReachableFromFork(State fork, StatePair r, Trace w, int rem) {
* Gets a state in the product automaton from which `(fork, fork)` is
* reachable in zero or more epsilon transitions.
*/
StatePair getAForkPair(State fork) {
private StatePair getAForkPair(State fork) {
isFork(fork, _, _, _, _) and
result = MkStatePair(epsilonPred*(fork), epsilonPred*(fork))
}
@@ -324,7 +324,7 @@ StatePair getAForkPair(State fork) {
/**
* Holds if `fork` is a pumpable fork with word `w`.
*/
predicate isPumpable(State fork, string w) {
private predicate isPumpable(State fork, string w) {
exists(StatePair q, Trace t |
isReachableFromFork(fork, q, t, _) and
q = getAForkPair(fork) and