mirror of
https://github.com/github/codeql.git
synced 2025-12-24 20:56:33 +01:00
56 lines
1.7 KiB
XML
56 lines
1.7 KiB
XML
<!DOCTYPE qhelp PUBLIC
|
|
"-//Semmle//qhelp//EN"
|
|
"qhelp.dtd">
|
|
<qhelp>
|
|
<overview>
|
|
<p>
|
|
|
|
Some regular expressions take a long time to match certain
|
|
input strings to the point where the time it takes to match a string
|
|
of length <i>n</i> is proportional to <i>n<sup>k</sup></i> or even
|
|
<i>2<sup>n</sup></i>. Such regular expressions can negatively affect
|
|
performance, or even allow a malicious user to perform a Denial of
|
|
Service ("DoS") attack by crafting an expensive input string for the
|
|
regular expression to match.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
The regular expression engines provided by many popular
|
|
JavaScript platforms use backtracking non-deterministic finite
|
|
automata to implement regular expression matching. While this approach
|
|
is space-efficient and allows supporting advanced features like
|
|
capture groups, it is not time-efficient in general. The worst-case
|
|
time complexity of such an automaton can be polynomial or even
|
|
exponential, meaning that for strings of a certain shape, increasing
|
|
the input length by ten characters may make the automaton about 1000
|
|
times slower.
|
|
|
|
</p>
|
|
|
|
<p>
|
|
|
|
Typically, a regular expression is affected by this
|
|
problem if it contains a repetition of the form <code>r*</code> or
|
|
<code>r+</code> where the sub-expression <code>r</code> is ambiguous
|
|
in the sense that it can match some string in multiple ways. More
|
|
information about the precise circumstances can be found in the
|
|
references.
|
|
|
|
</p>
|
|
</overview>
|
|
|
|
<recommendation>
|
|
|
|
<p>
|
|
|
|
Modify the regular expression to remove the ambiguity, or
|
|
ensure that the strings matched with the regular expression are short
|
|
enough that the time-complexity does not matter.
|
|
|
|
</p>
|
|
|
|
</recommendation>
|
|
</qhelp>
|