mirror of
https://github.com/github/codeql.git
synced 2026-04-28 02:05:14 +02:00
Add query for polynomial ReDoS
This commit is contained in:
113
ql/src/queries/security/cwe-1333/PolynomialReDoS.qhelp
Normal file
113
ql/src/queries/security/cwe-1333/PolynomialReDoS.qhelp
Normal file
@@ -0,0 +1,113 @@
|
||||
<!DOCTYPE qhelp PUBLIC
|
||||
"-//Semmle//qhelp//EN"
|
||||
"qhelp.dtd">
|
||||
|
||||
<qhelp>
|
||||
|
||||
<include src="ReDoSIntroduction.inc.qhelp" />
|
||||
|
||||
<example>
|
||||
<p>
|
||||
|
||||
Consider this use of a regular expression, which removes
|
||||
all leading and trailing whitespace in a string:
|
||||
|
||||
</p>
|
||||
|
||||
<sample language="ruby">
|
||||
text.gsub!(/^\s+|\s+$/, '') # BAD
|
||||
</sample>
|
||||
|
||||
<p>
|
||||
|
||||
The sub-expression <code>"\s+$"</code> will match the
|
||||
whitespace characters in <code>text</code> from left to
|
||||
right, but it can start matching anywhere within a
|
||||
whitespace sequence. This is problematic for strings
|
||||
that do <strong>not</strong> end with a whitespace
|
||||
character. Such a string will force the regular
|
||||
expression engine to process each whitespace sequence
|
||||
once per whitespace character in the sequence.
|
||||
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
||||
This ultimately means that the time cost of trimming a
|
||||
string is quadratic in the length of the string. So a
|
||||
string like <code>"a b"</code> will take milliseconds to
|
||||
process, but a similar string with a million spaces
|
||||
instead of just one will take several minutes.
|
||||
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
||||
Avoid this problem by rewriting the regular expression
|
||||
to not contain the ambiguity about when to start
|
||||
matching whitespace sequences. For instance, by using a
|
||||
negative look-behind
|
||||
(<code>/^\s+|(?<!\s)\s+$/</code>), or just by using
|
||||
the built-in strip method (<code>text.strip!</code>).
|
||||
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
||||
Note that the sub-expression <code>"^\s+"</code> is
|
||||
<strong>not</strong> problematic as the <code>^</code>
|
||||
anchor restricts when that sub-expression can start
|
||||
matching, and as the regular expression engine matches
|
||||
from left to right.
|
||||
|
||||
</p>
|
||||
|
||||
</example>
|
||||
|
||||
<example>
|
||||
|
||||
<p>
|
||||
|
||||
As a similar, but slightly subtler problem, consider the
|
||||
regular expression that matches lines with numbers, possibly written
|
||||
using scientific notation:
|
||||
</p>
|
||||
|
||||
<sample language="ruby">
|
||||
/^0\.\d+E?\d+$/ # BAD
|
||||
</sample>
|
||||
|
||||
<p>
|
||||
|
||||
The problem with this regular expression is in the
|
||||
sub-expression <code>\d+E?\d+</code> because the second
|
||||
<code>\d+</code> can start matching digits anywhere
|
||||
after the first match of the first <code>\d+</code> if
|
||||
there is no <code>E</code> in the input string.
|
||||
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
||||
This is problematic for strings that do
|
||||
<strong>not</strong> end with a digit. Such a string
|
||||
will force the regular expression engine to process each
|
||||
digit sequence once per digit in the sequence, again
|
||||
leading to a quadratic time complexity.
|
||||
|
||||
</p>
|
||||
|
||||
<p>
|
||||
|
||||
To make the processing faster, the regular expression
|
||||
should be rewritten such that the two <code>\d+</code>
|
||||
sub-expressions do not have overlapping matches:
|
||||
<code>/^0\.\d+(E\d+)?$/</code>.
|
||||
|
||||
</p>
|
||||
|
||||
</example>
|
||||
|
||||
<include src="ReDoSReferences.inc.qhelp"/>
|
||||
|
||||
</qhelp>
|
||||
30
ql/src/queries/security/cwe-1333/PolynomialReDoS.ql
Normal file
30
ql/src/queries/security/cwe-1333/PolynomialReDoS.ql
Normal file
@@ -0,0 +1,30 @@
|
||||
/**
|
||||
* @name Polynomial regular expression used on uncontrolled data
|
||||
* @description A regular expression that can require polynomial time
|
||||
* to match may be vulnerable to denial-of-service attacks.
|
||||
* @kind path-problem
|
||||
* @problem.severity warning
|
||||
* @precision high
|
||||
* @id rb/polynomial-redos
|
||||
* @tags security
|
||||
* external/cwe/cwe-1333
|
||||
* external/cwe/cwe-730
|
||||
* external/cwe/cwe-400
|
||||
*/
|
||||
|
||||
import DataFlow::PathGraph
|
||||
import codeql.ruby.DataFlow
|
||||
import codeql.ruby.regexp.PolynomialReDoS
|
||||
import codeql.ruby.regexp.SuperlinearBackTracking
|
||||
|
||||
from
|
||||
PolynomialReDoS::Configuration config, DataFlow::PathNode source, DataFlow::PathNode sink,
|
||||
PolynomialReDoS::Sink sinkNode, PolynomialBackTrackingTerm regexp
|
||||
where
|
||||
config.hasFlowPath(source, sink) and
|
||||
sinkNode = sink.getNode() and
|
||||
regexp = sinkNode.getRegExp()
|
||||
select sinkNode.getHighlight(), source, sink,
|
||||
"This $@ that depends on $@ may run slow on strings " + regexp.getPrefixMessage() +
|
||||
"with many repetitions of '" + regexp.getPumpString() + "'.", regexp, "regular expression",
|
||||
source.getNode(), "a user-provided value"
|
||||
Reference in New Issue
Block a user