add suspicious-regexp-range query

This commit is contained in:
Erik Krogh Kristensen
2022-06-27 08:42:20 +02:00
parent 0346b6b67a
commit a343ceaf8b
29 changed files with 1736 additions and 0 deletions

View File

@@ -0,0 +1,72 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<overview>
<p>
A regexp range can by accident match more than was intended.
For example, the regular expression <code>/[a-zA-z]/</code> will
match every lowercase and uppercase letters, but the same regular
expression will also match the chars: <code>[\]^_`</code>.
</p>
<p>
On other occasions it can happen that the dash in a regular
expression is not escaped, which will cause it to be interpreted
as part of a range. For example in the character class <code>[a-zA-Z0-9%=.,-_]</code>
the last character range matches the 55 characters between
<code>,</code> and <code>_</code> (both included), which overlaps with the
range <code>[0-9]</code> and is thus clearly not intended.
</p>
</overview>
<recommendation>
<p>
Don't write character ranges were there might be confusion as to
which characters are included in the range.
</p>
</recommendation>
<example>
<p>
The following example code checks whether a string is a valid 6 digit hex color.
</p>
<sample language="java">
import java.util.regex.Pattern
public class Tester {
public static boolean is_valid_hex_color(String color) {
return Pattern.matches("#[0-9a-fA-f]{6}", color);
}
}
</sample>
<p>
However, the <code>A-f</code> range matches every uppercase character, and
thus a "color" like <code>#XYZ</code> is considered valid.
</p>
<p>
The fix is to use an uppercase <code>A-F</code> range instead.
</p>
<sample language="javascript">
import java.util.regex.Pattern
public class Tester {
public static boolean is_valid_hex_color(String color) {
return Pattern.matches("#[0-9a-fA-F]{6}", color);
}
}
</sample>
</example>
<references>
<li>Mitre.org: <a href="https://cwe.mitre.org/data/definitions/20.html">CWE-020</a></li>
<li>github.com: <a href="https://github.com/advisories/GHSA-g4rg-993r-mgx7">CVE-2021-42740</a></li>
<li>wh0.github.io: <a href="https://wh0.github.io/2021/10/28/shell-quote-rce-exploiting.html">Exploiting CVE-2021-42740</a></li>
</references>
</qhelp>

View File

@@ -0,0 +1,18 @@
/**
* @name Suspicious regexp range
* @description Some ranges in regular expression might match more than intended.
* @kind problem
* @problem.severity warning
* @security-severity 5.0
* @precision high
* @id java/suspicious-regexp-range
* @tags correctness
* security
* external/cwe/cwe-020
*/
import semmle.code.java.security.SuspiciousRegexpRangeQuery
from RegExpCharacterRange range, string reason
where problem(range, reason)
select range, "Suspicious character range that " + reason + "."