mirror of
https://github.com/github/codeql.git
synced 2026-04-28 02:05:14 +02:00
add suspicious-regexp-range query
This commit is contained in:
66
python/ql/src/Security/CWE-020/SuspiciousRegexpRange.qhelp
Normal file
66
python/ql/src/Security/CWE-020/SuspiciousRegexpRange.qhelp
Normal file
@@ -0,0 +1,66 @@
|
||||
<!DOCTYPE qhelp PUBLIC
|
||||
"-//Semmle//qhelp//EN"
|
||||
"qhelp.dtd">
|
||||
<qhelp>
|
||||
|
||||
<overview>
|
||||
<p>
|
||||
A regexp range can by accident match more than was intended.
|
||||
For example, the regular expression <code>/[a-zA-z]/</code> will
|
||||
match every lowercase and uppercase letters, but the same regular
|
||||
expression will also match the chars: <code>[\]^_`</code>.
|
||||
</p>
|
||||
<p>
|
||||
On other occasions it can happen that the dash in a regular
|
||||
expression is not escaped, which will cause it to be interpreted
|
||||
as part of a range. For example in the character class <code>[a-zA-Z0-9%=.,-_]</code>
|
||||
the last character range matches the 55 characters between
|
||||
<code>,</code> and <code>_</code> (both included), which overlaps with the
|
||||
range <code>[0-9]</code> and is thus clearly not intended.
|
||||
</p>
|
||||
</overview>
|
||||
|
||||
<recommendation>
|
||||
<p>
|
||||
|
||||
Don't write character ranges were there might be confusion as to
|
||||
which characters are included in the range.
|
||||
|
||||
</p>
|
||||
</recommendation>
|
||||
|
||||
<example>
|
||||
|
||||
<p>
|
||||
The following example code checks whether a string is a valid 6 digit hex color.
|
||||
</p>
|
||||
|
||||
<sample language="python">
|
||||
import re
|
||||
def is_valid_hex_color(color):
|
||||
return re.match(r'^#[0-9a-fA-f]{6}$', color) is not None
|
||||
</sample>
|
||||
|
||||
<p>
|
||||
However, the <code>A-f</code> range matches every uppercase character, and
|
||||
thus a "color" like <code>#XYZ</code> is considered valid.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The fix is to use an uppercase <code>A-F</code> range instead.
|
||||
</p>
|
||||
|
||||
<sample language="python">
|
||||
import re
|
||||
def is_valid_hex_color(color):
|
||||
return re.match(r'^#[0-9a-fA-F]{6}$', color) is not None
|
||||
</sample>
|
||||
|
||||
</example>
|
||||
|
||||
<references>
|
||||
<li>Mitre.org: <a href="https://cwe.mitre.org/data/definitions/20.html">CWE-020</a></li>
|
||||
<li>github.com: <a href="https://github.com/advisories/GHSA-g4rg-993r-mgx7">CVE-2021-42740</a></li>
|
||||
<li>wh0.github.io: <a href="https://wh0.github.io/2021/10/28/shell-quote-rce-exploiting.html">Exploiting CVE-2021-42740</a></li>
|
||||
</references>
|
||||
</qhelp>
|
||||
18
python/ql/src/Security/CWE-020/SuspiciousRegexpRange.ql
Normal file
18
python/ql/src/Security/CWE-020/SuspiciousRegexpRange.ql
Normal file
@@ -0,0 +1,18 @@
|
||||
/**
|
||||
* @name Suspicious regexp range
|
||||
* @description Some ranges in regular expression might match more than intended.
|
||||
* @kind problem
|
||||
* @problem.severity warning
|
||||
* @security-severity 5.0
|
||||
* @precision high
|
||||
* @id py/suspicious-regexp-range
|
||||
* @tags correctness
|
||||
* security
|
||||
* external/cwe/cwe-020
|
||||
*/
|
||||
|
||||
import semmle.python.security.SuspiciousRegexpRangeQuery
|
||||
|
||||
from RegExpCharacterRange range, string reason
|
||||
where problem(range, reason)
|
||||
select range, "Suspicious character range that " + reason + "."
|
||||
@@ -0,0 +1,5 @@
|
||||
---
|
||||
category: newQuery
|
||||
---
|
||||
* Added a new query, `py/suspicious-regexp-range`, to detect character ranges in regular expressions that seem to match
|
||||
too many characters.
|
||||
Reference in New Issue
Block a user