Merge pull request #13548 from geoffw0/redos

Swift: Query for REDOS (Regular Expression Denial Of Service)
This commit is contained in:
Geoffrey White
2023-07-14 10:44:52 +01:00
committed by GitHub
8 changed files with 196 additions and 0 deletions

View File

@@ -0,0 +1,4 @@
---
category: newQuery
---
* Added new query "Inefficient regular expression" (`swift/redos`). This query finds regular expressions that require exponential time to match certain inputs and may make an application vulnerable to denial-of-service attacks.

View File

@@ -0,0 +1,26 @@
<!DOCTYPE qhelp PUBLIC "-//Semmle//qhelp//EN" "qhelp.dtd">
<qhelp>
<include src="ReDoSIntroduction.inc.qhelp" />
<example>
<p>Consider the following regular expression:</p>
<sample language="swift">
/^_(__|.)+_$/</sample>
<p>
Its sub-expression <code>"(__|.)+"</code> can match the string
<code>"__"</code> either by the first alternative <code>"__"</code> to the
left of the <code>"|"</code> operator, or by two repetitions of the second
alternative <code>"."</code> to the right. Therefore, a string consisting of an
odd number of underscores followed by some other character will cause the
regular expression engine to run for an exponential amount of time before
rejecting the input.
</p>
<p>
This problem can be avoided by rewriting the regular expression to remove
the ambiguity between the two branches of the alternative inside the
repetition:
</p>
<sample language="swift">
/^_(__|[^_])+_$/</sample>
</example>
<include src="ReDoSReferences.inc.qhelp"/>
</qhelp>

View File

@@ -0,0 +1,25 @@
/**
* @name Inefficient regular expression
* @description A regular expression that requires exponential time to match certain inputs
* can be a performance bottleneck, and may be vulnerable to denial-of-service
* attacks.
* @kind problem
* @problem.severity error
* @security-severity 7.5
* @precision high
* @id swift/redos
* @tags security
* external/cwe/cwe-1333
* external/cwe/cwe-730
* external/cwe/cwe-400
*/
import codeql.swift.regex.Regex
private import codeql.swift.regex.RegexTreeView::RegexTreeView as TreeView
import codeql.regex.nfa.ExponentialBackTracking::Make<TreeView>
from TreeView::RegExpTerm t, string pump, State s, string prefixMsg
where hasReDoSResult(t, pump, s, prefixMsg)
select t,
"This part of the regular expression may cause exponential backtracking on strings " + prefixMsg +
"containing many repetitions of '" + pump + "'."

View File

@@ -0,0 +1,37 @@
<!DOCTYPE qhelp PUBLIC "-//Semmle//qhelp//EN" "qhelp.dtd">
<qhelp>
<overview>
<p>
Some regular expressions take a long time to match certain input strings
to the point where the time it takes to match a string of length <i>n</i>
is proportional to <i>n<sup>k</sup></i> or even <i>2<sup>n</sup></i>.
Such regular expressions can negatively affect performance, and potentially allow
a malicious user to perform a Denial of Service ("DoS") attack by crafting
an expensive input string for the regular expression to match.
</p>
<p>
The regular expression engine used by Swift uses
backtracking non-deterministic finite automata to implement regular
expression matching. While this approach is space-efficient and allows
supporting advanced features like capture groups, it is not time-efficient
in general. The worst-case time complexity of such an automaton can be
polynomial or exponential, meaning that for strings of a certain
shape, increasing the input length by ten characters may make the
automaton about 1000 times slower.
</p>
<p>
Typically, a regular expression is affected by this problem if it contains
a repetition of the form <code>r*</code> or <code>r+</code> where the
sub-expression <code>r</code> is ambiguous in the sense that it can match
some string in multiple ways. More information about the precise
circumstances can be found in the references.
</p>
</overview>
<recommendation>
<p>
Modify the regular expression to remove the ambiguity, or ensure that the
strings matched with the regular expression are short enough that the
time complexity does not matter.
</p>
</recommendation>
</qhelp>

View File

@@ -0,0 +1,13 @@
<!DOCTYPE qhelp PUBLIC "-//Semmle//qhelp//EN" "qhelp.dtd">
<qhelp>
<references>
<li> OWASP:
<a href="https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS">Regular expression Denial of Service - ReDoS</a>.
</li>
<li>Wikipedia: <a href="https://en.wikipedia.org/wiki/ReDoS">ReDoS</a>.</li>
<li>Wikipedia: <a href="https://en.wikipedia.org/wiki/Time_complexity">Time complexity</a>.</li>
<li>James Kirrage, Asiri Rathnayake, Hayo Thielecke:
<a href="https://arxiv.org/abs/1301.0849">Static Analysis for Regular Expression Denial-of-Service Attack</a>.
</li>
</references>
</qhelp>

View File

@@ -0,0 +1,5 @@
| ReDoS.swift:65:22:65:22 | a* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of 'a'. |
| ReDoS.swift:66:22:66:22 | a* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of 'a'. |
| ReDoS.swift:69:18:69:18 | a* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of 'a'. |
| ReDoS.swift:77:57:77:57 | a* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of 'a'. |
| ReDoS.swift:80:57:80:57 | a* | This part of the regular expression may cause exponential backtracking on strings containing many repetitions of 'a'. |

View File

@@ -0,0 +1 @@
queries/Security/CWE-1333/ReDoS.ql

View File

@@ -0,0 +1,85 @@
// --- stubs ---
struct URL {
init?(string: String) {}
}
struct AnyRegexOutput {
}
protocol RegexComponent {
}
struct Regex<Output> : RegexComponent {
struct Match {
}
init(_ pattern: String) throws where Output == AnyRegexOutput { }
func firstMatch(in string: String) throws -> Regex<Output>.Match? { return nil}
typealias RegexOutput = Output
}
extension String {
init(contentsOf: URL) {
let data = ""
self.init(data)
}
}
class NSObject {
}
struct _NSRange {
init(location: Int, length: Int) { }
}
typealias NSRange = _NSRange
class NSRegularExpression : NSObject {
struct Options : OptionSet {
var rawValue: UInt
}
struct MatchingOptions : OptionSet {
var rawValue: UInt
}
init(pattern: String, options: NSRegularExpression.Options = []) throws { }
func stringByReplacingMatches(in string: String, options: NSRegularExpression.MatchingOptions = [], range: NSRange, withTemplate templ: String) -> String { return "" }
}
// --- tests ---
func myRegexpTests(myUrl: URL) throws {
let tainted = String(contentsOf: myUrl) // tainted
let untainted = "abcdef"
// Regex
_ = "((a*)*b)" // GOOD (never used)
_ = try Regex("((a*)*b)") // DUBIOUS (never used)
_ = try Regex("((a*)*b)").firstMatch(in: untainted) // DUBIOUS (never used on tainted input) [FLAGGED]
_ = try Regex("((a*)*b)").firstMatch(in: tainted) // BAD
_ = try Regex(".*").firstMatch(in: tainted) // GOOD (safe regex)
let str = "((a*)*b)" // BAD
let regex = try Regex(str)
_ = try regex.firstMatch(in: tainted)
// NSRegularExpression
_ = try? NSRegularExpression(pattern: "((a*)*b)") // DUBIOUS (never used)
let nsregex1 = try? NSRegularExpression(pattern: "((a*)*b)") // DUBIOUS (never used on tainted input) [FLAGGED]
_ = nsregex1?.stringByReplacingMatches(in: untainted, range: NSRange(location: 0, length: untainted.utf16.count), withTemplate: "")
let nsregex2 = try? NSRegularExpression(pattern: "((a*)*b)") // BAD
_ = nsregex2?.stringByReplacingMatches(in: tainted, range: NSRange(location: 0, length: tainted.utf16.count), withTemplate: "")
let nsregex3 = try? NSRegularExpression(pattern: ".*") // GOOD (safe regex)
_ = nsregex3?.stringByReplacingMatches(in: tainted, range: NSRange(location: 0, length: tainted.utf16.count), withTemplate: "")
}