diff --git a/java/ql/src/Security/CWE/CWE-730/PolynomialReDoS.qhelp b/java/ql/src/Security/CWE/CWE-730/PolynomialReDoS.qhelp index fa8a3563d23..dbb1f4c37f5 100644 --- a/java/ql/src/Security/CWE/CWE-730/PolynomialReDoS.qhelp +++ b/java/ql/src/Security/CWE/CWE-730/PolynomialReDoS.qhelp @@ -14,13 +14,13 @@

- - re.sub(r"^\s+|\s+$", "", text) # BAD + + Pattern.compile("^\\s+|\\s+$").matcher(text).replaceAll("") // BAD

- The sub-expression "\s+$" will match the + The sub-expression "\\s+$" will match the whitespace characters in text from left to right, but it can start matching anywhere within a whitespace sequence. This is problematic for strings that do not end with a whitespace @@ -45,14 +45,14 @@ Avoid this problem by rewriting the regular expression to not contain the ambiguity about when to start matching whitespace sequences. For instance, by using a negative look-behind - (^\s+|(?<!\s)\s+$), or just by using the built-in strip - method (text.strip()). + ("^\\s+|(?<!\\s)\\s+$"), or just by using the built-in trim + method (text.trim()).

- Note that the sub-expression "^\s+" is + Note that the sub-expression "^\\s+" is not problematic as the ^ anchor restricts when that sub-expression can start matching, and as the regular expression engine matches from left to right. @@ -70,8 +70,8 @@ using scientific notation:

- - ^0\.\d+E?\d+$ # BAD + + "^0\\.\\d+E?\\d+$""

@@ -97,7 +97,7 @@ To make the processing faster, the regular expression should be rewritten such that the two \d+ sub-expressions - do not have overlapping matches: ^0\.\d+(E\d+)?$. + do not have overlapping matches: "^0\\.\\d+(E\\d+)?$".

diff --git a/java/ql/src/Security/CWE/CWE-730/ReDoS.qhelp b/java/ql/src/Security/CWE/CWE-730/ReDoS.qhelp index 9cfbcc32354..08b67acb638 100644 --- a/java/ql/src/Security/CWE/CWE-730/ReDoS.qhelp +++ b/java/ql/src/Security/CWE/CWE-730/ReDoS.qhelp @@ -10,7 +10,7 @@

Consider this regular expression:

- + ^_(__|.)+_$

@@ -24,7 +24,7 @@ This problem can be avoided by rewriting the regular expression to remove the ambiguity between the two branches of the alternative inside the repetition:

- + ^_(__|[^_])+_$ diff --git a/java/ql/src/Security/CWE/CWE-730/ReDoSIntroduction.inc.qhelp b/java/ql/src/Security/CWE/CWE-730/ReDoSIntroduction.inc.qhelp index f533097c222..f6e4dbd0a5f 100644 --- a/java/ql/src/Security/CWE/CWE-730/ReDoSIntroduction.inc.qhelp +++ b/java/ql/src/Security/CWE/CWE-730/ReDoSIntroduction.inc.qhelp @@ -17,7 +17,7 @@

- The regular expression engine provided by Python uses a backtracking non-deterministic finite + The regular expression engine provided by Java uses a backtracking non-deterministic finite automata to implement regular expression matching. While this approach is space-efficient and allows supporting advanced features like capture groups, it is not time-efficient in general. The worst-case @@ -38,6 +38,11 @@ references.

+ +

+ Note that Java versions 9 and above have some mitigations against ReDoS; however they aren't perfect + and more complex regular expressions can still be affected by this problem. +

@@ -48,6 +53,8 @@ ensure that the strings matched with the regular expression are short enough that the time-complexity does not matter. + Alternatively, an alternate regex library that guarantees linear time execution, such as Google's RE2J, may be used. +