Merge branch 'main' into henrymercer/rc-3.11-mergeback

This commit is contained in:
Henry Mercer
2023-10-03 16:30:23 +01:00
1450 changed files with 135236 additions and 95589 deletions

View File

@@ -1,8 +1,137 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<include src="IncompleteSanitization.qhelp" />
<overview>
<p>
Sanitizing untrusted input is a common technique for preventing injection attacks and other security
vulnerabilities. Regular expressions are often used to perform this sanitization. However, when the
regular expression matches multiple consecutive characters, replacing it just once
can result in the unsafe text reappearing in the sanitized input.
</p>
<p>
Attackers can exploit this issue by crafting inputs that, when sanitized with an ineffective regular
expression, still contain malicious code or content. This can lead to code execution, data exposure,
or other vulnerabilities.
</p>
</overview>
<recommendation>
<p>
To prevent this issue, it is highly recommended to use a well-tested sanitization library whenever
possible. These libraries are more likely to handle corner cases and ensure effective sanitization.
</p>
<p>
If a library is not an option, you can consider alternative strategies to fix the issue. For example,
applying the regular expression replacement repeatedly until no more replacements can be performed, or rewriting the regular
expression to match single characters instead of the entire unsafe text.
</p>
</recommendation>
<example>
<p>
Consider the following JavaScript code that aims to remove all HTML comment start and end tags:
</p>
<sample language="javascript">
str.replace(/&lt;!--|--!?&gt;/g, "");
</sample>
<p>
Given the input string "&lt;!&lt;!--- comment ---&gt;&gt;", the output will be "&lt;!-- comment --&gt;",
which still contains an HTML comment.
</p>
<p>
One possible fix for this issue is to apply the regular expression replacement repeatedly until no
more replacements can be performed. This ensures that the unsafe text does not re-appear in the sanitized input, effectively
removing all instances of the targeted pattern:
</p>
<sample language="javascript">
function removeHtmlComments(input) {
let previous;
do {
previous = input;
input = input.replace(/&lt;!--|--!?&gt;/g, "");
} while (input !== previous);
return input;
}
</sample>
</example>
<example>
<p>
Another example is the following regular expression intended to remove script tags:
</p>
<sample language="javascript">
str.replace(/&lt;script\b[^&lt;]*(?:(?!&lt;\/script&gt;)&lt;[^&lt;]*)*&lt;\/script&gt;/g, "");
</sample>
<p>
If the input string is "&lt;scrip&lt;script&gt;is removed&lt;/script&gt;t&gt;alert(123)&lt;/script&gt;",
the output will be "&lt;script&gt;alert(123)&lt;/script&gt;", which still contains a script tag.
</p>
<p>
A fix for this issue is to rewrite the regular expression to match single characters
("&lt;" and "&gt;") instead of the entire unsafe text. This simplifies the sanitization process
and ensures that all potentially unsafe characters are removed:
</p>
<sample language="javascript">
function removeAllHtmlTags(input) {
return input.replace(/&lt;|&gt;/g, "");
}
</sample>
<p>
Another potential fix is to use the popular <code>sanitize-html</code> npm library.
It keeps most of the safe HTML tags while removing all unsafe tags and attributes.
</p>
<sample language="javascript">
const sanitizeHtml = require("sanitize-html");
function removeAllHtmlTags(input) {
return sanitizeHtml(input);
}
</sample>
</example>
<example>
<p>
Lastly, consider a path sanitizer using the regular expression <code>/\.\.\//</code>:
</p>
<sample language="javascript">
str.replace(/\.\.\//g, "");
</sample>
<p>
The regular expression attempts to strip out all occurences of <code>/../</code> from <code>str</code>.
This will not work as expected: for the string <code>/./.././</code>, for example, it will remove the single
occurrence of <code>/../</code> in the middle, but the remainder of the string then becomes
<code>/../</code>, which is another instance of the substring we were trying to remove.
</p>
<p>
A possible fix for this issue is to use the "sanitize-filename" npm library for path sanitization.
This library is specifically designed to handle path sanitization, and should handle all corner cases
and ensure effective sanitization:
</p>
<sample language="javascript">
const sanitize = require("sanitize-filename");
function sanitizePath(input) {
return sanitize(input);
}
</sample>
</example>
<references>
<li>OWASP Top 10: <a href="https://www.owasp.org/index.php/Top_10-2017_A1-Injection">A1 Injection</a>.</li>
<li>Stack Overflow: <a href="https://stackoverflow.com/questions/6659351/removing-all-script-tags-from-html-with-js-regular-expression">Removing all script tags from HTML with JS regular expression</a>.</li>
</references>
</qhelp>

View File

@@ -43,18 +43,6 @@ needed, for instance by using prepared statements for SQL queries.
Otherwise, make sure to use a regular expression with the <code>g</code> flag to ensure that
all occurrences are replaced, and remember to escape backslashes if applicable.
</p>
<p>
Note, however, that this is generally <i>not</i> sufficient for replacing multi-character strings:
the <code>String.prototype.replace</code> method only performs one pass over the input string,
and will not replace further instances of the string that result from earlier replacements.
</p>
<p>
For example, consider the code snippet <code>s.replace(/\/\.\.\//g, "")</code>, which attempts
to strip out all occurences of <code>/../</code> from <code>s</code>. This will not work as
expected: for the string <code>/./.././</code>, for example, it will remove the single
occurrence of <code>/../</code> in the middle, but the remainder of the string then becomes
<code>/../</code>, which is another instance of the substring we were trying to remove.
</p>
</recommendation>
<example>

View File

@@ -16,6 +16,10 @@ To guard against untrusted URL redirection, it is advisable to avoid putting use
directly into a redirect URL. Instead, maintain a list of authorized
redirects on the server; then choose from that list based on the user input provided.
</p>
<p>
If this is not possible, then the user input should be validated in some other way,
for example, by verifying that the target URL is on the same host as the current page.
</p>
</recommendation>
<example>
@@ -32,6 +36,21 @@ before doing the redirection:
</p>
<sample src="examples/ServerSideUrlRedirectGood.js"/>
<p>
Alternatively, we can check that the target URL does not redirect to a different host
by parsing it relative to a base URL with a known host and verifying that the host
stays the same:
</p>
<sample src="examples/ServerSideUrlRedirectGood2.js"/>
<p>
Note that as written, the above code will allow redirects to URLs on <code>example.com</code>,
which is harmless but perhaps not intended. You can substitute your own domain (if known) for
<code>example.com</code> to prevent this.
</p>
</example>
<references>

View File

@@ -1,6 +1,6 @@
const app = require("express")();
app.get('/some/path', function(req, res) {
app.get("/redirect", function (req, res) {
// BAD: a request parameter is incorporated without validation into a URL redirect
res.redirect(req.param("target"));
res.redirect(req.query["target"]);
});

View File

@@ -2,9 +2,12 @@ const app = require("express")();
const VALID_REDIRECT = "http://cwe.mitre.org/data/definitions/601.html";
app.get('/some/path', function(req, res) {
app.get("/redirect", function (req, res) {
// GOOD: the request parameter is validated against a known fixed string
let target = req.param("target");
if (VALID_REDIRECT === target)
let target = req.query["target"];
if (VALID_REDIRECT === target) {
res.redirect(target);
} else {
res.redirect("/");
}
});

View File

@@ -0,0 +1,22 @@
const app = require("express")();
function isLocalUrl(path) {
try {
return (
// TODO: consider substituting your own domain for example.com
new URL(path, "https://example.com").origin === "https://example.com"
);
} catch (e) {
return false;
}
}
app.get("/redirect", function (req, res) {
// GOOD: check that we don't redirect to a different host
let target = req.query["target"];
if (isLocalUrl(target)) {
res.redirect(target);
} else {
res.redirect("/");
}
});