mirror of
https://github.com/github/codeql.git
synced 2026-04-25 08:45:14 +02:00
Merge branch 'main' into henrymercer/rc-3.11-mergeback
This commit is contained in:
@@ -1,8 +1,137 @@
|
||||
<!DOCTYPE qhelp PUBLIC
|
||||
"-//Semmle//qhelp//EN"
|
||||
"qhelp.dtd">
|
||||
"-//Semmle//qhelp//EN"
|
||||
"qhelp.dtd">
|
||||
<qhelp>
|
||||
|
||||
<include src="IncompleteSanitization.qhelp" />
|
||||
<overview>
|
||||
<p>
|
||||
Sanitizing untrusted input is a common technique for preventing injection attacks and other security
|
||||
vulnerabilities. Regular expressions are often used to perform this sanitization. However, when the
|
||||
regular expression matches multiple consecutive characters, replacing it just once
|
||||
can result in the unsafe text reappearing in the sanitized input.
|
||||
</p>
|
||||
<p>
|
||||
Attackers can exploit this issue by crafting inputs that, when sanitized with an ineffective regular
|
||||
expression, still contain malicious code or content. This can lead to code execution, data exposure,
|
||||
or other vulnerabilities.
|
||||
</p>
|
||||
</overview>
|
||||
|
||||
<recommendation>
|
||||
<p>
|
||||
To prevent this issue, it is highly recommended to use a well-tested sanitization library whenever
|
||||
possible. These libraries are more likely to handle corner cases and ensure effective sanitization.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
If a library is not an option, you can consider alternative strategies to fix the issue. For example,
|
||||
applying the regular expression replacement repeatedly until no more replacements can be performed, or rewriting the regular
|
||||
expression to match single characters instead of the entire unsafe text.
|
||||
</p>
|
||||
</recommendation>
|
||||
|
||||
<example>
|
||||
<p>
|
||||
Consider the following JavaScript code that aims to remove all HTML comment start and end tags:
|
||||
</p>
|
||||
|
||||
<sample language="javascript">
|
||||
str.replace(/<!--|--!?>/g, "");
|
||||
</sample>
|
||||
|
||||
<p>
|
||||
Given the input string "<!<!--- comment --->>", the output will be "<!-- comment -->",
|
||||
which still contains an HTML comment.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
One possible fix for this issue is to apply the regular expression replacement repeatedly until no
|
||||
more replacements can be performed. This ensures that the unsafe text does not re-appear in the sanitized input, effectively
|
||||
removing all instances of the targeted pattern:
|
||||
</p>
|
||||
|
||||
<sample language="javascript">
|
||||
function removeHtmlComments(input) {
|
||||
let previous;
|
||||
do {
|
||||
previous = input;
|
||||
input = input.replace(/<!--|--!?>/g, "");
|
||||
} while (input !== previous);
|
||||
return input;
|
||||
}
|
||||
</sample>
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<p>
|
||||
Another example is the following regular expression intended to remove script tags:
|
||||
</p>
|
||||
|
||||
<sample language="javascript">
|
||||
str.replace(/<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/g, "");
|
||||
</sample>
|
||||
|
||||
<p>
|
||||
If the input string is "<scrip<script>is removed</script>t>alert(123)</script>",
|
||||
the output will be "<script>alert(123)</script>", which still contains a script tag.
|
||||
</p>
|
||||
<p>
|
||||
A fix for this issue is to rewrite the regular expression to match single characters
|
||||
("<" and ">") instead of the entire unsafe text. This simplifies the sanitization process
|
||||
and ensures that all potentially unsafe characters are removed:
|
||||
</p>
|
||||
<sample language="javascript">
|
||||
function removeAllHtmlTags(input) {
|
||||
return input.replace(/<|>/g, "");
|
||||
}
|
||||
</sample>
|
||||
<p>
|
||||
Another potential fix is to use the popular <code>sanitize-html</code> npm library.
|
||||
It keeps most of the safe HTML tags while removing all unsafe tags and attributes.
|
||||
</p>
|
||||
<sample language="javascript">
|
||||
const sanitizeHtml = require("sanitize-html");
|
||||
function removeAllHtmlTags(input) {
|
||||
return sanitizeHtml(input);
|
||||
}
|
||||
</sample>
|
||||
|
||||
</example>
|
||||
|
||||
<example>
|
||||
<p>
|
||||
Lastly, consider a path sanitizer using the regular expression <code>/\.\.\//</code>:
|
||||
</p>
|
||||
|
||||
<sample language="javascript">
|
||||
str.replace(/\.\.\//g, "");
|
||||
</sample>
|
||||
|
||||
<p>
|
||||
The regular expression attempts to strip out all occurences of <code>/../</code> from <code>str</code>.
|
||||
This will not work as expected: for the string <code>/./.././</code>, for example, it will remove the single
|
||||
occurrence of <code>/../</code> in the middle, but the remainder of the string then becomes
|
||||
<code>/../</code>, which is another instance of the substring we were trying to remove.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
A possible fix for this issue is to use the "sanitize-filename" npm library for path sanitization.
|
||||
This library is specifically designed to handle path sanitization, and should handle all corner cases
|
||||
and ensure effective sanitization:
|
||||
</p>
|
||||
|
||||
<sample language="javascript">
|
||||
const sanitize = require("sanitize-filename");
|
||||
|
||||
function sanitizePath(input) {
|
||||
return sanitize(input);
|
||||
}
|
||||
</sample>
|
||||
|
||||
</example>
|
||||
|
||||
<references>
|
||||
<li>OWASP Top 10: <a href="https://www.owasp.org/index.php/Top_10-2017_A1-Injection">A1 Injection</a>.</li>
|
||||
<li>Stack Overflow: <a href="https://stackoverflow.com/questions/6659351/removing-all-script-tags-from-html-with-js-regular-expression">Removing all script tags from HTML with JS regular expression</a>.</li>
|
||||
</references>
|
||||
</qhelp>
|
||||
|
||||
@@ -43,18 +43,6 @@ needed, for instance by using prepared statements for SQL queries.
|
||||
Otherwise, make sure to use a regular expression with the <code>g</code> flag to ensure that
|
||||
all occurrences are replaced, and remember to escape backslashes if applicable.
|
||||
</p>
|
||||
<p>
|
||||
Note, however, that this is generally <i>not</i> sufficient for replacing multi-character strings:
|
||||
the <code>String.prototype.replace</code> method only performs one pass over the input string,
|
||||
and will not replace further instances of the string that result from earlier replacements.
|
||||
</p>
|
||||
<p>
|
||||
For example, consider the code snippet <code>s.replace(/\/\.\.\//g, "")</code>, which attempts
|
||||
to strip out all occurences of <code>/../</code> from <code>s</code>. This will not work as
|
||||
expected: for the string <code>/./.././</code>, for example, it will remove the single
|
||||
occurrence of <code>/../</code> in the middle, but the remainder of the string then becomes
|
||||
<code>/../</code>, which is another instance of the substring we were trying to remove.
|
||||
</p>
|
||||
</recommendation>
|
||||
|
||||
<example>
|
||||
|
||||
@@ -16,6 +16,10 @@ To guard against untrusted URL redirection, it is advisable to avoid putting use
|
||||
directly into a redirect URL. Instead, maintain a list of authorized
|
||||
redirects on the server; then choose from that list based on the user input provided.
|
||||
</p>
|
||||
<p>
|
||||
If this is not possible, then the user input should be validated in some other way,
|
||||
for example, by verifying that the target URL is on the same host as the current page.
|
||||
</p>
|
||||
</recommendation>
|
||||
|
||||
<example>
|
||||
@@ -32,6 +36,21 @@ before doing the redirection:
|
||||
</p>
|
||||
|
||||
<sample src="examples/ServerSideUrlRedirectGood.js"/>
|
||||
|
||||
<p>
|
||||
Alternatively, we can check that the target URL does not redirect to a different host
|
||||
by parsing it relative to a base URL with a known host and verifying that the host
|
||||
stays the same:
|
||||
</p>
|
||||
|
||||
<sample src="examples/ServerSideUrlRedirectGood2.js"/>
|
||||
|
||||
<p>
|
||||
Note that as written, the above code will allow redirects to URLs on <code>example.com</code>,
|
||||
which is harmless but perhaps not intended. You can substitute your own domain (if known) for
|
||||
<code>example.com</code> to prevent this.
|
||||
</p>
|
||||
|
||||
</example>
|
||||
|
||||
<references>
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
const app = require("express")();
|
||||
|
||||
app.get('/some/path', function(req, res) {
|
||||
app.get("/redirect", function (req, res) {
|
||||
// BAD: a request parameter is incorporated without validation into a URL redirect
|
||||
res.redirect(req.param("target"));
|
||||
res.redirect(req.query["target"]);
|
||||
});
|
||||
|
||||
@@ -2,9 +2,12 @@ const app = require("express")();
|
||||
|
||||
const VALID_REDIRECT = "http://cwe.mitre.org/data/definitions/601.html";
|
||||
|
||||
app.get('/some/path', function(req, res) {
|
||||
app.get("/redirect", function (req, res) {
|
||||
// GOOD: the request parameter is validated against a known fixed string
|
||||
let target = req.param("target");
|
||||
if (VALID_REDIRECT === target)
|
||||
let target = req.query["target"];
|
||||
if (VALID_REDIRECT === target) {
|
||||
res.redirect(target);
|
||||
} else {
|
||||
res.redirect("/");
|
||||
}
|
||||
});
|
||||
|
||||
@@ -0,0 +1,22 @@
|
||||
const app = require("express")();
|
||||
|
||||
function isLocalUrl(path) {
|
||||
try {
|
||||
return (
|
||||
// TODO: consider substituting your own domain for example.com
|
||||
new URL(path, "https://example.com").origin === "https://example.com"
|
||||
);
|
||||
} catch (e) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
app.get("/redirect", function (req, res) {
|
||||
// GOOD: check that we don't redirect to a different host
|
||||
let target = req.query["target"];
|
||||
if (isLocalUrl(target)) {
|
||||
res.redirect(target);
|
||||
} else {
|
||||
res.redirect("/");
|
||||
}
|
||||
});
|
||||
Reference in New Issue
Block a user