mirror of
https://github.com/github/codeql.git
synced 2026-04-26 01:05:15 +02:00
docs: rename ql-training-rst > ql-training
(cherry picked from commit 65573492e7)
This commit is contained in:
141
docs/language/ql-training/java/apache-struts-java.rst
Normal file
141
docs/language/ql-training/java/apache-struts-java.rst
Normal file
@@ -0,0 +1,141 @@
|
||||
=======================
|
||||
Exercise: Apache Struts
|
||||
=======================
|
||||
|
||||
.. container:: subheading
|
||||
|
||||
Unsafe deserialization leading to an RCE
|
||||
|
||||
CVE-2017-9805
|
||||
|
||||
.. container:: semmle-logo
|
||||
|
||||
Semmle :sup:`TM`
|
||||
|
||||
.. rst-class:: setup
|
||||
|
||||
Setup
|
||||
=====
|
||||
|
||||
For this example you should download:
|
||||
|
||||
- `QL for Eclipse <https://help.semmle.com/ql-for-eclipse/Content/WebHelp/install-plugin-free.html>`__
|
||||
- `Apache Struts snapshot <https://downloads.lgtm.com/snapshots/java/apache/struts/apache-struts-7fd1622-CVE-2018-11776.zip>`__
|
||||
|
||||
.. note::
|
||||
|
||||
For this example, we will be analyzing `Apache Struts <https://github.com/apache/struts>`__.
|
||||
|
||||
You can also query the project in `the query console <https://lgtm.com/query/project:1878521151/lang:java/>`__ on LGTM.com.
|
||||
|
||||
.. insert snapshot-note.rst to explain differences between snapshot available to download and the version available in the query console.
|
||||
|
||||
.. include:: ../slide-snippets/snapshot-note.rst
|
||||
|
||||
.. resume slides
|
||||
|
||||
Unsafe deserialization in Struts
|
||||
================================
|
||||
|
||||
Apache Struts provides a ``ContentTypeHandler`` interface, which can be implemented for specific content types. It defines the following interface method:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
void toObject(Reader in, Object target);
|
||||
|
||||
|
||||
which is intended to populate the ``target`` object with data from the reader, usually through deserialization. However, the ``in`` parameter should be considered untrusted, and should not be deserialized without sanitization.
|
||||
|
||||
RCE in Apache Struts
|
||||
====================
|
||||
|
||||
- Vulnerable code looked like this (`original <https://lgtm.com/projects/g/apache/struts/snapshot/b434c23f95e0f9d5bde789bfa07f8fc1d5a8951d/files/plugins/rest/src/main/java/org/apache/struts2/rest/handler/XStreamHandler.java?sort=name&dir=ASC&mode=heatmap#L45>`__):
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
public void toObject(Reader in, Object target) {
|
||||
XStream xstream = createXStream();
|
||||
xstream.fromXML(in, target);
|
||||
}
|
||||
|
||||
- Xstream allows deserialization of **dynamic proxies**, which permit remote code execution.
|
||||
|
||||
- Disclosed as `CVE-2017-9805 <http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-9805>`__
|
||||
|
||||
- Blog post: https://lgtm.com/blog/apache_struts_CVE-2017-9805
|
||||
|
||||
Finding the RCE yourself
|
||||
========================
|
||||
|
||||
#. Create a QL class to find the interface ``org.apache.struts2.rest.handler.ContentTypeHandler``
|
||||
|
||||
**Hint**: Use predicate ``hasQualifiedName(...)``
|
||||
|
||||
#. Identify methods called ``toObject``, which are defined on direct subtypes of ``ContentTypeHandler``
|
||||
|
||||
**Hint**: Use ``Method.getDeclaringType()`` and ``Type.getASupertype()``
|
||||
|
||||
#. Implement a ``DataFlow::Configuration``, defining the source as the first parameter of a ``toObject`` method, and the sink as an instance of ``UnsafeDeserializationSink``.
|
||||
|
||||
**Hint**: Use ``Node::asParameter()``
|
||||
|
||||
#. Construct the query as a path-problem query, and verify you find one result.
|
||||
|
||||
Model answer, step 1
|
||||
====================
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
import java
|
||||
|
||||
/** The interface `org.apache.struts2.rest.handler.ContentTypeHandler`. */
|
||||
|
||||
class ContentTypeHandler extends RefType {
|
||||
ContentTypeHandler() {
|
||||
this.hasQualifiedName("org.apache.struts2.rest.handler", "ContentTypeHandler")
|
||||
}
|
||||
}
|
||||
|
||||
Model answer, step 2
|
||||
====================
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
/** A `toObject` method on a subtype of `org.apache.struts2.rest.handler.ContentTypeHandler`. */
|
||||
class ContentTypeHandlerDeserialization extends Method {
|
||||
ContentTypeHandlerDeserialization() {
|
||||
this.getDeclaringType().getASupertype() instanceof ContentTypeHandler and
|
||||
this.hasName("toObject")
|
||||
|
||||
Model answer, step 3
|
||||
====================
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
import UnsafeDeserialization
|
||||
import semmle.code.java.dataflow.DataFlow::DataFlow
|
||||
/**
|
||||
* Configuration that tracks the flow of taint from the first parameter of
|
||||
* `ContentTypeHandler.toObject` to an instance of unsafe deserialization.
|
||||
*/
|
||||
class StrutsUnsafeDeserializationConfig extends Configuration {
|
||||
StrutsUnsafeDeserializationConfig() { this = "StrutsUnsafeDeserializationConfig" }
|
||||
override predicate isSource(Node source) {
|
||||
source.asParameter() = any(ContentTypeHandlerDeserialization des).getParameter(0)
|
||||
}
|
||||
override predicate isSink(Node sink) { sink instanceof UnsafeDeserializationSink }
|
||||
}
|
||||
|
||||
Model answer, step 4
|
||||
====================
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
import PathGraph
|
||||
...
|
||||
from PathNode source, PathNode sink, StrutsUnsafeDeserializationConfig conf
|
||||
where conf.hasFlowPath(source, sink)
|
||||
and sink.getNode() instanceof UnsafeDeserializationSink
|
||||
select sink.getNode().(UnsafeDeserializationSink).getMethodAccess(), source, sink, "Unsafe deserialization of $@.", source, "user input"
|
||||
|
||||
More full-featured version: https://github.com/Semmle/demos/tree/master/ql_demos/java/Apache_Struts_CVE-2017-9805
|
||||
146
docs/language/ql-training/java/data-flow-java.rst
Normal file
146
docs/language/ql-training/java/data-flow-java.rst
Normal file
@@ -0,0 +1,146 @@
|
||||
=========================
|
||||
Introduction to data flow
|
||||
=========================
|
||||
|
||||
.. container:: semmle-logo
|
||||
|
||||
Semmle :sup:`TM`
|
||||
|
||||
Finding SPARQL injection vulnerabilities in Java
|
||||
|
||||
.. rst-class:: setup
|
||||
|
||||
Setup
|
||||
=====
|
||||
|
||||
For this example you should download:
|
||||
|
||||
- `QL for Eclipse <https://help.semmle.com/ql-for-eclipse/Content/WebHelp/install-plugin-free.html>`__
|
||||
- `VIVO Vitro snapshot <http://downloads.lgtm.com/snapshots/java/vivo-project/Vitro/vivo-project_Vitro_java-srcVersion_47ae42c01954432c3c3b92d5d163551ce367f510-dist_odasa-lgtm-2019-04-23-7ceff95-linux64.zip>`__
|
||||
|
||||
.. note::
|
||||
|
||||
For this example, we will be analyzing `VIVO Vitro <https://github.com/vivo-project/Vitro>`__.
|
||||
|
||||
You can also query the project in `the query console <https://lgtm.com/query/project:14040005/lang:java/>`__ on LGTM.com.
|
||||
|
||||
.. insert snapshot-note.rst to explain differences between snapshot available to download and the version available in the query console.
|
||||
|
||||
.. include:: ../slide-snippets/snapshot-note.rst
|
||||
|
||||
.. resume slides
|
||||
|
||||
.. rst-class:: agenda
|
||||
|
||||
Agenda
|
||||
======
|
||||
|
||||
- SPARQL injection
|
||||
- Data flow
|
||||
- Modules and libraries
|
||||
- Local data flow
|
||||
- Local taint tracking
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
`SPARQL <https://en.wikipedia.org/wiki/SPARQL>`__ is a language for querying key-value databases in RDF format, which can suffer from SQL injection-like vulnerabilities:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
sparqlAskQuery("ASK { <" + individualURI + "> ?p ?o }")
|
||||
|
||||
``individualURI`` is provided by a user, allowing an attacker to prematurely close the ``>``, and provide additional content.
|
||||
|
||||
**Goal**: Find query strings that are created by concatenation.
|
||||
|
||||
.. note::
|
||||
|
||||
If you have completed the “Example: Query injection” slide deck which was part of the previous course, this example will look familiar to you.
|
||||
|
||||
To understand the scope of this vulnerability, consider what would happen if a malicious user could provide the following as the content of the ``individualURI`` variable:
|
||||
|
||||
``“http://vivoweb.org/ontology/core#FacultyMember> ?p ?o . FILTER regex("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!", "(.*a){50}") } #``
|
||||
|
||||
|
||||
Example: SPARQL injection
|
||||
=========================
|
||||
|
||||
We can write a simple query that finds string concatenations that occur in calls to SPARQL query APIs.
|
||||
|
||||
.. rst-class:: build
|
||||
|
||||
.. literalinclude:: ../query-examples/java/data-flow-java-1.ql
|
||||
:language: ql
|
||||
|
||||
.. note::
|
||||
|
||||
This is similar, but not identical, to the formulation we had in the previous training deck. It has been rewritten to make it easier for the next step.
|
||||
|
||||
Success! But also missing results...
|
||||
====================================
|
||||
|
||||
Query finds a CVE reported by Semmle (CVE-2019-6986), plus one other result, but misses other opportunities where:
|
||||
|
||||
- String concatenation occurs on a different line in the same method.
|
||||
- String concatenation occurs in a different method.
|
||||
- String concatenation occurs through ``StringBuilders`` or similar.
|
||||
- Entirety of user input is provided as the query.
|
||||
|
||||
We want to improve our query to catch more of these cases.
|
||||
|
||||
.. note::
|
||||
|
||||
For more details of the CVE, see: https://github.com/Semmle/SecurityExploits/tree/master/vivo-project/CVE-2019-6986
|
||||
|
||||
As an example, consider this SPARQL query call:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
String queryString = "ASK { <" + individualURI + "> ?p ?o }";
|
||||
sparqlAskQuery(queryString);
|
||||
|
||||
Here the concatenation occurs before the call, so the existing query would miss this - the string concatenation does not occur *directly* as the first argument of the call.
|
||||
|
||||
.. include general data flow slides
|
||||
|
||||
.. include:: ../slide-snippets/local-data-flow.rst
|
||||
|
||||
.. resume language-specific slides
|
||||
|
||||
Exercise: revisiting SPARQL injection
|
||||
=====================================
|
||||
|
||||
Refine the query to find string concatenation that occurs in the same method, but a different line.
|
||||
|
||||
**Hint**: Use ``DataFlow::localFlow`` to assert that the result flows to the SPARQL call argument, using ``DataFlow::exprNode`` to get the data flow nodes for the relevant expression nodes.
|
||||
|
||||
.. rst-class:: build
|
||||
|
||||
.. literalinclude:: ../query-examples/java/data-flow-java-2.ql
|
||||
:language: ql
|
||||
|
||||
Refinements (take home exercise)
|
||||
================================
|
||||
|
||||
In Java, strings are often created using ``StringBuilder`` and ``StringBuffer`` classes. For example:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
StringBuilder queryBuilder = new StringBuilder();
|
||||
queryBuilder.add("ASK { <");
|
||||
queryBuilder.add(individualURI);
|
||||
queryBuilder.add("> ?p ?o }");
|
||||
sparqlAskQuery(queryBuilder);
|
||||
|
||||
**Exercise**: Refine the query to consider strings created from ``StringBuilder`` and ``StringBuffer`` classes as sources of concatenation.
|
||||
|
||||
Beyond local data flow
|
||||
======================
|
||||
|
||||
- We are still missing possible results.
|
||||
|
||||
- Concatenation that occurs outside the enclosing method.
|
||||
|
||||
- Instead, let’s turn the problem around and find user-controlled data that flows into a ``printf`` format argument, potentially through calls.
|
||||
- This needs :doc:`global data flow <global-data-flow-java>`.
|
||||
190
docs/language/ql-training/java/global-data-flow-java.rst
Normal file
190
docs/language/ql-training/java/global-data-flow-java.rst
Normal file
@@ -0,0 +1,190 @@
|
||||
================================
|
||||
Introduction to global data flow
|
||||
================================
|
||||
|
||||
QL for Java
|
||||
|
||||
.. container:: semmle-logo
|
||||
|
||||
Semmle :sup:`TM`
|
||||
|
||||
.. rst-class:: setup
|
||||
|
||||
Setup
|
||||
=====
|
||||
|
||||
For this example you should download:
|
||||
|
||||
- `QL for Eclipse <https://help.semmle.com/ql-for-eclipse/Content/WebHelp/install-plugin-free.html>`__
|
||||
- `Apache Struts snapshot <https://downloads.lgtm.com/snapshots/java/apache/struts/apache-struts-7fd1622-CVE-2018-11776.zip>`__
|
||||
|
||||
.. note::
|
||||
|
||||
For this example, we will be analyzing `Apache Struts <https://github.com/apache/struts>`__.
|
||||
|
||||
You can also query the project in `the query console <https://lgtm.com/query/project:1878521151/lang:java/>`__ on LGTM.com.
|
||||
|
||||
.. insert snapshot-note.rst to explain differences between snapshot available to download and the version available in the query console.
|
||||
|
||||
.. include:: ../slide-snippets/snapshot-note.rst
|
||||
|
||||
.. resume slides
|
||||
|
||||
.. rst-class:: agenda
|
||||
|
||||
Agenda
|
||||
======
|
||||
|
||||
- Global taint tracking
|
||||
- Sanitizers
|
||||
- Path queries
|
||||
- Data flow models
|
||||
|
||||
.. insert common global data flow slides
|
||||
|
||||
.. include:: ../slide-snippets/global-data-flow.rst
|
||||
|
||||
.. resume language-specific global data flow slides
|
||||
|
||||
Code injection in Apache struts
|
||||
===============================
|
||||
|
||||
- In April 2018, Man Yue Mo, a security researcher at Semmle, reported 5 remote code execution (RCE) vulnerabilities (CVE-2018-11776) in Apache Struts.
|
||||
|
||||
- These vulnerabilities were caused by untrusted, unsanitized data being evaluated as an OGNL (Object Graph Navigation Library) expression, allowing malicious users to perform remote code execution.
|
||||
|
||||
- Conceptually, this is a global taint tracking problem - does untrusted remote input flow to a method call which evaluates OGNL?
|
||||
|
||||
.. note::
|
||||
|
||||
More details on the CVE can be found here: https://lgtm.com/blog/apache_struts_CVE-2018-11776 and
|
||||
https://github.com/Semmle/demos/tree/master/ql_demos/java/Apache_Struts_CVE-2018-11776
|
||||
|
||||
More details on OGNL can be found here: https://commons.apache.org/proper/commons-ognl/
|
||||
|
||||
.. rst-class:: java-data-flow-code-example
|
||||
|
||||
Code example
|
||||
============
|
||||
|
||||
Finding RCEs (outline)
|
||||
======================
|
||||
|
||||
.. literalinclude:: ../query-examples/java/global-data-flow-java-1.ql
|
||||
:language: ql
|
||||
|
||||
Defining sources
|
||||
================
|
||||
|
||||
We want to look for method calls where the method name is ``getNamespace()``, and the declaring type of the method is a class called ``ActionProxy``.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
import semmle.code.java.security.Security
|
||||
|
||||
class TaintedOGNLConfig extends TaintTracking::Configuration {
|
||||
override predicate isSource(DataFlow::Node source) {
|
||||
exists(Method m |
|
||||
m.getName() = "getNamespace" and
|
||||
m.getDeclaringType().getName() = "ActionProxy" and
|
||||
source.asExpr() = m.getAReference()
|
||||
)
|
||||
}
|
||||
...
|
||||
}
|
||||
|
||||
.. note::
|
||||
|
||||
We first define what it means to be a *source* of tainted data for this particular problem. In this case, we are interested in the value returned by calls to ``getNamespace()``.
|
||||
|
||||
|
||||
Exercise: Defining sinks
|
||||
========================
|
||||
|
||||
Fill in the definition of ``isSink``.
|
||||
|
||||
**Hint**: We want to find the first argument of calls to the method ``compileAndExecute``.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
import semmle.code.java.security.Security
|
||||
|
||||
class TaintedOGNLConfig extends TaintTracking::Configuration {
|
||||
override predicate isSink(DataFlow::Node sink) {
|
||||
/* Fill me in */
|
||||
}
|
||||
...
|
||||
}
|
||||
|
||||
.. note::
|
||||
|
||||
The second part is to define what it means to be a sink for this particular problem. The queries from an :doc:`Introduction to data flow <data-flow-java>` will be useful for this exercise.
|
||||
|
||||
Solution: Defining sinks
|
||||
========================
|
||||
|
||||
Find a method access to ``compileAndExecute``, and mark the first argument.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
import semmle.code.java.security.Security
|
||||
|
||||
class TaintedOGNLConfig extends TaintTracking::Configuration {
|
||||
override predicate isSink(DataFlow::Node sink) {
|
||||
exists(MethodAccess ma |
|
||||
ma.getMethod().getName() = "compileAndExecute" and
|
||||
ma.getArgument(0) = sink.asExpr()
|
||||
)
|
||||
}
|
||||
...
|
||||
}
|
||||
|
||||
.. insert path queries slides
|
||||
|
||||
.. include:: ../slide-snippets/path-queries.rst
|
||||
|
||||
.. resume language-specific global data flow slides
|
||||
|
||||
Defining sanitizers
|
||||
===================
|
||||
|
||||
A sanitizer allows us to *prevent* flow through a particular node in the graph. For example, flows that go via ``ValueStackShadowMap`` are not particularly interesting, because it is a class that is rarely used in practice. We can exclude them like so:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
class TaintedOGNLConfig extends TaintTracking::Configuration {
|
||||
override predicate isSanitizer(DataFlow::Node nd) {
|
||||
nd.getEnclosingCallable()
|
||||
.getDeclaringType()
|
||||
.getName() = "ValueStackShadowMap"
|
||||
}
|
||||
...
|
||||
}
|
||||
|
||||
Defining additional taint steps
|
||||
===============================
|
||||
|
||||
Add an additional taint step that (heuristically) taints a local variable if it is a pointer, and it is passed to a function in a parameter position that taints it.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
class TaintedOGNLConfig extends TaintTracking::Configuration {
|
||||
override predicate isAdditionalTaintStep(DataFlow::Node pred,
|
||||
DataFlow::Node succ) {
|
||||
exists(Field f, RefType t |
|
||||
node1.asExpr() = f.getAnAssignedValue() and
|
||||
node2.asExpr() = f.getAnAccess() and
|
||||
node1.asExpr().getEnclosingCallable().getDeclaringType() = t and
|
||||
node2.asExpr().getEnclosingCallable().getDeclaringType() = t
|
||||
)
|
||||
}
|
||||
...
|
||||
}
|
||||
|
||||
|
||||
.. rst-class:: end-slide
|
||||
|
||||
Extra slides
|
||||
============
|
||||
|
||||
.. include:: ../slide-snippets/global-data-flow-extra-slides.rst
|
||||
211
docs/language/ql-training/java/intro-ql-java.rst
Normal file
211
docs/language/ql-training/java/intro-ql-java.rst
Normal file
@@ -0,0 +1,211 @@
|
||||
================================
|
||||
Introduction to variant analysis
|
||||
================================
|
||||
|
||||
QL for Java
|
||||
|
||||
.. container:: semmle-logo
|
||||
|
||||
Semmle :sup:`TM`
|
||||
|
||||
.. rst-class:: setup
|
||||
|
||||
Setup
|
||||
=====
|
||||
|
||||
For this example you should download:
|
||||
|
||||
- `QL for Eclipse <https://help.semmle.com/ql-for-eclipse/Content/WebHelp/install-plugin-free.html>`__
|
||||
- `Apache Struts snapshot <https://downloads.lgtm.com/snapshots/java/apache/struts/apache-struts-7fd1622-CVE-2018-11776.zip>`__
|
||||
|
||||
.. note::
|
||||
|
||||
For this example, we will be analyzing `Apache Struts <https://github.com/apache/struts>`__.
|
||||
|
||||
You can also query the project in `the query console <https://lgtm.com/query/project:1878521151/lang:java/>`__ on LGTM.com.
|
||||
|
||||
.. insert snapshot-note.rst to explain differences between snapshot available to download and the version available in the query console.
|
||||
|
||||
.. include:: ../slide-snippets/snapshot-note.rst
|
||||
|
||||
.. resume slides
|
||||
|
||||
.. Include language-agnostic section here
|
||||
|
||||
.. include:: ../slide-snippets/intro-ql-general.rst
|
||||
|
||||
Oops
|
||||
====
|
||||
|
||||
.. code-block:: java
|
||||
:emphasize-lines: 3
|
||||
|
||||
int write(int[] buf, int size, int loc, int val) {
|
||||
if (loc >= size) {
|
||||
// return -1;
|
||||
}
|
||||
|
||||
buf[loc] = val;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
- The return statement has been commented out (during debugging?)
|
||||
- The ``if`` statement is now dead code
|
||||
- No explicit bounds checking, will throw ``ArrayIndexOutOfbounds``
|
||||
|
||||
.. note::
|
||||
|
||||
Here’s a simple (artificial) bug, which we’ll develop a QL query to catch.
|
||||
|
||||
This function writes a value to a given location in an array, first trying to do a bounds check to validate that the location is within bounds. However, the return statement has been commented out, leaving a redundant if statement and no bounds checking.
|
||||
|
||||
This case can act as our “patient zero” in the variant analysis game.
|
||||
|
||||
A simple QL query
|
||||
=================
|
||||
|
||||
.. literalinclude:: ../query-examples/java/empty-if-java.ql
|
||||
:language: ql
|
||||
|
||||
.. note::
|
||||
|
||||
We are going to write a simple query which finds “if statements” with empty “then” blocks, so we can highlight the results like those on the previous slide. The query can be run in the `query console on LGTM <https://lgtm.com/query>`__, or in your `IDE <https://lgtm.com/help/lgtm/running-queries-ide>`__.
|
||||
|
||||
A `QL query <https://help.semmle.com/QL/ql-handbook/queries.html>`__ consists of a “select” clause that indicates what results should be returned. Typically it will also provide a “from” clause to declare some variables, and a “where” clause to state conditions over those variables. For more information on the structure of query files (including links to useful topics in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__), see `Introduction to query files <https://help.semmle.com/QL/learn-ql/ql/writing-queries/introduction-to-queries.html>`__.
|
||||
|
||||
In our example here, the first line of the query imports the `Java standard QL library <https://help.semmle.com/qldoc/java/>`__, which defines concepts like ``IfStmt`` and ``Block``.
|
||||
The query proper starts by declaring two variables–ifStmt and block. These variables represent sets of values in the database, according to the type of each of the variables. For example, ``ifStmt`` has the type ``IfStmt``, which means it represents the set of all if statements in the program.
|
||||
|
||||
If we simply selected these two variables::
|
||||
|
||||
from IfStmt ifStmt, Block block
|
||||
select ifStmt, block
|
||||
|
||||
We would get a result row for every combination of blocks and if statements in the program. This is known as a cross-product, because there is no logical condition linking the two variables. We can use the where clause to specify the condition that we are only interested in rows where the “block” is the “then” part of the if statement. We do this by specifying::
|
||||
|
||||
block = ifStmt.getThen()
|
||||
|
||||
This states that the block is equal to (not assigned!) the “then” part of the ``ifStmt``. ``getThen()`` is an operation which is available on any IfStmt. One way to interpret this is as a filtering operation – starting with every pair of block and if statements, check each one to whether the logical condition holds, and only keep the row if that is the case.
|
||||
We can add a second condition that specifies the block must be empty::
|
||||
|
||||
and block.isEmpty()
|
||||
|
||||
The ``isEmpty()`` operation is available on any Block, and is only true if the “block” has no children.
|
||||
|
||||
Finally, we select a location, at which to report the problem, and a message, to explain what the problem is.
|
||||
|
||||
|
||||
Structure of a QL query
|
||||
=======================
|
||||
|
||||
A **query file** has the extension ``.ql`` and contains a **query clause**, and optionally **predicates**, **classes**, and **modules**.
|
||||
|
||||
A **query library** has the extension ``.qll`` and does not contain a query clause, but may contain modules, classes, and predicates.
|
||||
|
||||
Each query library also implicitly defines a module.
|
||||
|
||||
**Import** statements allow the classes and predicates defined in one module to be used in another.
|
||||
|
||||
.. note::
|
||||
|
||||
QL queries are always contained in query files with the file extension ``.ql``. `Quick queries <https://help.semmle.com/ql-for-eclipse/Content/WebHelp/quick-query.html>`__, run in `QL for Eclipse <https://help.semmle.com/ql-for-eclipse/Content/WebHelp/home-page.html>`__, are no exception: the quick query window maintains a temporary QL file in the background.
|
||||
|
||||
Parts of queries can be lifted into `QL library files <https://help.semmle.com/QL/ql-handbook/modules.html#library-modules>`__ with the extension ``.qll``. Definitions within such libraries can be brought into scope using “import” statements, and similarly QLL files can import each other’s definitions using “import” statements.
|
||||
|
||||
Logic can be encapsulated as user-defined `predicates <https://help.semmle.com/QL/ql-handbook/predicates.html>`__ and `classes <https://help.semmle.com/QL/ql-handbook/types.html#classes>`__, and organized into `modules <https://help.semmle.com/QL/ql-handbook/modules.html>`__. Each QLL file implicitly defines a module, but QL and QLL files can also contain explicit module definitions, as we will see later.
|
||||
|
||||
Predicates in QL
|
||||
================
|
||||
|
||||
A predicate allows you to pull out and name parts of a query.
|
||||
|
||||
.. container:: column-left
|
||||
|
||||
.. literalinclude:: ../query-examples/java/empty-if-java.ql
|
||||
:language: ql
|
||||
:emphasize-lines: 6
|
||||
|
||||
.. container:: column-right
|
||||
|
||||
.. literalinclude:: ../query-examples/java/empty-if-java-predicate.ql
|
||||
:language: ql
|
||||
:emphasize-lines: 3-5
|
||||
|
||||
.. note::
|
||||
|
||||
A `QL predicate <https://help.semmle.com/QL/ql-handbook/predicates.html>`__ takes zero or more parameters, and its body is a condition on those parameters. The predicate may (or may not) hold. Predicates may also be `recursive <https://help.semmle.com/QL/ql-handbook/predicates.html#recursive-predicates>`__, simply by referring to themselves (directly or indirectly).
|
||||
|
||||
You can imagine a predicate to be a self-contained from-where-select statement, that produces an intermediate relation, or table. In this case, the ``isEmpty`` predicate will be the set of all blocks which are empty.
|
||||
|
||||
|
||||
Classes in QL
|
||||
=============
|
||||
|
||||
A QL class allows you to name a set of values and define (member) predicates on them.
|
||||
|
||||
A class has at least one supertype and optionally a **characteristic predicate**; it contains the values that belong to *all* supertypes *and* satisfy the characteristic predicate, if provided.
|
||||
|
||||
Member predicates are inherited and can be overridden.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
class EmptyBlock extends Block {
|
||||
EmptyBlock() {
|
||||
this.getNumStmt() = 0
|
||||
}
|
||||
}
|
||||
|
||||
.. note::
|
||||
|
||||
`Classes <https://help.semmle.com/QL/ql-handbook/types.html#classes>`__ model sets of values from the database. A class has one or more supertypes, and inherits `member predicates <https://help.semmle.com/QL/ql-handbook/types.html#member-predicates>`__ (methods) from each of them. Each value in a class must be in every supertype, but additional conditions can be stated in a so-called **characteristic predicate**, which looks a bit like a zero-argument constructor.
|
||||
|
||||
In the example, declaring a variable “EmptyBlock e” will allow it to range over only those blocks that have zero statements.
|
||||
|
||||
Classes in QL continued
|
||||
=======================
|
||||
|
||||
.. container:: column-left
|
||||
|
||||
.. literalinclude:: ../query-examples/java/empty-if-java-predicate.ql
|
||||
:language: ql
|
||||
:emphasize-lines: 3-5
|
||||
|
||||
.. container:: column-right
|
||||
|
||||
.. literalinclude:: ../query-examples/java/empty-if-java-class.ql
|
||||
:language: ql
|
||||
:emphasize-lines: 3-7
|
||||
|
||||
.. note::
|
||||
|
||||
As shown in this example, classes behave much like unary predicates, but with ``instanceof`` instead of predicate calls to check membership. Later on, we will see how to define member predicates on classes.
|
||||
|
||||
Iterative query refinement
|
||||
==========================
|
||||
|
||||
- **Common workflow**: Start with a simple query, inspect a few results, refine, repeat.
|
||||
|
||||
- For example, empty ``then`` branches are not a problem if there is an ``else``.
|
||||
|
||||
- **Exercise**: How can we refine the query to take this into account?
|
||||
|
||||
**Hints**:
|
||||
|
||||
- Use member predicate ``IfStmt.getElse()``
|
||||
- Use ``not exists(...)``
|
||||
|
||||
.. note::
|
||||
|
||||
QL makes it very easy to experiment with analysis ideas. A common workflow is to start with a simple query (like our “redundant if-statement” example), examine a few results, refine the query based on any patterns that emerge and repeat.
|
||||
|
||||
As an exercise, refine the redundant-if query based on the observation that if the if-statement has an “else” clause, then even if the body of the “then” clause is empty, it’s not actually redundant.
|
||||
|
||||
Model answer: redundant if-statement
|
||||
====================================
|
||||
|
||||
.. literalinclude:: ../query-examples/java/empty-if-java-model.ql
|
||||
|
||||
.. note::
|
||||
|
||||
You can explore the results generated when this query is run on apache/struts in LGTM `here <https://lgtm.com/query/1269550358355690774/>`__.
|
||||
@@ -0,0 +1,98 @@
|
||||
======================
|
||||
Program representation
|
||||
======================
|
||||
|
||||
QL for Java
|
||||
|
||||
.. container:: semmle-logo
|
||||
|
||||
Semmle :sup:`TM`
|
||||
|
||||
.. rst-class:: agenda
|
||||
|
||||
Agenda
|
||||
======
|
||||
|
||||
- Abstract syntax trees
|
||||
- Database representation
|
||||
- Program elements
|
||||
- AST classes
|
||||
|
||||
.. insert abstract-syntax-tree.rst
|
||||
|
||||
.. include:: ../slide-snippets/abstract-syntax-tree.rst
|
||||
|
||||
.. resume slides
|
||||
|
||||
Program elements
|
||||
================
|
||||
|
||||
- The QL class ``Element`` represents program elements with a name.
|
||||
- This includes: packages (``Package``), compilation units (``CompilationUnit``), types (``Type``), methods (``Method``), constructors (``Constructor``), and variables (``Variable``).
|
||||
- It is often convenient to refer to an element that might either be a method or a constructor; the class ``Callable``, which is a common superclass of ``Method`` and ``Constructor``, can be used for this purpose.
|
||||
|
||||
|
||||
AST
|
||||
===
|
||||
|
||||
There are two primary AST classes, used within ``Callables``:
|
||||
|
||||
- ``Expr``: expressions such as assignments, variable references, function calls, ...
|
||||
- ``Stmt``: statements such as conditionals, loops, try statements, ...
|
||||
|
||||
Operations are provided for exploring the AST:
|
||||
|
||||
- ``Expr.getAChildExpr`` returns a sub-expression of a given expression.
|
||||
- ``Stmt.getAChild`` returns a statement or expression that is nested directly inside a given statement.
|
||||
- ``Expr.getParent`` and ``Stmt.getParent`` return the parent node of an AST node.
|
||||
|
||||
Types
|
||||
=====
|
||||
|
||||
The database also includes information about the types used in a program:
|
||||
|
||||
- ``PrimitiveType`` represents a `primitive type <http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html>`__, that is, one of ``boolean``, ``byte``, ``char``, ``double``, ``float``, ``int``, ``long``, ``short``. QL also classifies ``void`` and ``<nulltype>`` (the type of the ``null`` literal) as primitive types.
|
||||
- ``RefType`` represents a reference type; it has several subclasses:
|
||||
|
||||
- ``Class`` represents a Java class.
|
||||
- ``Interface`` represents a Java interface.
|
||||
- ``EnumType`` represents a Java enum type.
|
||||
- ``Array`` represents a Java array type.
|
||||
|
||||
Working with variables
|
||||
======================
|
||||
|
||||
``Variable`` represents program variables, including locally scoped variables (``LocalScopeVariable``), fields (``Fields``), and parameters (``Parameters``):
|
||||
|
||||
- ``string Variable.getName()``
|
||||
- ``Type Variable.getType()``
|
||||
|
||||
``Access`` represents references to declared entities such as methods (``MethodAccess``) and variables (``VariableAccess``), including fields (``FieldAccess``).
|
||||
|
||||
- ``Declaration Access.getTarget()``
|
||||
|
||||
``VariableDeclarationEntry`` represents declarations or definitions of a variable.
|
||||
|
||||
- ``Variable VariableDeclarationEntry.getVariable()``
|
||||
|
||||
Working with callables
|
||||
======================
|
||||
|
||||
Callables are represented by the ``Callable`` QL class.
|
||||
|
||||
Calls to callables are modeled by the QL class ``Call`` and its subclasses:
|
||||
|
||||
- ``Call.getCallee()`` gets the declared target of the call
|
||||
- ``Call.getAReference()`` gets a call to this function
|
||||
|
||||
Typically, callables are identified by name:
|
||||
|
||||
- ``string Callable.getName()``
|
||||
- ``string Callable.getQualifiedName()``
|
||||
|
||||
.. rst-class:: java-expression-ast
|
||||
|
||||
Example: Java expression AST
|
||||
============================
|
||||
|
||||
.. diagram copied from google slides
|
||||
148
docs/language/ql-training/java/query-injection-java.rst
Normal file
148
docs/language/ql-training/java/query-injection-java.rst
Normal file
@@ -0,0 +1,148 @@
|
||||
========================
|
||||
Example: Query injection
|
||||
========================
|
||||
|
||||
QL for Java
|
||||
|
||||
.. container:: semmle-logo
|
||||
|
||||
Semmle :sup:`TM`
|
||||
|
||||
.. rst-class:: setup
|
||||
|
||||
Setup
|
||||
=====
|
||||
|
||||
For this example you should download:
|
||||
|
||||
- `QL for Eclipse <https://help.semmle.com/ql-for-eclipse/Content/WebHelp/install-plugin-free.html>`__
|
||||
- `VIVO Vitro snapshot <http://downloads.lgtm.com/snapshots/java/vivo-project/Vitro/vivo-project_Vitro_java-srcVersion_47ae42c01954432c3c3b92d5d163551ce367f510-dist_odasa-lgtm-2019-04-23-7ceff95-linux64.zip>`__
|
||||
|
||||
.. note::
|
||||
|
||||
For this example, we will be analyzing `VIVO Vitro <https://github.com/vivo-project/Vitro>`__.
|
||||
|
||||
You can also query the project in `the query console <https://lgtm.com/query/project:14040005/lang:java/>`__ on LGTM.com.
|
||||
|
||||
.. insert snapshot-note.rst to explain differences between snapshot available to download and the version available in the query console.
|
||||
|
||||
.. include:: ../slide-snippets/snapshot-note.rst
|
||||
|
||||
.. resume slides
|
||||
|
||||
SQL injection
|
||||
=============
|
||||
|
||||
- Occurs when user input is used to construct an SQL query without any sanitization or escaping.
|
||||
|
||||
- Classic example involves constructing a query using string concatenation:
|
||||
|
||||
.. code-block:: sql
|
||||
|
||||
runQuery("SELECT * FROM users WHERE id='" + userId + "'");
|
||||
|
||||
|
||||
- If the ``userId`` can be provided by a user, and is not sanitized, then a malicious user can provide input that manipulates the intended query.
|
||||
|
||||
- For example, providing the input ``"' OR '1'='1"`` would allow the attacker to return all records in the users table.
|
||||
|
||||
.. note::
|
||||
|
||||
`SQL <https://en.wikipedia.org/wiki/SQL>`__ is a database query language, which is often used from within other programming languages to interact with a database. The typical case is that a query is to be executed to find some data, based on some input provided by the user - for example, the user’s ID. However, the interface between the host programming language and SQL is typically implemented by passing a string containing the query to some API.
|
||||
|
||||
SPARQL injection
|
||||
================
|
||||
|
||||
- `SPARQL <https://en.wikipedia.org/wiki/SPARQL>`__ is a language for querying key-value databases in RDF format.
|
||||
|
||||
- The same type of vulnerability can occur for SPARQL as for SQL: if the SPARQL query is constructed through string concatenation, a malicious user can subvert the query:
|
||||
|
||||
.. code-block:: sql
|
||||
|
||||
sparqlAskQuery("ASK { <" + individualURI + "> ?p ?o }");
|
||||
|
||||
- SPARQL is used by many projects, but we will be looking at `VIVO Vitro <https://github.com/vivo-project/Vitro/>`__.
|
||||
|
||||
.. rst-class:: background2
|
||||
|
||||
Developing a QL query
|
||||
======================
|
||||
|
||||
Finding a query concatenation
|
||||
|
||||
QL query: find SPARQL methods
|
||||
=============================
|
||||
|
||||
Let’s start by looking for calls to methods with names of the form ``sparql*Query``, using the classes ``Method`` and ``MethodAccess`` from the Java library.
|
||||
|
||||
.. rst-class:: build
|
||||
|
||||
.. literalinclude:: ../query-examples/java/query-injection-java-1.ql
|
||||
|
||||
.. note::
|
||||
|
||||
- When performing `variant analysis <https://semmle.com/ variant-analysis>`__, it is usually helpful to write a simple query that finds the simple syntactic pattern, before trying to go on to describe the cases where it goes wrong.
|
||||
- In this case, we start by looking for all the method calls which appear to run, before trying to refine the query to find cases which are vulnerable to query injection.
|
||||
- The ``select`` clause defines what this query is looking for:
|
||||
|
||||
- a ``MethodAccess``: the call to a SPARQL query method
|
||||
- a ``Method``: the SPARQL query method.
|
||||
|
||||
- The ``where`` part of the query ties these three QL variables together using `predicates <https://help.semmle.com/QL/ ql-handbook/predicates.html>`__ defined in the `standard QL for Java library <https://help.semmle.com/qldoc/java/>`__.
|
||||
|
||||
QL query: find string concatenation
|
||||
===================================
|
||||
|
||||
- We now need to define what would make these API calls unsafe.
|
||||
- A simple heuristic would be to look for string concatenation used in the query argument.
|
||||
- We may want to reuse this logic, so let us create a separate predicate.
|
||||
|
||||
Looking at autocomplete suggestions, we see that we can get the type of an expression using the ``getType()`` method.
|
||||
|
||||
.. rst-class:: build
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
predicate isStringConcat(AddExpr ae) {
|
||||
ae.getType() instanceof TypeString
|
||||
}
|
||||
|
||||
.. note::
|
||||
|
||||
- An important part of the query is to determine whether a given expression is string concatenation.
|
||||
- We therefore write a helper predicate for finding string concatenation.
|
||||
- This predicate effectively represents the set of all ``add`` expressions in the database where the type of the expression is ``TypeString`` - that is, the addition produces a ``String`` value.
|
||||
|
||||
QL query: SPARQL injection
|
||||
==========================
|
||||
|
||||
We can now combine our predicate with the existing query.
|
||||
Note that we do not need to specify that the argument of the method access is an ``AddExpr`` - this is implied by the ``isStringConcat`` requirement.
|
||||
|
||||
Now our query becomes:
|
||||
|
||||
.. rst-class:: build
|
||||
|
||||
.. literalinclude:: ../query-examples/java/query-injection-java-2.ql
|
||||
:language: ql
|
||||
|
||||
The final query
|
||||
===============
|
||||
|
||||
.. literalinclude:: ../query-examples/java/query-injection-java-3.ql
|
||||
:language: ql
|
||||
|
||||
There are two results, one of which was assigned **CVE-2019-6986**.
|
||||
|
||||
.. note::
|
||||
|
||||
Full write up and exploit can be found here: https://github.com/Semmle/SecurityExploits/tree/master/vivo-project/CVE-2019-6986
|
||||
|
||||
Follow up
|
||||
=========
|
||||
|
||||
- Our query successfully finds cases where the concatenation occurs in the argument to the SPARQL API.
|
||||
|
||||
- However, in general, the concatenation could occur before the method call.
|
||||
|
||||
- For this, we would need to use :doc:`local data flow <data-flow-java>`, which is the topic of the next set of training slides.
|
||||
Reference in New Issue
Block a user