mirror of
https://github.com/github/codeql.git
synced 2026-04-28 02:05:14 +02:00
Merge pull request #2866 from hubwriter/alistairs-docs-preparation-1
CodeQL migration: Java topics - change titles & add intros (2164)
This commit is contained in:
@@ -1,19 +1,19 @@
|
||||
Tutorial: Annotations
|
||||
=====================
|
||||
|
||||
Overview
|
||||
--------
|
||||
Annotations in Java
|
||||
===================
|
||||
|
||||
CodeQL databases of Java projects contain information about all annotations attached to program elements.
|
||||
|
||||
Annotations are represented by the following CodeQL classes:
|
||||
About working with annotations
|
||||
------------------------------
|
||||
|
||||
Annotations are represented by these CodeQL classes:
|
||||
|
||||
- The class ``Annotatable`` represents all entities that may have an annotation attached to them (that is, packages, reference types, fields, methods, and local variables).
|
||||
- The class ``AnnotationType`` represents a Java annotation type, such as ``java.lang.Override``; annotation types are interfaces.
|
||||
- The class ``AnnotationElement`` represents an annotation element, that is, a member of an annotation type.
|
||||
- The class ``Annotation`` represents an annotation such as ``@Override``; annotation values can be accessed through member predicate ``getValue``.
|
||||
|
||||
As an example, recall that the Java standard library defines an annotation ``SuppressWarnings`` that instructs the compiler not to emit certain kinds of warnings. It is defined as follows:
|
||||
For example, the Java standard library defines an annotation ``SuppressWarnings`` that instructs the compiler not to emit certain kinds of warnings:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
@@ -25,7 +25,7 @@ As an example, recall that the Java standard library defines an annotation ``Sup
|
||||
|
||||
``SuppressWarnings`` is represented as an ``AnnotationType``, with ``value`` as its only ``AnnotationElement``.
|
||||
|
||||
A typical usage of ``SuppressWarnings`` would be the following annotation to prevent a warning about using raw types:
|
||||
A typical usage of ``SuppressWarnings`` would be this annotation for preventing a warning about using raw types:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
@@ -37,7 +37,7 @@ A typical usage of ``SuppressWarnings`` would be the following annotation to pre
|
||||
|
||||
The expression ``@SuppressWarnings("rawtypes")`` is represented as an ``Annotation``. The string literal ``"rawtypes"`` is used to initialize the annotation element ``value``, and its value can be extracted from the annotation by means of the ``getValue`` predicate.
|
||||
|
||||
We could then write the following query to find all ``@SuppressWarnings`` annotations attached to constructors, and return both the annotation itself and the value of its ``value`` element:
|
||||
We could then write this query to find all ``@SuppressWarnings`` annotations attached to constructors, and return both the annotation itself and the value of its ``value`` element:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -69,9 +69,9 @@ As another example, this query finds all annotation types that only have a singl
|
||||
Example: Finding missing ``@Override`` annotations
|
||||
--------------------------------------------------
|
||||
|
||||
In newer versions of Java, it is recommended (though not required) to annotate methods that override another method with an ``@Override`` annotation. These annotations, which are checked by the compiler, serve as documentation, and also help you avoid accidental overloading where overriding was intended.
|
||||
In newer versions of Java, it's recommended (though not required) that you annotate methods that override another method with an ``@Override`` annotation. These annotations, which are checked by the compiler, serve as documentation, and also help you avoid accidental overloading where overriding was intended.
|
||||
|
||||
For example, consider the following example program:
|
||||
For example, consider this example program:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
@@ -89,9 +89,9 @@ For example, consider the following example program:
|
||||
|
||||
Here, both ``Sub1.m`` and ``Sub2.m`` override ``Super.m``, but only ``Sub1.m`` is annotated with ``@Override``.
|
||||
|
||||
We will now develop a query for finding methods like ``Sub2.m`` that should be annotated with ``@Override``, but are not.
|
||||
We'll now develop a query for finding methods like ``Sub2.m`` that should be annotated with ``@Override``, but are not.
|
||||
|
||||
As a first step, let us write a query that finds all ``@Override`` annotations. Annotations are expressions, so their type can be accessed using ``getType``. Annotation types, on the other hand, are interfaces, so their qualified name can be queried using ``hasQualifiedName``. Therefore we can implement the query as follows:
|
||||
As a first step, let's write a query that finds all ``@Override`` annotations. Annotations are expressions, so their type can be accessed using ``getType``. Annotation types, on the other hand, are interfaces, so their qualified name can be queried using ``hasQualifiedName``. Therefore we can implement the query like this:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -111,7 +111,7 @@ As always, it is a good idea to try this query on a CodeQL database for a Java p
|
||||
}
|
||||
}
|
||||
|
||||
This makes it very easy to write our query for finding methods that override another method, but do not have an ``@Override`` annotation: we use predicate ``overrides`` to find out whether one method overrides another, and predicate ``getAnAnnotation`` (available on any ``Annotatable``) to retrieve some annotation.
|
||||
This makes it very easy to write our query for finding methods that override another method, but don't have an ``@Override`` annotation: we use predicate ``overrides`` to find out whether one method overrides another, and predicate ``getAnAnnotation`` (available on any ``Annotatable``) to retrieve some annotation.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -122,14 +122,14 @@ This makes it very easy to write our query for finding methods that override ano
|
||||
not overriding.getAnAnnotation() instanceof OverrideAnnotation
|
||||
select overriding, "Method overrides another method, but does not have an @Override annotation."
|
||||
|
||||
➤ `See this in the query console <https://lgtm.com/query/1505752756202/>`__. In practice, this query may yield many results from compiled library code, which are not very interesting. Therefore, it is a good idea to add another conjunct ``overriding.fromSource()`` to restrict the result to only report methods for which source code is available.
|
||||
➤ `See this in the query console <https://lgtm.com/query/1505752756202/>`__. In practice, this query may yield many results from compiled library code, which aren't very interesting. It's therefore a good idea to add another conjunct ``overriding.fromSource()`` to restrict the result to only report methods for which source code is available.
|
||||
|
||||
Example: Finding calls to deprecated methods
|
||||
--------------------------------------------
|
||||
|
||||
As another example, we can write a query that finds calls to methods marked with a ``@Deprecated`` annotation.
|
||||
|
||||
For example, consider the following example program:
|
||||
For example, consider this example program:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
@@ -147,7 +147,7 @@ For example, consider the following example program:
|
||||
|
||||
Here, both ``A.m`` and ``A.n`` are marked as deprecated. Methods ``n`` and ``r`` both call ``m``, but note that ``n`` itself is deprecated, so we probably should not warn about this call.
|
||||
|
||||
Like in the previous example, we start by defining a class for representing ``@Deprecated`` annotations:
|
||||
As in the previous example, we'll start by defining a class for representing ``@Deprecated`` annotations:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -167,7 +167,7 @@ Now we can define a class for representing deprecated methods:
|
||||
}
|
||||
}
|
||||
|
||||
Finally, we use these classes to find calls to deprecated methods, excluding calls that themselves appear in deprecated methods (see :doc:`Tutorial: Navigating the call graph <call-graph>` for more information on class ``Call``):
|
||||
Finally, we use these classes to find calls to deprecated methods, excluding calls that themselves appear in deprecated methods:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -178,7 +178,9 @@ Finally, we use these classes to find calls to deprecated methods, excluding cal
|
||||
and not call.getCaller() instanceof DeprecatedMethod
|
||||
select call, "This call invokes a deprecated method."
|
||||
|
||||
On our example, this query flags the call to ``A.m`` in ``A.r``, but not the one in ``A.n``.
|
||||
In our example, this query flags the call to ``A.m`` in ``A.r``, but not the one in ``A.n``.
|
||||
|
||||
For more information about the class ``Call``, see :doc:`Navigating the call graph <call-graph>`.
|
||||
|
||||
Improvements
|
||||
~~~~~~~~~~~~
|
||||
@@ -235,9 +237,9 @@ Now we can extend our query to filter out calls in methods carrying a ``Suppress
|
||||
|
||||
➤ `See this in the query console <https://lgtm.com/query/665760001>`__. It's fairly common for projects to contain calls to methods that appear to be deprecated.
|
||||
|
||||
What next?
|
||||
----------
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- Take a look at some of the other tutorials: :doc:`Tutorial: Javadoc <javadoc>` and :doc:`Tutorial: Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`AST class reference <ast-class-reference>`.
|
||||
- Take a look at some of the other articles in this section: :doc:`Javadoc <javadoc>` and :doc:`Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
|
||||
- Find out more about QL in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__ and `QL language specification <https://help.semmle.com/QL/ql-spec/language.html>`__.
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
AST class reference
|
||||
===================
|
||||
Classes for working with Java code
|
||||
==================================
|
||||
|
||||
CodeQL has a large selection of classes for working with Java statements and expressions.
|
||||
|
||||
.. _Expr: https://help.semmle.com/qldoc/java/semmle/code/java/Expr.qll/type.Expr$Expr.html
|
||||
.. _Stmt: https://help.semmle.com/qldoc/java/semmle/code/java/Statement.qll/type.Statement$Stmt.html
|
||||
|
||||
@@ -1,8 +1,10 @@
|
||||
Tutorial: Navigating the call graph
|
||||
===================================
|
||||
Navigating the call graph
|
||||
=========================
|
||||
|
||||
Call graph API
|
||||
--------------
|
||||
CodeQL has classes for identifying code that calls other code, and code that can be called from elsewhere. This allows you to find, for example, methods that are never used.
|
||||
|
||||
Call graph classes
|
||||
------------------
|
||||
|
||||
The CodeQL library for Java provides two abstract classes for representing a program's call graph: ``Callable`` and ``Call``. The former is simply the common superclass of ``Method`` and ``Constructor``, the latter is a common superclass of ``MethodAccess``, ``ClassInstanceExpression``, ``ThisConstructorInvocationStmt`` and ``SuperConstructorInvocationStmt``. Simply put, a ``Callable`` is something that can be invoked, and a ``Call`` is something that invokes a ``Callable``.
|
||||
|
||||
@@ -56,7 +58,7 @@ Class ``Call`` provides two call graph navigation predicates:
|
||||
|
||||
For instance, in our example ``getCallee`` of the second call in ``Client.main`` would return ``Super.getX``. At runtime, though, this call would actually invoke ``Sub.getX``.
|
||||
|
||||
Class ``Callable`` defines a large number of member predicates; for our purposes, the two most important ones are as follows:
|
||||
Class ``Callable`` defines a large number of member predicates; for our purposes, the two most important ones are:
|
||||
|
||||
- ``calls(Callable target)`` succeeds if this callable contains a call whose callee is ``target``.
|
||||
- ``polyCalls(Callable target)`` succeeds if this callable may call ``target`` at runtime; this is the case if it contains a call whose callee is either ``target`` or a method that ``target`` overrides.
|
||||
@@ -66,7 +68,7 @@ In our example, ``Client.main`` calls the constructor ``Sub(int)`` and the metho
|
||||
Example: Finding unused methods
|
||||
-------------------------------
|
||||
|
||||
Given this API, we can easily write a query that finds methods that are not called by any other method:
|
||||
We can use the ``Callable`` class to write a query that finds methods that are not called by any other method:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -84,7 +86,7 @@ Given this API, we can easily write a query that finds methods that are not call
|
||||
|
||||
We have to use ``polyCalls`` instead of ``calls`` here: we want to be reasonably sure that ``callee`` is not called, either directly or via overriding.
|
||||
|
||||
Running this query on a typical Java project results in lots of hits in the Java standard library. This makes sense, since no single client program uses every method of the standard library. More generally, we may want to exclude methods and constructors from compiled libraries. We can use the predicate ``fromSource`` to check whether a compilation unit is a source file, and refine our query as follows:
|
||||
Running this query on a typical Java project results in lots of hits in the Java standard library. This makes sense, since no single client program uses every method of the standard library. More generally, we may want to exclude methods and constructors from compiled libraries. We can use the predicate ``fromSource`` to check whether a compilation unit is a source file, and refine our query:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -142,7 +144,7 @@ A further special case is non-public default constructors: in the singleton patt
|
||||
|
||||
➤ `See this in the query console <https://lgtm.com/query/673060008/>`__. This change has a large effect on the results for some projects but little effect on the results for others. Use of this pattern varies widely between different projects.
|
||||
|
||||
Finally, on many Java projects there are methods that are invoked indirectly by reflection. Thus, while there are no calls invoking these methods, they are, in fact, used. It is in general very hard to identify such methods. A very common special case, however, is JUnit test methods, which are reflectively invoked by a test runner. The QL Java library has support for recognizing test classes of JUnit and other testing frameworks, which we can employ to filter out methods defined in such classes:
|
||||
Finally, on many Java projects there are methods that are invoked indirectly by reflection. So, while there are no calls invoking these methods, they are, in fact, used. It is in general very hard to identify such methods. A very common special case, however, is JUnit test methods, which are reflectively invoked by a test runner. The QL Java library has support for recognizing test classes of JUnit and other testing frameworks, which we can employ to filter out methods defined in such classes:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -159,9 +161,9 @@ Finally, on many Java projects there are methods that are invoked indirectly by
|
||||
|
||||
➤ `See this in the query console <https://lgtm.com/query/665760002/>`__. This should give a further reduction in the number of results returned.
|
||||
|
||||
What next?
|
||||
----------
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- Find out how to query metadata and white space: :doc:`Tutorial: Annotations <annotations>`, :doc:`Tutorial: Javadoc <javadoc>`, and :doc:`Tutorial: Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`AST class reference <ast-class-reference>`.
|
||||
- Find out how to query metadata and white space: :doc:`Annotations in Java <annotations>`, :doc:`Javadoc <javadoc>`, and :doc:`Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
|
||||
- Find out more about QL in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__ and `QL language specification <https://help.semmle.com/QL/ql-spec/language.html>`__.
|
||||
|
||||
@@ -1,11 +1,13 @@
|
||||
Analyzing data flow in Java
|
||||
============================
|
||||
===========================
|
||||
|
||||
Overview
|
||||
--------
|
||||
You can use CodeQL to track the flow of data through a Java program to its use.
|
||||
|
||||
This topic describes how data flow analysis is implemented in the CodeQL libraries for Java and includes examples to help you write your own data flow queries.
|
||||
The following sections describe how to utilize the libraries for local data flow, global data flow, and taint tracking.
|
||||
About this article
|
||||
------------------
|
||||
|
||||
This article describes how data flow analysis is implemented in the CodeQL libraries for Java and includes examples to help you write your own data flow queries.
|
||||
The following sections describe how to use the libraries for local data flow, global data flow, and taint tracking.
|
||||
|
||||
For a more general introduction to modeling data flow, see :doc:`Introduction to data flow analysis with CodeQL <../intro-to-data-flow>`.
|
||||
|
||||
@@ -17,7 +19,7 @@ Local data flow is data flow within a single method or callable. Local data flow
|
||||
Using local data flow
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The local data flow library is in the module ``DataFlow``, which defines the class ``Node`` denoting any element that data can flow through. ``Node``\ s are divided into expression nodes (``ExprNode``) and parameter nodes (``ParameterNode``). It is possible to map between data flow nodes and expressions/parameters using the member predicates ``asExpr`` and ``asParameter``:
|
||||
The local data flow library is in the module ``DataFlow``, which defines the class ``Node`` denoting any element that data can flow through. ``Node``\ s are divided into expression nodes (``ExprNode``) and parameter nodes (``ParameterNode``). You can map between data flow nodes and expressions/parameters using the member predicates ``asExpr`` and ``asParameter``:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -45,9 +47,9 @@ or using the predicates ``exprNode`` and ``parameterNode``:
|
||||
*/
|
||||
ParameterNode parameterNode(Parameter p) { ... }
|
||||
|
||||
The predicate ``localFlowStep(Node nodeFrom, Node nodeTo)`` holds if there is an immediate data flow edge from the node ``nodeFrom`` to the node ``nodeTo``. The predicate can be applied recursively (using the ``+`` and ``*`` operators), or through the predefined recursive predicate ``localFlow``, which is equivalent to ``localFlowStep*``.
|
||||
The predicate ``localFlowStep(Node nodeFrom, Node nodeTo)`` holds if there is an immediate data flow edge from the node ``nodeFrom`` to the node ``nodeTo``. You can apply the predicate recursively by using the ``+`` and ``*`` operators, or by using the predefined recursive predicate ``localFlow``, which is equivalent to ``localFlowStep*``.
|
||||
|
||||
For example, finding flow from a parameter ``source`` to an expression ``sink`` in zero or more local steps can be achieved as follows:
|
||||
For example, you can find flow from a parameter ``source`` to an expression ``sink`` in zero or more local steps:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -65,9 +67,9 @@ Local taint tracking extends local data flow by including non-value-preserving f
|
||||
|
||||
If ``x`` is a tainted string then ``y`` is also tainted.
|
||||
|
||||
The local taint tracking library is in the module ``TaintTracking``. Like local data flow, a predicate ``localTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo)`` holds if there is an immediate taint propagation edge from the node ``nodeFrom`` to the node ``nodeTo``. The predicate can be applied recursively (using the ``+`` and ``*`` operators), or through the predefined recursive predicate ``localTaint``, which is equivalent to ``localTaintStep*``.
|
||||
The local taint tracking library is in the module ``TaintTracking``. Like local data flow, a predicate ``localTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo)`` holds if there is an immediate taint propagation edge from the node ``nodeFrom`` to the node ``nodeTo``. You can apply the predicate recursively by using the ``+`` and ``*`` operators, or by using the predefined recursive predicate ``localTaint``, which is equivalent to ``localTaintStep*``.
|
||||
|
||||
For example, finding taint propagation from a parameter ``source`` to an expression ``sink`` in zero or more local steps can be achieved as follows:
|
||||
For example, you can find taint propagation from a parameter ``source`` to an expression ``sink`` in zero or more local steps:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -76,7 +78,7 @@ For example, finding taint propagation from a parameter ``source`` to an express
|
||||
Examples
|
||||
~~~~~~~~
|
||||
|
||||
The following query finds the filename passed to ``new FileReader(..)``.
|
||||
This query finds the filename passed to ``new FileReader(..)``.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -88,7 +90,7 @@ The following query finds the filename passed to ``new FileReader(..)``.
|
||||
call.getCallee() = fileReader
|
||||
select call.getArgument(0)
|
||||
|
||||
Unfortunately, this will only give the expression in the argument, not the values which could be passed to it. So we use local data flow to find all expressions that flow into the argument:
|
||||
Unfortunately, this only gives the expression in the argument, not the values which could be passed to it. So we use local data flow to find all expressions that flow into the argument:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -102,7 +104,7 @@ Unfortunately, this will only give the expression in the argument, not the value
|
||||
DataFlow::localFlow(DataFlow::exprNode(src), DataFlow::exprNode(call.getArgument(0)))
|
||||
select src
|
||||
|
||||
Then we can make the source more specific, for example an access to a public parameter. The following query finds where a public parameter is passed to ``new FileReader(..)``:
|
||||
Then we can make the source more specific, for example an access to a public parameter. This query finds where a public parameter is passed to ``new FileReader(..)``:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -116,7 +118,7 @@ Then we can make the source more specific, for example an access to a public par
|
||||
DataFlow::localFlow(DataFlow::parameterNode(p), DataFlow::exprNode(call.getArgument(0)))
|
||||
select p
|
||||
|
||||
The following example finds calls to formatting functions where the format string is not hard-coded.
|
||||
This query finds calls to formatting functions where the format string is not hard-coded.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -148,7 +150,7 @@ Global data flow tracks data flow throughout the entire program, and is therefor
|
||||
Using global data flow
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The global data flow library is used by extending the class ``DataFlow::Configuration`` as follows:
|
||||
You use the global data flow library by extending the class ``DataFlow::Configuration``:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -166,7 +168,7 @@ The global data flow library is used by extending the class ``DataFlow::Configur
|
||||
}
|
||||
}
|
||||
|
||||
The following predicates are defined in the configuration:
|
||||
These predicates are defined in the configuration:
|
||||
|
||||
- ``isSource``—defines where data may flow from
|
||||
- ``isSink``—defines where data may flow to
|
||||
@@ -186,7 +188,7 @@ The data flow analysis is performed using the predicate ``hasFlow(DataFlow::Node
|
||||
Using global taint tracking
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Global taint tracking is to global data flow as local taint tracking is to local data flow. That is, global taint tracking extends global data flow with additional non-value-preserving steps. The global taint tracking library is used by extending the class ``TaintTracking::Configuration`` as follows:
|
||||
Global taint tracking is to global data flow as local taint tracking is to local data flow. That is, global taint tracking extends global data flow with additional non-value-preserving steps. You use the global taint tracking library by extending the class ``TaintTracking::Configuration``:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -204,7 +206,7 @@ Global taint tracking is to global data flow as local taint tracking is to local
|
||||
}
|
||||
}
|
||||
|
||||
The following predicates are defined in the configuration:
|
||||
These predicates are defined in the configuration:
|
||||
|
||||
- ``isSource``—defines where taint may flow from
|
||||
- ``isSink``—defines where taint may flow to
|
||||
@@ -223,7 +225,7 @@ The data flow library contains some predefined flow sources. The class ``RemoteF
|
||||
Examples
|
||||
~~~~~~~~
|
||||
|
||||
The following example shows a taint-tracking configuration that uses remote user input as data sources.
|
||||
This query shows a taint-tracking configuration that uses remote user input as data sources.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -254,7 +256,7 @@ Exercise 4: Using the answers from 2 and 3, write a query which finds all global
|
||||
What next?
|
||||
----------
|
||||
|
||||
- Try the worked examples in the following topics: :doc:`Tutorial: Navigating the call graph <call-graph>` and :doc:`Tutorial: Working with source locations <source-locations>`.
|
||||
- Try the worked examples in these articles: :doc:`Navigating the call graph <call-graph>` and :doc:`Working with source locations <source-locations>`.
|
||||
- Find out more about QL in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__ and `QL language specification <https://help.semmle.com/QL/ql-spec/language.html>`__.
|
||||
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__.
|
||||
|
||||
|
||||
@@ -1,12 +1,14 @@
|
||||
Tutorial: Expressions and statements
|
||||
====================================
|
||||
Overflow-prone comparisons in Java
|
||||
==================================
|
||||
|
||||
Overview
|
||||
--------
|
||||
You can use CodeQL to check for comparisons in Java code where one side of the comparison is prone to overflow.
|
||||
|
||||
This tutorial develops a query for finding comparisons between integers and long integers in loops that may lead to non-termination due to overflow.
|
||||
About this article
|
||||
------------------
|
||||
|
||||
Specifically, consider the following code snippet:
|
||||
In this tutorial article you'll write a query for finding comparisons between integers and long integers in loops that may lead to non-termination due to overflow.
|
||||
|
||||
To begin, consider this code snippet:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
@@ -24,12 +26,12 @@ If ``l`` is bigger than 2\ :sup:`31`\ - 1 (the largest positive value of type ``
|
||||
|
||||
All primitive numeric types have a maximum value, beyond which they will wrap around to their lowest possible value (called an "overflow"). For ``int``, this maximum value is 2\ :sup:`31`\ - 1. Type ``long`` can accommodate larger values up to a maximum of 2\ :sup:`63`\ - 1. In this example, this means that ``l`` can take on a value that is higher than the maximum for type ``int``; ``i`` will never be able to reach this value, instead overflowing and returning to a low value.
|
||||
|
||||
We will develop a query that finds code that looks like it might exhibit this kind of behavior. We will be using several of the standard library classes for representing statements and functions, a full list of which can be found in the :doc:`AST class reference <ast-class-reference>`.
|
||||
We're going to develop a query that finds code that looks like it might exhibit this kind of behavior. We'll be using several of the standard library classes for representing statements and functions. For a full list, see :doc:`Classes for working with Java code <ast-class-reference>`.
|
||||
|
||||
Initial query
|
||||
-------------
|
||||
|
||||
We start out by writing a query that finds less-than expressions (CodeQL class ``LTExpr``) where the left operand is of type ``int`` and the right operand is of type ``long``:
|
||||
We'll start by writing a query that finds less-than expressions (CodeQL class ``LTExpr``) where the left operand is of type ``int`` and the right operand is of type ``long``:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -42,7 +44,7 @@ We start out by writing a query that finds less-than expressions (CodeQL class `
|
||||
|
||||
➤ `See this in the query console <https://lgtm.com/query/672320008/>`__. This query usually finds results on most projects.
|
||||
|
||||
Notice that we use the predicate ``getType`` (available on all subclasses of ``Expr``) to determine the type of the operands. Types, in turn, define the ``hasName`` predicate, which allows us to identify the primitive types ``int`` and ``long``. As it stands, this query finds *all* less-than expressions comparing ``int`` and ``long``, but in fact we are only interested in comparisons that are part of a loop condition. Also, we want to filter out comparisons where either operand is constant, since these are less likely to be real bugs. The revised query looks as follows:
|
||||
Notice that we use the predicate ``getType`` (available on all subclasses of ``Expr``) to determine the type of the operands. Types, in turn, define the ``hasName`` predicate, which allows us to identify the primitive types ``int`` and ``long``. As it stands, this query finds *all* less-than expressions comparing ``int`` and ``long``, but in fact we are only interested in comparisons that are part of a loop condition. Also, we want to filter out comparisons where either operand is constant, since these are less likely to be real bugs. The revised query looks like this:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -78,7 +80,7 @@ In order to compare the ranges of types, we define a predicate that returns the
|
||||
(pt.hasName("long") and result=64)
|
||||
}
|
||||
|
||||
We now want to generalize our query to apply to any comparison where the width of the type on the smaller end of the comparison is less than the width of the type on the greater end. Let us call such a comparison *overflow prone*, and introduce an abstract class to model it:
|
||||
We now want to generalize our query to apply to any comparison where the width of the type on the smaller end of the comparison is less than the width of the type on the greater end. Let's call such a comparison *overflow prone*, and introduce an abstract class to model it:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -120,9 +122,9 @@ Now we rewrite our query to make use of these new classes:
|
||||
|
||||
➤ `See the full query in the query console <https://lgtm.com/query/1951710018/lang:java/>`__.
|
||||
|
||||
What next?
|
||||
----------
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- Have a look at some of the other tutorials: :doc:`Tutorial: Types and the class hierarchy <types-class-hierarchy>`, :doc:`Tutorial: Navigating the call graph <call-graph>`, :doc:`Tutorial: Annotations <annotations>`, :doc:`Tutorial: Javadoc <javadoc>`, and :doc:`Tutorial: Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`AST class reference <ast-class-reference>`.
|
||||
- Have a look at some of the other articles in this section: :doc:`Java types <types-class-hierarchy>`, :doc:`Navigating the call graph <call-graph>`, :doc:`Annotations in Java <annotations>`, :doc:`Javadoc <javadoc>`, and :doc:`Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
|
||||
- Find out more about QL in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__ and `QL language specification <https://help.semmle.com/QL/ql-spec/language.html>`__.
|
||||
|
||||
@@ -1,8 +1,10 @@
|
||||
Introducing the CodeQL libraries for Java
|
||||
=========================================
|
||||
CodeQL library for Java
|
||||
=======================
|
||||
|
||||
Overview
|
||||
--------
|
||||
When you're analyzing a Java program in {{ site.data.variables.product.prodname_dotcom }}, you can make use of the large collection of classes in the CodeQL library for Java.
|
||||
|
||||
About the CodeQL library for Java
|
||||
---------------------------------
|
||||
|
||||
There is an extensive library for analyzing CodeQL databases extracted from Java projects. The classes in this library present the data from a database in an object-oriented form and provide abstractions and predicates to help you with common analysis tasks.
|
||||
|
||||
@@ -12,13 +14,13 @@ The library is implemented as a set of QL modules, that is, files with the exten
|
||||
|
||||
import java
|
||||
|
||||
The rest of this topic briefly summarizes the most important classes and predicates provided by this library.
|
||||
The rest of this article briefly summarizes the most important classes and predicates provided by this library.
|
||||
|
||||
.. pull-quote::
|
||||
|
||||
Note
|
||||
|
||||
The example queries in this topic illustrate the types of results returned by different library classes. The results themselves are not interesting but can be used as the basis for developing a more complex query. The tutorial topics show how you can take a simple query and fine-tune it to find precisely the results you're interested in.
|
||||
The example queries in this article illustrate the types of results returned by different library classes. The results themselves are not interesting but can be used as the basis for developing a more complex query. The other articles in this section of the help show how you can take a simple query and fine-tune it to find precisely the results you're interested in.
|
||||
|
||||
Summary of the library classes
|
||||
------------------------------
|
||||
@@ -40,7 +42,7 @@ These classes represent named program elements: packages (``Package``), compilat
|
||||
|
||||
Their common superclass is ``Element``, which provides general member predicates for determining the name of a program element and checking whether two elements are nested inside each other.
|
||||
|
||||
It is often convenient to refer to an element that might either be a method or a constructor; the class ``Callable``, which is a common superclass of ``Method`` and ``Constructor``, can be used for this purpose.
|
||||
It's often convenient to refer to an element that might either be a method or a constructor; the class ``Callable``, which is a common superclass of ``Method`` and ``Constructor``, can be used for this purpose.
|
||||
|
||||
Types
|
||||
~~~~~
|
||||
@@ -66,9 +68,9 @@ For example, the following query finds all variables of type ``int`` in the prog
|
||||
pt.hasName("int")
|
||||
select v
|
||||
|
||||
➤ `See this in the query console <https://lgtm.com/query/660700018/>`__. You are likely to get many results when you run this query because most projects contain many variables of type ``int``.
|
||||
➤ `See this in the query console <https://lgtm.com/query/660700018/>`__. You're likely to get many results when you run this query because most projects contain many variables of type ``int``.
|
||||
|
||||
Reference types can also be categorized according to their declaration scope:
|
||||
Reference types are also categorized according to their declaration scope:
|
||||
|
||||
- ``TopLevelType`` represents a reference type declared at the top-level of a compilation unit.
|
||||
- ``NestedType`` is a type declared inside another type.
|
||||
@@ -105,7 +107,7 @@ As an example, we can write a query that finds all nested classes that directly
|
||||
where nc.getASupertype() instanceof TypeObject
|
||||
select nc
|
||||
|
||||
➤ `See this in the query console <https://lgtm.com/query/672230026/>`__. You are likely to get many results when you run this query because many projects include nested classes that extend ``Object`` directly.
|
||||
➤ `See this in the query console <https://lgtm.com/query/672230026/>`__. You're likely to get many results when you run this query because many projects include nested classes that extend ``Object`` directly.
|
||||
|
||||
Generics
|
||||
~~~~~~~~
|
||||
@@ -194,7 +196,7 @@ The wildcards ``? extends Number`` and ``? super Float`` are represented by clas
|
||||
|
||||
For dealing with generic methods, there are classes ``GenericMethod``, ``ParameterizedMethod`` and ``RawMethod``, which are entirely analogous to the like-named classes for representing generic types.
|
||||
|
||||
More information on working with types can be found in the :doc:`tutorial on types and the class hierarchy <types-class-hierarchy>`.
|
||||
For more information on working with types, see the :doc:`article on Java types <types-class-hierarchy>`.
|
||||
|
||||
Variables
|
||||
~~~~~~~~~
|
||||
@@ -208,7 +210,7 @@ Class ``Variable`` represents a variable `in the Java sense <http://docs.oracle.
|
||||
Abstract syntax tree
|
||||
--------------------
|
||||
|
||||
Classes in this category represent abstract syntax tree (AST) nodes, that is, statements (class ``Stmt``) and expressions (class ``Expr``). See the :doc:`AST class reference <ast-class-reference>` for an exhaustive list of all expression and statement types available in the standard QL library.
|
||||
Classes in this category represent abstract syntax tree (AST) nodes, that is, statements (class ``Stmt``) and expressions (class ``Expr``). For a full list of expression and statement types available in the standard QL library, see :doc:`Classes for working with Java code <ast-class-reference>`.
|
||||
|
||||
Both ``Expr`` and ``Stmt`` provide member predicates for exploring the abstract syntax tree of a program:
|
||||
|
||||
@@ -258,7 +260,7 @@ Finally, here is a query that finds method bodies:
|
||||
|
||||
As these examples show, the parent node of an expression is not always an expression: it may also be a statement, for example, an ``IfStmt``. Similarly, the parent node of a statement is not always a statement: it may also be a method or a constructor. To capture this, the QL Java library provides two abstract class ``ExprParent`` and ``StmtParent``, the former representing any node that may be the parent node of an expression, and the latter any node that may be the parent node of a statement.
|
||||
|
||||
For more information on working with AST classes, see the :doc:`tutorial on expressions and statements <expressions-statements>`.
|
||||
For more information on working with AST classes, see the :doc:`article on overflow-prone comparisons in Java <expressions-statements>`.
|
||||
|
||||
Metadata
|
||||
--------
|
||||
@@ -290,7 +292,7 @@ These annotations are represented by class ``Annotation``. An annotation is simp
|
||||
|
||||
➤ `See this in the query console <https://lgtm.com/query/659662167/>`__. Only constructors with the ``@deprecated`` annotation are reported this time.
|
||||
|
||||
For more information on working with annotations, see the :doc:`tutorial on annotations <annotations>`.
|
||||
For more information on working with annotations, see the :doc:`article on annotations <annotations>`.
|
||||
|
||||
For Javadoc, class ``Element`` has a member predicate ``getDoc`` that returns a delegate ``Documentable`` object, which can then be queried for its attached Javadoc comments. For example, the following query finds Javadoc comments on private fields:
|
||||
|
||||
@@ -325,7 +327,7 @@ Class ``Javadoc`` represents an entire Javadoc comment as a tree of ``JavadocEle
|
||||
|
||||
On line 5 we used ``getParent+`` to capture tags that are nested at any depth within the Javadoc comment.
|
||||
|
||||
For more information on working with Javadoc, see the :doc:`tutorial on Javadoc <javadoc>`.
|
||||
For more information on working with Javadoc, see the :doc:`article on Javadoc <javadoc>`.
|
||||
|
||||
Metrics
|
||||
-------
|
||||
@@ -379,11 +381,11 @@ Conversely, ``Callable.getAReference`` returns a ``Call`` that refers to it. So
|
||||
|
||||
➤ `See this in the query console <https://lgtm.com/query/666680036/>`__. The LGTM.com demo projects all appear to have many methods that are not called directly, but this is unlikely to be the whole story. To explore this area further, see :doc:`Navigating the call graph <call-graph>`.
|
||||
|
||||
For more information about callables and calls, see the :doc:`call graph tutorial <call-graph>`.
|
||||
For more information about callables and calls, see the :doc:`article on the call graph <call-graph>`.
|
||||
|
||||
What next?
|
||||
----------
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- Experiment with the worked examples in the CodeQL for Java tutorial topics: :doc:`Types and the class hierarchy <types-class-hierarchy>`, :doc:`Expressions and statements <expressions-statements>`, :doc:`Navigating the call graph <call-graph>`, :doc:`Annotations <annotations>`, :doc:`Javadoc <javadoc>` and :doc:`Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`AST class reference <ast-class-reference>`.
|
||||
- Experiment with the worked examples in the CodeQL for Java articles: :doc:`Java types <types-class-hierarchy>`, :doc:`Overflow-prone comparisons in Java <expressions-statements>`, :doc:`Navigating the call graph <call-graph>`, :doc:`Annotations in Java <annotations>`, :doc:`Javadoc <javadoc>` and :doc:`Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
|
||||
- Find out more about QL in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__ and `QL language specification <https://help.semmle.com/QL/ql-spec/language.html>`__.
|
||||
|
||||
@@ -1,8 +1,10 @@
|
||||
Tutorial: Javadoc
|
||||
=================
|
||||
Javadoc
|
||||
=======
|
||||
|
||||
Overview
|
||||
--------
|
||||
You can use CodeQL to find errors in Javadoc comments in Java code.
|
||||
|
||||
About analyzing Javadoc
|
||||
-----------------------
|
||||
|
||||
To access Javadoc associated with a program element, we use member predicate ``getDoc`` of class ``Element``, which returns a ``Documentable``. Class ``Documentable``, in turn, offers a member predicate ``getJavadoc`` to retrieve the Javadoc attached to the element in question, if any.
|
||||
|
||||
@@ -49,9 +51,9 @@ The ``JavadocTag`` has several subclasses representing specific kinds of Javadoc
|
||||
Example: Finding spurious @param tags
|
||||
-------------------------------------
|
||||
|
||||
As an example of using the CodeQL Javadoc API, let us write a query that finds ``@param`` tags that refer to a non-existent parameter.
|
||||
As an example of using the CodeQL Javadoc API, let's write a query that finds ``@param`` tags that refer to a non-existent parameter.
|
||||
|
||||
For example, consider the following program:
|
||||
For example, consider this program:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
@@ -76,7 +78,7 @@ To begin with, we write a query that finds all callables (that is, methods or co
|
||||
where c.getDoc().getJavadoc() = pt.getParent()
|
||||
select c, pt
|
||||
|
||||
It is now easy to add another conjunct to the ``where`` clause, restricting the query to ``@param`` tags that refer to a non-existent parameter: we simply need to require that no parameter of ``c`` has the name ``pt.getParamName()``.
|
||||
It's now easy to add another conjunct to the ``where`` clause, restricting the query to ``@param`` tags that refer to a non-existent parameter: we simply need to require that no parameter of ``c`` has the name ``pt.getParamName()``.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -92,7 +94,7 @@ Example: Finding spurious @throws tags
|
||||
|
||||
A related, but somewhat more involved, problem is finding ``@throws`` tags that refer to an exception that the method in question cannot actually throw.
|
||||
|
||||
For example, consider the following Java program:
|
||||
For example, consider this Java program:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
@@ -108,9 +110,9 @@ For example, consider the following Java program:
|
||||
}
|
||||
}
|
||||
|
||||
Notice that the Javadoc comment of ``A.foo`` documents two thrown exceptions: ``IOException`` and ``RuntimeException``. The former is clearly spurious: ``A.foo`` does not have a ``throws IOException`` clause, and thus cannot throw this kind of exception. On the other hand, ``RuntimeException`` is an unchecked exception, so it can be thrown even if there is no explicit ``throws`` clause listing it. Therefore, our query should flag the ``@throws`` tag for ``IOException``, but not the one for ``RuntimeException.``
|
||||
Notice that the Javadoc comment of ``A.foo`` documents two thrown exceptions: ``IOException`` and ``RuntimeException``. The former is clearly spurious: ``A.foo`` doesn't have a ``throws IOException`` clause, and therefore can't throw this kind of exception. On the other hand, ``RuntimeException`` is an unchecked exception, so it can be thrown even if there is no explicit ``throws`` clause listing it. So our query should flag the ``@throws`` tag for ``IOException``, but not the one for ``RuntimeException.``
|
||||
|
||||
Recall from above that the CodeQL library represents ``@throws`` tags using class ``ThrowsTag``. This class does not provide a member predicate for determining the exception type that is being documented, so we first need to implement our own version. A simple version might look as follows:
|
||||
Remember that the CodeQL library represents ``@throws`` tags using class ``ThrowsTag``. This class doesn't provide a member predicate for determining the exception type that is being documented, so we first need to implement our own version. A simple version might look like this:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -118,7 +120,7 @@ Recall from above that the CodeQL library represents ``@throws`` tags using clas
|
||||
result.hasName(tt.getExceptionName())
|
||||
}
|
||||
|
||||
Similarly, ``Callable`` does not come with a member predicate for querying all exceptions that the method or constructor may possibly throw. We can, however, implement this ourselves by using ``getAnException`` to find all ``throws`` clauses of the callable, and then use ``getType`` to resolve the corresponding exception types:
|
||||
Similarly, ``Callable`` doesn't come with a member predicate for querying all exceptions that the method or constructor may possibly throw. We can, however, implement this ourselves by using ``getAnException`` to find all ``throws`` clauses of the callable, and then use ``getType`` to resolve the corresponding exception types:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -131,7 +133,7 @@ Note the use of ``getASupertype*`` to find both exceptions declared in a ``throw
|
||||
Now we can write a query for finding all callables ``c`` and ``@throws`` tags ``tt`` such that:
|
||||
|
||||
- ``tt`` belongs to a Javadoc comment attached to ``c``.
|
||||
- ``c`` cannot throw the exception documented by ``tt``.
|
||||
- ``c`` can't throw the exception documented by ``tt``.
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -152,10 +154,10 @@ Improvements
|
||||
|
||||
Currently, there are two problems with this query:
|
||||
|
||||
#. ``getDocumentedException`` is too liberal: it will return *any* reference type with the right name, even if it is in a different package and not actually visible in the current compilation unit.
|
||||
#. ``mayThrow`` is too restrictive: it does not account for unchecked exceptions, which do not need to be declared.
|
||||
#. ``getDocumentedException`` is too liberal: it will return *any* reference type with the right name, even if it's in a different package and not actually visible in the current compilation unit.
|
||||
#. ``mayThrow`` is too restrictive: it doesn't account for unchecked exceptions, which do not need to be declared.
|
||||
|
||||
To see why the former is a problem, consider the following program:
|
||||
To see why the former is a problem, consider this program:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
@@ -166,9 +168,9 @@ To see why the former is a problem, consider the following program:
|
||||
void bar() throws IOException {}
|
||||
}
|
||||
|
||||
This program defines its own class ``IOException``, which is unrelated to the class ``java.io.IOException`` in the standard library: they are in different packages. Our ``getDocumentedException`` predicate does not check packages, however, so it will consider the ``@throws`` clause to refer to both ``IOException`` classes, and thus flag the ``@param`` tag as spurious, since ``B.bar`` cannot actually throw ``java.io.IOException``.
|
||||
This program defines its own class ``IOException``, which is unrelated to the class ``java.io.IOException`` in the standard library: they are in different packages. Our ``getDocumentedException`` predicate doesn't check packages, however, so it will consider the ``@throws`` clause to refer to both ``IOException`` classes, and thus flag the ``@param`` tag as spurious, since ``B.bar`` can't actually throw ``java.io.IOException``.
|
||||
|
||||
As an example of the second problem, method ``A.foo`` from our previous example was annotated with a ``@throws RuntimeException`` tag. Our current version of ``mayThrow``, however, would think that ``A.foo`` cannot throw a ``RuntimeException``, and thus flag the tag as spurious.
|
||||
As an example of the second problem, method ``A.foo`` from our previous example was annotated with a ``@throws RuntimeException`` tag. Our current version of ``mayThrow``, however, would think that ``A.foo`` can't throw a ``RuntimeException``, and thus flag the tag as spurious.
|
||||
|
||||
We can make ``mayThrow`` less restrictive by introducing a new class to represent unchecked exceptions, which are just the subtypes of ``java.lang.RuntimeException`` and ``java.lang.Error``:
|
||||
|
||||
@@ -196,7 +198,7 @@ Fixing ``getDocumentedException`` is more complicated, but we can easily cover t
|
||||
#. The ``@throws`` tag refers to a type in the same package.
|
||||
#. The ``@throws`` tag refers to a type that is imported by the current compilation unit.
|
||||
|
||||
The first case can be covered by changing ``getDocumentedException`` to use the qualified name of the ``@throws`` tag. To handle the second and the third case, we can introduce a new predicate ``visibleIn`` that checks whether a reference type is visible in a compilation unit, either by virtue of belonging to the same package or by being explicitly imported. We then rewrite ``getDocumentedException`` as follows:
|
||||
The first case can be covered by changing ``getDocumentedException`` to use the qualified name of the ``@throws`` tag. To handle the second and the third case, we can introduce a new predicate ``visibleIn`` that checks whether a reference type is visible in a compilation unit, either by virtue of belonging to the same package or by being explicitly imported. We then rewrite ``getDocumentedException`` as:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -214,11 +216,11 @@ The first case can be covered by changing ``getDocumentedException`` to use the
|
||||
|
||||
➤ `See this in the query console <https://lgtm.com/query/1505751136101/>`__. This finds many fewer, more interesting results in the LGTM.com demo projects.
|
||||
|
||||
Currently, ``visibleIn`` only considers single-type imports, but it would be possible to extend it with support for other kinds of imports.
|
||||
Currently, ``visibleIn`` only considers single-type imports, but you could extend it with support for other kinds of imports.
|
||||
|
||||
What next?
|
||||
----------
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- Find out how you can use the location API to define queries on whitespace: :doc:`Tutorial: Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`AST class reference <ast-class-reference>`.
|
||||
- Find out how you can use the location API to define queries on whitespace: :doc:`Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
|
||||
- Find out more about QL in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__ and `QL language specification <https://help.semmle.com/QL/ql-spec/language.html>`__.
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
CodeQL for Java
|
||||
===============
|
||||
|
||||
You can use CodeQL to explore Java programs and quickly find variants of security vulnerabilities and bugs.
|
||||
|
||||
.. toctree::
|
||||
:glob:
|
||||
:hidden:
|
||||
|
||||
@@ -1,10 +1,12 @@
|
||||
Tutorial: Working with source locations
|
||||
=======================================
|
||||
Working with source locations
|
||||
=============================
|
||||
|
||||
Overview
|
||||
--------
|
||||
You can use the location of entities within Java code to look for potential errors. Locations allow you to deduce the presence, or absence, of white space which, in some cases, may indicate a problem.
|
||||
|
||||
Java offers a rich set of operators with complex precedence rules, which are sometimes confusing to developers. For instance, the class ``ByteBufferCache`` in the OpenJDK Java compiler (which is a member class of ``com.sun.tools.javac.util.BaseFileManager``) contains the following code for allocating a buffer:
|
||||
About source locations
|
||||
----------------------
|
||||
|
||||
Java offers a rich set of operators with complex precedence rules, which are sometimes confusing to developers. For instance, the class ``ByteBufferCache`` in the OpenJDK Java compiler (which is a member class of ``com.sun.tools.javac.util.BaseFileManager``) contains this code for allocating a buffer:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
@@ -14,14 +16,14 @@ Presumably, the author meant to allocate a buffer that is 1.5 times the size ind
|
||||
|
||||
Note that the source layout gives a fairly clear indication of the intended meaning: there is more white space around ``+`` than around ``>>``, suggesting that the latter is meant to bind more tightly.
|
||||
|
||||
We will now develop a query that finds this kind of suspicious nesting, where the operator of the inner expression has more white space around it than the operator of the outer expression. This pattern may not necessarily indicate a bug, but at the very least it makes the code hard to read and prone to misinterpretation.
|
||||
We're going to develop a query that finds this kind of suspicious nesting, where the operator of the inner expression has more white space around it than the operator of the outer expression. This pattern may not necessarily indicate a bug, but at the very least it makes the code hard to read and prone to misinterpretation.
|
||||
|
||||
White space is not directly represented in the CodeQL database, but we can deduce its presence from the location information associated with program elements and AST nodes. So we will start by providing an overview of source location management in the standard library for Java.
|
||||
White space is not directly represented in the CodeQL database, but we can deduce its presence from the location information associated with program elements and AST nodes. So, before we write our query, we need an understanding of source location management in the standard library for Java.
|
||||
|
||||
Location API
|
||||
------------
|
||||
|
||||
For every entity that has a representation in Java source code (including, in particular, program elements and AST nodes), the standard CodeQL library provides the following predicates for accessing source location information:
|
||||
For every entity that has a representation in Java source code (including, in particular, program elements and AST nodes), the standard CodeQL library provides these predicates for accessing source location information:
|
||||
|
||||
- ``getLocation`` returns a ``Location`` object describing the start and end position of the entity.
|
||||
- ``getFile`` returns a ``File`` object representing the file containing the entity.
|
||||
@@ -29,7 +31,7 @@ For every entity that has a representation in Java source code (including, in pa
|
||||
- ``getNumberOfCommentLines`` returns the number of comment lines.
|
||||
- ``getNumberOfLinesOfCode`` returns the number of non-comment lines.
|
||||
|
||||
For example, assume the following Java class is defined in compilation unit ``SayHello.java``:
|
||||
For example, let's assume this Java class is defined in the compilation unit ``SayHello.java``:
|
||||
|
||||
.. code-block:: java
|
||||
|
||||
@@ -44,20 +46,20 @@ For example, assume the following Java class is defined in compilation unit ``Sa
|
||||
}
|
||||
}
|
||||
|
||||
Invoking ``getFile`` on the expression statement in the body of ``main`` will return a ``File`` object representing the file ``SayHello.java``. The statement spans four lines in total ``(getTotalNumberOfLines``), of which one is a comment line (``getNumberOfCommentLines``), while three lines contain code (``getNumberOfLinesOfCode``).
|
||||
Invoking ``getFile`` on the expression statement in the body of ``main`` returns a ``File`` object representing the file ``SayHello.java``. The statement spans four lines in total ``(getTotalNumberOfLines``), of which one is a comment line (``getNumberOfCommentLines``), while three lines contain code (``getNumberOfLinesOfCode``).
|
||||
|
||||
Class ``Location`` defines member predicates ``getStartLine``, ``getEndLine``, ``getStartColumn`` and ``getEndColumn`` to retrieve the line and column number an entity starts and ends at, respectively. Both lines and columns are counted starting from 1 (not 0), and the end position is inclusive, that is, it is the position of the last character belonging to the source code of the entity.
|
||||
|
||||
In our example, the expression statement starts at line 5, column 3 (the first two characters on the line are tabs, which each count as one character), and it ends at line 8, column 4.
|
||||
|
||||
Class ``File`` defines the following member predicates:
|
||||
Class ``File`` defines these member predicates:
|
||||
|
||||
- ``getFullName`` returns the fully qualified name of the file.
|
||||
- ``getRelativePath`` returns the path of the file relative to the base directory of the source code.
|
||||
- ``getExtension`` returns the extension of the file.
|
||||
- ``getShortName`` returns the base name of the file, without its extension.
|
||||
|
||||
In our example, assume file ``A.java`` is located in directory ``/home/testuser/code/pkg``, where ``/home/testuser/code`` is the base directory of the program being analyzed. Then, a ``File`` object for ``A.java`` returns the following:
|
||||
In our example, assume file ``A.java`` is located in directory ``/home/testuser/code/pkg``, where ``/home/testuser/code`` is the base directory of the program being analyzed. Then, a ``File`` object for ``A.java`` returns:
|
||||
|
||||
- ``getFullName`` is ``/home/testuser/code/pkg/A.java``.
|
||||
- ``getRelativePath`` is ``pkg/A.java``.
|
||||
@@ -67,7 +69,7 @@ In our example, assume file ``A.java`` is located in directory ``/home/testuser/
|
||||
Determining white space around an operator
|
||||
------------------------------------------
|
||||
|
||||
Let us start by considering how to write a predicate that computes the total amount of white space surrounding the operator of a given binary expression. If ``rcol`` is the start column of the expression's right operand and ``lcol`` is the end column of its left operand, then ``rcol - (lcol+1)`` gives us the total number of characters in between the two operands (note that we have to use ``lcol+1`` instead of ``lcol`` because end positions are inclusive).
|
||||
Let's start by considering how to write a predicate that computes the total amount of white space surrounding the operator of a given binary expression. If ``rcol`` is the start column of the expression's right operand and ``lcol`` is the end column of its left operand, then ``rcol - (lcol+1)`` gives us the total number of characters in between the two operands (note that we have to use ``lcol+1`` instead of ``lcol`` because end positions are inclusive).
|
||||
|
||||
This number includes the length of the operator itself, which we need to subtract out. For this, we can use predicate ``getOp``, which returns the operator string, surrounded by one white space on either side. Overall, the expression for computing the amount of white space around the operator of a binary expression ``expr`` is:
|
||||
|
||||
@@ -88,12 +90,12 @@ Clearly, however, this only works if the entire expression is on a single line,
|
||||
)
|
||||
}
|
||||
|
||||
Notice that we use an ``exists`` to introduce our temporary variables ``lcol`` and ``rcol``. The predicate could be written without them by just inlining ``lcol`` and ``rcol`` into their use, at some cost in readability.
|
||||
Notice that we use an ``exists`` to introduce our temporary variables ``lcol`` and ``rcol``. You could write the predicate without them by just inlining ``lcol`` and ``rcol`` into their use, at some cost in readability.
|
||||
|
||||
Find suspicious nesting
|
||||
-----------------------
|
||||
|
||||
A first version of our query can now be written:
|
||||
Here's a first version of our query:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -123,7 +125,7 @@ If we run this initial query, we might notice some false positives arising from
|
||||
|
||||
i< start + 100
|
||||
|
||||
Note that our predicate ``operatorWS`` computes the **total** amount of white space around the operator, which, in this case, is one for the ``<`` and two for the ``+``. Ideally, we would like to exclude cases where the amount of white space before and after the operator are not the same. Currently, CodeQL databases do not record enough information to figure this out, but as an approximation we could require that the total number of white space characters is even:
|
||||
Note that our predicate ``operatorWS`` computes the **total** amount of white space around the operator, which, in this case, is one for the ``<`` and two for the ``+``. Ideally, we would like to exclude cases where the amount of white space before and after the operator are not the same. Currently, CodeQL databases don't record enough information to figure this out, but as an approximation we could require that the total number of white space characters is even:
|
||||
|
||||
.. code-block:: ql
|
||||
|
||||
@@ -181,8 +183,8 @@ Notice that we again use ``getOp``, this time to determine whether two binary ex
|
||||
|
||||
Whitespace suggests that the programmer meant to toggle ``i`` between zero and one, but in fact the expression is parsed as ``i + (1%2)``, which is the same as ``i + 1``, so ``i`` is simply incremented.
|
||||
|
||||
What next?
|
||||
----------
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`AST class reference <ast-class-reference>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
|
||||
- Find out more about QL in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__ and `QL language specification <https://help.semmle.com/QL/ql-spec/language.html>`__.
|
||||
|
||||
@@ -1,8 +1,10 @@
|
||||
Tutorial: Types and the class hierarchy
|
||||
=======================================
|
||||
Java types
|
||||
==========
|
||||
|
||||
Overview
|
||||
--------
|
||||
You can use CodeQL to find out information about data types used in Java code. This allows you to write queries to identify specific type-related issues.
|
||||
|
||||
About working with Java types
|
||||
-----------------------------
|
||||
|
||||
The standard CodeQL library represents Java types by means of the ``Type`` class and its various subclasses.
|
||||
|
||||
@@ -59,7 +61,7 @@ If the expression ``e`` happens to actually evaluate to a ``B[]`` array, on the
|
||||
Object[] o = new String[] { "Hello", "world" };
|
||||
String[] s = (String[])o;
|
||||
|
||||
In this tutorial, we do not try to distinguish these two cases. Our query should simply look for cast expressions ``ce`` that cast from some type ``source`` to another type ``target``, such that:
|
||||
In this tutorial, we don't try to distinguish these two cases. Our query should simply look for cast expressions ``ce`` that cast from some type ``source`` to another type ``target``, such that:
|
||||
|
||||
- Both ``source`` and ``target`` are array types.
|
||||
- The element type of ``source`` is a transitive super type of the element type of ``target``.
|
||||
@@ -144,7 +146,7 @@ Using these new classes we can extend our query to exclude calls to ``toArray``
|
||||
Example: Finding mismatched contains checks
|
||||
-------------------------------------------
|
||||
|
||||
As another example, we develop a query that finds uses of ``Collection.contains`` where the type of the queried element is unrelated to the element type of the collection, thus guaranteeing that the test will always return ``false``.
|
||||
We'll now develop a query that finds uses of ``Collection.contains`` where the type of the queried element is unrelated to the element type of the collection, which guarantees that the test will always return ``false``.
|
||||
|
||||
For example, `Apache Zookeeper <http://zookeeper.apache.org/>`__ used to have a snippet of code similar to the following in class ``QuorumPeerConfig``:
|
||||
|
||||
@@ -272,7 +274,7 @@ Improvements
|
||||
|
||||
For many programs, this query yields a large number of false positive results due to type variables and wild cards: if the collection element type is some type variable ``E`` and the argument type is ``String``, for example, CodeQL will consider that the two have no common subtype, and our query will flag the call. An easy way to exclude such false positive results is to simply require that neither ``collEltType`` nor ``argType`` are instances of ``TypeVariable``.
|
||||
|
||||
Another source of false positives is autoboxing of primitive types: if, for example, the collection's element type is ``Integer`` and the argument is of type ``int``, predicate ``haveCommonDescendant`` will fail, since ``int`` is not a ``RefType``. Thus, our query should check that ``collEltType`` is not the boxed type of ``argType``.
|
||||
Another source of false positives is autoboxing of primitive types: if, for example, the collection's element type is ``Integer`` and the argument is of type ``int``, predicate ``haveCommonDescendant`` will fail, since ``int`` is not a ``RefType``. To account for this, our query should check that ``collEltType`` is not the boxed type of ``argType``.
|
||||
|
||||
Finally, ``null`` is special because its type (known as ``<nulltype>`` in the CodeQL library) is compatible with every reference type, so we should exclude it from consideration.
|
||||
|
||||
@@ -294,9 +296,9 @@ Adding these three improvements, our final query becomes:
|
||||
|
||||
➤ `See the full query in the query console <https://lgtm.com/query/1505753056300/>`__.
|
||||
|
||||
What next?
|
||||
----------
|
||||
Further reading
|
||||
---------------
|
||||
|
||||
- Take a look at some of the other tutorials: :doc:`Tutorial: Expressions and statements <expressions-statements>`, :doc:`Tutorial: Navigating the call graph <call-graph>`, :doc:`Tutorial: Annotations <annotations>`, :doc:`Tutorial: Javadoc <javadoc>`, and :doc:`Tutorial: Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`AST class reference <ast-class-reference>`.
|
||||
- Take a look at some of the other articles in this section: :doc:`Overflow-prone comparisons in Java <expressions-statements>`, :doc:`Navigating the call graph <call-graph>`, :doc:`Annotations in Java <annotations>`, :doc:`Javadoc <javadoc>`, and :doc:`Working with source locations <source-locations>`.
|
||||
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
|
||||
- Find out more about QL in the `QL language handbook <https://help.semmle.com/QL/ql-handbook/index.html>`__ and `QL language specification <https://help.semmle.com/QL/ql-spec/language.html>`__.
|
||||
|
||||
Reference in New Issue
Block a user