Apply suggestions from code review

Many thanks for the review suggestions.

Co-Authored-By: Shati Patel <42641846+shati-patel@users.noreply.github.com>
This commit is contained in:
Felicity Chapman
2020-02-19 16:31:15 +00:00
committed by GitHub
parent 2a5ac2e829
commit f8c876176a
6 changed files with 17 additions and 17 deletions

View File

@@ -3,7 +3,7 @@ Analyzing control flow in Python
You can write CodeQL queries to explore the control flow graph of a Python program, for example, to discover unreachable code or mutually exclusive blocks of code.
To analyze the control-flow graph of a ``Scope`` we can use the two CodeQL classes ``ControlFlowNode`` and ``BasicBlock``. These classes allow you to ask such questions as "can you reach point A from point B?" or "Is it possible to reach point B *without* going through point A?". To report results we use the class ``AstNode``, which represents a syntactic element and corresponds to the source code - allowing the results of the query to be more easily understood. For more information, see `Control-flow graph <http://en.wikipedia.org/wiki/Control_flow_graph>`__ in Wikipedia.
To analyze the control-flow graph of a ``Scope`` we can use the two CodeQL classes ``ControlFlowNode`` and ``BasicBlock``. These classes allow you to ask such questions as "can you reach point A from point B?" or "Is it possible to reach point B *without* going through point A?". To report results we use the class ``AstNode``, which represents a syntactic element and corresponds to the source code - allowing the results of the query to be more easily understood. For more information, see `Control-flow graph <http://en.wikipedia.org/wiki/Control_flow_graph>`__ on Wikipedia.
The ``ControlFlowNode`` class
-----------------------------
@@ -42,7 +42,7 @@ Example finding unreachable AST nodes
where not exists(node.getAFlowNode())
select node
`See this in the query console <https://lgtm.com/query/669220024/>`__. The demo projects on LGTM.com all have some code that has no control flow node, and is therefore unreachable. However, since the ``Module`` class is also a subclass of the ``AstNode`` class, the query also finds any modules implemented in C or with no source code. Therefore, it is better to find all unreachable statements:
`See this in the query console <https://lgtm.com/query/669220024/>`__. The demo projects on LGTM.com all have some code that has no control flow node, and is therefore unreachable. However, since the ``Module`` class is also a subclass of the ``AstNode`` class, the query also finds any modules implemented in C or with no source code. Therefore, it is better to find all unreachable statements.
Example finding unreachable statements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -55,12 +55,12 @@ Example finding unreachable statements
where not exists(s.getAFlowNode())
select s
`See this in the query console <https://lgtm.com/query/670720181/>`__. This query gives fewer results, but most of the projects have some unreachable nodes. These are also highlighted by the standard query: unreachable code. For more information, see `Unreachable code <https://lgtm.com/rules/3980095>`__ on LGTM.com.
`See this in the query console <https://lgtm.com/query/670720181/>`__. This query gives fewer results, but most of the projects have some unreachable nodes. These are also highlighted by the standard "Unreachable code" query. For more information, see `Unreachable code <https://lgtm.com/rules/3980095>`__ on LGTM.com.
The ``BasicBlock`` class
------------------------
The ``BasicBlock`` class represents a basic block of control flow nodes. The ``BasicBlock`` class is not that useful for writing queries directly, but is very useful for building complex analyses, such as data flow. The reason it is useful is that it shares many of the interesting properties of control flow nodes, such as, what can reach what, and what dominates what, but there are fewer basic blocks than control flow nodes - resulting in queries that are faster and use less memory. For more information, see `basic block <http://en.wikipedia.org/wiki/Basic_block>`__ and `dominates <http://en.wikipedia.org/wiki/Dominator_%28graph_theory%29>`__ on Wikipedia.
The ``BasicBlock`` class represents a basic block of control flow nodes. The ``BasicBlock`` class is not that useful for writing queries directly, but is very useful for building complex analyses, such as data flow. The reason it is useful is that it shares many of the interesting properties of control flow nodes, such as, what can reach what, and what dominates what, but there are fewer basic blocks than control flow nodes - resulting in queries that are faster and use less memory. For more information, see `Basic block <http://en.wikipedia.org/wiki/Basic_block>`__ and `Dominator <http://en.wikipedia.org/wiki/Dominator_%28graph_theory%29>`__ on Wikipedia.
Example finding mutually exclusive basic blocks
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

View File

@@ -3,7 +3,7 @@ Functions in Python
Functions are key building blocks of Python code bases. You can find functions and identify calls to them using syntactic classes from the standard CodeQL library.
These examples use the standard CodeQL class `Function <https://help.semmle.com/qldoc/python/semmle/python/Function.qll/type.Function$Function.html>`__. For more information, see :doc:`Introducing the Python libraries <introduce-libraries-python>`.
These examples use the standard CodeQL class `Function <https://help.semmle.com/qldoc/python/semmle/python/Function.qll/type.Function$Function.html>`__. For more information, see ":doc:`Introducing the Python libraries <introduce-libraries-python>`."
Finding all functions called "get..."
-------------------------------------
@@ -57,7 +57,7 @@ We can modify the query further to include only methods whose body consists of a
and count(f.getAStmt()) = 1
select f, "This function is (probably) a getter."
`See this in the query console <https://lgtm.com/query/667290044/>`__. This query returns fewer results, but if you examine the results you can see that there are still refinements to be made. This is refined further in ":doc:`Tutorial: Statements and expressions <statements-expressions>`."
`See this in the query console <https://lgtm.com/query/667290044/>`__. This query returns fewer results, but if you examine the results you can see that there are still refinements to be made. This is refined further in ":doc:`Expressions and statements in Python <statements-expressions>`."
Finding a call to a specific function
-------------------------------------

View File

@@ -22,12 +22,12 @@ The CodeQL library for Python incorporates a large number of classes. Each class
Syntactic classes
-----------------
This part of the library represents the Python source code. The ``Module``, ``Class``, and ``Function`` classes correspond to Python modules, classes, and functions respectively, collectively these are known as ``Scope`` classes. Each ``Scope`` contains a list of statements each of which is represented by a subclass of the class ``Stmt``. Statements themselves can contain other statements or expressions which are represented by subclasses of ``Expr``. Finally, there are a few additional classes for the parts of more complex expressions such as list comprehensions. Collectively these classes are subclasses of ``AstNode`` and form an Abstract syntax tree (AST). The root of each AST is a ``Module``. Symbolic information is attached to the AST in the form of variables (represented by the class ``Variable``). For more information, see `Abstract syntax tree <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`__ and `Symbolic information <http://en.wikipedia.org/wiki/Symbol_table>`__ in Wikipedia.
This part of the library represents the Python source code. The ``Module``, ``Class``, and ``Function`` classes correspond to Python modules, classes, and functions respectively, collectively these are known as ``Scope`` classes. Each ``Scope`` contains a list of statements each of which is represented by a subclass of the class ``Stmt``. Statements themselves can contain other statements or expressions which are represented by subclasses of ``Expr``. Finally, there are a few additional classes for the parts of more complex expressions such as list comprehensions. Collectively these classes are subclasses of ``AstNode`` and form an Abstract syntax tree (AST). The root of each AST is a ``Module``. Symbolic information is attached to the AST in the form of variables (represented by the class ``Variable``). For more information, see `Abstract syntax tree <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`__ and `Symbolic information <http://en.wikipedia.org/wiki/Symbol_table>`__ on Wikipedia.
Scope
^^^^^
A Python program is a group of modules. Technically a module is just a list of statements, but we often think of it as composed of classes and functions. These top-level entities, the module, class, and function are represented by the three CodeQL classes (`Module <https://help.semmle.com/qldoc/python/semmle/python/Module.qll/type.Module$Module.html>`__, `Class <https://help.semmle.com/qldoc/python/semmle/python/Class.qll/type.Class$Class.html>`__ and `Function <https://help.semmle.com/qldoc/python/semmle/python/Function.qll/type.Function$Function.html>`__ which are all subclasses of ``Scope``).
A Python program is a group of modules. Technically a module is just a list of statements, but we often think of it as composed of classes and functions. These top-level entities, the module, class, and function are represented by the three CodeQL classes `Module <https://help.semmle.com/qldoc/python/semmle/python/Module.qll/type.Module$Module.html>`__, `Class <https://help.semmle.com/qldoc/python/semmle/python/Class.qll/type.Class$Class.html>`__ and `Function <https://help.semmle.com/qldoc/python/semmle/python/Function.qll/type.Function$Function.html>`__ which are all subclasses of ``Scope``.
- ``Scope``
@@ -151,7 +151,7 @@ Both forms are equivalent. Using the positive expression, the whole query looks
`See this in the query console <https://lgtm.com/query/690010036/>`__. Many projects include pass-only ``except`` blocks.
Summary of syntactic classes
Summary
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The most commonly used standard classes in the syntactic part of the library are organized as follows:
@@ -237,7 +237,7 @@ Other
Control flow classes
--------------------
This part of the library represents the control flow graph of each ``Scope`` (classes, functions, and modules). Each ``Scope`` contains a graph of ``ControlFlowNode`` elements. Each scope has a single entry point and at least one (potentially many) exit points. To speed up control and data flow analysis, control flow nodes are grouped into basic blocks. For more information, see `basic blocks <http://en.wikipedia.org/wiki/Basic_block>`__ in Wikipedia.
This part of the library represents the control flow graph of each ``Scope`` (classes, functions, and modules). Each ``Scope`` contains a graph of ``ControlFlowNode`` elements. Each scope has a single entry point and at least one (potentially many) exit points. To speed up control and data flow analysis, control flow nodes are grouped into basic blocks. For more information, see `Basic block <http://en.wikipedia.org/wiki/Basic_block>`__ on Wikipedia.
Example
^^^^^^^
@@ -332,7 +332,7 @@ Summary
- `TaintKind <https://help.semmle.com/qldoc/python/semmle/python/dataflow/TaintTracking.qll/type.TaintTracking$TaintKind.html>`__
- `Configuration <https://help.semmle.com/qldoc/python/semmle/python/dataflow/Configuration.qll/type.Configuration$TaintTracking$Configuration.html>`__
For more information about these classes, see ":doc:`Analyzing data flow and tracking tainted data in Python <taint-tracking>`".
For more information about these classes, see ":doc:`Analyzing data flow and tracking tainted data in Python <taint-tracking>`."
Further reading

View File

@@ -1,7 +1,7 @@
Pointer analysis and type inference in Python
=============================================
At run time, each Python expression has a value with an associated type. You can learn how an expression behaves at run time using type-inference classes from the standard CodeQL library.
At runtime, each Python expression has a value with an associated type. You can learn how an expression behaves at runtime using type-inference classes from the standard CodeQL library.
This topic contains worked examples of how to write queries using the standard CodeQL library classes for Python type inference.
@@ -25,7 +25,7 @@ Class hierarchy for ``Value``:
Points-to analysis and type inference
-------------------------------------
Points-to analysis, sometimes known as pointer analysis, allows us to determine which objects an expression may "point to" at runtime. Type inference allows us to infer what the types (classes) of an expression may be at runtime. For more information, see `pointer analysis <http://en.wikipedia.org/wiki/Pointer_analysis>`__ and `Type inference <http://en.wikipedia.org/wiki/Type_inference>`__ on Wikipedia.
Points-to analysis, sometimes known as pointer analysis, allows us to determine which objects an expression may "point to" at runtime. Type inference allows us to infer what the types (classes) of an expression may be at runtime. For more information, see `Pointer analysis <http://en.wikipedia.org/wiki/Pointer_analysis>`__ and `Type inference <http://en.wikipedia.org/wiki/Type_inference>`__ on Wikipedia.
The predicate ``ControlFlowNode.pointsTo(...)`` shows which object a control flow node may "point to" at runtime.
@@ -124,7 +124,7 @@ Combining the parts of the query we get this:
)
select t, ex1, ex2
`See this in the query console <https://lgtm.com/query/669950027/>`__. This query finds only one result in the demo projects on LGTM.com (`youtube-dl <https://lgtm.com/projects/g/ytdl-org/youtube-dl/rev/39e9d524e5fe289936160d4c599a77f10f6e9061/files/devscripts/buildserver.py?sort=name&dir=ASC&mode=heatmap#L413>`__). The result is also highlighted by the standard query: Unreachable 'except' block. For more information, see `Unreachable 'except' block <https://lgtm.com/rules/7900089>`__ on LGTM.com.
`See this in the query console <https://lgtm.com/query/669950027/>`__. This query finds only one result in the demo projects on LGTM.com (`youtube-dl <https://lgtm.com/projects/g/ytdl-org/youtube-dl/rev/39e9d524e5fe289936160d4c599a77f10f6e9061/files/devscripts/buildserver.py?sort=name&dir=ASC&mode=heatmap#L413>`__). The result is also highlighted by the standard "Unreachable 'except' block" query. For more information, see `Unreachable 'except' block <https://lgtm.com/rules/7900089>`__ on LGTM.com.
.. pull-quote::
@@ -184,7 +184,7 @@ The ``Value`` class has a method ``getACall()`` which allows us to find calls to
If we wish to restrict the callables to actual functions we can use the ``FunctionValue`` class, which is a subclass of ``Value`` and corresponds to function objects in Python, in much the same way as the ``ClassValue`` class corresponds to class objects in Python.
Returning to an example from ":doc:`Tutorial: Functions <functions>`," we wish to find calls to the ``eval`` function.
Returning to an example from ":doc:`Functions in Python <functions>`," we wish to find calls to the ``eval`` function.
The original query looked this:

View File

@@ -197,7 +197,7 @@ The short version is usually used as this is easier to read.
Example finding Java-style getters
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Returning to the example from ":doc:`Tutorial: Functions <functions>`," the query identified all methods with a single line of code and a name starting with ``get``.
Returning to the example from ":doc:`Functions in Python <functions>`," the query identified all methods with a single line of code and a name starting with ``get``.
.. code-block:: ql

View File

@@ -1,7 +1,7 @@
Analyzing data flow and tracking tainted data in Python
=======================================================
You can use CodeQL to track the flow of data through a Python program to its use. Tracking user-controlled, or tainted, data is a key technique for security researchers.
You can use CodeQL to track the flow of data through a Python program. Tracking user-controlled, or tainted, data is a key technique for security researchers.
About data flow and taint tracking
----------------------------------