From 55bda34a45f50e1b35dc060511f6cdb9b7c0653c Mon Sep 17 00:00:00 2001 From: Arthur Baars Date: Tue, 18 Oct 2022 15:07:35 +0200 Subject: [PATCH 01/27] Ruby: drop beta notice --- docs/codeql/codeql-language-guides/codeql-for-ruby.rst | 1 - docs/codeql/query-help/codeql-cwe-coverage.rst | 1 - docs/codeql/query-help/index.rst | 2 -- docs/codeql/reusables/ruby-beta-note.rst | 4 ---- 4 files changed, 8 deletions(-) delete mode 100644 docs/codeql/reusables/ruby-beta-note.rst diff --git a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst index bfb29a012ef..b19f8abe230 100644 --- a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst +++ b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst @@ -15,4 +15,3 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat - :doc:`CodeQL library for Ruby `: When you're analyzing a Ruby program, you can make use of the large collection of classes in the CodeQL library for Ruby. -.. include:: ../reusables/ruby-beta-note.rst diff --git a/docs/codeql/query-help/codeql-cwe-coverage.rst b/docs/codeql/query-help/codeql-cwe-coverage.rst index 30e7b569184..c0b36646df8 100644 --- a/docs/codeql/query-help/codeql-cwe-coverage.rst +++ b/docs/codeql/query-help/codeql-cwe-coverage.rst @@ -35,4 +35,3 @@ Note that the CWE coverage includes both "`supported queries `." -.. include:: ../reusables/ruby-beta-note.rst - .. toctree:: :hidden: :titlesonly: diff --git a/docs/codeql/reusables/ruby-beta-note.rst b/docs/codeql/reusables/ruby-beta-note.rst deleted file mode 100644 index 761381777c0..00000000000 --- a/docs/codeql/reusables/ruby-beta-note.rst +++ /dev/null @@ -1,4 +0,0 @@ - .. pull-quote:: Note - - CodeQL analysis for Ruby is currently in beta. During the beta, analysis of Ruby code, - and the accompanying documentation, will not be as comprehensive as for other languages. From 6f646be733495ad9c78e09e7ae1c55d1cae8580b Mon Sep 17 00:00:00 2001 From: Arthur Baars Date: Tue, 18 Oct 2022 14:06:59 +0200 Subject: [PATCH 02/27] Ruby: document API graphs --- .../codeql-for-ruby.rst | 3 + .../using-api-graphs-in-ruby.rst | 183 ++++++++++++++++++ 2 files changed, 186 insertions(+) create mode 100644 docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst diff --git a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst index bfb29a012ef..7066c108200 100644 --- a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst +++ b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst @@ -10,9 +10,12 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat basic-query-for-ruby-code codeql-library-for-ruby + using-api-graphs-in-ruby - :doc:`Basic query for Ruby code `: Learn to write and run a simple CodeQL query using LGTM. - :doc:`CodeQL library for Ruby `: When you're analyzing a Ruby program, you can make use of the large collection of classes in the CodeQL library for Ruby. +- :doc:`Using API graphs in Ruby `: API graphs are a uniform interface for referring to functions, classes, and methods defined in external libraries. + .. include:: ../reusables/ruby-beta-note.rst diff --git a/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst b/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst new file mode 100644 index 00000000000..af3b1ecfdec --- /dev/null +++ b/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst @@ -0,0 +1,183 @@ +.. _using-api-graphs-in-ruby: + +Using API graphs in Ruby +========================== + +API graphs are a uniform interface for referring to functions, classes, and methods defined in +external libraries. + +About this article +------------------ + +This article describes how to use API graphs to reference classes and functions defined in library +code. You can use API graphs to conveniently refer to external library functions when defining things like +remote flow sources. + + +Module and class references +--------------------------- + +The most common entry point into the API graph will be the point where a toplevel module or class is +accessed. For example, you can access the API graph node corresponding to the ``::Regexp`` class +by using the ``API::getTopLevelMember`` method defined in the ``codeql.ruby.ApiGraphs`` module, as the +following snippet demonstrates. + +.. code-block:: ql + + import codeql.ruby.ApiGraphs + + select API::getTopLevelMember("Regexp") + +This query selects the API graph nodes corresponding to references to the ``Regexp`` class. For nested +modules and classes, you can use the ``getMember` method. For example the following query selects +references to the ``Net::HTTP`` class. + +.. code-block:: ql + + import codeql.ruby.ApiGraphs + + select API::getTopLevelMember("Net").getMember("HTTP") + +Note that the given module name *must not* contain any ```::`` symbols. Thus, something like +`API::getTopLevelMember("Net::HTTP")`` will not do what you expect. Instead, this should be decomposed +into an access of the ``HTTP`` member of the API graph node for ``Net``, as in the example above. + +Calls and class instantiations +------------------------------ + +To track the calls of externally defined functions, you can use the ``getMethod`` method. The +following snippet finds all calls of ``Regexp.compile``: + +.. code-block:: ql + + import codeql.ruby.ApiGraphs + + select API::getTopLevelMember("Regexp").getMethod("compile") + +The example above is for a call to a class method. Tracking calls to instance methods, is a two-step +process, first you need to find instances of the class before you can find the calls +to methods on those instances. The following snippet finds instantiations of the ``Regexp`` class: + +.. code-block:: ql + + import codeql.ruby.ApiGraphs + + select API::getTopLevelMember("Regexp").getInstance() + +Note that the ``getInstance`` method also includes subclasses. For example if there is a +``class SpecialRegexp < Regexp`` then ``getInstance`` also finds ``SpecialRegexp.new``. + +The following snippet builds on the above to find calls of the ``Regexp#match?`` instance method: + +.. code-block:: ql + + import codeql.ruby.ApiGraphs + + select API::getTopLevelMember("Regexp").getInstance().getMethod("match?") + +Subclasses +---------- + +For many libraries, the main mode of usage is to extend one or more library classes. To track this +in the API graph, you can use the ``getASubclass`` method to get the API graph node corresponding to +all the immediate subclasses of this node. To find *all* subclasses, use ``*`` or ``+`` to apply the +method repeatedly, as in ``getASubclass*``. + +Note that ``getASubclass`` does not account for any subclassing that takes place in library code +that has not been extracted. Thus, it may be necessary to account for this in the models you write. +For example, the ``ActionController::Base`` class has a predefined subclass ``Rails::ApplicationController``. To find +all subclasses of ``ActionController::Base``, you must explicitly include the subclasses of ``Rails::ApplicationController`` as well. + +.. code-block:: ql + + import codeql.ruby.ApiGraphs + + + API::Node actionController() { + result = + [ + API::getTopLevelMember("ActionController").getMember("Base"), + API::getTopLevelMember("Rails").getMember("ApplicationController") + ].getASubclass*() + } + + select actionController() + + +Using the API graph in dataflow queries +--------------------------------------- + +Dataflow queries often search for points where data from external sources enters the code base +as well as places where data leaves the code base. API graphs provide a convenient way to refer +to external API components such as library functions and their inputs and outputs. API graph nodes +cannot be used directly in dataflow queries they model entities that are defined externally, +while dataflow nodes correspond to entities defined in the current code base. To brigde this gap +the API node classes provide the ``asSource()`` and ``asSink()`` methods. + +The ``asSource()`` method is used to select dataflow nodes where a value from an external source +enters the current code base. A typical example is the return value of a library function such as +``File.read(path)``: + +.. code-block:: ql + + import codeql.ruby.ApiGraphs + + select API::getTopLevelMember("File").getMethod("read").getParameter(1).asSource() + + +The ``asSink()`` method is used to select dataflow nodes where a value leaves the +current code base and flows into an external library. For example the second parameter +of the ``File.write(path, value)`` method. + +.. code-block:: ql + + import codeql.ruby.ApiGraphs + + select API::getTopLevelMember("File").getMethod("write").getParameter(1).asSink() + +A more complex example is a call to ``File.open`` with a block argument. This function creates a ```File`` instance +and passes it to the supplied block. In this case the first parameter of the block is the place where an +externally created value enters the code base, i.e. the ``|file|`` in the example below: + +.. code-block:: ruby + + File.open("/my/file.txt", "w") { |file| file << "Hello world" } + +The following snippet finds parameters of blocks of ``File.open`` method calls: + +.. code-block:: ql + + import codeql.ruby.ApiGraphs + + select API::getTopLevelMember("File").getMethod("open").getBlock().getParameter(0).asSource() + +The following example is a dataflow query that that uses API graphs to find cases where data that +is read flows into a call to ```File.write``. + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.ApiGraphs + + class Configuration extends DataFlow::Configuration { + Configuration() { this = "File read/write Configuration" } + + override predicate isSource(DataFlow::Node source) { + source = API::getTopLevelMember("File").getMethod("read").getReturn().asSource() + } + + override predicate isSink(DataFlow::Node sink) { + sink = API::getTopLevelMember("File").getMethod("write").getParameter(1).asSink() + } + } + + from DataFlow::Node src, DataFlow::Node sink, Configuration config + where config.hasFlow(src, sink) + select src, "The data read here flows into a $@ call.", sink, "File.write" + +Further reading +--------------- + + +.. include:: ../reusables/ruby-further-reading.rst +.. include:: ../reusables/codeql-ref-tools-further-reading.rst From b1da636be0cab23ce4fca168b77d730c022af52b Mon Sep 17 00:00:00 2001 From: Nick Rolfe Date: Fri, 21 Oct 2022 15:11:43 +0100 Subject: [PATCH 03/27] Ruby: first draft of data flow docs --- .../analyzing-data-flow-in-ruby.rst | 390 ++++++++++++++++++ .../codeql-for-ruby.rst | 2 + 2 files changed, 392 insertions(+) create mode 100644 docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst diff --git a/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst b/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst new file mode 100644 index 00000000000..feaa6415486 --- /dev/null +++ b/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst @@ -0,0 +1,390 @@ +.. _analyzing-data-flow-in-ruby: + +Analyzing data flow in Ruby +============================= + +You can use CodeQL to track the flow of data through a Ruby program to places where the data is used. + +About this article +------------------ + +This article describes how data flow analysis is implemented in the CodeQL libraries for Ruby and includes examples to help you write your own data flow queries. +The following sections describe how to use the libraries for local data flow, global data flow, and taint tracking. +For a more general introduction to modeling data flow, see ":ref:`About data flow analysis `." + +Local data flow +--------------- + +Local data flow is data flow within a single method or callable. Local data flow is easier, faster, and more precise than global data flow, and is sufficient for many queries. + +Using local data flow +~~~~~~~~~~~~~~~~~~~~~ + +The local data flow library is in the module ``DataFlow`` and it defines the class ``Node``, representing any element through which data can flow. +``Node``\ s are divided into expression nodes (``ExprNode``) and parameter nodes (``ParameterNode``). +You can map between a data flow ``ParameterNode`` and its corresponding ``Parameter`` AST node using the ``asParameter`` member predicate. +Meanwhile, the ``asExpr`` member predicate maps between a data flow ``ExprNode`` and its corresponding ``ExprCfgNode`` in the control-flow library. + +.. code-block:: ql + + class Node { + /** Gets the expression corresponding to this node, if any. */ + CfgNodes::ExprCfgNode asExpr() { ... } + + /** Gets the parameter corresponding to this node, if any. */ + Parameter asParameter() { ... } + + ... + } + +You can also use the predicates ``exprNode`` and ``parameterNode``: + +.. code-block:: ql + + /** + * Gets a node corresponding to expression `e`. + */ + ExprNode exprNode(CfgNodes::ExprCfgNode e) { ... } + + /** + * Gets the node corresponding to the value of parameter `p` at function entry. + */ + ParameterNode parameterNode(Parameter p) { ... } + +Note that since ``asExpr`` and ``exprNode`` map between data-flow and control-flow nodes, you then need to call the ``getExpr`` member predicate on the control-flow node to map to the corresponding AST node, +e.g. by writing ``node.asExpr().getExpr()``. +Due to the control-flow graph being split, there can be multiple data-flow and control-flow nodes associated with a single expression AST node. + +The predicate ``localFlowStep(Node nodeFrom, Node nodeTo)`` holds if there is an immediate data flow edge from the node ``nodeFrom`` to the node ``nodeTo``. +You can apply the predicate recursively, by using the ``+`` and ``*`` operators, or you can use the predefined recursive predicate ``localFlow``. + +For example, you can find flow from an expression ``source`` to an expression ``sink`` in zero or more local steps: + +.. code-block:: ql + + DataFlow::localFlow(source, sink) + +Using local taint tracking +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Local taint tracking extends local data flow by including non-value-preserving flow steps. +For example: + +.. code-block:: ruby + + temp = x + y = temp + ", " + temp + +If ``x`` is a tainted string then ``y`` is also tainted. + +The local taint tracking library is in the module ``TaintTracking``. +Like local data flow, a predicate ``localTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo)`` holds if there is an immediate taint propagation edge from the node ``nodeFrom`` to the node ``nodeTo``. +You can apply the predicate recursively, by using the ``+`` and ``*`` operators, or you can use the predefined recursive predicate ``localTaint``. + +For example, you can find taint propagation from an expression ``source`` to an expression ``sink`` in zero or more local steps: + +.. code-block:: ql + + TaintTracking::localTaint(source, sink) + + +Using local sources +~~~~~~~~~~~~~~~~~~~ + +When asking for local data flow or taint propagation between two expressions as above, you would normally constrain the expressions to be relevant to a certain investigation. +The next section will give some concrete examples, but there is a more abstract concept that we should call out explicitly, namely that of a local source. + +A local source is a data-flow node with no local data flow into it. +As such, it is a local origin of data flow, a place where a new value is created. +This includes parameters (which only receive global data flow) and most expressions (because they are not value-preserving). +Restricting attention to such local sources gives a much lighter and more performant data-flow graph and in most cases also a more suitable abstraction for the investigation of interest. +The class ``LocalSourceNode`` represents data-flow nodes that are also local sources. +It comes with a useful member predicate ``flowsTo(DataFlow::Node node)``, which holds if there is local data flow from the local source to ``node``. + +Examples +~~~~~~~~ + +This query finds the filename argument passed in each call to ``File.open``: + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.ApiGraphs + + from DataFlow::CallNode call + where call = API::getTopLevelMember("File").getAMethodCall("open") + select call.getArgument(0) + +Notice the use of the ``API`` module for referring to library methods. +For more information, see ":doc:`Using API graphs in Ruby `." + +Unfortunately this will only give the expression in the argument, not the values which could be passed to it. +So we use local data flow to find all expressions that flow into the argument: + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.ApiGraphs + + from DataFlow::CallNode call, DataFlow::ExprNode expr + where + call = API::getTopLevelMember("File").getAMethodCall("open") and + DataFlow::localFlow(expr, call.getArgument(0)) + select call, expr + +Many expressions flow to the same call. +If you run this query, you may notice that you get several data-flow nodes for an expression as it flows towards a call (notice repeated locations in the ``call`` column). +We are mostly interested in the "first" of these, what might be called the local source for the file name. +To restrict attention to such local sources, and to simultaneously make the analysis more performant, we have the QL class ``LocalSourceNode``. +We could demand that ``expr`` is such a node: + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.ApiGraphs + + from DataFlow::CallNode call, DataFlow::ExprNode expr + where + call = API::getTopLevelMember("File").getAMethodCall("open") and + DataFlow::localFlow(expr, call.getArgument(0)) and + expr instanceof DataFlow::LocalSourceNode + select call, expr + +However, we could also enforce this by casting. +That would allow us to use the member predicate ``flowsTo`` on ``LocalSourceNode`` like so: + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.ApiGraphs + + from DataFlow::CallNode call, DataFlow::ExprNode expr + where + call = API::getTopLevelMember("File").getAMethodCall("open") and + expr.(DataFlow::LocalSourceNode).flowsTo(call.getArgument(0)) + select call, expr + +As an alternative, we can ask more directly that ``expr`` is a local source of the first argument, via the predicate ``getALocalSource``: + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.ApiGraphs + + from DataFlow::CallNode call, DataFlow::ExprNode expr + where + call = API::getTopLevelMember("File").getAMethodCall("open") and + expr = call.getArgument(0).getALocalSource() + select call, expr + +All these three queries give identical results. +We now mostly have one expression per call. + +We may still have cases of more than one expression flowing to a call, but then they flow through different code paths (possibly due to control-flow splitting). + +We might want to make the source more specific, for example a parameter to a method or block. +This query finds instances where a parameter is used as the name when opening a file: + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.ApiGraphs + + from DataFlow::CallNode call, DataFlow::ParameterNode p + where + call = API::getTopLevelMember("File").getAMethodCall("open") and + DataFlow::localFlow(p, call.getArgument(0)) + select call, p + +Using the exact name supplied via the parameter may be too strict. +If we want to know if the parameter influences the file name, we can use taint tracking instead of data flow. +This query finds calls to ``File.open`` where the filename is derived from a parameter: + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.TaintTracking + import codeql.ruby.ApiGraphs + + from DataFlow::CallNode call, DataFlow::ParameterNode p + where + call = API::getTopLevelMember("File").getAMethodCall("open") and + TaintTracking::localTaint(p, call.getArgument(0)) + select call, p + +Global data flow +---------------- + +Global data flow tracks data flow throughout the entire program, and is therefore more powerful than local data flow. +However, global data flow is less precise than local data flow, and the analysis typically requires significantly more time and memory to perform. + +.. pull-quote:: Note + + .. include:: ../reusables/path-problem.rst + +Using global data flow +~~~~~~~~~~~~~~~~~~~~~~ + +The global data flow library is used by extending the class ``DataFlow::Configuration``: + +.. code-block:: ql + + import codeql.ruby.DataFlow + + class MyDataFlowConfiguration extends DataFlow::Configuration { + MyDataFlowConfiguration() { this = "..." } + + override predicate isSource(DataFlow::Node source) { + ... + } + + override predicate isSink(DataFlow::Node sink) { + ... + } + } + +These predicates are defined in the configuration: + +- ``isSource`` - defines where data may flow from. +- ``isSink`` - defines where data may flow to. +- ``isBarrier`` - optionally, restricts the data flow. +- ``isAdditionalFlowStep`` - optionally, adds additional flow steps. + +The characteristic predicate (``MyDataFlowConfiguration()``) defines the name of the configuration, so ``"..."`` must be replaced with a unique name (for instance the class name). + +The data flow analysis is performed using the predicate ``hasFlow(DataFlow::Node source, DataFlow::Node sink)``: + +.. code-block:: ql + + from MyDataFlowConfiguation dataflow, DataFlow::Node source, DataFlow::Node sink + where dataflow.hasFlow(source, sink) + select source, "Dataflow to $@.", sink, sink.toString() + +Using global taint tracking +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Global taint tracking is to global data flow what local taint tracking is to local data flow. +That is, global taint tracking extends global data flow with additional non-value-preserving steps. +The global taint tracking library is used by extending the class ``TaintTracking::Configuration``: + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.TaintTracking + + class MyTaintTrackingConfiguration extends TaintTracking::Configuration { + MyTaintTrackingConfiguration() { this = "..." } + + override predicate isSource(DataFlow::Node source) { + ... + } + + override predicate isSink(DataFlow::Node sink) { + ... + } + } + +These predicates are defined in the configuration: + +- ``isSource`` - defines where taint may flow from. +- ``isSink`` - defines where taint may flow to. +- ``isSanitizer`` - optionally, restricts the taint flow. +- ``isAdditionalTaintStep`` - optionally, adds additional taint steps. + +Similar to global data flow, the characteristic predicate (``MyTaintTrackingConfiguration()``) defines the unique name of the configuration and the taint analysis is performed using the predicate ``hasFlow(DataFlow::Node source, DataFlow::Node sink)``. + +Predefined sources and sinks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The data flow library contains a number of predefined sources and sinks, providing a good starting point for defining data flow based security queries. + +- The class ``RemoteFlowSource`` (defined in module ``codeql.ruby.dataflow.RemoteFlowSources``) represents data flow from remote network inputs. This is useful for finding security problems in networked services. +- The library ``Concepts`` (defined in module ``codeql.ruby.Concepts``) contains several subclasses of ``DataFlow::Node`` that are security relevant, such as ``FileSystemAccess`` and ``SqlExecution``. + +For global flow, it is also useful to restrict sources to instances of ``LocalSourceNode``. +The predefined sources generally do that. + +Class hierarchy +~~~~~~~~~~~~~~~ + +- ``DataFlow::Configuration`` - base class for custom global data flow analysis. +- ``DataFlow::Node`` - an element behaving as a data-flow node. + + - ``DataFlow::CfgNode`` - a control-flow node behaving as a data-flow node. + + - ``DataFlow::ExprNode`` - an expression behaving as a data-flow node. + - ``DataFlow::ParameterNode`` - a parameter data-flow node representing the value of a parameter at method/block entry. + + - ``RemoteFlowSource`` - data flow from network/remote input. + - ``Concepts::SystemCommandExecution`` - a data-flow node that executes an operating system command, for instance by spawning a new process. + - ``Concepts::FileSystemAccess`` - a data-flow node that performs a file system access, including reading and writing data, creating and deleting files and folders, checking and updating permissions, and so on. + - ``Concepts::Path::PathNormalization`` - a data-flow node that performs path normalization. This is often needed in order to safely access paths. + - ``Concepts::CodeExecution`` - a data-flow node that dynamically executes Python code. + - ``Concepts::SqlExecution`` - a data-flow node that executes SQL statements. + - ``Concepts::HTTP::Server::RouteSetup`` - a data-flow node that sets up a route on a server. + - ``Concepts::HTTP::Server::HttpResponse`` - a data-flow node that creates an HTTP response on a server. + +- ``TaintTracking::Configuration`` - base class for custom global taint tracking analysis. + +Examples +~~~~~~~~ + +This query shows a data flow configuration that uses all network input as data sources: + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.TaintTracking + import codeql.ruby.Concepts + import codeql.ruby.dataflow.RemoteFlowSources + + class RemoteToFileConfiguration extends TaintTracking::Configuration { + RemoteToFileConfiguration() { this = "RemoteToFileConfiguration" } + + override predicate isSource(DataFlow::Node source) { source instanceof RemoteFlowSource } + + override predicate isSink(DataFlow::Node sink) { + sink = any(FileSystemAccess fa).getAPathArgument() + } + } + + from DataFlow::Node input, DataFlow::Node fileAccess, RemoteToFileConfiguration config + where config.hasFlow(input, fileAccess) + select fileAccess, "This file access uses data from $@.", input, "user-controllable input." + +This data flow configuration tracks data flow from environment variables to opening files: + +.. code-block:: ql + + import codeql.ruby.DataFlow + import codeql.ruby.controlflow.CfgNodes + import codeql.ruby.ApiGraphs + + class EnvironmentToFileConfiguration extends DataFlow::Configuration { + EnvironmentToFileConfiguration() { this = "EnvironmentToFileConfiguration" } + + override predicate isSource(DataFlow::Node source) { + exists(ExprNodes::ConstantReadAccessCfgNode env | + env.getExpr().getName() = "ENV" and + env = source.asExpr().(ExprNodes::ElementReferenceCfgNode).getReceiver() + ) + } + + override predicate isSink(DataFlow::Node sink) { + sink = API::getTopLevelMember("File").getAMethodCall("open").getArgument(0) + } + } + + from EnvironmentToFileConfiguration config, DataFlow::Node environment, DataFlow::Node fileOpen + where config.hasFlow(environment, fileOpen) + select fileOpen, "This call to 'File.open' uses data from $@.", environment, + "an environment variable" + +Further reading +--------------- + +- ":ref:`Exploring data flow with path queries `" + + +.. include:: ../reusables/ruby-further-reading.rst +.. include:: ../reusables/codeql-ref-tools-further-reading.rst diff --git a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst index bfb29a012ef..8e2dfe267e3 100644 --- a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst +++ b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst @@ -15,4 +15,6 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat - :doc:`CodeQL library for Ruby `: When you're analyzing a Ruby program, you can make use of the large collection of classes in the CodeQL library for Ruby. +- :doc:`Analyzing data flow in Ruby `: You can use CodeQL to track the flow of data through a Ruby program to places where the data is used. + .. include:: ../reusables/ruby-beta-note.rst From bb9205226a6d012eafafbe2478c00cee165b621d Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Fri, 28 Oct 2022 13:36:45 +0100 Subject: [PATCH 04/27] Ruby: fix whitespace in basic query doc table --- .../codeql/codeql-language-guides/basic-query-for-ruby-code.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/codeql/codeql-language-guides/basic-query-for-ruby-code.rst b/docs/codeql/codeql-language-guides/basic-query-for-ruby-code.rst index 1881dfc71a7..4acc85e6a85 100644 --- a/docs/codeql/codeql-language-guides/basic-query-for-ruby-code.rst +++ b/docs/codeql/codeql-language-guides/basic-query-for-ruby-code.rst @@ -80,7 +80,7 @@ After the initial ``import`` statement, this simple query comprises three parts +---------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+ | Query part | Purpose | Details | +===============================================================+===================================================================================================================+========================================================================================================================+ -| ``import codeql.ruby.AST`` | Imports the standard CodeQL AST libraries for Ruby. | Every query begins with one or more ``import`` statements. | +| ``import codeql.ruby.AST`` | Imports the standard CodeQL AST libraries for Ruby. | Every query begins with one or more ``import`` statements. | +---------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+ | ``from IfExpr ifexpr`` | Defines the variables for the query. | We use: an ``IfExpr`` variable for ``if`` expressions. | | | Declarations are of the form: | | From 5369ba1d832e39013ef1d82581c2582239856290 Mon Sep 17 00:00:00 2001 From: Nick Rolfe Date: Mon, 31 Oct 2022 11:24:30 +0000 Subject: [PATCH 05/27] ruby docs: remove distracting sentence --- .../codeql-language-guides/analyzing-data-flow-in-ruby.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst b/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst index feaa6415486..49a633ba2a7 100644 --- a/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst +++ b/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst @@ -97,7 +97,6 @@ The next section will give some concrete examples, but there is a more abstract A local source is a data-flow node with no local data flow into it. As such, it is a local origin of data flow, a place where a new value is created. This includes parameters (which only receive global data flow) and most expressions (because they are not value-preserving). -Restricting attention to such local sources gives a much lighter and more performant data-flow graph and in most cases also a more suitable abstraction for the investigation of interest. The class ``LocalSourceNode`` represents data-flow nodes that are also local sources. It comes with a useful member predicate ``flowsTo(DataFlow::Node node)``, which holds if there is local data flow from the local source to ``node``. From 23db9c573f2e7793160814074412660c42beb728 Mon Sep 17 00:00:00 2001 From: Nick Rolfe Date: Mon, 31 Oct 2022 16:25:34 +0000 Subject: [PATCH 06/27] Ruby docs: add LocalSourceNode and remove CfgNode from class list --- .../analyzing-data-flow-in-ruby.rst | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst b/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst index 49a633ba2a7..5d6b8c90ac4 100644 --- a/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst +++ b/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst @@ -308,11 +308,9 @@ Class hierarchy - ``DataFlow::Configuration`` - base class for custom global data flow analysis. - ``DataFlow::Node`` - an element behaving as a data-flow node. - - - ``DataFlow::CfgNode`` - a control-flow node behaving as a data-flow node. - - - ``DataFlow::ExprNode`` - an expression behaving as a data-flow node. - - ``DataFlow::ParameterNode`` - a parameter data-flow node representing the value of a parameter at method/block entry. + - ``DataFlow::LocalSourceNode`` - a local origin of data, as a data-flow node. + - ``DataFlow::ExprNode`` - an expression behaving as a data-flow node. + - ``DataFlow::ParameterNode`` - a parameter data-flow node representing the value of a parameter at method/block entry. - ``RemoteFlowSource`` - data flow from network/remote input. - ``Concepts::SystemCommandExecution`` - a data-flow node that executes an operating system command, for instance by spawning a new process. From a7ebbfb139b604d4074f24d271cb3f7203edd67b Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Sun, 30 Oct 2022 20:40:57 +0000 Subject: [PATCH 07/27] Ruby: WIP AST reference guide --- ...classes-for-working-with-ruby-programs.rst | 474 ++++++++++++++++++ 1 file changed, 474 insertions(+) create mode 100644 docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst new file mode 100644 index 00000000000..58b3c8eb5c2 --- /dev/null +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -0,0 +1,474 @@ +.. _abstract-syntax-tree-classes-for-working-with-ruby-programs: + +Abstract syntax tree classes for working with Ruby programs +=========================================================== + +CodeQL has a large selection of classes for representing the abstract syntax tree of Ruby programs. + +.. include:: ../reusables/abstract-syntax-tree.rst + +An ``IDENTIFIER`` should match the regular expression ``/[a-zA-Z_][a-zA-Z0-9_]*/``. A ``CNAME`` should match ``/[A-Z][a-zA-Z0-9_]*/``. + +Statement classes +~~~~~~~~~~~~~~~~~ + +This table lists subclasses of Stmt_ representing Ruby statements. +.. + TODO: FNAME definition + ++------------------------------------------------+--------------+----------------+---------+ +| Statement syntax | CodeQL class | Superclasses | Remarks | ++================================================+==============+================+=========+ +| ``alias`` FNAME FNAME | AliasStmt_ | Stmt_ | | +| ``BEGIN {`` StmtSequence_ ``}`` | BeginBlock_ | StmtSequence_ | | +| ``begin`` StmtSequence_ ``end`` | BeginExpr_ | StmtSequence_ | | +| ``break`` [Expr_] | BreakStmt_ | ReturningStmt_ | | +| ``;`` | EmptyStmt_ | Stmt_ | | +| ``END {`` StmtSequence_ ``}`` | EndBlock_ | StmtSequence_ | | +| ``next`` [Expr_] | NextStmt_ | ReturningStmt_ | | +| ``redo`` | RedoStmt_ | Stmt_ | | +| ``retry`` | RetryStmt_ | Stmt_ | | +| ``return`` [Expr_] | ReturnStmt_ | ReturningStmt_ | | +| ``undef`` FNAME (, FNAME) | UndefStmt_ | Stmt_ | | ++------------------------------------------------+--------------+----------------+---------+ + +Calls +~~~~~ + ++-------------------------+---------------------+----------------+-------------------------------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++=========================+=====================+================+===============================+ +| Expr_ ``[`` Expr_ ``]`` | ElementReference_ | MethodCall_ | | +| MethodName_ (, Expr_) | MethodCall_ | Call_ | | +| LhsExpr_ ``=`` Expr_ | SetterMethodCall_ | MethodCall_ | | +| ``super`` | SuperCall_ | MethodCall_ | | +| ``yield`` (, Expr_) | YieldCall_ | Call_ | | +| ``&IDENTIFIER`` | BlockArgument_ | Expr_ | Used as an argument to a call | +| ``...`` | ForwardedArguments_ | Expr_ | Used as an argument to a call | ++-------------------------+---------------------+----------------+-------------------------------+ + +Constant accesses +~~~~~~~~~~~~~~~~~ + +All classes in this subsection are subclasses of ConstantAccess_. + ++----------------------------------------+----------------------+----------------------+-------------------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++========================================+======================+======================+===================+ +| CNAME | ConstantReadAccess_ | ConstantAccess_ | | +| ``class`` CNAME StmtSequence_ ``end`` | ConstantWriteAccess_ | ConstantAccess_ | class definition | +| ``module`` CNAME StmtSequence_ ``end`` | ConstantWriteAccess_ | ConstantAccess_ | module definition | +| CNAME ``=`` Expr_ | ConstantAssignment_ | ConstantWriteAccess_ | | ++----------------------------------------+----------------------+----------------------+-------------------+ + +Control expressions +~~~~~~~~~~~~~~~~~~~ + +All classes in this subsection are subclasses of ControlExpr_. + ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++===============================================================================================================================+=====================+================================+=========+ +| ``if`` Expr_ ``then`` StmtSequence_ {``elsif`` Expr_ ``then`` StmtSequence_} [``else`` StmtSequence_] ``end`` | IfExpr_ | ConditionalExpr_, ControlExpr_ | | +| ``while`` Expr_ ``do`` StmtSequence_ ``end`` | WhileExpr_ | ConditionalLoop_ | | +| ``until`` Expr_ ``do`` StmtSequence_ ``end`` | UntilExpr_ | ConditionalLoop_ | | +| ``for`` LhsExpr_ ``in`` Expr_ ``do`` StmtSequence_ ``end`` | ForExpr_ | Loop_ | | +| Stmt_ ``while`` Expr_ | WhileModifierExpr_ | ConditionalLoop_ | | +| Stmt_ ``until`` Expr_ | UntilModifierExpr_ | ConditionalLoop_ | | +| Stmt_ ``if`` Expr_ | IfModifierExpr_ | ConditionalExpr_, ControlExpr_ | | +| Stmt_ ``unless`` Expr_ | UnlessModifierExpr_ | ConditionalExpr_, ControlExpr_ | | +| Expr_ ``?`` Stmt_ ``:`` Stmt_ | TernaryIfExpr_ | ConditionalExpr_, ControlExpr_ | | +| ``case`` Expr_ ``when`` Expr_ ``then`` StmtSequence_ {``when`` Expr_ ``then`` StmtSequence_} [``else`` StmtSequence_] ``end`` | CaseExpr_ | ControlExpr_ | | +| ``case when`` Expr_ ``then`` StmtSequence_ [``else`` StmtSequence_] ``end`` | CaseExpr_ | ControlExpr_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ + + +Unary operations +~~~~~~~~~~~~~~~~ + +All classes in this subsection are subclasses of UnaryOperation_. + ++--------------------+----------------+----------------------------+-------------------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++====================+================+============================+===================+ +| ``~`` Expr_ | ComplementExpr_ | UnaryBitwiseOperation_ | | +| ``defined?`` Expr_ | DefinedExpr_ | UnaryOperation_ | | +| ``**`` Expr_ | HashSplatExpr_ | UnaryOperation_ | | +| ``!`` Expr_ | NotExpr_ | UnaryOperation_ | | +| ``not`` Expr_ | NotExpr_ | UnaryOperation_ | | +| ``*`` Expr_ | SplatExpr_ | UnaryOperation_ | | +| ``-`` Expr_ | UnaryMinusExpr_ | UnaryArithmeticOperation_ | | +| ``+`` Expr_ | UnaryPlusExpr_ | UnaryArithmeticOperation_ | | ++--------------------+----------------+----------------------------+-------------------+ + +Binary operations +~~~~~~~~~~~~~~~~~ + +All classes in this subsection are subclasses of BinaryOperation_. + ++------------------------+--------------------------+----------------------------+-------------------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++========================+==========================+============================+===================+ +| Expr_ ``+`` Expr_ | AddExpr_ | BinaryArithmeticOperation_ | | +| LhsExpr_ ``+=`` Expr_ | AssignAddExpr_ | AssignArithmeticOperation_ | | +| LhsExpr_ ``&=`` Expr_ | AssignBitwiseAndExpr_ | AssignBitwiseOperation_ | | +| LhsExpr_ ``|=`` Expr_ | AssignBitwiseOrExpr_ | AssignBitwiseOperation_ | | +| LhsExpr_ ``^=`` Expr_ | AssignBitwiseXorExpr_ | AssignBitwiseOperation_ | | +| LhsExpr_ ``/=`` Expr_ | AssignDivExpr_ | AssignArithmeticOperation_ | | +| LhsExpr_ ``**=`` Expr_ | AssignExponentExpr_ | AssignArithmeticOperation_ | | +| LhsExpr_ ``<<=`` Expr_ | AssignBitwiseLShiftExpr_ | AssignBitwiseOperation_ | | +| LhsExpr_ ``&&=`` Expr_ | AssignLogicalAndExpr_ | BinaryLogicalOperation_ | | +| LhsExpr_ ``||=`` Expr_ | AssignLogicalOrExpr_ | BinaryLogicalOperation_ | | +| LhsExpr_ ``%=`` Expr_ | AssignModuloExpr_ | AssignArithmeticOperation_ | | +| LhsExpr_ ``*=`` Expr_ | AssignMulExpr_ | AssignArithmeticOperation_ | | +| LhsExpr_ ``>>=`` Expr_ | AssignBitwiseRShiftExpr_ | AssignBitwiseOperation_ | | +| LhsExpr_ ``-=`` Expr_ | AssignSubExpr_ | AssignArithmeticOperation_ | | +| Expr_ ``&`` Expr_ | BitwiseAndExpr_ | BinaryBitwiseOperation_ | | +| Expr_ ``|`` Expr_ | BitwiseOrExpr_ | BinaryBitwiseOperation_ | | +| Expr_ ``^`` Expr_ | BitwiseXorExpr_ | BinaryBitwiseOperation_ | | +| Expr_ ``===`` Expr_ | CaseEqExpr_ | EqualityOperation_ | | +| Expr_ ``/`` Expr_ | DivExpr_ | BinaryArithmeticOperation_ | | +| Expr_ ``===`` Expr_ | EqExpr_ | EqualityOperation_ | | +| Expr_ ``^`` Expr_ | ExponentExpr_ | BinaryArithmeticOperation_ | | +| Expr_ ``>=`` Expr_ | GEExpr_ | RelationalOperation_ | | +| Expr_ ``>`` Expr_ | GTExpr_ | RelationalOperation_ | | +| Expr_ ``<=`` Expr_ | LEExpr_ | RelationalOperation_ | | +| Expr_ ``<<`` Expr_ | LShiftExpr_ | BinaryBitwiseOperation_ | | +| Expr_ ``<`` Expr_ | LTExpr_ | RelationalOperation_ | | +| Expr_ ``&&`` Expr_ | LogicalAndExpr_ | BinaryLogicalOperation_ | | +| Expr_ ``and`` Expr_ | LogicalAndExpr_ | BinaryLogicalOperation_ | | +| Expr_ ``||`` Expr_ | LogicalOrExpr_ | BinaryLogicalOperation_ | | +| Expr_ ``or`` Expr_ | LogicalOrExpr_ | BinaryLogicalOperation_ | | +| Expr_ ``%`` Expr_ | ModuloExpr_ | BinaryArithmeticOperation_ | | +| Expr_ ``*`` Expr_ | MulExpr_ | BinaryArithmeticOperation_ | | +| Expr_ ``!=`` Expr_ | NEExpr_ | RelationalOperation_ | | +| Expr_ ``!~`` Expr_ | NoRegExpMatchExpr_ | BinaryOperation_ | | +| Expr_ ``>>`` Expr_ | RShiftExpr_ | BinaryBitwiseOperation_ | | +| Expr_ ``=~`` Expr_ | RegExpMatchExpr_ | BinaryOperation_ | | +| Expr_ ``<=>`` Expr_ | SpaceshipExpr_ | BinaryOperation_ | | +| Expr_ ``-`` Expr_ | SubExpr_ | BinaryArithmeticOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ + +Literals +~~~~~~~~ + +All classes in this subsection are subclasses of Literal_. + ++----------------------------+-------------------+----------------------------+-------------------+ +| Example expression syntax | CodeQL class | Superclasses | Remarks | ++============================+===================+============================+===================+ +| ``[1, 2]`` | ArrayLiteral_ | Literal_ | | +| ``true`` | BooleanLiteral_ | Literal_ | | +| ``?a`` | CharacterLiteral_ | Literal_ | | +| ``__ENCODING__`` | EncodingLiteral_ | Literal_ | | +| ``__FILE__`` | FileLiteral_ | Literal_ | | +| ``{ foo: 123, bar: 456 }`` | HashLiteral_ | Literal_ | | +| ``< Date: Mon, 31 Oct 2022 22:35:00 +0000 Subject: [PATCH 08/27] Ruby: AST ref docs - fix table formatting and some misnamed classes --- ...classes-for-working-with-ruby-programs.rst | 102 +++++++++++++++++- 1 file changed, 97 insertions(+), 5 deletions(-) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index 58b3c8eb5c2..dec7d0293b0 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -13,6 +13,7 @@ Statement classes ~~~~~~~~~~~~~~~~~ This table lists subclasses of Stmt_ representing Ruby statements. + .. TODO: FNAME definition @@ -20,15 +21,25 @@ This table lists subclasses of Stmt_ representing Ruby statements. | Statement syntax | CodeQL class | Superclasses | Remarks | +================================================+==============+================+=========+ | ``alias`` FNAME FNAME | AliasStmt_ | Stmt_ | | ++------------------------------------------------+--------------+----------------+---------+ | ``BEGIN {`` StmtSequence_ ``}`` | BeginBlock_ | StmtSequence_ | | ++------------------------------------------------+--------------+----------------+---------+ | ``begin`` StmtSequence_ ``end`` | BeginExpr_ | StmtSequence_ | | ++------------------------------------------------+--------------+----------------+---------+ | ``break`` [Expr_] | BreakStmt_ | ReturningStmt_ | | ++------------------------------------------------+--------------+----------------+---------+ | ``;`` | EmptyStmt_ | Stmt_ | | ++------------------------------------------------+--------------+----------------+---------+ | ``END {`` StmtSequence_ ``}`` | EndBlock_ | StmtSequence_ | | ++------------------------------------------------+--------------+----------------+---------+ | ``next`` [Expr_] | NextStmt_ | ReturningStmt_ | | ++------------------------------------------------+--------------+----------------+---------+ | ``redo`` | RedoStmt_ | Stmt_ | | ++------------------------------------------------+--------------+----------------+---------+ | ``retry`` | RetryStmt_ | Stmt_ | | ++------------------------------------------------+--------------+----------------+---------+ | ``return`` [Expr_] | ReturnStmt_ | ReturningStmt_ | | ++------------------------------------------------+--------------+----------------+---------+ | ``undef`` FNAME (, FNAME) | UndefStmt_ | Stmt_ | | +------------------------------------------------+--------------+----------------+---------+ @@ -39,11 +50,17 @@ Calls | Expression syntax | CodeQL class | Superclasses | Remarks | +=========================+=====================+================+===============================+ | Expr_ ``[`` Expr_ ``]`` | ElementReference_ | MethodCall_ | | ++-------------------------+---------------------+----------------+-------------------------------+ | MethodName_ (, Expr_) | MethodCall_ | Call_ | | ++-------------------------+---------------------+----------------+-------------------------------+ | LhsExpr_ ``=`` Expr_ | SetterMethodCall_ | MethodCall_ | | ++-------------------------+---------------------+----------------+-------------------------------+ | ``super`` | SuperCall_ | MethodCall_ | | ++-------------------------+---------------------+----------------+-------------------------------+ | ``yield`` (, Expr_) | YieldCall_ | Call_ | | ++-------------------------+---------------------+----------------+-------------------------------+ | ``&IDENTIFIER`` | BlockArgument_ | Expr_ | Used as an argument to a call | ++-------------------------+---------------------+----------------+-------------------------------+ | ``...`` | ForwardedArguments_ | Expr_ | Used as an argument to a call | +-------------------------+---------------------+----------------+-------------------------------+ @@ -56,8 +73,11 @@ All classes in this subsection are subclasses of ConstantAccess_. | Expression syntax | CodeQL class | Superclasses | Remarks | +========================================+======================+======================+===================+ | CNAME | ConstantReadAccess_ | ConstantAccess_ | | ++----------------------------------------+----------------------+----------------------+-------------------+ | ``class`` CNAME StmtSequence_ ``end`` | ConstantWriteAccess_ | ConstantAccess_ | class definition | ++----------------------------------------+----------------------+----------------------+-------------------+ | ``module`` CNAME StmtSequence_ ``end`` | ConstantWriteAccess_ | ConstantAccess_ | module definition | ++----------------------------------------+----------------------+----------------------+-------------------+ | CNAME ``=`` Expr_ | ConstantAssignment_ | ConstantWriteAccess_ | | +----------------------------------------+----------------------+----------------------+-------------------+ @@ -70,15 +90,25 @@ All classes in this subsection are subclasses of ControlExpr_. | Expression syntax | CodeQL class | Superclasses | Remarks | +===============================================================================================================================+=====================+================================+=========+ | ``if`` Expr_ ``then`` StmtSequence_ {``elsif`` Expr_ ``then`` StmtSequence_} [``else`` StmtSequence_] ``end`` | IfExpr_ | ConditionalExpr_, ControlExpr_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ | ``while`` Expr_ ``do`` StmtSequence_ ``end`` | WhileExpr_ | ConditionalLoop_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ | ``until`` Expr_ ``do`` StmtSequence_ ``end`` | UntilExpr_ | ConditionalLoop_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ | ``for`` LhsExpr_ ``in`` Expr_ ``do`` StmtSequence_ ``end`` | ForExpr_ | Loop_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ | Stmt_ ``while`` Expr_ | WhileModifierExpr_ | ConditionalLoop_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ | Stmt_ ``until`` Expr_ | UntilModifierExpr_ | ConditionalLoop_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ | Stmt_ ``if`` Expr_ | IfModifierExpr_ | ConditionalExpr_, ControlExpr_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ | Stmt_ ``unless`` Expr_ | UnlessModifierExpr_ | ConditionalExpr_, ControlExpr_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ | Expr_ ``?`` Stmt_ ``:`` Stmt_ | TernaryIfExpr_ | ConditionalExpr_, ControlExpr_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ | ``case`` Expr_ ``when`` Expr_ ``then`` StmtSequence_ {``when`` Expr_ ``then`` StmtSequence_} [``else`` StmtSequence_] ``end`` | CaseExpr_ | ControlExpr_ | | ++-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ | ``case when`` Expr_ ``then`` StmtSequence_ [``else`` StmtSequence_] ``end`` | CaseExpr_ | ControlExpr_ | | +-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ @@ -92,12 +122,19 @@ All classes in this subsection are subclasses of UnaryOperation_. | Expression syntax | CodeQL class | Superclasses | Remarks | +====================+================+============================+===================+ | ``~`` Expr_ | ComplementExpr_ | UnaryBitwiseOperation_ | | ++--------------------+----------------+----------------------------+-------------------+ | ``defined?`` Expr_ | DefinedExpr_ | UnaryOperation_ | | ++--------------------+----------------+----------------------------+-------------------+ | ``**`` Expr_ | HashSplatExpr_ | UnaryOperation_ | | ++--------------------+----------------+----------------------------+-------------------+ | ``!`` Expr_ | NotExpr_ | UnaryOperation_ | | ++--------------------+----------------+----------------------------+-------------------+ | ``not`` Expr_ | NotExpr_ | UnaryOperation_ | | ++--------------------+----------------+----------------------------+-------------------+ | ``*`` Expr_ | SplatExpr_ | UnaryOperation_ | | ++--------------------+----------------+----------------------------+-------------------+ | ``-`` Expr_ | UnaryMinusExpr_ | UnaryArithmeticOperation_ | | ++--------------------+----------------+----------------------------+-------------------+ | ``+`` Expr_ | UnaryPlusExpr_ | UnaryArithmeticOperation_ | | +--------------------+----------------+----------------------------+-------------------+ @@ -110,42 +147,79 @@ All classes in this subsection are subclasses of BinaryOperation_. | Expression syntax | CodeQL class | Superclasses | Remarks | +========================+==========================+============================+===================+ | Expr_ ``+`` Expr_ | AddExpr_ | BinaryArithmeticOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``+=`` Expr_ | AssignAddExpr_ | AssignArithmeticOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``&=`` Expr_ | AssignBitwiseAndExpr_ | AssignBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``|=`` Expr_ | AssignBitwiseOrExpr_ | AssignBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``^=`` Expr_ | AssignBitwiseXorExpr_ | AssignBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``/=`` Expr_ | AssignDivExpr_ | AssignArithmeticOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``**=`` Expr_ | AssignExponentExpr_ | AssignArithmeticOperation_ | | -| LhsExpr_ ``<<=`` Expr_ | AssignBitwiseLShiftExpr_ | AssignBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ +| LhsExpr_ ``<<=`` Expr_ | AssignLShiftExpr_ | AssignBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``&&=`` Expr_ | AssignLogicalAndExpr_ | BinaryLogicalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``||=`` Expr_ | AssignLogicalOrExpr_ | BinaryLogicalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``%=`` Expr_ | AssignModuloExpr_ | AssignArithmeticOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``*=`` Expr_ | AssignMulExpr_ | AssignArithmeticOperation_ | | -| LhsExpr_ ``>>=`` Expr_ | AssignBitwiseRShiftExpr_ | AssignBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ +| LhsExpr_ ``>>=`` Expr_ | AssignRShiftExpr_ | AssignBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | LhsExpr_ ``-=`` Expr_ | AssignSubExpr_ | AssignArithmeticOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``&`` Expr_ | BitwiseAndExpr_ | BinaryBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``|`` Expr_ | BitwiseOrExpr_ | BinaryBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``^`` Expr_ | BitwiseXorExpr_ | BinaryBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``===`` Expr_ | CaseEqExpr_ | EqualityOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``/`` Expr_ | DivExpr_ | BinaryArithmeticOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``===`` Expr_ | EqExpr_ | EqualityOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``^`` Expr_ | ExponentExpr_ | BinaryArithmeticOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``>=`` Expr_ | GEExpr_ | RelationalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``>`` Expr_ | GTExpr_ | RelationalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``<=`` Expr_ | LEExpr_ | RelationalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``<<`` Expr_ | LShiftExpr_ | BinaryBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``<`` Expr_ | LTExpr_ | RelationalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``&&`` Expr_ | LogicalAndExpr_ | BinaryLogicalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``and`` Expr_ | LogicalAndExpr_ | BinaryLogicalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``||`` Expr_ | LogicalOrExpr_ | BinaryLogicalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``or`` Expr_ | LogicalOrExpr_ | BinaryLogicalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``%`` Expr_ | ModuloExpr_ | BinaryArithmeticOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``*`` Expr_ | MulExpr_ | BinaryArithmeticOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``!=`` Expr_ | NEExpr_ | RelationalOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``!~`` Expr_ | NoRegExpMatchExpr_ | BinaryOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``>>`` Expr_ | RShiftExpr_ | BinaryBitwiseOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``=~`` Expr_ | RegExpMatchExpr_ | BinaryOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``<=>`` Expr_ | SpaceshipExpr_ | BinaryOperation_ | | ++------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``-`` Expr_ | SubExpr_ | BinaryArithmeticOperation_ | | +------------------------+--------------------------+----------------------------+-------------------+ @@ -158,25 +232,43 @@ All classes in this subsection are subclasses of Literal_. | Example expression syntax | CodeQL class | Superclasses | Remarks | +============================+===================+============================+===================+ | ``[1, 2]`` | ArrayLiteral_ | Literal_ | | ++----------------------------+-------------------+----------------------------+-------------------+ | ``true`` | BooleanLiteral_ | Literal_ | | ++----------------------------+-------------------+----------------------------+-------------------+ | ``?a`` | CharacterLiteral_ | Literal_ | | ++----------------------------+-------------------+----------------------------+-------------------+ | ``__ENCODING__`` | EncodingLiteral_ | Literal_ | | ++----------------------------+-------------------+----------------------------+-------------------+ | ``__FILE__`` | FileLiteral_ | Literal_ | | ++----------------------------+-------------------+----------------------------+-------------------+ | ``{ foo: 123, bar: 456 }`` | HashLiteral_ | Literal_ | | -| ``< Date: Tue, 1 Nov 2022 11:35:26 +0100 Subject: [PATCH 09/27] Apply suggestions from code review Co-authored-by: Felicity Chapman --- .../using-api-graphs-in-ruby.rst | 48 ++++++++++--------- 1 file changed, 25 insertions(+), 23 deletions(-) diff --git a/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst b/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst index af3b1ecfdec..96145c51422 100644 --- a/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst +++ b/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst @@ -9,16 +9,15 @@ external libraries. About this article ------------------ -This article describes how to use API graphs to reference classes and functions defined in library -code. You can use API graphs to conveniently refer to external library functions when defining things like -remote flow sources. +This article describes how you can use API graphs to reference classes and functions defined in library +code. API graphs are particularly useful when you want to model the remote flow sources available from external library functions. Module and class references --------------------------- -The most common entry point into the API graph will be the point where a toplevel module or class is -accessed. For example, you can access the API graph node corresponding to the ``::Regexp`` class +The most common entry point into the API graph is when a top-level module or class is accessed. +For example, you can access the API graph node corresponding to the ``::Regexp`` class by using the ``API::getTopLevelMember`` method defined in the ``codeql.ruby.ApiGraphs`` module, as the following snippet demonstrates. @@ -29,7 +28,7 @@ following snippet demonstrates. select API::getTopLevelMember("Regexp") This query selects the API graph nodes corresponding to references to the ``Regexp`` class. For nested -modules and classes, you can use the ``getMember` method. For example the following query selects +modules and classes, you can use the ``getMember`` method. For example the following query selects references to the ``Net::HTTP`` class. .. code-block:: ql @@ -38,9 +37,8 @@ references to the ``Net::HTTP`` class. select API::getTopLevelMember("Net").getMember("HTTP") -Note that the given module name *must not* contain any ```::`` symbols. Thus, something like -`API::getTopLevelMember("Net::HTTP")`` will not do what you expect. Instead, this should be decomposed -into an access of the ``HTTP`` member of the API graph node for ``Net``, as in the example above. +Note that you should specify module names without ``::`` symbols. If you write ``API::getTopLevelMember("Net::HTTP")``, it will not do what you expect. Instead, you need to decompose this name +into an access of the ``HTTP`` member of the API graph node for ``Net``, as shown in the example above. Calls and class instantiations ------------------------------ @@ -78,13 +76,13 @@ The following snippet builds on the above to find calls of the ``Regexp#match?`` Subclasses ---------- -For many libraries, the main mode of usage is to extend one or more library classes. To track this +Many libraries are used by extending one or more library classes. To track this in the API graph, you can use the ``getASubclass`` method to get the API graph node corresponding to -all the immediate subclasses of this node. To find *all* subclasses, use ``*`` or ``+`` to apply the -method repeatedly, as in ``getASubclass*``. +the immediate subclasses of a node. To find *all* subclasses, use ``*`` or ``+`` to apply the +method repeatedly. You can see an example where all subclasses are identified using ``getASubclass*`` below. -Note that ``getASubclass`` does not account for any subclassing that takes place in library code -that has not been extracted. Thus, it may be necessary to account for this in the models you write. +Note that ``getASubclass`` can only return subclasses that are extracted as part of the CodeQL database +that you are analyzing. When libraries have predefined subclasses, you will need to explicitly include them in your model. For example, the ``ActionController::Base`` class has a predefined subclass ``Rails::ApplicationController``. To find all subclasses of ``ActionController::Base``, you must explicitly include the subclasses of ``Rails::ApplicationController`` as well. @@ -109,10 +107,14 @@ Using the API graph in dataflow queries Dataflow queries often search for points where data from external sources enters the code base as well as places where data leaves the code base. API graphs provide a convenient way to refer -to external API components such as library functions and their inputs and outputs. API graph nodes -cannot be used directly in dataflow queries they model entities that are defined externally, -while dataflow nodes correspond to entities defined in the current code base. To brigde this gap -the API node classes provide the ``asSource()`` and ``asSink()`` methods. +to external API components such as library functions and their inputs and outputs. +However, you do not use API graph nodes directly in dataflow queries. + +- API graph nodes model entities that are defined outside your code base. +- Dataflow nodes model entities defined within the current code base. + +You bridge the gap between the entities outside and inside your code base using +the API node class methods: ``asSource()`` and ``asSink()``. The ``asSource()`` method is used to select dataflow nodes where a value from an external source enters the current code base. A typical example is the return value of a library function such as @@ -135,15 +137,15 @@ of the ``File.write(path, value)`` method. select API::getTopLevelMember("File").getMethod("write").getParameter(1).asSink() -A more complex example is a call to ``File.open`` with a block argument. This function creates a ```File`` instance -and passes it to the supplied block. In this case the first parameter of the block is the place where an -externally created value enters the code base, i.e. the ``|file|`` in the example below: +A more complex example is a call to ``File.open`` with a block argument. This function creates a ``File`` instance +and passes it to the supplied block. In this case, we are interested in the first parameter of the block because this is where an +externally created value enters the code base, that is, the ``|file|`` in the Ruby example below: .. code-block:: ruby File.open("/my/file.txt", "w") { |file| file << "Hello world" } -The following snippet finds parameters of blocks of ``File.open`` method calls: +The following snippet of CodeQL finds parameters of blocks of ``File.open`` method calls: .. code-block:: ql @@ -152,7 +154,7 @@ The following snippet finds parameters of blocks of ``File.open`` method calls: select API::getTopLevelMember("File").getMethod("open").getBlock().getParameter(0).asSource() The following example is a dataflow query that that uses API graphs to find cases where data that -is read flows into a call to ```File.write``. +is read flows into a call to ``File.write``. .. code-block:: ql From d061df2e124a787fcfb14ed718350668be405472 Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Tue, 1 Nov 2022 15:24:23 +0000 Subject: [PATCH 10/27] Ruby: AST ref docs - Module.qll --- ...classes-for-working-with-ruby-programs.rst | 31 +++++++++---------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index dec7d0293b0..65877e0f72c 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -8,6 +8,7 @@ CodeQL has a large selection of classes for representing the abstract syntax tre .. include:: ../reusables/abstract-syntax-tree.rst An ``IDENTIFIER`` should match the regular expression ``/[a-zA-Z_][a-zA-Z0-9_]*/``. A ``CNAME`` should match ``/[A-Z][a-zA-Z0-9_]*/``. +``TERM`` is either a semicolon or a newline used to terminate a statement. Statement classes ~~~~~~~~~~~~~~~~~ @@ -74,10 +75,6 @@ All classes in this subsection are subclasses of ConstantAccess_. +========================================+======================+======================+===================+ | CNAME | ConstantReadAccess_ | ConstantAccess_ | | +----------------------------------------+----------------------+----------------------+-------------------+ -| ``class`` CNAME StmtSequence_ ``end`` | ConstantWriteAccess_ | ConstantAccess_ | class definition | -+----------------------------------------+----------------------+----------------------+-------------------+ -| ``module`` CNAME StmtSequence_ ``end`` | ConstantWriteAccess_ | ConstantAccess_ | module definition | -+----------------------------------------+----------------------+----------------------+-------------------+ | CNAME ``=`` Expr_ | ConstantAssignment_ | ConstantWriteAccess_ | | +----------------------------------------+----------------------+----------------------+-------------------+ @@ -272,6 +269,19 @@ All classes in this subsection are subclasses of Literal_. | ``:foo`` | SymbolLiteral_ | StringlikeLiteral_ | | +----------------------------+-------------------+----------------------------+-------------------+ +Modules and Ruby classes +~~~~~~~~~~~~~~~~~~~~~~~~ + +All classes in this subsection are subclasses of BodyStmt_ and Scope_. + ++----------------------------------------------------------------+--------------------+----------------------------------+-------------------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++================================================================+====================+==================================+===================+ +| ``class`` CNAME [``<`` Expr_] TERM StmtSequence_ TERM ``end`` | ClassDeclaration_ | Namespace_, ConstantWriteAccess_ | | +| ``module`` CNAME TERM StmtSequence_ TERM ``end`` | ModuleDeclaration_ | Namespace_, ConstantWriteAccess_ | | +| ``class <<`` Expr_ TERM StmtSequence_ TERM ``end`` | SingletonClass_ | ModuleBase_ | | ++----------------------------------------------------------------+--------------------+----------------------------------+-------------------+ + Method classes ~~~~~~~~~~~~~~ @@ -297,20 +307,9 @@ All classes in this subsection are subclasses of Callable_. | ``(`` StmtSequence_ ``)`` | ParenthesizedExpr_ | StmtSequence_ | | | ``rescue`` TODO | RescueClause_ | Expr_ | | | Stmt_ ``rescue`` Stmt_ | RescueModifierExpr_ | Expr_ | | -| StmtSequence_ ``;`` Stmt_ | StmtSequence_ | Expr_ | A sequence of 0 or more statements, separated by semicolons or newlines | +| StmtSequence_ TERM Stmt_ | StmtSequence_ | Expr_ | A sequence of 0 or more statements, separated by semicolons or newlines | | StringLiteral_ StringLiteral_ | StringConcatenation_ | Expr_ | Implicit concatenation of consecutive string literals | - -.. - Module.qll -| | ClassDeclaration_ | | | -| | Module_ | | | -| | ModuleBase_ | | | -| | ModuleDeclaration_ | | | -| | Namespace_ | | | -| | SingletonClass_ | | | -| | Toplevel_ | | | - .. Parameter.qll | | BlockParameter_ | | | From 1a702bfd5015e8e4e6de3c09fa0cd5f853585b89 Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Tue, 1 Nov 2022 17:26:36 +0000 Subject: [PATCH 11/27] Add new article to `toctree` to fix test --- docs/codeql/codeql-language-guides/codeql-for-ruby.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst index 8e2dfe267e3..17bb8749120 100644 --- a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst +++ b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst @@ -10,6 +10,7 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat basic-query-for-ruby-code codeql-library-for-ruby + analyzing-data-flow-in-ruby - :doc:`Basic query for Ruby code `: Learn to write and run a simple CodeQL query using LGTM. From e6f91b91e0320d09150a2c0f99495cc35f935799 Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Tue, 1 Nov 2022 23:48:23 +0000 Subject: [PATCH 12/27] Ruby: AST ref docs - initial draft --- ...classes-for-working-with-ruby-programs.rst | 343 ++++++++++-------- 1 file changed, 197 insertions(+), 146 deletions(-) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index 65877e0f72c..08ebadd81aa 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -7,63 +7,63 @@ CodeQL has a large selection of classes for representing the abstract syntax tre .. include:: ../reusables/abstract-syntax-tree.rst -An ``IDENTIFIER`` should match the regular expression ``/[a-zA-Z_][a-zA-Z0-9_]*/``. A ``CNAME`` should match ``/[A-Z][a-zA-Z0-9_]*/``. -``TERM`` is either a semicolon or a newline used to terminate a statement. +* An ``IDENTIFIER`` denotes an arbitrary identifier. +* A ``CNAME`` denotes a class or module name. +* An ``FNAME`` denotes a method name. +* A ``TERM`` is either a semicolon or a newline used to terminate a statement. +* Elements enclosed in ``« »`` are grouped and may be suffixed by ``?``, ``*``, or ``+`` to denote 0 or 1 occurrences, 0 or more occurrences, and 1 or more occurrences respectively. Statement classes ~~~~~~~~~~~~~~~~~ This table lists subclasses of Stmt_ representing Ruby statements. -.. - TODO: FNAME definition - -+------------------------------------------------+--------------+----------------+---------+ -| Statement syntax | CodeQL class | Superclasses | Remarks | -+================================================+==============+================+=========+ -| ``alias`` FNAME FNAME | AliasStmt_ | Stmt_ | | -+------------------------------------------------+--------------+----------------+---------+ -| ``BEGIN {`` StmtSequence_ ``}`` | BeginBlock_ | StmtSequence_ | | -+------------------------------------------------+--------------+----------------+---------+ -| ``begin`` StmtSequence_ ``end`` | BeginExpr_ | StmtSequence_ | | -+------------------------------------------------+--------------+----------------+---------+ -| ``break`` [Expr_] | BreakStmt_ | ReturningStmt_ | | -+------------------------------------------------+--------------+----------------+---------+ -| ``;`` | EmptyStmt_ | Stmt_ | | -+------------------------------------------------+--------------+----------------+---------+ -| ``END {`` StmtSequence_ ``}`` | EndBlock_ | StmtSequence_ | | -+------------------------------------------------+--------------+----------------+---------+ -| ``next`` [Expr_] | NextStmt_ | ReturningStmt_ | | -+------------------------------------------------+--------------+----------------+---------+ -| ``redo`` | RedoStmt_ | Stmt_ | | -+------------------------------------------------+--------------+----------------+---------+ -| ``retry`` | RetryStmt_ | Stmt_ | | -+------------------------------------------------+--------------+----------------+---------+ -| ``return`` [Expr_] | ReturnStmt_ | ReturningStmt_ | | -+------------------------------------------------+--------------+----------------+---------+ -| ``undef`` FNAME (, FNAME) | UndefStmt_ | Stmt_ | | -+------------------------------------------------+--------------+----------------+---------+ ++---------------------------------+--------------+----------------+---------+ +| Statement syntax | CodeQL class | Superclasses | Remarks | ++=================================+==============+================+=========+ +| ``alias`` FNAME FNAME | AliasStmt_ | Stmt_ | | ++---------------------------------+--------------+----------------+---------+ +| ``BEGIN {`` StmtSequence_ ``}`` | BeginBlock_ | StmtSequence_ | | ++---------------------------------+--------------+----------------+---------+ +| ``begin`` StmtSequence_ ``end`` | BeginExpr_ | StmtSequence_ | | ++---------------------------------+--------------+----------------+---------+ +| ``break`` «Expr_»? | BreakStmt_ | ReturningStmt_ | | ++---------------------------------+--------------+----------------+---------+ +| ``;`` | EmptyStmt_ | Stmt_ | | ++---------------------------------+--------------+----------------+---------+ +| ``END {`` StmtSequence_ ``}`` | EndBlock_ | StmtSequence_ | | ++---------------------------------+--------------+----------------+---------+ +| ``next`` «Expr_»? | NextStmt_ | ReturningStmt_ | | ++---------------------------------+--------------+----------------+---------+ +| ``redo`` | RedoStmt_ | Stmt_ | | ++---------------------------------+--------------+----------------+---------+ +| ``retry`` | RetryStmt_ | Stmt_ | | ++---------------------------------+--------------+----------------+---------+ +| ``return`` «Expr_»? | ReturnStmt_ | ReturningStmt_ | | ++---------------------------------+--------------+----------------+---------+ +| ``undef`` «FNAME ``,``»+ | UndefStmt_ | Stmt_ | | ++---------------------------------+--------------+----------------+---------+ Calls ~~~~~ -+-------------------------+---------------------+----------------+-------------------------------+ -| Expression syntax | CodeQL class | Superclasses | Remarks | -+=========================+=====================+================+===============================+ -| Expr_ ``[`` Expr_ ``]`` | ElementReference_ | MethodCall_ | | -+-------------------------+---------------------+----------------+-------------------------------+ -| MethodName_ (, Expr_) | MethodCall_ | Call_ | | -+-------------------------+---------------------+----------------+-------------------------------+ -| LhsExpr_ ``=`` Expr_ | SetterMethodCall_ | MethodCall_ | | -+-------------------------+---------------------+----------------+-------------------------------+ -| ``super`` | SuperCall_ | MethodCall_ | | -+-------------------------+---------------------+----------------+-------------------------------+ -| ``yield`` (, Expr_) | YieldCall_ | Call_ | | -+-------------------------+---------------------+----------------+-------------------------------+ -| ``&IDENTIFIER`` | BlockArgument_ | Expr_ | Used as an argument to a call | -+-------------------------+---------------------+----------------+-------------------------------+ -| ``...`` | ForwardedArguments_ | Expr_ | Used as an argument to a call | -+-------------------------+---------------------+----------------+-------------------------------+ ++----------------------------+---------------------+----------------+-------------------------------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++============================+=====================+================+===============================+ +| Expr_ ``[`` Expr_ ``]`` | ElementReference_ | MethodCall_ | | ++----------------------------+---------------------+----------------+-------------------------------+ +| MethodName_ «Expr_ ``,``»* | MethodCall_ | Call_ | | ++----------------------------+---------------------+----------------+-------------------------------+ +| LhsExpr_ ``=`` Expr_ | SetterMethodCall_ | MethodCall_ | | ++----------------------------+---------------------+----------------+-------------------------------+ +| ``super`` | SuperCall_ | MethodCall_ | | ++----------------------------+---------------------+----------------+-------------------------------+ +| ``yield`` «Expr_ ``,``»* | YieldCall_ | Call_ | | ++----------------------------+---------------------+----------------+-------------------------------+ +| ``&IDENTIFIER`` | BlockArgument_ | Expr_ | Used as an argument to a call | ++----------------------------+---------------------+----------------+-------------------------------+ +| ``...`` | ForwardedArguments_ | Expr_ | Used as an argument to a call | ++----------------------------+---------------------+----------------+-------------------------------+ Constant accesses ~~~~~~~~~~~~~~~~~ @@ -83,32 +83,33 @@ Control expressions All classes in this subsection are subclasses of ControlExpr_. -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| Expression syntax | CodeQL class | Superclasses | Remarks | -+===============================================================================================================================+=====================+================================+=========+ -| ``if`` Expr_ ``then`` StmtSequence_ {``elsif`` Expr_ ``then`` StmtSequence_} [``else`` StmtSequence_] ``end`` | IfExpr_ | ConditionalExpr_, ControlExpr_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| ``while`` Expr_ ``do`` StmtSequence_ ``end`` | WhileExpr_ | ConditionalLoop_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| ``until`` Expr_ ``do`` StmtSequence_ ``end`` | UntilExpr_ | ConditionalLoop_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| ``for`` LhsExpr_ ``in`` Expr_ ``do`` StmtSequence_ ``end`` | ForExpr_ | Loop_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| Stmt_ ``while`` Expr_ | WhileModifierExpr_ | ConditionalLoop_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| Stmt_ ``until`` Expr_ | UntilModifierExpr_ | ConditionalLoop_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| Stmt_ ``if`` Expr_ | IfModifierExpr_ | ConditionalExpr_, ControlExpr_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| Stmt_ ``unless`` Expr_ | UnlessModifierExpr_ | ConditionalExpr_, ControlExpr_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| Expr_ ``?`` Stmt_ ``:`` Stmt_ | TernaryIfExpr_ | ConditionalExpr_, ControlExpr_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| ``case`` Expr_ ``when`` Expr_ ``then`` StmtSequence_ {``when`` Expr_ ``then`` StmtSequence_} [``else`` StmtSequence_] ``end`` | CaseExpr_ | ControlExpr_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ -| ``case when`` Expr_ ``then`` StmtSequence_ [``else`` StmtSequence_] ``end`` | CaseExpr_ | ControlExpr_ | | -+-------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ - ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++=================================================================================================================================+=====================+================================+=========+ +| ``if`` Expr_ ``then`` StmtSequence_ «``elsif`` Expr_ ``then`` StmtSequence_»* «``else`` StmtSequence_»? ``end`` | IfExpr_ | ConditionalExpr_, ControlExpr_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| ``while`` Expr_ ``do`` StmtSequence_ ``end`` | WhileExpr_ | ConditionalLoop_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| ``until`` Expr_ ``do`` StmtSequence_ ``end`` | UntilExpr_ | ConditionalLoop_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| ``for`` LhsExpr_ ``in`` Expr_ ``do`` StmtSequence_ ``end`` | ForExpr_ | Loop_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| Stmt_ ``while`` Expr_ | WhileModifierExpr_ | ConditionalLoop_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| Stmt_ ``until`` Expr_ | UntilModifierExpr_ | ConditionalLoop_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| Stmt_ ``if`` Expr_ | IfModifierExpr_ | ConditionalExpr_, ControlExpr_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| Stmt_ ``unless`` Expr_ | UnlessModifierExpr_ | ConditionalExpr_, ControlExpr_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| Expr_ ``?`` Stmt_ ``:`` Stmt_ | TernaryIfExpr_ | ConditionalExpr_, ControlExpr_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| ``case`` Expr_ ``when`` Expr_ ``then`` StmtSequence_ «``when`` Expr_ ``then`` StmtSequence_»* «``else`` StmtSequence_»? ``end`` | CaseExpr_ | ControlExpr_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| ``case when`` Expr_ ``then`` StmtSequence_ «``else`` StmtSequence_»? ``end`` | CaseExpr_ | ControlExpr_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ +| ``case`` Expr_ ``in`` «TERM CaseExpr_»+ ``end`` f | CaseExpr_ | ControlExpr_ | | ++---------------------------------------------------------------------------------------------------------------------------------+---------------------+--------------------------------+---------+ Unary operations ~~~~~~~~~~~~~~~~ @@ -277,89 +278,139 @@ All classes in this subsection are subclasses of BodyStmt_ and Scope_. +----------------------------------------------------------------+--------------------+----------------------------------+-------------------+ | Expression syntax | CodeQL class | Superclasses | Remarks | +================================================================+====================+==================================+===================+ -| ``class`` CNAME [``<`` Expr_] TERM StmtSequence_ TERM ``end`` | ClassDeclaration_ | Namespace_, ConstantWriteAccess_ | | +| ``class`` CNAME «``<`` Expr_»? TERM StmtSequence_ TERM ``end`` | ClassDeclaration_ | Namespace_, ConstantWriteAccess_ | | ++----------------------------------------------------------------+--------------------+----------------------------------+-------------------+ | ``module`` CNAME TERM StmtSequence_ TERM ``end`` | ModuleDeclaration_ | Namespace_, ConstantWriteAccess_ | | ++----------------------------------------------------------------+--------------------+----------------------------------+-------------------+ | ``class <<`` Expr_ TERM StmtSequence_ TERM ``end`` | SingletonClass_ | ModuleBase_ | | +----------------------------------------------------------------+--------------------+----------------------------------+-------------------+ -Method classes -~~~~~~~~~~~~~~ +Callable classes +~~~~~~~~~~~~~~~~ All classes in this subsection are subclasses of Callable_. -+----------------------------------------+----------------------+----------------------+-------------------+ -| Expression syntax | CodeQL class | Superclasses | Remarks | -+========================================+======================+======================+===================+ -| | BraceBlock_ | Block_ | | -| | Callable_ | | | -| | DoBlock_ | Block_ | | -| | Lambda_ | | | -| | Method_ | | | -| | MethodBase_ | | | -| | SingletonMethod_ | | | -.. - Expr.qll -| TODO | ArgumentList_ | Expr_ | The right-hand side of an assignment or a ``return``, ``break``, or ``next`` statement | -| StmtSequence_ TODO | BodyStmt_ | StmtSequence_ | | -| TODO | DestructuredLhsExpr_ | LhsExpr_ | | -| TODO | LhsExpr_ | Expr_ | | -| ``IDENTIFIER:`` Expr_ | Pair_ | Expr_ | Such as in a hash or as a keyword argument | -| ``(`` StmtSequence_ ``)`` | ParenthesizedExpr_ | StmtSequence_ | | -| ``rescue`` TODO | RescueClause_ | Expr_ | | -| Stmt_ ``rescue`` Stmt_ | RescueModifierExpr_ | Expr_ | | -| StmtSequence_ TERM Stmt_ | StmtSequence_ | Expr_ | A sequence of 0 or more statements, separated by semicolons or newlines | -| StringLiteral_ StringLiteral_ | StringConcatenation_ | Expr_ | Implicit concatenation of consecutive string literals | ++----------------------------------------------------------------------+----------------------+----------------------+-------------------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++======================================================================+======================+======================+===================+ +| ``{`` «``|`` «Parameter_ ``,``»* ``|``»? StmtSequence_ ``}`` | BraceBlock_ | Block_ | | ++----------------------------------------------------------------------+----------------------+----------------------+-------------------+ +| ``do`` «``|`` «Parameter_ ``,``»* ``|``»? BodyStmt_ ``end`` | DoBlock_ | Block_, BodyStmt_ | | ++----------------------------------------------------------------------+----------------------+----------------------+-------------------+ +| ``-> (`` «Parameter_ ``,``»* ``)`` ``{`` StmtSequence_ ``}`` | Lambda_ | Callable_, BodyStmt_ | | ++----------------------------------------------------------------------+----------------------+----------------------+-------------------+ +| ``-> (`` «Parameter_ ``,``»* ``)`` ``do`` BodyStmt_ ``end`` | Lambda_ | Callable_, BodyStmt_ | | ++----------------------------------------------------------------------+----------------------+----------------------+-------------------+ +| ``def`` FNAME «Parameter_ ``,``»* TERM BodyStmt_ TERM ``end`` | Method_ | MethodBase_ | | ++----------------------------------------------------------------------+----------------------+----------------------+-------------------+ +| ``def self.`` FNAME «Parameter_ ``,``»* TERM BodyStmt_ TERM ``end`` | SingletonMethod_ | MethodBase_ | | ++----------------------------------------------------------------------+----------------------+----------------------+-------------------+ -.. - Parameter.qll -| | BlockParameter_ | | | -| | DestructuredParameter_ | | | -| | ForwardParameter_ | | | -| | HashSplatNilParameter_ | | | -| | HashSplatParameter_ | | | -| | KeywordParameter_ | | | -| | NamedParameter_ | | | -| | OptionalParameter_ | | | -| | Parameter_ | | | -| | SimpleParameter_ | | | -| | SplatParameter_ | | | +Parameter classes +~~~~~~~~~~~~~~~~~ -.. - Pattern.qll -| | AlternativePattern_ | | | -| | ArrayPattern_ | | | -| | AsPattern_ | | | -| | CasePattern_ | | | -| | FindPattern_ | | | -| | HashPattern_ | | | -| | ParenthesizedPattern_ | | | -| | ReferencePattern_ | | | +All classes in this subsection are subclasses of Parameter_. -.. - Variable.qll -| | ClassVariable_ | | | -| | ClassVariableAccess_ | | | -| | ClassVariableReadAccess_ | | | -| | ClassVariableWriteAccess_ | | | -| | GlobalVariable_ | | | -| | GlobalVariableAccess_ | | | -| | GlobalVariableReadAccess_ | | | -| | GlobalVariableWriteAccess_ | | | -| | InstanceVariable_ | | | -| | InstanceVariableAccess_ | | | -| | InstanceVariableReadAccess_ | | | -| | InstanceVariableWriteAccess_ | | | -| | LocalVariable_ | | | -| | LocalVariableAccess_ | | | -| | LocalVariableReadAccess_ | | | -| | LocalVariableWriteAccess_ | | | -| | SelfVariable_ | | | -| | SelfVariableAccess_ | | | -| | SelfVariableReadAccess_ | | | -| | Variable_ | | | -| | VariableAccess_ | | | -| | VariableReadAccess_ | | | -| | VariableWriteAccess_ | | | ++------------------------------------------------------------+------------------------+----------------------+--------------------------------------------------------------------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++============================================================+========================+======================+====================================================================+ +| ``&`` IDENTIFIER | BlockParameter_ | NamedParameter_ | | ++------------------------------------------------------------+------------------------+----------------------+--------------------------------------------------------------------+ +| ``(`` «IDENTIFIER ``,``»+ ``)`` | DestructuredParameter_ | | | ++------------------------------------------------------------+------------------------+----------------------+--------------------------------------------------------------------+ +| ``...`` | ForwardParameter_ | | | ++------------------------------------------------------------+------------------------+----------------------+--------------------------------------------------------------------+ +| ``**nil`` | HashSplatNilParameter_ | | Indicates that there are no keyword parameters or keyword patterns | ++------------------------------------------------------------+------------------------+----------------------+--------------------------------------------------------------------+ +| ``**`` IDENTIFIER | HashSplatParameter_ | NamedParameter_ | | ++------------------------------------------------------------+------------------------+----------------------+--------------------------------------------------------------------+ +| IDENTIFIER ``:`` «Expr_»? | KeywordParameter_ | NamedParameter_ | | ++------------------------------------------------------------+------------------------+----------------------+--------------------------------------------------------------------+ +| IDENTIFIER ``=`` Expr_ | OptionalParameter_ | NamedParameter_ | | ++------------------------------------------------------------+------------------------+----------------------+--------------------------------------------------------------------+ +| IDENTIFIER | SimpleParameter_ | NamedParameter_ | | ++------------------------------------------------------------+------------------------+----------------------+--------------------------------------------------------------------+ +| ``*`` IDENTIFIER | SplatParameter_ | NamedParameter_ | | ++------------------------------------------------------------+------------------------+----------------------+--------------------------------------------------------------------+ + +Pattern classes +~~~~~~~~~~~~~~~ + +All classes in this subsection are subclasses of CasePattern_. These expressions typically occur in the context of a ``case`` using pattern matching syntax. + ++--------------------------------------------------------------------------------+-----------------------+--------------+-------------------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++================================================================================+=======================+==============+===================+ +| CasePattern_ «``|`` CasePattern_»+ | AlternativePattern_ | CasePattern_ | | ++--------------------------------------------------------------------------------+-----------------------+--------------+-------------------+ +| ``[`` «CasePattern ``,``»* «``*`` IDENTIFIER»? ``]`` | ArrayPattern_ | CasePattern_ | | ++--------------------------------------------------------------------------------+-----------------------+--------------+-------------------+ +| CasePattern_ ``=>`` IDENTIFIER | AsPattern_ | CasePattern_ | | ++--------------------------------------------------------------------------------+-----------------------+--------------+-------------------+ +| ``[`` ``*`` «IDENTIFIER»? (``,`` CasePattern)* ``,`` ``*`` «IDENTIFIER»? ``]`` | FindPattern_ | CasePattern_ | | ++--------------------------------------------------------------------------------+-----------------------+--------------+-------------------+ +| ``{`` «StringlikeLiteral_ ``:`` CasePattern ``,``»* «``**`` IDENTIFIER»? ``}`` | HashPattern_ | CasePattern_ | | ++--------------------------------------------------------------------------------+-----------------------+--------------+-------------------+ +| ``(`` CasePattern_ ``)`` | ParenthesizedPattern_ | CasePattern_ | | ++--------------------------------------------------------------------------------+-----------------------+--------------+-------------------+ +| ``^`` Expr_ | ReferencePattern_ | CasePattern_ | | ++--------------------------------------------------------------------------------+-----------------------+--------------+-------------------+ + +Expression classes +~~~~~~~~~~~~~~~~~~ + +All classes in this subsection are subclasses of Expr_. + ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ +| Expression syntax | CodeQL class | Superclasses | Remarks | ++======================================================================================+======================+===============+========================================================================================+ +| «Expr_ ``,``»+ | ArgumentList_ | Expr_ | The right-hand side of an assignment or a ``return``, ``break``, or ``next`` statement | ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ +| StmtSequence_ «RescueClause_»? «``else`` StmtSequence_»? «``ensure`` StmtSequence_»? | BodyStmt_ | StmtSequence_ | | ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ +| Expr_ «``,`` Expr_»+ | DestructuredLhsExpr_ | LhsExpr_ | | ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ +| Expr_ | LhsExpr_ | Expr_ | An Expr_ appearing on the left-hand side of various operations. Can take many forms. | ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ +| Expr_ ``:`` Expr_ | Pair_ | Expr_ | Such as in a hash or as a keyword argument | ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ +| ``(`` StmtSequence_ ``)`` | ParenthesizedExpr_ | StmtSequence_ | | ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ +| ``rescue`` StmtSequence_ | RescueClause_ | Expr_ | | ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ +| Stmt_ ``rescue`` Stmt_ | RescueModifierExpr_ | Expr_ | | ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ +| StmtSequence_ TERM Stmt_ | StmtSequence_ | Expr_ | A sequence of 0 or more statements, separated by semicolons or newlines | ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ +| StringLiteral_ StringLiteral_ | StringConcatenation_ | Expr_ | Implicit concatenation of consecutive string literals | ++--------------------------------------------------------------------------------------+----------------------+---------------+----------------------------------------------------------------------------------------+ + +Variable classes +~~~~~~~~~~~~~~~~ + +All classes in this subsection are subclasses of VariableAccess_. + ++----------------------------+------------------------------+-----------------------------------------------+------------------+ +| Example expression syntax | CodeQL class | Superclasses | Remarks | ++===========================+==============================+================================================+==================+ +| ``@@foo`` | ClassVariableReadAccess_ | VariableReadAccess_, ClassVariableAccess_ | | ++----------------------------+------------------------------+-----------------------------------------------+------------------+ +| ``@@foo = 'str'`` | ClassVariableWriteAccess_ | VariableWriteAccess_, ClassVariableAccess_ | | ++----------------------------+------------------------------+-----------------------------------------------+------------------+ +| ``$foo`` | GlobalVariableReadAccess_ | VariableReadAccess_, GlobalVariableAccess_ | | ++----------------------------+------------------------------+-----------------------------------------------+------------------+ +| ``$foo = 'str'`` | GlobalVariableWriteAccess_ | VariableWriteAccess_, GlobalVariableAccess_ | | ++----------------------------+------------------------------+-----------------------------------------------+------------------+ +| ``@foo`` | InstanceVariableReadAccess_ | VariableReadAccess_, InstanceVariableAccess_ | | ++----------------------------+------------------------------+-----------------------------------------------+------------------+ +| ``@foo = 'str'`` | InstanceVariableWriteAccess_ | VariableWriteAccess_, InstanceVariableAccess_ | | ++----------------------------+------------------------------+-----------------------------------------------+------------------+ +| ``foo`` | LocalVariableReadAccess_ | VariableReadAccess_, LocalVariableAccess_ | | ++----------------------------+------------------------------+-----------------------------------------------+------------------+ +| ``foo = 'str'`` | LocalVariableWriteAccess_ | VariableWriteAccess_, LocalVariableAccess_ | | ++----------------------------+------------------------------+-----------------------------------------------+------------------+ +| ``self`` | SelfVariableReadAccess_ | VariableReadAccess_, SelfVariableAccess_ | | ++----------------------------+------------------------------+-----------------------------------------------+------------------+ .. _BlockArgument: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/ast/Call.qll/type.Call$BlockArgument.html .. _Call: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/ast/Call.qll/type.Call$Call.html From 9998752147f23be8ffcbdd6f0ce23907a3cdaa5d Mon Sep 17 00:00:00 2001 From: Nick Rolfe Date: Wed, 2 Nov 2022 10:53:21 +0000 Subject: [PATCH 13/27] Accept suggested wording improvements Co-authored-by: Felicity Chapman --- .../analyzing-data-flow-in-ruby.rst | 44 +++++++++---------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst b/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst index 5d6b8c90ac4..bec5bc79ee6 100644 --- a/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst +++ b/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst @@ -15,15 +15,15 @@ For a more general introduction to modeling data flow, see ":ref:`About data flo Local data flow --------------- -Local data flow is data flow within a single method or callable. Local data flow is easier, faster, and more precise than global data flow, and is sufficient for many queries. +Local data flow tracks the flow of data within a single method or callable. Local data flow is easier, faster, and more precise than global data flow. Before looking at more complex tracking, you should always consider local tracking because it is sufficient for many queries. Using local data flow ~~~~~~~~~~~~~~~~~~~~~ -The local data flow library is in the module ``DataFlow`` and it defines the class ``Node``, representing any element through which data can flow. +You can use the local data flow library by importing the ``DataFlow`` module. The library uses the class ``Node`` to represent any element through which data can flow. ``Node``\ s are divided into expression nodes (``ExprNode``) and parameter nodes (``ParameterNode``). -You can map between a data flow ``ParameterNode`` and its corresponding ``Parameter`` AST node using the ``asParameter`` member predicate. -Meanwhile, the ``asExpr`` member predicate maps between a data flow ``ExprNode`` and its corresponding ``ExprCfgNode`` in the control-flow library. +You can map a data flow ``ParameterNode`` to its corresponding ``Parameter`` AST node using the ``asParameter`` member predicate. +Similarly, you can use the ``asExpr`` member predicate to map a data flow ``ExprNode`` to its corresponding ``ExprCfgNode`` in the control-flow library. .. code-block:: ql @@ -37,7 +37,7 @@ Meanwhile, the ``asExpr`` member predicate maps between a data flow ``ExprNode`` ... } -You can also use the predicates ``exprNode`` and ``parameterNode``: +You can use the predicates ``exprNode`` and ``parameterNode`` to map from expressions and parameters to their data-flow node: .. code-block:: ql @@ -52,8 +52,8 @@ You can also use the predicates ``exprNode`` and ``parameterNode``: ParameterNode parameterNode(Parameter p) { ... } Note that since ``asExpr`` and ``exprNode`` map between data-flow and control-flow nodes, you then need to call the ``getExpr`` member predicate on the control-flow node to map to the corresponding AST node, -e.g. by writing ``node.asExpr().getExpr()``. -Due to the control-flow graph being split, there can be multiple data-flow and control-flow nodes associated with a single expression AST node. +for example, by writing ``node.asExpr().getExpr()``. +A control-flow graph considers every way control can flow through code, consequently, there can be multiple data-flow and control-flow nodes associated with a single expression node in the AST. The predicate ``localFlowStep(Node nodeFrom, Node nodeTo)`` holds if there is an immediate data flow edge from the node ``nodeFrom`` to the node ``nodeTo``. You can apply the predicate recursively, by using the ``+`` and ``*`` operators, or you can use the predefined recursive predicate ``localFlow``. @@ -67,7 +67,7 @@ For example, you can find flow from an expression ``source`` to an expression `` Using local taint tracking ~~~~~~~~~~~~~~~~~~~~~~~~~~ -Local taint tracking extends local data flow by including non-value-preserving flow steps. +Local taint tracking extends local data flow to include flow steps where values are not preserved, for example, string manipulation. For example: .. code-block:: ruby @@ -91,17 +91,17 @@ For example, you can find taint propagation from an expression ``source`` to an Using local sources ~~~~~~~~~~~~~~~~~~~ -When asking for local data flow or taint propagation between two expressions as above, you would normally constrain the expressions to be relevant to a certain investigation. -The next section will give some concrete examples, but there is a more abstract concept that we should call out explicitly, namely that of a local source. +When exploring local data flow or taint propagation between two expressions as above, you would normally constrain the expressions to be relevant to your investigation. +The next section gives some concrete examples, but first it's helpful to introduce the concept of a local source. A local source is a data-flow node with no local data flow into it. As such, it is a local origin of data flow, a place where a new value is created. -This includes parameters (which only receive global data flow) and most expressions (because they are not value-preserving). +This includes parameters (which only receive values from global data flow) and most expressions (because they are not value-preserving). The class ``LocalSourceNode`` represents data-flow nodes that are also local sources. It comes with a useful member predicate ``flowsTo(DataFlow::Node node)``, which holds if there is local data flow from the local source to ``node``. -Examples -~~~~~~~~ +Examples of local data flow +~~~~~~~~~~~~~~~~~~~~~~~~~~~ This query finds the filename argument passed in each call to ``File.open``: @@ -134,8 +134,8 @@ So we use local data flow to find all expressions that flow into the argument: Many expressions flow to the same call. If you run this query, you may notice that you get several data-flow nodes for an expression as it flows towards a call (notice repeated locations in the ``call`` column). We are mostly interested in the "first" of these, what might be called the local source for the file name. -To restrict attention to such local sources, and to simultaneously make the analysis more performant, we have the QL class ``LocalSourceNode``. -We could demand that ``expr`` is such a node: +To restrict the results to local sources for the file name, and to simultaneously make the analysis more efficient, we can use the CodeQL class ``LocalSourceNode``. +We can update the query to specify that ``expr`` is an instance of a ``LocalSourceNode``. .. code-block:: ql @@ -149,7 +149,7 @@ We could demand that ``expr`` is such a node: expr instanceof DataFlow::LocalSourceNode select call, expr -However, we could also enforce this by casting. +An alternative approach to limit the results to local sources for the file name is to enforce this by casting. That would allow us to use the member predicate ``flowsTo`` on ``LocalSourceNode`` like so: .. code-block:: ql @@ -181,7 +181,7 @@ We now mostly have one expression per call. We may still have cases of more than one expression flowing to a call, but then they flow through different code paths (possibly due to control-flow splitting). -We might want to make the source more specific, for example a parameter to a method or block. +We might want to make the source more specific, for example, a parameter to a method or block. This query finds instances where a parameter is used as the name when opening a file: .. code-block:: ql @@ -197,7 +197,7 @@ This query finds instances where a parameter is used as the name when opening a Using the exact name supplied via the parameter may be too strict. If we want to know if the parameter influences the file name, we can use taint tracking instead of data flow. -This query finds calls to ``File.open`` where the filename is derived from a parameter: +This query finds calls to ``File.open`` where the file name is derived from a parameter: .. code-block:: ql @@ -224,7 +224,7 @@ However, global data flow is less precise than local data flow, and the analysis Using global data flow ~~~~~~~~~~~~~~~~~~~~~~ -The global data flow library is used by extending the class ``DataFlow::Configuration``: +You can use the global data flow library by extending the class ``DataFlow::Configuration``: .. code-block:: ql @@ -316,15 +316,15 @@ Class hierarchy - ``Concepts::SystemCommandExecution`` - a data-flow node that executes an operating system command, for instance by spawning a new process. - ``Concepts::FileSystemAccess`` - a data-flow node that performs a file system access, including reading and writing data, creating and deleting files and folders, checking and updating permissions, and so on. - ``Concepts::Path::PathNormalization`` - a data-flow node that performs path normalization. This is often needed in order to safely access paths. - - ``Concepts::CodeExecution`` - a data-flow node that dynamically executes Python code. + - ``Concepts::CodeExecution`` - a data-flow node that dynamically executes Ruby code. - ``Concepts::SqlExecution`` - a data-flow node that executes SQL statements. - ``Concepts::HTTP::Server::RouteSetup`` - a data-flow node that sets up a route on a server. - ``Concepts::HTTP::Server::HttpResponse`` - a data-flow node that creates an HTTP response on a server. - ``TaintTracking::Configuration`` - base class for custom global taint tracking analysis. -Examples -~~~~~~~~ +Examples of global data flow +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This query shows a data flow configuration that uses all network input as data sources: From 8786c700c2b804cbbd12ae1bb6c0d5ddea3b6948 Mon Sep 17 00:00:00 2001 From: Nick Rolfe Date: Wed, 2 Nov 2022 11:30:37 +0000 Subject: [PATCH 14/27] Expand explanations of example global data-flow queries --- .../analyzing-data-flow-in-ruby.rst | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst b/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst index bec5bc79ee6..b326bfa59aa 100644 --- a/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst +++ b/docs/codeql/codeql-language-guides/analyzing-data-flow-in-ruby.rst @@ -326,7 +326,10 @@ Class hierarchy Examples of global data flow ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -This query shows a data flow configuration that uses all network input as data sources: +The following global taint-tracking query finds path arguments in filesystem accesses that can be controlled by a remote user. + - Since this is a taint-tracking query, the configuration class extends ``TaintTracking::Configuration``. + - The ``isSource`` predicate defines sources as any data-flow nodes that are instances of ``RemoteFlowSource``. + - The ``isSink`` predicate defines sinks as path arguments in any filesystem access, using ``FileSystemAccess`` from the ``Concepts`` library. .. code-block:: ql @@ -349,7 +352,10 @@ This query shows a data flow configuration that uses all network input as data s where config.hasFlow(input, fileAccess) select fileAccess, "This file access uses data from $@.", input, "user-controllable input." -This data flow configuration tracks data flow from environment variables to opening files: +The following global data-flow query finds calls to ``File.open`` where the filename argument comes from an environment variable. + - Since this is a data-flow query, the configuration class extends ``DataFlow::Configuration``. + - The ``isSource`` predicate defines sources as expression nodes representing lookups on the ``ENV`` hash. + - The ``isSink`` predicate defines sinks as the first argument in any call to ``File.open``. .. code-block:: ql From 727b5aebd1dfe7434a5ac00778e70a580676fbe9 Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Wed, 2 Nov 2022 12:36:52 +0000 Subject: [PATCH 15/27] Ruby: AST ref docs - add too toctree --- docs/codeql/codeql-language-guides/codeql-for-ruby.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst index bfb29a012ef..83da5142b5d 100644 --- a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst +++ b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst @@ -10,6 +10,7 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat basic-query-for-ruby-code codeql-library-for-ruby + abstract-syntax-tree-classes-for-working-with-ruby-programs - :doc:`Basic query for Ruby code `: Learn to write and run a simple CodeQL query using LGTM. From 7c577ae1d1a9c16020aa5f3de3fbb0c8a9da907a Mon Sep 17 00:00:00 2001 From: Arthur Baars Date: Thu, 3 Nov 2022 11:32:39 +0100 Subject: [PATCH 16/27] Address review feedback --- docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst b/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst index 96145c51422..1f552ee922d 100644 --- a/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst +++ b/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst @@ -27,7 +27,7 @@ following snippet demonstrates. select API::getTopLevelMember("Regexp") -This query selects the API graph nodes corresponding to references to the ``Regexp`` class. For nested +The example above finds references to a top-level class. For nested modules and classes, you can use the ``getMember`` method. For example the following query selects references to the ``Net::HTTP`` class. From d218572c726d23dceb82b2af0f4d14d889b578e3 Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Fri, 4 Nov 2022 14:42:33 +0000 Subject: [PATCH 17/27] Ruby: Apply review suggestions for AST reference guide Co-authored-by: Felicity Chapman --- ...classes-for-working-with-ruby-programs.rst | 30 ++++++++++--------- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index 08ebadd81aa..f64bf3d591c 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -7,6 +7,8 @@ CodeQL has a large selection of classes for representing the abstract syntax tre .. include:: ../reusables/abstract-syntax-tree.rst +The descriptions below use the following conventions and placeholders. + * An ``IDENTIFIER`` denotes an arbitrary identifier. * A ``CNAME`` denotes a class or module name. * An ``FNAME`` denotes a method name. @@ -16,7 +18,7 @@ CodeQL has a large selection of classes for representing the abstract syntax tre Statement classes ~~~~~~~~~~~~~~~~~ -This table lists subclasses of Stmt_ representing Ruby statements. +This table lists subclasses of Stmt_ that represent Ruby statements. +---------------------------------+--------------+----------------+---------+ | Statement syntax | CodeQL class | Superclasses | Remarks | @@ -60,7 +62,7 @@ Calls +----------------------------+---------------------+----------------+-------------------------------+ | ``yield`` «Expr_ ``,``»* | YieldCall_ | Call_ | | +----------------------------+---------------------+----------------+-------------------------------+ -| ``&IDENTIFIER`` | BlockArgument_ | Expr_ | Used as an argument to a call | +| ``&``IDENTIFIER | BlockArgument_ | Expr_ | Used as an argument to a call | +----------------------------+---------------------+----------------+-------------------------------+ | ``...`` | ForwardedArguments_ | Expr_ | Used as an argument to a call | +----------------------------+---------------------+----------------+-------------------------------+ @@ -116,25 +118,25 @@ Unary operations All classes in this subsection are subclasses of UnaryOperation_. -+--------------------+----------------+----------------------------+-------------------+ ++--------------------+-----------------+---------------------------+-------------------+ | Expression syntax | CodeQL class | Superclasses | Remarks | -+====================+================+============================+===================+ ++====================+=================+===========================+===================+ | ``~`` Expr_ | ComplementExpr_ | UnaryBitwiseOperation_ | | -+--------------------+----------------+----------------------------+-------------------+ ++--------------------+-----------------+---------------------------+-------------------+ | ``defined?`` Expr_ | DefinedExpr_ | UnaryOperation_ | | -+--------------------+----------------+----------------------------+-------------------+ ++--------------------+-----------------+---------------------------+-------------------+ | ``**`` Expr_ | HashSplatExpr_ | UnaryOperation_ | | -+--------------------+----------------+----------------------------+-------------------+ ++--------------------+-----------------+---------------------------+-------------------+ | ``!`` Expr_ | NotExpr_ | UnaryOperation_ | | -+--------------------+----------------+----------------------------+-------------------+ ++--------------------+-----------------+---------------------------+-------------------+ | ``not`` Expr_ | NotExpr_ | UnaryOperation_ | | -+--------------------+----------------+----------------------------+-------------------+ ++--------------------+-----------------+---------------------------+-------------------+ | ``*`` Expr_ | SplatExpr_ | UnaryOperation_ | | -+--------------------+----------------+----------------------------+-------------------+ ++--------------------+-----------------+---------------------------+-------------------+ | ``-`` Expr_ | UnaryMinusExpr_ | UnaryArithmeticOperation_ | | -+--------------------+----------------+----------------------------+-------------------+ ++--------------------+-----------------+---------------------------+-------------------+ | ``+`` Expr_ | UnaryPlusExpr_ | UnaryArithmeticOperation_ | | -+--------------------+----------------+----------------------------+-------------------+ ++--------------------+-----------------+---------------------------+-------------------+ Binary operations ~~~~~~~~~~~~~~~~~ @@ -336,7 +338,7 @@ All classes in this subsection are subclasses of Parameter_. Pattern classes ~~~~~~~~~~~~~~~ -All classes in this subsection are subclasses of CasePattern_. These expressions typically occur in the context of a ``case`` using pattern matching syntax. +All classes in this subsection are subclasses of CasePattern_. These expressions typically occur when a ``case`` uses pattern matching syntax. +--------------------------------------------------------------------------------+-----------------------+--------------+-------------------+ | Expression syntax | CodeQL class | Superclasses | Remarks | @@ -392,7 +394,7 @@ All classes in this subsection are subclasses of VariableAccess_. +----------------------------+------------------------------+-----------------------------------------------+------------------+ | Example expression syntax | CodeQL class | Superclasses | Remarks | -+===========================+==============================+================================================+==================+ ++============================+==============================+===============================================+==================+ | ``@@foo`` | ClassVariableReadAccess_ | VariableReadAccess_, ClassVariableAccess_ | | +----------------------------+------------------------------+-----------------------------------------------+------------------+ | ``@@foo = 'str'`` | ClassVariableWriteAccess_ | VariableWriteAccess_, ClassVariableAccess_ | | From 610bbeee977f79e7c76a4ec82fa36c14c123699d Mon Sep 17 00:00:00 2001 From: Arthur Baars Date: Fri, 4 Nov 2022 16:21:57 +0100 Subject: [PATCH 18/27] Update docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst Co-authored-by: Erik Krogh Kristensen --- docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst b/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst index 1f552ee922d..7ac699b61c2 100644 --- a/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst +++ b/docs/codeql/codeql-language-guides/using-api-graphs-in-ruby.rst @@ -124,7 +124,7 @@ enters the current code base. A typical example is the return value of a library import codeql.ruby.ApiGraphs - select API::getTopLevelMember("File").getMethod("read").getParameter(1).asSource() + select API::getTopLevelMember("File").getMethod("read").getReturn().asSource() The ``asSink()`` method is used to select dataflow nodes where a value leaves the From 9cf32843715e6e5620f91ae1bb8d72bf5e9001ca Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Fri, 4 Nov 2022 15:19:13 +0000 Subject: [PATCH 19/27] Ruby: AST ref docs - add a missing space --- ...tract-syntax-tree-classes-for-working-with-ruby-programs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index f64bf3d591c..8b64a82fbc7 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -62,7 +62,7 @@ Calls +----------------------------+---------------------+----------------+-------------------------------+ | ``yield`` «Expr_ ``,``»* | YieldCall_ | Call_ | | +----------------------------+---------------------+----------------+-------------------------------+ -| ``&``IDENTIFIER | BlockArgument_ | Expr_ | Used as an argument to a call | +| ``&`` IDENTIFIER | BlockArgument_ | Expr_ | Used as an argument to a call | +----------------------------+---------------------+----------------+-------------------------------+ | ``...`` | ForwardedArguments_ | Expr_ | Used as an argument to a call | +----------------------------+---------------------+----------------+-------------------------------+ From 530b29ccdfb42bcb8173f270e2f89d76bc20c2bc Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Fri, 4 Nov 2022 15:23:27 +0000 Subject: [PATCH 20/27] Ruby: AST ref docs - note AssignExpr --- ...tract-syntax-tree-classes-for-working-with-ruby-programs.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index 8b64a82fbc7..f36771fa0c4 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -222,6 +222,8 @@ All classes in this subsection are subclasses of BinaryOperation_. +------------------------+--------------------------+----------------------------+-------------------+ | Expr_ ``-`` Expr_ | SubExpr_ | BinaryArithmeticOperation_ | | +------------------------+--------------------------+----------------------------+-------------------+ +| LhsExpr_ ``=`` Expr_ | AssignExpr_ | Assignment_ | | ++------------------------+--------------------------+----------------------------+-------------------+ Literals ~~~~~~~~ From a77fc96067c033f5e335e004d27c88a1507da02f Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Fri, 4 Nov 2022 15:36:18 +0000 Subject: [PATCH 21/27] Ruby: AST ref docs - note about desugaring and synthesized AstNodes --- ...classes-for-working-with-ruby-programs.rst | 23 +++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index f36771fa0c4..d85077dd356 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -416,6 +416,27 @@ All classes in this subsection are subclasses of VariableAccess_. | ``self`` | SelfVariableReadAccess_ | VariableReadAccess_, SelfVariableAccess_ | | +----------------------------+------------------------------+-----------------------------------------------+------------------+ +Desugaring +~~~~~~~~~~ + +Certain Ruby language features are implemented using syntactic sugar. For example, supposing that ``x`` is an object with an attribute ``foo``, the assignment:: + + x.foo = y + +is desugared to code similar to:: + + x.foo=(__synth_0 = y); + __synth_0; + +In other words, there is effectively a call to the SetterMethodCall_ ``foo=`` on ``x`` with argument ``__synth_0 = y``, followed by a read of the ``__synth_0`` variable. + +In CodeQL, this is implemented by syntheisizing AstNode_ instances corresponding to this desugared version of the code. + +Note that both the original AssignExpr_ and the desugared SetterMethodCall_ versions are both available to CodeQL queries, and it is usually not necessary to be aware of any desugaring that may take place. However, if a codebase explicitly uses ``x.foo=(y)`` SetterMethodCall_ syntax, then this will not be found by a query for AssignExpr_ instances. + +Other synthesized AstNode_ instances exist, see the isSynthesized_ and getDesugared_ predicates for details. + + .. _BlockArgument: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/ast/Call.qll/type.Call$BlockArgument.html .. _Call: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/ast/Call.qll/type.Call$Call.html .. _ElementReference: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/ast/Call.qll/type.Call$ElementReference.html @@ -618,3 +639,5 @@ All classes in this subsection are subclasses of VariableAccess_. .. _VariableAccess: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/ast/Variable.qll/type.Variable$VariableAccess.html .. _VariableReadAccess: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/ast/Variable.qll/type.Variable$VariableReadAccess.html .. _VariableWriteAccess: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/ast/Variable.qll/type.Variable$VariableWriteAccess.html +.. _isSynthesized: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/AST.qll/predicate.AST$AstNode$isSynthesized.0.html +.. _getDesugared: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/AST.qll/predicate.AST$AstNode$getDesugared.0.html From 13aad99194aabf110c88aece7793ae96be140791 Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Fri, 4 Nov 2022 15:56:13 +0000 Subject: [PATCH 22/27] Ruby: AST ref docs - add Calls section intro --- ...tract-syntax-tree-classes-for-working-with-ruby-programs.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index d85077dd356..98297d7b8e6 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -49,6 +49,8 @@ This table lists subclasses of Stmt_ that represent Ruby statements. Calls ~~~~~ +This table lists subclasses of Call_ as well as some expressions that appear as call arguments. + +----------------------------+---------------------+----------------+-------------------------------+ | Expression syntax | CodeQL class | Superclasses | Remarks | +============================+=====================+================+===============================+ From 53e83ff048c3e6c4838b1a9d6f9844a0bf2cf2c7 Mon Sep 17 00:00:00 2001 From: Alex Ford Date: Fri, 4 Nov 2022 16:00:22 +0000 Subject: [PATCH 23/27] Ruby: AST ref docs - add futher reading section --- ...ct-syntax-tree-classes-for-working-with-ruby-programs.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index 98297d7b8e6..3132f18a2cf 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -438,6 +438,11 @@ Note that both the original AssignExpr_ and the desugared SetterMethodCall_ vers Other synthesized AstNode_ instances exist, see the isSynthesized_ and getDesugared_ predicates for details. +Further reading +--------------- + +.. include:: ../reusables/ruby-further-reading.rst +.. include:: ../reusables/codeql-ref-tools-further-reading.rst .. _BlockArgument: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/ast/Call.qll/type.Call$BlockArgument.html .. _Call: https://codeql.github.com/codeql-standard-libraries/ruby/codeql/ruby/ast/Call.qll/type.Call$Call.html From 63dc0445a8d9235e825028c487ae0f6dc6214104 Mon Sep 17 00:00:00 2001 From: Arthur Baars Date: Mon, 7 Nov 2022 11:54:37 +0100 Subject: [PATCH 24/27] Ruby: docs add missing entry --- docs/codeql/codeql-language-guides/codeql-for-ruby.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst index 12e40f9f35b..82d13cf410e 100644 --- a/docs/codeql/codeql-language-guides/codeql-for-ruby.rst +++ b/docs/codeql/codeql-language-guides/codeql-for-ruby.rst @@ -21,3 +21,5 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat - :doc:`Analyzing data flow in Ruby `: You can use CodeQL to track the flow of data through a Ruby program to places where the data is used. - :doc:`Using API graphs in Ruby `: API graphs are a uniform interface for referring to functions, classes, and methods defined in external libraries. + +- :doc:`Abstract syntax tree classes for working with Ruby programs `: CodeQL has a large selection of classes for representing the abstract syntax tree of Ruby programs. From 6a0a81b3bee5f71622d255dd44ca97d63a4cac73 Mon Sep 17 00:00:00 2001 From: Arthur Baars Date: Mon, 7 Nov 2022 12:57:01 +0100 Subject: [PATCH 25/27] Ruby: expand explanation of desugaring --- ...tax-tree-classes-for-working-with-ruby-programs.rst | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index 3132f18a2cf..47263808a78 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -421,7 +421,13 @@ All classes in this subsection are subclasses of VariableAccess_. Desugaring ~~~~~~~~~~ -Certain Ruby language features are implemented using syntactic sugar. For example, supposing that ``x`` is an object with an attribute ``foo``, the assignment:: +Certain Ruby language features are shorthands for a common operations that could also be expressed in an alternate, more verbose, forms. +Such language features are typically referred to as "syntactic sugar", and make it easier for programmers to write and read code. This is +great for programmers. Source code analyzers on the other hand, this lead to additional work as they need to understand the short +hand notation as well as the long form. To make analysis easier, CodeQL automatically "desugars" Ruby code, effectively rewriting +rich syntactic constructs into equivalent code that uses simpler syntactic contructs. + +For example, supposing that ``x`` is an object with an attribute ``foo``, the assignment:: x.foo = y @@ -432,7 +438,7 @@ is desugared to code similar to:: In other words, there is effectively a call to the SetterMethodCall_ ``foo=`` on ``x`` with argument ``__synth_0 = y``, followed by a read of the ``__synth_0`` variable. -In CodeQL, this is implemented by syntheisizing AstNode_ instances corresponding to this desugared version of the code. +In CodeQL, this is implemented by synthesizing AstNode_ instances corresponding to this desugared version of the code. Note that both the original AssignExpr_ and the desugared SetterMethodCall_ versions are both available to CodeQL queries, and it is usually not necessary to be aware of any desugaring that may take place. However, if a codebase explicitly uses ``x.foo=(y)`` SetterMethodCall_ syntax, then this will not be found by a query for AssignExpr_ instances. From aad3e06027ec8dd3e9794c73e0db2b2dc0dabd5d Mon Sep 17 00:00:00 2001 From: Arthur Baars Date: Mon, 7 Nov 2022 13:08:57 +0100 Subject: [PATCH 26/27] Apply suggestions from code review Co-authored-by: Nick Rolfe --- ...ct-syntax-tree-classes-for-working-with-ruby-programs.rst | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index 47263808a78..8987f5c0578 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -421,10 +421,9 @@ All classes in this subsection are subclasses of VariableAccess_. Desugaring ~~~~~~~~~~ -Certain Ruby language features are shorthands for a common operations that could also be expressed in an alternate, more verbose, forms. +Certain Ruby language features are shorthands for common operations that could also be expressed in other, more verbose, forms. Such language features are typically referred to as "syntactic sugar", and make it easier for programmers to write and read code. This is -great for programmers. Source code analyzers on the other hand, this lead to additional work as they need to understand the short -hand notation as well as the long form. To make analysis easier, CodeQL automatically "desugars" Ruby code, effectively rewriting +great for programmers. For source code analyzers, however, this leads to additional work as they need to understand the short-hand notation as well as the long form. To make analysis easier, CodeQL automatically "desugars" Ruby code, effectively rewriting rich syntactic constructs into equivalent code that uses simpler syntactic contructs. For example, supposing that ``x`` is an object with an attribute ``foo``, the assignment:: From 33b1c8471c36f3e4f7490a2879986040db8000fd Mon Sep 17 00:00:00 2001 From: Arthur Baars Date: Mon, 7 Nov 2022 13:35:58 +0100 Subject: [PATCH 27/27] Apply suggestions from code review Co-authored-by: Felicity Chapman --- ...tax-tree-classes-for-working-with-ruby-programs.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst index 8987f5c0578..3e9f98e3ae9 100644 --- a/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst +++ b/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-ruby-programs.rst @@ -418,13 +418,13 @@ All classes in this subsection are subclasses of VariableAccess_. | ``self`` | SelfVariableReadAccess_ | VariableReadAccess_, SelfVariableAccess_ | | +----------------------------+------------------------------+-----------------------------------------------+------------------+ -Desugaring -~~~~~~~~~~ +Syntactic sugar and desugaring +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Certain Ruby language features are shorthands for common operations that could also be expressed in other, more verbose, forms. Such language features are typically referred to as "syntactic sugar", and make it easier for programmers to write and read code. This is -great for programmers. For source code analyzers, however, this leads to additional work as they need to understand the short-hand notation as well as the long form. To make analysis easier, CodeQL automatically "desugars" Ruby code, effectively rewriting -rich syntactic constructs into equivalent code that uses simpler syntactic contructs. +great for programmers. For source code analyzers, however, this leads to additional work as they need to understand the shorthand notation as well as the long form. To make analysis easier, CodeQL automatically "desugars" Ruby code, effectively rewriting +rich syntactic constructs into equivalent code that uses simpler syntactic constructs. For example, supposing that ``x`` is an object with an attribute ``foo``, the assignment:: @@ -439,7 +439,7 @@ In other words, there is effectively a call to the SetterMethodCall_ ``foo=`` on In CodeQL, this is implemented by synthesizing AstNode_ instances corresponding to this desugared version of the code. -Note that both the original AssignExpr_ and the desugared SetterMethodCall_ versions are both available to CodeQL queries, and it is usually not necessary to be aware of any desugaring that may take place. However, if a codebase explicitly uses ``x.foo=(y)`` SetterMethodCall_ syntax, then this will not be found by a query for AssignExpr_ instances. +Note that the original AssignExpr_ and the desugared SetterMethodCall_ versions are both available to use in CodeQL queries, and you do not usually need to be aware of any desugaring that may take place. However, if a codebase explicitly uses ``x.foo=(y)`` SetterMethodCall_ syntax, you cannot find this syntax by searching for instances of AssignExpr_ . Other synthesized AstNode_ instances exist, see the isSynthesized_ and getDesugared_ predicates for details.