Files
codeql-lab/codeql-sqlite/README.org
Michael Hohn fe1baf7dc1 wip
2025-07-30 14:37:54 -07:00

9.6 KiB
Raw Blame History

Using sqlite to illustrate models-as-data

This description uses / recycles a codeql workshop.

Build the codeql database

To get started, build the codeql database (adjust paths to your setup):

  # Build the db with source commit id.
  SRCDIR=$(pwd)
  DB=$SRCDIR/java-sqlite-$(cd $SRCDIR && git rev-parse --short HEAD).db

  echo $DB
  test -d "$DB" && rm -fR "$DB"
  mkdir -p "$DB"

  # Use the correct codeql
  export PATH="$(cd ../codeql && pwd):$PATH"
  codeql database create --language=java -s . -j 8 -v $DB --command='./build.sh'

  # Check for AddUser in the db
  unzip -v $DB/src.zip | grep AddUser

Then add this database directory to your VS Code DATABASES tab.

Tests using a default query

You can run the stdlib query ../ql/java/ql/src/Security/CWE/CWE-089/SqlTainted.ql but will get no results. It does point at classes to inspect in particular, the source and sink classes. Run ./Illustrations.ql; from the command line or vs studio code. Via cli:

  # run query
  codeql query run                                \
         -v                                       \
         --database java-sqlite-e2e555c.db        \
         --output result.bqrs                     \
         --threads=12                             \
         --ram=14000                              \
         Illustrations.ql

  # format results
  codeql bqrs decode --format=text result.bqrs | sed -n '/^Result set: #select/,$p'

This shows

  Result set: #select
  |  ui  |  qsi  |
  +------+-------+
  | args | query |

In the editor, these link to

  1. main(ARGS) and
  2. conn.createStatement().executeUpdate(QUERY);

The second is correct, but System.console().readLine(); is not found. Thus, SqlTainted.ql will not find anything.

TODO supplement sources via the model editor

  • We have no flow

    • check source, sink
    • we have a sink
    • but ActiveThreatModelSource finds no source
  • We can supplement in different ways

supplement codeql: Write full manual query: already in workshop

TODO supplement codeql: Add to FlowSource or a subclass

Note: this one area that just has to be known. Browsing source will not help you.

CodeQL reading hint:

class ActiveThreatModelSource extends DataFlow::Node

uses

this.(SourceNode).getThreatModel()

So following the cast (SourceNode) may be useful:

  /**
   ,* A data flow source.
   ,*/
  abstract class SourceNode extends DataFlow::Node

Following the abstract class is promising:

  abstract class RemoteFlowSource extends SourceNode

and others.

In ../ql/java/ql/lib/Customizations.qll notice the comments mentioning RemoteFlowSource. Use imports from ../ql/java/ql/src/Security/CWE/CWE-089/SqlTainted.ql but note that there are conflicts. you will use

private import semmle.code.java.dataflow.FlowSources

Follow this to FlowSources, and find the mentioned RemoteFlowSource

abstract class RemoteFlowSource extends SourceNode

Add the custom source. The modified ../ql/java/ql/lib/Customizations.qll is

  import java
  private import semmle.code.java.dataflow.FlowSources

  class ReadLine extends RemoteFlowSource {
    ReadLine() {
      exists(Call read |
        read.getCallee().getName() = "readLine" and
        read = this.asExpr()
      )
    }

    override string getSourceType() { result = "Console readline" }
  }

Note that the predicate

  module QueryInjectionFlowConfig implements DataFlow::ConfigSig {
    predicate isSource(DataFlow::Node src) { src instanceof ActiveThreatModelSource }
        ...;
  }

now also returns the readLine() result although we extended RemoteFlowSource, not ActiveThreatModelSource

TODO supplement codeql: Add to models-as-data

In the model editor, we see a java.io.*Console.*readline' (using show already modeled option)

  1:$ rg -i 'java.io.*Console.*readline' ql/java
  ql/java/ql/lib/ext/generated/java.io.model.yml
  16:      - ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
  17:      - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[0]", "Argument[this]", "taint", "df-generated"]
  18:      - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[1].ArrayElement", "Argument[this]", "taint", "df-generated"]
  19:      - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]

note: this file is in the generated/ tree.

The current readline modeling is in the summaryModel section; we need it in a sourceModel

  extensions:
    - addsTo:
        pack: codeql/java-all
        extensible: summaryModel
      data:
        ...
        - ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[0]", "Argument[this]", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[1].ArrayElement", "Argument[this]", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument

The model editor will not show this because its already modeled. To illustrate text-based additions, we'll use plain text. Starting from

  extensions:
    - addsTo:
        pack: codeql/java-all
        extensible: summaryModel
      data:
        ...
        - ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[0]", "Argument[this]", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[1].ArrayElement", "Argument[this]", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument

and the field information

  extensible predicate sourceModel(
    string package, string type, boolean subtypes, string name, string signature, string ext,
    string output, string kind, string provenance, QlBuiltins::ExtensionId madId
  );

Starting from summaryModel

  # summaryModel
  # string package, string type, boolean subtypes, string name, string signature, string ext, string input,     string output, string kind,  string provenance, QlBuiltins::ExtensionId madId
  - ["java.io",     "Console",   False,            "readLine",  "()",             "",         "Argument[this]", "ReturnValue", "taint",      "df-generated"]

we can construct the sourceModel

  extensions:
    - addsTo:
        pack: codeql/java-all
        extensible: sourceModel
      data: 
        # sourceModel
        # string package, string type, boolean subtypes, string name, string signature, string ext,                   string output,    string kind,   string provenance, QlBuiltins::ExtensionId madId
        - ["java.io",     "Console",   False,            "readLine",  "()",             "",                           "ReturnValue",    "remote",      "manual"]

        # # from original
        # # summaryModel
        # # string package, string type, boolean subtypes, string name, string signature, string ext, string input,     string output, string kind,  string provenance, QlBuiltins::ExtensionId madId
        # - ["java.io",     "Console",   False,            "readLine",  "()",             "",         "Argument[this]", "ReturnValue", "taint",      "df-generated"]

and move this into ../.github/codeql/extensions/sqlite-db/models/sqlite.model.yml

To ensure that these model extensions are applied during query runs, include this setting

  {
      ...,
      "settings": {
          ...,
          "codeQL.runningQueries.useExtensionPacks": "all"
      }
  }

in the workspace configuration file ../qllab.code-workspace

In some environments (e.g., older VS Code versions), you may also need to replicate this setting in ../.vscode/settings.json; there it simplifies to

  "codeQL.runningQueries.useExtensionPacks": "all"

Now we can run ../ql/java/ql/src/Security/CWE/CWE-089/SqlTainted.ql again.