Files
Michael Hohn 07c9d15a76 minor
2025-07-30 21:56:54 -07:00

13 KiB
Raw Permalink Blame History

https://imgs.xkcd.com/comics/exploits_of_a_mom.png

(from https://xkcd.com/327/)

Using sqlite to illustrate models-as-data

Build codeql database

To get started, build the codeql database (adjust paths to your setup):

  # Build the db with source commit id.
  # export PATH=$HOME/local/vmsync/codeql250:"$PATH"
  SRCDIR=$(pwd)
  DB=$SRCDIR/cpp-sqli-$(cd $SRCDIR && git rev-parse --short HEAD)

  echo $DB
  test -d "$DB" && rm -fR "$DB"
  mkdir -p "$DB"

  cd $SRCDIR && codeql database create --language=cpp -s . -j 8 -v $DB --command='./build.sh'

Then add this database directory to your VS Code DATABASES tab.

Tests using a default query

TODO supplement sources via the model editor

TODO supplement codeql: Add to FlowSource or a subclass

Note: this one area that just has to be known. Browsing source will not help you.

CodeQL reading hint:

class ActiveThreatModelSource extends DataFlow::Node

uses

this.(SourceNode).getThreatModel()

So following the cast (SourceNode) may be useful:

  /**
   ,* A data flow source.
   ,*/
  abstract class SourceNode extends DataFlow::Node

Following the abstract class is promising:

  abstract class RemoteFlowSource extends SourceNode

and others.

XX: no java, use C In ../ql/java/ql/lib/Customizations.qll notice the comments mentioning RemoteFlowSource. Use imports from ../ql/java/ql/src/Security/CWE/CWE-089/SqlTainted.ql but note that there are conflicts. you will use

private import semmle.code.java.dataflow.FlowSources

Follow this to FlowSources, and find the mentioned RemoteFlowSource

abstract class RemoteFlowSource extends SourceNode

For C, FlowSources.qll has

  abstract class FlowSource extends DataFlow::Node

Add the custom source. The modified ../ql/java/ql/lib/Customizations.qll is

  import java
  private import semmle.code.java.dataflow.FlowSources

  class ReadLine extends RemoteFlowSource {
    ReadLine() {
      exists(Call read |
        read.getCallee().getName() = "readLine" and
        read = this.asExpr()
      )
    }

    override string getSourceType() { result = "Console readline" }
  }

Note that the predicate

  module QueryInjectionFlowConfig implements DataFlow::ConfigSig {
    predicate isSource(DataFlow::Node src) { src instanceof ActiveThreatModelSource }
        ...;
  }

now also returns the readLine() result although we extended RemoteFlowSource, not ActiveThreatModelSource

TODO supplement codeql: Add to models-as-data

  • schema in codeql: ../ql/cpp/ql/lib/semmle/code/cpp/dataflow/internal/ExternalFlowExtensions.qll

      extensible predicate sourceModel(
        string namespace, string type, boolean subtypes, string name, string signature, string ext,
        string output, string kind, string provenance, QlBuiltins::ExtensionId madId
      );
  • schema in json: ../tmp.bundle/codeql/qlpacks/codeql/cpp-queries/1.3.0/.codeql/libraries/codeql/cpp-all/3.0.0/.packinfo

      ../bin/hovjson < ../tmp.bundle/codeql/qlpacks/codeql/cpp-queries/1.3.0/.codeql/libraries/codeql/cpp-all/3.0.0/.packinfo
      {
        "extensible_predicate_metadata": {
          "extensible_predicates": [
            {
              "name": "sourceModel",
              "parameters": [
                {"name": "namespace","type": "string"},
                {"name": "type","type": "string"},
                {"name": "subtypes","type": "boolean"},
                {"name": "name","type": "string"},
                {"name": "signature","type": "string"},
                {"name": "ext","type": "string"},
                {"name": "output","type": "string"},
                {"name": "kind","type": "string"},
                {"name": "provenance","type": "string"}
              ],
              "has_origin": true,
              "path": "semmle/code/cpp/dataflow/internal/ExternalFlowExtensions.qll",
              "start_line": 8,
              "start_column": 1,
              "end_line": 11,
              "end_column": 3
            },
            ....
          ]
        }
      }
  • note: QlBuiltins::ExtensionId madId is only in ql, not json.
  • file format sample: ../ql/cpp/ql/lib/ext/empty.model.yml
  • data sample:

      # partial model of windows system calls
      extensions:
        - addsTo:
            pack: codeql/cpp-all
            extensible: sourceModel
          data: # namespace, type, subtypes, name, signature, ext, output, kind, provenance
            # processenv.h
            - ["", "", False, "GetCommandLineA", "", "", "ReturnValue[*]", "local", "manual"]
  • add a sourceModel

      extensions:
        - addsTo:
            pack: codeql/cpp-all
            extensible: sourceModel
          data:
            - [
                "",
                "",
                False,
                "get_user_info",
                "",
                "",
                "ReturnValue[*]",
                "remote",
                "manual",
              ]
        - addsTo:
            pack: codeql/cpp-all
            extensible: sinkModel
          data: []
        - addsTo:
            pack: codeql/cpp-all
            extensible: summaryModel
          data: []
      0:$ ls .github/codeql/extensions/
      jedis-db-local-java/ sqlite-db/
      (venv)
      hohn@ghm3 ~/work-gh/codeql-lab
      0:$ cp -r .github/codeql/extensions/sqlite-db .github/codeql/extensions/sqlite-db-c
    
      pushd .github/codeql/extensions/sqlite-db-c
    
      sed -i -e 's/java-all/cpp-all/g;'  codeql-pack.yml
      # TODO also replace pack name
    
      0:$ cat > models/sqlite.model.yml
      extensions:
        - addsTo:
            pack: codeql/cpp-all
            extensible: sourceModel
          data:
            - [
                "",
                "",
                False,
                "get_user_info",
                "",
                "",
                "ReturnValue[*]",
                "remote",
                "manual",
              ]
        - addsTo:
            pack: codeql/cpp-all
            extensible: sinkModel
          data: []
        - addsTo:
            pack: codeql/cpp-all
            extensible: summaryModel
          data: []
  • back to SqlTainted.ql

In the model editor, we see a java.io.*Console.*readline' (using show already modeled option)

  1:$ rg -i 'java.io.*Console.*readline' ql/java
  ql/java/ql/lib/ext/generated/java.io.model.yml
  16:      - ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
  17:      - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[0]", "Argument[this]", "taint", "df-generated"]
  18:      - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[1].ArrayElement", "Argument[this]", "taint", "df-generated"]
  19:      - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]

note: this file is in the generated/ tree.

The current readline modeling is in the summaryModel section; we need it in a sourceModel

  extensions:
    - addsTo:
        pack: codeql/java-all
        extensible: summaryModel
      data:
        ...
        - ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[0]", "Argument[this]", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[1].ArrayElement", "Argument[this]", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument

The model editor will not show this because its already modeled. To illustrate text-based additions, we'll use plain text. Starting from

  extensions:
    - addsTo:
        pack: codeql/java-all
        extensible: summaryModel
      data:
        ...
        - ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[0]", "Argument[this]", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[1].ArrayElement", "Argument[this]", "taint", "df-generated"]
        - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument

and the field information

  extensible predicate sourceModel(
    string package, string type, boolean subtypes, string name, string signature, string ext,
    string output, string kind, string provenance, QlBuiltins::ExtensionId madId
  );

Starting from summaryModel

  # summaryModel
  # string package, string type, boolean subtypes, string name, string signature, string ext, string input,     string output, string kind,  string provenance, QlBuiltins::ExtensionId madId
  - ["java.io",     "Console",   False,            "readLine",  "()",             "",         "Argument[this]", "ReturnValue", "taint",      "df-generated"]

we can construct the sourceModel

  extensions:
    - addsTo:
        pack: codeql/java-all
        extensible: sourceModel
      data: 
        # sourceModel
        # string package, string type, boolean subtypes, string name, string signature, string ext,                   string output,    string kind,   string provenance, QlBuiltins::ExtensionId madId
        - ["java.io",     "Console",   False,            "readLine",  "()",             "",                           "ReturnValue",    "remote",      "manual"]

        # # from original
        # # summaryModel
        # # string package, string type, boolean subtypes, string name, string signature, string ext, string input,     string output, string kind,  string provenance, QlBuiltins::ExtensionId madId
        # - ["java.io",     "Console",   False,            "readLine",  "()",             "",         "Argument[this]", "ReturnValue", "taint",      "df-generated"]

and move this into ../.github/codeql/extensions/sqlite-db/models/sqlite.model.yml

To ensure that these model extensions are applied during query runs, include this setting

  {
      ...,
      "settings": {
          ...,
          "codeQL.runningQueries.useExtensionPacks": "all"
      }
  }

in the workspace configuration file ../qllab.code-workspace

In some environments (e.g., older VS Code versions), you may also need to replicate this setting in ../.vscode/settings.json; there it simplifies to

  "codeQL.runningQueries.useExtensionPacks": "all"

Now we can run ../ql/java/ql/src/Security/CWE/CWE-089/SqlTainted.ql again.