Files
codeql-cli-end-to-end/readme.org
2023-06-19 15:59:21 -07:00

12 KiB

End-to-end demo of CodeQL command line usage

Run analyses

Get collection of databases (already handy)

DONE Get https://github.com/rvermeulen/codeql-workshop-vulnerable-linux-driver
  cd ~/local
  git clone git@github.com:rvermeulen/codeql-workshop-vulnerable-linux-driver.git
  cd codeql-workshop-vulnerable-linux-driver/
  unzip vulnerable-linux-driver.zip
  tree -L 2 vulnerable-linux-driver-db/
  vulnerable-linux-driver-db/
  ├── codeql-database.yml
  ├── db-cpp
  │   ├── default
  │   ├── semmlecode.cpp.dbscheme
  │   └── semmlecode.cpp.dbscheme.stats
  └── src.zip

  3 directories, 4 files
DONE Quick check using VS Code. Same steps will repeat:
select DB
select query
run query
view results
DONE Install codeql
In short:
  cd ~/local/codeql-cli-end-to-endw
  # Decide on version / os via browser, then: 
  wget https://github.com/github/codeql-action/releases/download/codeql-bundle-v2.13.4/codeql-bundle-osx64.tar.gz

  # Fix attributes on mac
  if [ `uname` = Darwin ] ; then
      xattr -c *.tar.gz
  fi

  # Extract
  tar zxf ./codeql-bundle-osx64.tar.gz

  # Check binary
  pwd
  # /Users/hohn/local/codeql-cli-end-to-end
  ./codeql/codeql --version
  # CodeQL command-line toolchain release 2.13.4.
  # Copyright (C) 2019-2023 GitHub, Inc.
  # Unpacked in: /Users/hohn/local/codeql-cli-end-to-end/codeql
  #    Analysis results depend critically on separately distributed query and
  #    extractor modules. To list modules that are visible to the toolchain,
  #    use 'codeql resolve qlpacks' and 'codeql resolve languages'.

  # Check packs
  0:$ ./codeql/codeql resolve qlpacks |head -5
  # codeql/cpp-all (/Users/hohn/local/codeql-cli-end-to-end/codeql/qlpacks/codeql/cpp-all/0.7.3)
  # codeql/cpp-examples (/Users/hohn/local/codeql-cli-end-to-end/codeql/qlpacks/codeql/cpp-examples/0.0.0)
  # codeql/cpp-queries (/Users/hohn/local/codeql-cli-end-to-end/codeql/qlpacks/codeql/cpp-queries/0.6.3)
  # codeql/csharp-all (/Users/hohn/local/codeql-cli-end-to-end/codeql/qlpacks/codeql/csharp-all/0.6.3)
  # codeql/csharp-examples (/Users/hohn/local/codeql-cli-end-to-end/codeql/qlpacks/codeql/csharp-examples/0.0.0) 

  # Fix the path
  export PATH=$(pwd -P)/codeql:"$PATH"

  # Check languages
  codeql resolve languages | head -5
  # go (/Users/hohn/local/codeql-cli-end-to-end/codeql/go)
  # python (/Users/hohn/local/codeql-cli-end-to-end/codeql/python)
  # java (/Users/hohn/local/codeql-cli-end-to-end/codeql/java)
  # html (/Users/hohn/local/codeql-cli-end-to-end/codeql/html)
  # xml (/Users/hohn/local/codeql-cli-end-to-end/codeql/xml)
A more fancy version
  # Reference urls:
  # https://github.com/github/codeql-cli-binaries/releases/download/v2.8.0/codeql-linux64.zip
  # https://github.com/github/codeql/archive/refs/tags/codeql-cli/v2.8.0.zip
  #
  # grab -- retrieve and extract codeql cli and library
  # Usage: grab version url prefix
  grab() {
      version=$1; shift
      platform=$1; shift
      prefix=$1; shift
      mkdir -p $prefix/codeql-$version &&
          cd $prefix/codeql-$version || return

      # Get cli
      wget "https://github.com/github/codeql-cli-binaries/releases/download/$version/codeql-$platform.zip"
      # Get lib
      wget "https://github.com/github/codeql/archive/refs/tags/codeql-cli/$version.zip"
      # Fix attributes
      if [ `uname` = Darwin ] ; then
          xattr -c *.zip
      fi
      # Extract
      unzip -q codeql-$platform.zip
      unzip -q $version.zip
      # Rename library directory for VS Code
      mv codeql-codeql-cli-$version/ ql
      # remove archives?
      # rm codeql-$platform.zip
      # rm $version.zip
  }

  grab v2.7.6 osx64 $HOME/local
  grab v2.8.3 osx64 $HOME/local
  grab v2.8.4 osx64 $HOME/local

  grab v2.6.3 linux64 /opt

  grab v2.6.3 osx64 $HOME/local
  grab v2.4.6 osx64 $HOME/local
Most flexible in use, but more initial setup

gh, the GitHub command-line tool from https://github.com/cli/cli

gh api repos/{owner}/{repo}/releases
gh gist list

https://cli.github.com/manual/gh_gist_list

  0:$ gh codeql
  GitHub command-line wrapper for the CodeQL CLI.
Install pack dependencies
View installed docs via -h flag, highly recommended
  # Overview
  codeql -h

  # Sub 1
  codeql pack -h

  # Sub 2
  codeql pack install -h
In short
Create the qlpack

Create the qlpack files if not there, one per directory. In this project, that's already done:

  0:$ find codeql-workshop-vulnerable-linux-driver  -name "qlpack.yml" 
  codeql-workshop-vulnerable-linux-driver/queries/qlpack.yml
  codeql-workshop-vulnerable-linux-driver/solutions/qlpack.yml
  codeql-workshop-vulnerable-linux-driver/common/qlpack.yml

For example:

cat codeql-workshop-vulnerable-linux-driver/queries/qlpack.yml

shows

  ---
  library: false
  name: queries
  version: 0.0.1
  dependencies:
    codeql/cpp-all: ^0.7.0
    common: "*"

So the queries directory does not contain a library, but it depends on one,

cat codeql-workshop-vulnerable-linux-driver/common/qlpack.yml
  ---
  library: true
  name: common
  version: 0.0.1
  dependencies:
    codeql/cpp-all: 0.7.0
Install each pack's dependencies

The first time you install dependencies, it's a good idea to do this menually, per qlpack.yml file, and deal with any errors that may occur.

  pushd ~/local/codeql-cli-end-to-end/codeql-workshop-vulnerable-linux-driver
  codeql pack install --no-strict-mode queries/

After the initial setup and for automation, install each pack's dependencies via a loop: codeql pack install

  pushd ~/local/codeql-cli-end-to-end/codeql-workshop-vulnerable-linux-driver
  find . -name "qlpack.yml"
  # ./queries/qlpack.yml
  # ./solutions/qlpack.yml
  # ./common/qlpack.yml

  codeql pack install --no-strict-mode queries/
  # Dependencies resolved. Installing packages...
  # Install location: /Users/hohn/.codeql/packages
  # Nothing to install.
  # Package install location: /Users/hohn/.codeql/packages
  # Nothing downloaded.

  for sub in `find . -name "qlpack.yml" | sed s@qlpack.yml@@g;`
  do
      codeql pack install --no-strict-mode $sub
  done

Run queries

Individual: 1 database -> N sarif files
  #* Set environment
  PROJ=$HOME/local/codeql-cli-end-to-end/codeql-workshop-vulnerable-linux-driver
  DB=$PROJ/vulnerable-linux-driver-db
  QLQUERY=$PROJ/solutions/BufferOverflow.ql
  QUERY_RES_SARIF=$PROJ/$(cd $PROJ && git rev-parse --short HEAD).sarif

  #* Run query
  pushd $PROJ
  codeql database analyze --format=sarif-latest --rerun   \
         --output $QUERY_RES_SARIF                        \
         -j6                                              \
         --ram=24000                                      \
         --                                               \
         $DB                                              \
         $QLQUERY

  # if you get
      # fatal error occurred: Error initializing the IMB disk cache: the cache
      # directory is already locked by another running process. Only one instance of
      # the IMB can access a cache directory at a time. The lock file is located at
      # /Users/hohn/local/codeql-cli-end-to-end/codeql-workshop-vulnerable-linux-driver/vulnerable-linux-driver-db/db-cpp/default/cache/.lock
  #  exit vs code and try again

And after some time:

  BufferOverflow.ql: [1/1 eval 1.8s] Results written to solutions/BufferOverfl
  Shutting down query evaluator.
  Interpreting results.
  echo The query $QLQUERY
  echo run on $DB
  echo produced output in $QUERY_RES_SARIF:
  head -5 $QUERY_RES_SARIF
  # {
  #   "$schema" : "https://json.schemastore.org/sarif-2.1.0.json",
  #   "version" : "2.1.0",
  #   "runs" : [ {
  #     "tool" : {
  # ...
Use directory of queries: 1 database -> 1 sarif file (least effort)
Use suite: 1 database -> 1 sarif file (more flexible, more effort)
Include versioning:
codeql cli
query set version

Checks:

Will include e.g.,
  codeql database analyze --format=sarif-latest --rerun   \
  --output $QUERY_RES_SARIF                        \
  --search-path $QLGIT                             \
  -j6                                              \
  --ram=24000                                      \
  --                                               \
  $DB                                              \
  $QLQUERY
Will include recommendations, e.g., 32 G ram, 4-6 cores.
For building DBs: Common case: 15 minutes for || cpp compilation, can

be 2 h with codeql.

Review results

sarif viewer plugin

raw sarif with jq

sarif-cli

dump
sql conversion

Running sequence

Smallest query suite (security suite).

Check results.

Lots of result (> 5000) -> cli review via compiler-style dump.
Medium result sets (~ 2000) (sarif review plugin, can only load 5000

results)

Few results (sarif review plugin, can only load 5000 results)

Expand query

Compare results.

sarif-cli using compiler-style dump.

Short end-to-end illustration

  1. Overall procedure
  2. Command-line use

    1. For 3.2 also using sarif-cli
  3. sarif viewer plugin https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer Sarif Viewer v3.3.7 Microsoft DevLabs microsoft.com 53,335 (1)
  4. Details on query suite use (3. Use suite: 1 database -> 1 sarif file (more flexible, more effort))