2024-12-04 11:59:48 -08:00
2024-12-03 10:03:24 -08:00
2024-12-03 11:46:48 -08:00
2024-12-03 14:32:14 -08:00
2024-12-03 10:03:24 -08:00
2024-12-04 11:59:48 -08:00
2024-12-04 11:59:48 -08:00

Introduction to CodeQL

The document ./CodeQL-workshop-overview-only.pdf gives a very short overview just to highlight the language capabilities.

This document is intended to support CodeQL workshops and presentations; it focuses on the the section labeled 'CodeQL Running Sequence', in grids C2 through E5 of the full CodeQL and GHAS integration diagram shown here. The section 'CodeQL query development sequence, using CI artifacts', in grids H0 through J4, is a subset without database building.

There are two identifyable tracks for codeql users: devops and query writers. The first one focuses on setup, deployment, and query selection; the second on query writing. There is significant overlap; the CodeQL CLI Setup is needed by both.

CodeQL CLI Setup

After you have installed the CodeQL CLI proceed with setting up this repository:

  # Clone repository
  cd && mkdir -p work-gh && cd work-gh
  git clone https://github.com/hohn/codeql-intro-csharp.git

  # Initialize CodeQL
  cd ~/work-gh/codeql-intro-csharp
  codeql resolve  packs
  codeql pack install

Using the file qlpack.yml, this will install the packs matching this codeql version, then create codeql-pack.lock.yml which pins the version.

Setup Test Problems

Hello World Sample

  # Install sdk
  brew install --cask dotnet-sdk
  dotnet --version

  # Create template project
  mkdir HelloWorld
  cd HelloWorld
  dotnet new console

  # Compile template project
  cd ~/work-gh/codeql-intro-csharp/HelloWorld/
  dotnet build

  # Run template project
  dotnet run
  # or
  ./bin/Debug/net9.0/HelloWorld

SQL Injection Sample

  # Project Setup
  cd ~/work-gh/codeql-intro-csharp/
  dotnet new console -n SqliDemo
  cd SqliDemo

  dotnet add package Microsoft.Data.Sqlite

  # Database Init
  cd ~/work-gh/codeql-intro-csharp/SqliDemo
  sqlite3 users.sqlite
  CREATE TABLE users (id INTEGER, info TEXT);
  .exit

  # Build
  cd ~/work-gh/codeql-intro-csharp/SqliDemo
  dotnet build

  # Run
  dotnet run
  First User

  # Check db
  echo '
      SELECT * FROM users;
  ' | sqlite3 users.sqlite 

  # Add Johnny Droptable 
  dotnet run
  Johnny'); DROP TABLE users; --

  # Check db
  echo '
      SELECT * FROM users;
  ' | sqlite3 users.sqlite 
  # Parse error near line 2: no such table: users

SQL Injection Code Sample Run

  # All run in pwsh, typical prompt is
  # PS /Users/hohn/work-gh/codeql-intro-csharp> 

  # Build
  cd $HOME/work-gh/codeql-intro-csharp
  ./build.ps1

  # Prepare db
  ./admin.ps1 -r
  ./admin.ps1 -c
  ./admin.ps1 -s

  # Add regular user interactively
  ./build.ps1
  ./SqliDemo/bin/Debug/net9.0/SqliDemo
  hello user

  # Check
  ./admin.ps1 -s

  # Add Johnny Droptable 
  ./SqliDemo/bin/Debug/net9.0/SqliDemo
  Johnny'); DROP TABLE users; --

  # And the problem:
  ./admin.ps1 -s
  Parse error near line 1: no such table: users

Build CodeQL Database

To get started, build the codeql database (adjust paths to your setup).

The bash version

  # Build the db with source commit id.
  cd $HOME/work-gh/codeql-intro-csharp
  SRCDIR=$(pwd)
  DB=$SRCDIR/csharp-sqli-$(cd $SRCDIR && git rev-parse --short HEAD)

  echo "preparing database directory $DB"
  test -d "$DB" && rm -fR "$DB"
  mkdir -p "$DB"

  # Run the build under codeql
  cd $SRCDIR && codeql database create --language=csharp -s . -j 8 -v $DB --command='./build.sh'
  # ...
  # Successfully created database at /Users/hohn/work-gh/codeql-intro-csharp/csharp-sqli-c89fbf8.

NEXT Run analysis using given script and database

The bash version

  # The setup information from before
  echo $DB
  echo $SRCDIR

  # To see the help
  codeql database analyze -h

  # Run a query
  codeql database analyze                                 \
         -v                                               \
         --ram=14000                                      \
         -j12                                             \
         --rerun                                          \
         --format=sarif-latest                            \
         --output csharp-sqli.sarif                       \
         --                                               \
         $DB                                              \
         $SRCDIR/FindFunction.ql

  # optional: pretty-print
  jq . < csharp-sqli.sarif | sponge csharp-sqli.sarif

  # Examine the file in an editor
  edit csharp-sqli.sarif

An example of using the sarif data is in the the jq script ./sarif-summary.jq. When run against the sarif input via

  jq --raw-output --join-output  -f sarif-summary.jq < csharp-sqli.sarif > csharp-sqli.txt

it produces output in a form close to that of compiler error messages:

  query-id: message line 
      Path
         ...

Here, that is

  csharp/intro/FindFunction: Method found [0 more]
          SqliDemo/Injectable.cs:8:
  csharp/intro/FindFunction: Method found [0 more]
          SqliDemo/Injectable.cs:17:
  csharp/intro/FindFunction: Method found [0 more]
          SqliDemo/Injectable.cs:22:
  csharp/intro/FindFunction: Method found [0 more]
          SqliDemo/Injectable.cs:47:

TODO Optional: Multiple Builds

  dotnet sln codeql-intro-csharp.sln list
  dotnet build codeql-intro-csharp.sln

TODO CodeQL for Devops and Administrators

TODO CodeQL for Query Writers

Identify the problem

./add-user is reading from STDIN, and writing to a database; looking at the code in ./add-user.c leads to

count = read(STDIN_FILENO, buf, BUFSIZE - 1);

for the read and

rc = sqlite3_exec(db, query, NULL, 0, &zErrMsg);

for the write.

This problem is thus a dataflow problem; in codeql terminology we have

  • a source at the read(STDIN_FILENO, buf, BUFSIZE - 1);
  • a sink at the sqlite3_exec(db, query, NULL, 0, &zErrMsg);

We write codeql to identify these two, and then connect them via

  • a dataflow configuration for this problem, the more general taintflow configuration.

Develop the query bottom-up

  1. Identify the source part of the

    Console.ReadLine()?.Trim() ?? string.Empty;
    

    expression, the Console.ReadLine() call. Start from a from..where..select then convert to a predicate or class. The from..where..select is found in ./SqlInjection-source.ql

  2. Identify the sink part of the

    var command = new SqliteCommand(query, connection))
    

    expression, the query argument. Again start from from..where..select, then convert to a predicate or class. There is a subtlety here; the docs mention 'The Expr class represents all C# expressions in the program. An expression is something producing a value such as a+b or new List<int>().' Use the 'view AST' option from the results of step 1 to see what is needed here. It's not obvious. The from..where..select is found in ./SqlInjection-sink.ql

  3. Fill in the taintflow configuration boilerplate. The documentation explains in detail. For this example, use

      module MyFlowConfiguration implements DataFlow::ConfigSig {
        predicate isSource(DataFlow::Node source) {
          ...
        }
    
        predicate isSink(DataFlow::Node sink) {
          ...
        }
      }
    
      module MyFlow = TaintTracking::Global<MyFlowConfiguration>;
    
      from DataFlow::Node source, DataFlow::Node sink
      where MyFlow::flow(source, sink)
      select source, "Dataflow to $@.", sink, sink.toString()

    Note the different CodeQL classes used here and their connections: Node, ExprNode, ParameterNode are part of the DFG (data flow graph), Expr and Parameter are part of the AST (abstract syntax tree). Here, this means using

    source.asExpr() = call
    

    for the source and

    sink.asExpr() = queryArg
    

    for the sink.

  4. Also, note that we want the flow path. So the query changes from

    * @kind problem
    

    to

    * @kind path-problem
    

    There are other changes, see ./SqlInjection-flow-with-path.ql

  5. Try this with dataflow instead of taintflow, and notice that there are no results.
Description
No description provided
Readme MIT 589 KiB
Languages
HTML 50.2%
CodeQL 19.1%
Shell 12.2%
PowerShell 10.1%
C# 8.4%