mirror of
https://github.com/hohn/codeql-operational-view.git
synced 2025-12-16 12:43:04 +01:00
252 lines
10 KiB
Org Mode
252 lines
10 KiB
Org Mode
* An Operational View of CodeQL
|
|
These are notes used to develop the slides in [[./operational-view.key]]
|
|
and its [[./operational-view.pdf][pdf version]]. They may be handy for running the examples, and are a
|
|
simpler read than the slides.
|
|
|
|
A diagram view of the use of codeql is in [[./notes/codeql-build.drawio]]. To edit
|
|
/ view / print these, use the open-source version of [[https://www.drawio.com][drawio]]. It can be used in
|
|
the browser or downloaded. For simpler viewing, a [[./notes/codeql-build.drawio.pdf][pdf]] is provided.
|
|
|
|
* The codeql / C compiler comparison
|
|
*** sql injection problem, compiler view
|
|
Think Compiler (C) with library:
|
|
#+BEGIN_SRC sh
|
|
# Prepare System
|
|
./admin -c
|
|
|
|
# Convert data if needed
|
|
cat users.txt
|
|
|
|
# Edit your code
|
|
edit add-user.c
|
|
|
|
# Compile & run your code
|
|
clang -Wall add-user.c -lsqlite3 -o add-user
|
|
for user in `cat input.txt` ; do echo "$user" | ./add-user 2>> users.log ; done
|
|
|
|
# Examine results
|
|
./admin -s
|
|
|
|
#+END_SRC
|
|
|
|
*** sql injection problem, codeql view
|
|
Think Compiler (CodeQL) with library:
|
|
#+BEGIN_SRC sh
|
|
# Prepare System
|
|
export PATH=$HOME/local/vmsync/codeql250:"$PATH"
|
|
|
|
# Convert data if needed
|
|
SRCDIR=.
|
|
DB=add-user.db
|
|
cd $SRCDIR && \
|
|
codeql database create --language=cpp \
|
|
-s . -j 8 -v \
|
|
$DB \
|
|
--command='clang -Wall add-user.c -lsqlite3 -o add-user'
|
|
|
|
# Edit your code
|
|
edit SqlInjection.ql
|
|
|
|
# Compile & run your code
|
|
RESULTS=cpp-sqli.sarif
|
|
codeql database analyze \
|
|
-v --ram=14000 -j12 --rerun \
|
|
--search-path ~/local/vmsync/ql \
|
|
--format=sarif-latest \
|
|
--output=$RESULTS \
|
|
-- \
|
|
$DB \
|
|
$SRCDIR/SqlInjection.ql
|
|
|
|
# Examine results
|
|
# Plain text, look for
|
|
# "results" : [ {
|
|
# and
|
|
# "codeFlows" : [ {
|
|
edit $RESULTS
|
|
# Or
|
|
jq --raw-output --join-output -f sarif-summary.jq < cpp-sqli.sarif | less
|
|
# Or use vs code's sarif viewer
|
|
# Or use the GHAS integration via actions
|
|
|
|
#+END_SRC
|
|
|
|
** Connecting to the compiler core
|
|
- IDEs: use vs code for full functionality, any lsp-using editor for
|
|
completion/jump to source
|
|
- choose a repository layout that best fits your custom queries' development
|
|
model
|
|
- check for library X support in the ql/ library. Better yet, check for
|
|
particular function names.
|
|
** Best practice CodeQL
|
|
|
|
** Key Ideas
|
|
All of following ideas follow from one simple observation: *The CodeQL CLI
|
|
is a compiler and you should treat it as such*
|
|
|
|
- ghas setup and integration are almost completely independent of query
|
|
customization; take advantage of this.
|
|
|
|
If you can build your code on your desktop/laptop/own server, you don't have
|
|
to wait for GHAS integration to produce codeql databases.
|
|
|
|
In fact, you should *start on your desktop/laptop/server* to find issues
|
|
around the build: memory / thread requirements, ensuring the build system
|
|
runs correctly when invoked from codeql, etc.
|
|
|
|
- use desktop-based code scanning earlier in workflow
|
|
|
|
- *cli setup / analysis should be done as prototype* for your github admins to
|
|
work off
|
|
|
|
- *customize scanning tools to actually get results:*
|
|
- bug bounty programs
|
|
- known entry / exit points for services
|
|
|
|
-
|
|
Just like your CI/CD pipeline encapsulates your compiler cli tools,
|
|
|
|
github and GHAS encapsulate the codeql cli tools.
|
|
|
|
So you can always think about what makes sense for the cli, try it there, and
|
|
then update your GHAS workflow.
|
|
|
|
|
|
** Some Q&A via compiler analogy
|
|
***
|
|
Q: Is the C standard library supported?
|
|
|
|
A: Much of it, typically from a conceptual level.
|
|
|
|
To find the supported APIs, search the [[https://github.com/github/codeql/blob/87ee7849a929fff00343071315fa8108976d5c70/cpp/ql/src/][=ql/=]] library source tree.
|
|
|
|
For example, for a top-down search start with =cpp.qll= and notice the import
|
|
=import semmle.code.cpp.commons.Printf=. Follow this to find the
|
|
[[https://github.com/github/codeql/blob/87ee7849a929fff00343071315fa8108976d5c70/cpp/ql/src/semmle/code/cpp/commons/][=cpp.commons=]] module and see what it models:
|
|
# /Users/hohn/local/vmsync/ql/cpp/ql/src/semmle/code/cpp/commons:
|
|
#+BEGIN_SRC text
|
|
Alloc.qll Dependency.qll NullTermination.qll StringAnalysis.qll
|
|
Assertions.qll Environment.qll PolymorphicClass.qll StructLikeClass.qll
|
|
Buffer.qll Exclusions.qll Printf.qll Synchronization.qll
|
|
CommonType.qll File.qll Scanf.qll VoidContext.qll
|
|
DateTime.qll NULL.qll Strcat.qll unix/
|
|
#+END_SRC
|
|
|
|
***
|
|
Q: Is library X supported?
|
|
|
|
A: If it is, you'll find it in the [[https://github.com/github/codeql/blob/87ee7849a929fff00343071315fa8108976d5c70/cpp/ql/src/][=ql/=]] library source tree. A whole-tree
|
|
search, =grep=-style, is easiest.
|
|
# /Users/hohn/local/vmsync/ql/cpp/ql/src:
|
|
|
|
For example, to check support for sqlite:
|
|
#+BEGIN_SRC text
|
|
hohn@gh-hohn ~/local/vmsync/ql/cpp/ql/src
|
|
0:$ grep -l -R sqlite *
|
|
Security/CWE/CWE-313/CleartextSqliteDatabase.ql
|
|
Security/CWE/CWE-313/CleartextSqliteDatabase.c
|
|
semmle/code/cpp/security/Security.qll
|
|
#+END_SRC
|
|
So we have a query (=.ql=) and a library (=.qll=); look at both to get
|
|
some ideas:
|
|
|
|
**** =Security/CWE/CWE-313/CleartextSqliteDatabase.ql= has some info [[https://github.com/github/codeql/blob/87ee7849a929fff00343071315fa8108976d5c70/cpp/ql/src/Security/CWE/CWE-313/CleartextSqliteDatabase.ql#L2][in the header]]
|
|
#+begin_src javascript
|
|
/**
|
|
,* @name Cleartext storage of sensitive information in an SQLite database
|
|
,* @description Storing sensitive information in a non-encrypted
|
|
,* database can expose it to an attacker.
|
|
,*/
|
|
#+end_src
|
|
and [[https://github.com/github/codeql/blob/87ee7849a929fff00343071315fa8108976d5c70/cpp/ql/src/Security/CWE/CWE-313/CleartextSqliteDatabase.ql#L25][a promising class]]:
|
|
#+begin_src javascript
|
|
class SqliteFunctionCall extends FunctionCall {
|
|
SqliteFunctionCall() { this.getTarget().getName().matches("sqlite%") }
|
|
|
|
Expr getASource() { result = this.getAnArgument() }
|
|
}
|
|
#+end_src
|
|
**** =semmle/code/cpp/security/Security.qll= has [[https://github.com/github/codeql/blob/87ee7849a929fff00343071315fa8108976d5c70/cpp/ql/src/semmle/code/cpp/security/Security.qll#L12][some very promising entries]]
|
|
#+begin_src javascript
|
|
/**
|
|
,* Extend this class to customize the security queries for
|
|
,* a particular code base. Provide no constructor in the
|
|
,* subclass, and override any methods that need customizing.
|
|
,*/
|
|
class SecurityOptions extends string {
|
|
;;
|
|
predicate sqlArgument(string function, int arg) {
|
|
;;
|
|
// SQLite3 C API
|
|
function = "sqlite3_exec" and arg = 1
|
|
}
|
|
;;
|
|
/**
|
|
,* The argument of the given function is filled in from user input.
|
|
,*/
|
|
predicate userInputArgument(FunctionCall functionCall, int arg) {
|
|
;;
|
|
fname = "scanf" and arg >= 1
|
|
;;
|
|
}
|
|
;;
|
|
}
|
|
#+end_src
|
|
|
|
This is a library, so some sample uses would be nice. Another search via
|
|
: grep -nH -R SecurityOptions *
|
|
|
|
[[https://github.com/github/codeql/blob/87ee7849a929fff00343071315fa8108976d5c70/docs/codeql/ql-training/cpp/global-data-flow-cpp.rst#L59][finds documentation]]:
|
|
#+begin_src text
|
|
docs/codeql/ql-training/cpp/global-data-flow-cpp.rst:59:The library class ``SecurityOptions`` provides a (configurable) model of what counts as user-controlled data:
|
|
#+end_src
|
|
and an [[https://github.com/github/codeql/blob/87ee7849a929fff00343071315fa8108976d5c70/cpp/ql/src/semmle/code/cpp/security/SecurityOptions.qll#L16][extension point]]:
|
|
#+begin_src text
|
|
cpp/ql/src/semmle/code/cpp/security/SecurityOptions.qll:16:class CustomSecurityOptions extends SecurityOptions
|
|
#+end_src
|
|
#+begin_src javascript
|
|
/**
|
|
,* This class overrides `SecurityOptions` and can be used to add project
|
|
,* specific customization.
|
|
,*/
|
|
class CustomSecurityOptions extends SecurityOptions {...}
|
|
#+end_src
|
|
|
|
***
|
|
Q: How should we go about modeling our libraries with CodeQL?
|
|
|
|
A: Follow the way you use a C library, say =sqlite3=. Your code includes only
|
|
=sqlite3.h=; you use, but don't care about, =libsqlite3.a=.
|
|
|
|
Thus for CodeQL: don't try to model the library internals, only model the
|
|
parts of the API you actually use.
|
|
|
|
For other languages, you need also only model the exposed API.
|
|
|
|
***
|
|
Q: Should we use the most recent version of codeql at all times?
|
|
|
|
A: Follow the way you use your compiler. Do you use the most recent version
|
|
of compiler at all times, or do you use a rolling release cycle?
|
|
|
|
To get your current version's info:
|
|
#+BEGIN_SRC sh
|
|
hohn@gh-hohn ~/local/vmsync/ql/cpp/ql/src
|
|
0:$ codeql --version
|
|
CodeQL command-line toolchain release 2.5.0.
|
|
Copyright (C) 2019-2021 GitHub, Inc.
|
|
Unpacked in: /Users/hohn/local/vmsync/codeql250
|
|
Analysis results depend critically on separately distributed query and
|
|
extractor modules. To list modules that are visible to the toolchain,
|
|
use 'codeql resolve qlpacks' and 'codeql resolve languages'.
|
|
#+END_SRC
|
|
|
|
You should match the CodeQL cli version to the CodeQL library version;
|
|
the [[https://github.com/github/codeql/releases][library releases]] have =codeql-cli/<VERSION>= tags to allow matching with
|
|
the [[https://github.com/github/codeql-cli-binaries/releases/tag/v2.6.2][binaries]].
|
|
|
|
When using git for the library, you should check out the appropriate version
|
|
via, e.g.,
|
|
: cd $HOME/local/vmsync/ql && git checkout codeql-cli/v2.5.9
|
|
|