Collection of cli tools for SARIF processing
THIS IS A WORK IN PROGRESS
Each of these tools present a high-level command-line interface to extract a specific subset of information from a SARIF file. The format of each tool's output is versioned and, as much as possible, independent of the input.
It is the intent of these tools to
- hide the internals of sarif when used
- provide examples of extracting information from SARIF files while writing your own or extending the tools
Setup for development
This repository uses git lfs for some larger files; installation steps are at
git-lfs; on a mac with homebrew, install it via
brew install git-lfs
git lfs install
Set up the virtual environment and install the packages:
python3 -m venv .venv
. .venv/bin/activate
python3 -m pip install -r requirements.txt
# Or separately:
pip install --upgrade pip
pip install ipython pyyaml
"Install" for local development:
pip install -e .
Examples
To use git parlance, the porcelain tool is sarif-results-summary, while the
plumbing tools are sarif-digest, sarif-labeled and sarif-list-files.
Following are short summaries of each.
sarif-results-summary
Display the SARIF results in human-readable plain text form. Taking the warning around
src/stc/scintilla/lexers/LexMySQL.cxx:153:24:153:30:
as example, there are two options using only the SARIF file, and one more when source code is available.
-
Display only main result. Using
sarif-results-summary -s data/wxWidgets-small -r data/wxWidgets_wxWidgets__2021-11-21_16_06_30__export.sarif 2>&1 |less -p LexMySQL.cxxonly displays
RESULT: src/stc/scintilla/lexers/LexMySQL.cxx:153:24:153:30: Local variable 'length' hides a [parameter of the same name](1). -
Display the related information. Using
sarif-results-summary -r data/wxWidgets_wxWidgets__2021-11-21_16_06_30__export.sarif 2>&1 | less -p LexMySQL.cxxdisplays
RESULT: src/stc/scintilla/lexers/LexMySQL.cxx:153:24:153:30: Local variable 'length' hides a [parameter of the same name](1). REFERENCE: src/stc/scintilla/lexers/LexMySQL.cxx:108:68:108:74: parameter of the same name -
Either display can be supplemented by source code snippets if the source is available. Using
sarif-results-summary -s data/wxWidgets-small -r data/wxWidgets_wxWidgets__2021-11-21_16_06_30__export.sarif 2>&1 |lessdisplays the source code with underlines
RESULT: src/stc/scintilla/lexers/LexMySQL.cxx:153:24:153:30: Local variable 'length' hides a [parameter of the same name](1). Sci_Position length = sc.LengthCurrent() + 1; ^^^^^^ REFERENCE: src/stc/scintilla/lexers/LexMySQL.cxx:108:68:108:74: parameter of the same name static void ColouriseMySQLDoc(Sci_PositionU startPos, Sci_Position length, int initStyle, WordList *keywordlists[], ^^^^^^
sarif-digest
Get an idea of the SARIF file structure by showing only first / last entries in arrays.
sarif-digest data/torvalds_linux__2021-10-21_10_07_00__export.sarif |less
sarif-labeled
Display the SARIF file with explicit paths inserted before json objects and selected array entries. Handy when reverse-engineering the format by searching for results.
sarif-labeled data/torvalds_linux__2021-10-21_10_07_00__export.sarif |less
For example, the
"uri": "drivers/gpu/drm/i915/gt/uc/intel_guc.c",
is nested; the labeled display shows where:
"sarif_struct['runs'][1]['results'][4]['locations'][0]['physicalLocation']['artifactLocation']": "----path----",
"artifactLocation": {
"uri": "drivers/gpu/drm/i915/gt/uc/intel_guc.c",
sarif-list-files
Display the list of files referenced by a SARIF file. This is the tools used to
get file names that ultimately went into data/linux-small/ and
data/wxWidgets-small/.
sarif-list-files data/wxWidgets_wxWidgets__2021-11-21_16_06_30__export.sarif
Sample Data
The query results in data/ are taken from lgtm.com, which ran the
ql/$LANG/ql/src/codeql-suites/$LANG-lgtm.qls
queries.
The linux kernel has both single-location results ("kind": "problem") and path
results ("kind": "path-problem"). It also has results for multiple source
languages.
The subset of files referenced by the sarif results is in data/linux-small/
and is taken from
"versionControlProvenance": [
{
"repositoryUri": "https://github.com/torvalds/linux.git",
"revisionId": "d9abdee5fd5abffd0e763e52fbfa3116de167822"
}
]
The wxWidgets library has both single-location results ("kind": "problem") and path
results ("kind": "path-problem").
The subset of files referenced by the sarif results is in data/wxWidgets-small/
and is taken from
"repositoryUri": "https://github.com/wxWidgets/wxWidgets.git",
"revisionId": "7a03d5fe9bca2d2a2cd81fc0620bcbd2cbc4c7b0"