This command introduces a new tree structure that pulls in a collection
of sarif files. In yaml format, an example is
- creation_date: '2021-12-09' # Repository creation date
primary_language: javascript # By lines of code
project_name: treeio/treeio # Repo name-short name
query_commit_id: fa9571646c # Commit id for custom (non-library) queries
sarif_content: {} # The sarif content will be attached here
sarif_file_name: 2021-12-09/results.sarif # Path to sarif file
scan_start_date: '2021-12-09' # Beginning date/time of scan
scan_stop_date: '2021-12-10' # End date/time of scan
tool_name: codeql
tool_version: v1.27
- creation_date: '2022-02-25'
primary_language: javascript
...
At run time,
cd ~/local/sarif-cli/data/treeio
sarif-extract-multi multi-sarif-01.json test-multi-table
will load the specified sarif files and put them in place of
`sarif_content`, then build tables against the new signature found in
sarif_cli/signature_multi.py, and merge those into 6 larger tables. The
exported tables are
artifacts.csv path-problem.csv project.csv
codeflows.csv problem.csv related-locations.csv
and they have join keys for further operations.
The new typegraph is rendered in
notes/typegraph-multi.pdf
using the instructions in
sarif_cli/signature_multi.py
With --related-locations,
../../bin/sarif-results-summary -r results.sarif
produces the details
RESULT: static/js/tinymce/jscripts/tiny_mce/plugins/paste/editor_plugin_src.js:722:
72:722:73: Character ''' is repeated [here](1) in the same character class.
Character ''' is repeated [here](2) in the same character class.
Character ''' is repeated [here](3) in the same character class.
REFERENCE: static/js/tinymce/jscripts/tiny_mce/plugins/paste/editor_plugin_src.js:722:74:722:75: here
REFERENCE: static/js/tinymce/jscripts/tiny_mce/plugins/paste/editor_plugin_src.js:722:76:722:77: here
REFERENCE: static/js/tinymce/jscripts/tiny_mce/plugins/paste/editor_plugin_src.js:722:78:722:79: here
Via
../../bin/sarif-extract-tables results.sarif tables
sarif-extract-tables now produces two output tables,
tables/
├── messages.csv
└── relatedLocations.csv
that contain the relevant information and can be joined or otherwise combined on
the struct_id_4055 key.
For example, adding to the end of sarif-extract-tables:
import IPython
IPython.embed()
msg = d2[d2.message.str.startswith("Character ''' is repeated [here]")]
dr3[dr3.struct_id_4055 == msg.struct_id_4055.values[0]]
In [24]: msg
Out[24]:
struct_id_4055 ... message
180 4796917312 ... Character ''' is repeated [here](1) in the sam...
[1 rows x 7 columns]
In [25]: dr3[dr3.struct_id_4055 == msg.struct_id_4055.values[0]]
Out[25]:
struct_id_4055 uri startLine startColumn endLine endColumn message
180 4796917312 static/js/tinymce/jscripts/tiny_mce/plugins/pa... 722 74 722 75 here
181 4796917312 static/js/tinymce/jscripts/tiny_mce/plugins/pa... 722 76 722 77 here
182 4796917312 static/js/tinymce/jscripts/tiny_mce/plugins/pa... 722 78 722 79 here
or manually from the shell:
# pick up the struct_id_4055:
0:$ grep "static.*Character ''' is repeated \[here\]" tables/messages.csv
180,4927448704,static/js/tinymce/jscripts/tiny_mce/plugins/paste/editor_plugin_src.js,722,72,722,73,"Character ''' is repeated [here](1) in the same character class.
# and find relatedLocations:
0:$ grep 4927448704 tables/relatedLocations.csv
180,4927448704,static/js/tinymce/jscripts/tiny_mce/plugins/paste/editor_plugin_src.js,722,74,722,75,here
181,4927448704,static/js/tinymce/jscripts/tiny_mce/plugins/paste/editor_plugin_src.js,722,76,722,77,here
182,4927448704,static/js/tinymce/jscripts/tiny_mce/plugins/paste/editor_plugin_src.js,722,78,722,79,here
Changes:
- Introduce scli-dyys, a random id string for later identification and removal of
dummy table rows.
- Keep the struct_id_4055 column to join tables as needed.
- Output is now written to a directory as there are always multiple files.
Common to all:
| ('locations', 'Array008') |
| ('message', 'Struct009') |
| ('partialFingerprints', 'Struct010') |
| ('rule', 'Struct011') |
| ('ruleId', 'String'), |
| ('ruleIndex', 'Int'))) |
Only some problems and flow problems have
| ('relatedLocations', 'Array014') |
Add dummy value for relatedLocations to reduce to two result categories,
@kind flow problem and @kind problem.