Commit Graph

12 Commits

Author SHA1 Message Date
Kristen Newbury
2ba9593d70 Add CLI support
enabled by -f flag with CLI value
tested on sarif from CodeQL CLIs:
2.6.3, 2.9.4, 2.11.4
MUST contain versionControlProvenance property however
2022-12-13 12:14:32 -05:00
Kristen Newbury
2bda917a4e Improve error handling on signature mismatch cases
and cleanup old todos that have been addressed
2022-11-23 14:06:23 -05:00
Kristen Newbury
15aa9573e2 Adjust extra properties status from error to warning 2022-11-15 13:35:52 -05:00
Kristen Newbury
066fcb8248 Add error handling csv writer
writer generates status csv per sarif
2022-11-14 13:02:36 -05:00
Michael Hohn
0e7a941be3 Include all typegraph samples, from raw to refined 2022-07-14 18:29:21 -07:00
Michael Hohn
0fc6eb3cce Improve error reporting in sarif destructuring routines 2022-05-30 00:09:13 -07:00
Michael Hohn
b212423907 WIP: sarif-extract-scans: back to single sarif file handling, incorporate multi-file libraries 2022-05-10 19:01:38 -07:00
Michael Hohn
db00f17137 Some cleanup based on pyflakes output 2022-03-17 17:23:53 -07:00
Michael Hohn
b82c620a1e Add overview of the base tables derived from multi-sarif input; add rules.csv
The table overview is in the jupyter notebook
scripts/multi-table-overview.ipynb and makes use of some formatting
customizations to actually get an overview.

The initial `projects` table had far too many entries; the `rules` part
is now in a separate `rules` table.
2022-03-16 16:54:14 -07:00
Michael Hohn
0f070a6ae4 sarif-extract-multi: extract combined tables from multiple sarif files
This command introduces a new tree structure that pulls in a collection
of sarif files.  In yaml format, an example is

    - creation_date: '2021-12-09'   # Repository creation date
      primary_language: javascript  # By lines of code
      project_name: treeio/treeio   # Repo name-short name
      query_commit_id: fa9571646c   # Commit id for custom (non-library) queries
      sarif_content: {}             # The sarif content will be attached here
      sarif_file_name: 2021-12-09/results.sarif # Path to sarif file
      scan_start_date: '2021-12-09'             # Beginning date/time of scan
      scan_stop_date:  '2021-12-10'             # End date/time of scan
      tool_name: codeql
      tool_version: v1.27

    - creation_date: '2022-02-25'
      primary_language: javascript
      ...

At run time,

    cd ~/local/sarif-cli/data/treeio
    sarif-extract-multi multi-sarif-01.json test-multi-table

will load the specified sarif files and put them in place of
`sarif_content`, then build tables against the new signature found in
sarif_cli/signature_multi.py, and merge those into 6 larger tables.  The
exported tables are

    artifacts.csv  path-problem.csv  project.csv
    codeflows.csv  problem.csv       related-locations.csv

and they have join keys for further operations.

The new typegraph is rendered in

    notes/typegraph-multi.pdf

using the instructions in

    sarif_cli/signature_multi.py
2022-03-11 23:00:53 -08:00
Michael Hohn
f246f06d4e sarif-extract-tables: interim commit: form tables
Tables are now formed and kept in the Typegraph instance.
These will be tested using pandas operations to form one of the previous outputs.
2022-02-04 23:56:01 -08:00
Michael Hohn
7a517fa06c sarif-extract-tables: interim commit
Internal destructuring and array aggregration run, but need to be tested.
Tables need to be formed, and pandas selections/joins/etc. used for custom table output.
2022-02-04 14:44:55 -08:00