Commit Graph

24 Commits

Author SHA1 Message Date
8741e12860 wip: sarif-to-table: full table output in parallel to text 2025-10-20 18:57:34 -07:00
8977273e94 remove stale log notes/update.org 2025-10-19 13:37:40 -07:00
c15dc6d4bc Fix subtle type problem: M8 is required for early steps, datetime64[ns] later 2025-10-19 13:35:02 -07:00
Michael Hohn
1ee2dae8d7 Simplify org headline 2023-12-06 14:12:43 -08:00
Michael Hohn
95a6aaed6a Add 'SARIF and Signatures' section 2023-12-06 14:09:51 -08:00
Michael Hohn
68b43e0514 wip: debug and get automationDetails into CSV output 2023-07-12 17:04:23 -07:00
Michael Hohn
dc8a4929fa wip: notes cleanup 2023-07-11 20:26:40 -07:00
Michael Hohn
62ec56948e WIP: debug missing field propagation for automationDetails.id
Create SARIF files with and without automationDetails.id for examination.
2023-07-11 10:45:15 -07:00
Michael Hohn
2b42a7d306 scan table change: the results.query_id is the @id from the CodeQL query
Before, the query_id was
	==> results.csv <==
	query_id STRING,         -- git commit id of the ql query set

now, it's
	query_id STRING,         -- @id from the CodeQL query
2022-08-11 16:56:20 -07:00
Michael Hohn
0e7a941be3 Include all typegraph samples, from raw to refined 2022-07-14 18:29:21 -07:00
Michael Hohn
154b0bdc56 WIP: assemble derived 'results' table 2022-05-13 17:01:18 -07:00
Michael Hohn
b212423907 WIP: sarif-extract-scans: back to single sarif file handling, incorporate multi-file libraries 2022-05-10 19:01:38 -07:00
Michael Hohn
675a5a4008 Add svg snapshot of derived-tables.drawio 2022-05-02 10:45:26 -07:00
Michael Hohn
44f1d2f179 Description of current and upcoming tables and their information sources 2022-04-20 15:22:20 -07:00
Michael Hohn
046a152ae2 Expand current and planned table description 2022-04-19 12:00:54 -07:00
Michael Hohn
6cef65338a explore parts of the github API via distinct connection layers. 2022-04-18 21:20:43 -07:00
Michael Hohn
8e5d9c464b Add snowflake implementation 2022-04-11 19:24:12 -07:00
Michael Hohn
8b3710a51b interim: sarif-extract-multi table outputs and future table diagrams 2022-04-08 14:13:24 -07:00
Michael Hohn
d5390bb87e Full revision of the base tables derived from multiple sarif input files
The new base tables produced by `sarif-extract-multi` are
    artifacts
    codeflows
    kind_pathproblem
    kind_problem
    project
    relatedLocations
    rules

The revised table overview is in the jupyter notebook
scripts/multi-table-overview.ipynb

The file notes/typegraph-multi-with-tables.pdf illustrates what original (sarif)
tables are used to form the base (derived) tables.
2022-03-23 16:37:41 -07:00
Michael Hohn
926e083991 Added field to multi-file signature; the steps are documented in adding-to-typegraph.org 2022-03-15 12:30:05 -07:00
Michael Hohn
0f070a6ae4 sarif-extract-multi: extract combined tables from multiple sarif files
This command introduces a new tree structure that pulls in a collection
of sarif files.  In yaml format, an example is

    - creation_date: '2021-12-09'   # Repository creation date
      primary_language: javascript  # By lines of code
      project_name: treeio/treeio   # Repo name-short name
      query_commit_id: fa9571646c   # Commit id for custom (non-library) queries
      sarif_content: {}             # The sarif content will be attached here
      sarif_file_name: 2021-12-09/results.sarif # Path to sarif file
      scan_start_date: '2021-12-09'             # Beginning date/time of scan
      scan_stop_date:  '2021-12-10'             # End date/time of scan
      tool_name: codeql
      tool_version: v1.27

    - creation_date: '2022-02-25'
      primary_language: javascript
      ...

At run time,

    cd ~/local/sarif-cli/data/treeio
    sarif-extract-multi multi-sarif-01.json test-multi-table

will load the specified sarif files and put them in place of
`sarif_content`, then build tables against the new signature found in
sarif_cli/signature_multi.py, and merge those into 6 larger tables.  The
exported tables are

    artifacts.csv  path-problem.csv  project.csv
    codeflows.csv  problem.csv       related-locations.csv

and they have join keys for further operations.

The new typegraph is rendered in

    notes/typegraph-multi.pdf

using the instructions in

    sarif_cli/signature_multi.py
2022-03-11 23:00:53 -08:00
Michael Hohn
ec9a0b5590 sarif-extract-tables: initial version, reproduces known output as table
Reproduce the

    file:line:col:line:col: message

output from

    ../../bin/sarif-results-summary results.sarif | grep size

as test/example.

Original sample output is

    RESULT: static/js/fileuploader.js:1214:13:1214:17: Unused variable size.
    RESULT: static/js/tinymce/jscripts/tiny_mce/plugins/media/js/media.js:438:30:438:34: Unused variable size.

The table result here is

    0:$ ../../bin/sarif-extract-tables results.sarif | grep size
    0,static/js/fileuploader.js,1214,13,1214,17,Unused variable size.
    34,static/js/tinymce/jscripts/tiny_mce/plugins/media/js/media.js,438,30,438,34,Unused variable size.
2022-02-08 20:04:28 -08:00
Michael Hohn
cf8096446b sarif-to-dot: cleanup for and preparation for sarif table extraction 2022-02-01 22:42:25 -08:00
Michael Hohn
3e5d3ff5de Added interesting sarif structure diagram to notes/ 2022-01-26 23:25:30 -08:00