Commit Graph

24 Commits

Author SHA1 Message Date
Michael Hohn
ee11214aee Add support for external timestamps
This allows external files containing

    timestamps = {
        "db_create_start"      : pd.Timestamp(0.0, unit='s'),
        "db_create_stop"       : pd.Timestamp(0.0, unit='s'),
        "scan_start_date"      : pd.Timestamp(0.0, unit='s'),
        "scan_stop_date"       : pd.Timestamp(0.0, unit='s'),
    }

to be used to provide those values, instead of the above defaults.

This patch changes the top-level scripts
        bin/sarif-extract-scans
        bin/sarif-extract-scans-runner
and provides
        scripts/test-timestamps.sh
for verification.

The following keys are also accepted:
    {
      "db_create_start": ...,
      "db_create_stop": ...,
      "scan_start": ...
      "scan_stop": ...
    }
2023-08-18 17:06:58 -07:00
Michael Hohn
8820186152 Add sample output for test-vcp 2023-07-13 16:46:24 -07:00
Michael Hohn
1d85d13efb Execute test-vcp with tracing 2023-07-13 16:35:33 -07:00
Michael Hohn
c299321ab8 Remove repls; add scripts/test-vcp.sh 2023-07-13 16:03:01 -07:00
Michael Hohn
62ec56948e WIP: debug missing field propagation for automationDetails.id
Create SARIF files with and without automationDetails.id for examination.
2023-07-11 10:45:15 -07:00
Michael Hohn
203343df07 Add sarif-pad-aggregate to fill scan values
Fills the scans table's db_create_start/stop and scan_start/stop_date
columns with realistic random values.
2022-08-31 21:19:02 -07:00
Michael Hohn
03a9ef0477 Rewrite sarif-combine-tables.py as full tool, bin/sarif-aggregate-scans 2022-08-10 17:34:35 -07:00
Michael Hohn
7e996e746c Rewrite sarif-runner as full tool, sarif-extract-scans-runner 2022-08-08 14:47:25 -07:00
Michael Hohn
ef00559408 Bring sarif-extract-tables up to date with sarif-extract-scans 2022-07-19 15:42:26 -07:00
Michael Hohn
5cce2ed4d1 Better status updates for sarif-combine-tables 2022-06-03 00:08:23 -07:00
Michael Hohn
69f02cf99a Add sarif-combine-tables to combine output from sarif-runner 2022-06-02 18:55:22 -07:00
Michael Hohn
b7cd96ea72 Add sarif-runner.py to drive sarif-extract-scans for sarif file collections
The input file format is just a list of  organization/project entries
2022-05-30 00:04:40 -07:00
Michael Hohn
3dd8522b7f Add simple timing run information 2022-05-16 11:43:05 -07:00
Michael Hohn
154b0bdc56 WIP: assemble derived 'results' table 2022-05-13 17:01:18 -07:00
Michael Hohn
1f2daab51e Re-run of table overview 2022-04-20 15:13:46 -07:00
Michael Hohn
8b3710a51b interim: sarif-extract-multi table outputs and future table diagrams 2022-04-08 14:13:24 -07:00
Michael Hohn
d5390bb87e Full revision of the base tables derived from multiple sarif input files
The new base tables produced by `sarif-extract-multi` are
    artifacts
    codeflows
    kind_pathproblem
    kind_problem
    project
    relatedLocations
    rules

The revised table overview is in the jupyter notebook
scripts/multi-table-overview.ipynb

The file notes/typegraph-multi-with-tables.pdf illustrates what original (sarif)
tables are used to form the base (derived) tables.
2022-03-23 16:37:41 -07:00
Michael Hohn
bdf85eafc8 Add a collection of commands to run static python checkers 2022-03-17 17:21:58 -07:00
Michael Hohn
b82c620a1e Add overview of the base tables derived from multi-sarif input; add rules.csv
The table overview is in the jupyter notebook
scripts/multi-table-overview.ipynb and makes use of some formatting
customizations to actually get an overview.

The initial `projects` table had far too many entries; the `rules` part
is now in a separate `rules` table.
2022-03-16 16:54:14 -07:00
Michael Hohn
926e083991 Added field to multi-file signature; the steps are documented in adding-to-typegraph.org 2022-03-15 12:30:05 -07:00
Michael Hohn
0f070a6ae4 sarif-extract-multi: extract combined tables from multiple sarif files
This command introduces a new tree structure that pulls in a collection
of sarif files.  In yaml format, an example is

    - creation_date: '2021-12-09'   # Repository creation date
      primary_language: javascript  # By lines of code
      project_name: treeio/treeio   # Repo name-short name
      query_commit_id: fa9571646c   # Commit id for custom (non-library) queries
      sarif_content: {}             # The sarif content will be attached here
      sarif_file_name: 2021-12-09/results.sarif # Path to sarif file
      scan_start_date: '2021-12-09'             # Beginning date/time of scan
      scan_stop_date:  '2021-12-10'             # End date/time of scan
      tool_name: codeql
      tool_version: v1.27

    - creation_date: '2022-02-25'
      primary_language: javascript
      ...

At run time,

    cd ~/local/sarif-cli/data/treeio
    sarif-extract-multi multi-sarif-01.json test-multi-table

will load the specified sarif files and put them in place of
`sarif_content`, then build tables against the new signature found in
sarif_cli/signature_multi.py, and merge those into 6 larger tables.  The
exported tables are

    artifacts.csv  path-problem.csv  project.csv
    codeflows.csv  problem.csv       related-locations.csv

and they have join keys for further operations.

The new typegraph is rendered in

    notes/typegraph-multi.pdf

using the instructions in

    sarif_cli/signature_multi.py
2022-03-11 23:00:53 -08:00
Michael Hohn
558e218d3b Add endpoints-only option for path output and a collection of usage samples 2021-12-21 14:05:27 -08:00
Michael Hohn
f0e52753f6 Illustration of the steps needed to pull in used source files only 2021-12-18 14:56:39 -08:00
Michael Hohn
780def7063 Add utility scripts to retrieve sarif files from lgtm 2021-12-10 11:25:03 -08:00