Commit Graph

46 Commits

Author SHA1 Message Date
Michael Hohn
ef08825b43 Processing in stages: Move the initial sarif_cli code to sarif_cli/traverse 2021-12-22 18:03:34 -08:00
Michael Hohn
7d49c3bd08 Update the sarif-results-summary examples 2021-12-22 17:48:24 -08:00
Michael Hohn
558e218d3b Add endpoints-only option for path output and a collection of usage samples 2021-12-21 14:05:27 -08:00
Michael Hohn
79649a6226 Add treeio/ files referenced in sarif 2021-12-18 14:58:51 -08:00
Michael Hohn
979042ff5c Add a 3 =relatedLocations= and 3 =threadFlows= example 2021-12-18 14:58:10 -08:00
Michael Hohn
f0e52753f6 Illustration of the steps needed to pull in used source files only 2021-12-18 14:56:39 -08:00
Michael Hohn
9590d0a677 Add newline after dbg(message) output 2021-12-18 14:19:38 -08:00
Michael Hohn
291726dd58 Add smaller sarif test files 2021-12-18 13:19:11 -08:00
Michael Hohn
68a661fffb Added notes on more thorough examination of multiple results 2021-12-18 00:33:38 -08:00
Michael Hohn
7e66e29f53 Fix editing error 2021-12-15 14:02:27 -08:00
Michael Hohn
62ae8dca4a Correct the =sarif-results-summary= commands 2021-12-10 11:56:10 -08:00
Michael Hohn
780def7063 Add utility scripts to retrieve sarif files from lgtm 2021-12-10 11:25:03 -08:00
Michael Hohn
5386310b1b Prepend path index to data flow results; use single newlines 2021-12-08 16:28:32 -08:00
Michael Hohn
f1d21e4a43 Fix missing 'region' key in relatedLocations: use whole-file output
The goal is fixed-structure output formatting, so whole-file output uses
-1,-1,-1,-1 for line, column information.
2021-12-08 16:02:31 -08:00
Michael Hohn
1271589bc4 Fix class NoFile: comment 2021-12-06 15:34:03 -08:00
Michael Hohn
92d904ee10 Add quick check to verify that input is serif
An occasional output from LGTM is
    {"code":404,"error":"The specified analysis could not be found"}

With this patch, the csv output is now
    "ERROR","invalid json contents %s","some-file.json"

and the plain text output becomes
    ERROR: invalid json contents in some-file.json
2021-12-06 14:24:08 -08:00
Michael Hohn
120e673424 Fix: handle relatedLocations without physicalLocations (files)
Problem:
    The
        artifact = get(related_location, 'physicalLocation', 'artifactLocation')
    requested by
        message, artifact, region = S.get_relatedlocation_message_info(location)
    is incomplete:
        ipdb> p related_location
        {'message': {'text': 'request'}}

Fix:
    Introduce the NoFile class to propagate this and handle it where needed.

Now simply report <NoFile> as appropriate.
    For plain text output:

        RESULT: src/optionsparser/ ..
        FLOW STEP 0: <NoFile>: request
        FLOW STEP 1: <NoFile>: request_mp
        FLOW STEP 2: src/....

    For csv output:

        "result","src/optionsparser/...","116","26","116","34","`& ...` used as ..."
        "flow_step","0","<NoFile>","-1","-1","-1","-1","request"
        "flow_step","1","<NoFile>","-1","-1","-1","-1","request_mp"
        "flow_step","2","src/foo.cpp","119","97","119","104","request"
2021-12-06 12:37:35 -08:00
Michael Hohn
2c3ca3c0eb Fix for KeyError: 'region', caused by result without region
Region / line / column information are present in most messages.  The one that
caused this error refers to the whole file:

    ipdb> p sarif_struct

    {'ruleId': 'com.lgtm/cpp-queries:cpp/missing-header-guard', 'ruleIndex': 12,
    'message': {'text': 'This header file should contain a header guard to prevent
    multiple inclusion.'}, 'locations': [{'physicalLocation': {'artifactLocation':
    {'uri': 'diff/cmpbuf.h', 'uriBaseId': '%SRCROOT%', 'index': 13}}}],
    'partialFingerprints': {'primaryLocationLineHash': 'd04cb834fa64727d:1',
    'primaryLocationStartColumnFingerprint': '0'}}

The goal is fixed-structure output formatting, so whole-file output uses
-1,-1,-1,-1 for line, column information.
2021-12-06 11:48:53 -08:00
Michael Hohn
ffcacec630 sarif-results-summary: add csv output option 2021-12-06 11:48:53 -08:00
Michael Hohn
f9c3e18842 Add * Examples to README 2021-12-06 11:48:53 -08:00
Michael Hohn
44f61dc70c Add wxWidget subset as test case 2021-12-06 11:48:53 -08:00
Michael Hohn
f0aa815a9a Fix encoding read error
When using
: with open(fname, 'r') as file:
hits the accented letter á in Vrána in the file
: data/wxWidgets-small/src/stc/scintilla/lexers/LexCSS.cxx
it results in a
: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 119: invalid continuation byte

We are reading source code, so we likely don't care about dropping non-ascii; using
: with codecs.open(fname, 'r', encoding="latin-1") as file:
ignores this problem.
2021-12-06 11:48:53 -08:00
Michael Hohn
85ddaaafe1 sarif-results-summary: add codeFlow (path-problem) output, remove meta-data
The per-language result counts are removed; they belong in a separate sarif-info script.
2021-12-06 11:48:53 -08:00
Michael Hohn
29b62b8b1a Remove unused requirements 2021-12-06 11:48:53 -08:00
Michael Hohn
303d063940 Add note on git lfs requirement 2021-12-06 11:48:53 -08:00
Michael Hohn
6147e57260 Introduce get_relatedlocation_message_info to co-locate tree information 2021-11-17 16:34:20 -08:00
Michael Hohn
1f7e78b049 refactor: introduce get_location_message_info 2021-11-17 16:28:43 -08:00
Michael Hohn
8036ea5ffc factor common result prefix 2021-11-17 16:14:36 -08:00
Michael Hohn
90758f769f factor common code into display_underlined 2021-11-17 15:56:43 -08:00
Michael Hohn
f5bb156c8c Add option to print related location info (sarif-results-summary -r) 2021-11-16 21:46:55 -08:00
Michael Hohn
9f3be7bcb0 Log missing files, but try to continue execution 2021-11-16 21:45:54 -08:00
Michael Hohn
502cb21850 Add source files for relatedLocations 2021-11-16 21:42:28 -08:00
Michael Hohn
4ca7dda579 Add TODO to sarif-list-files
TODO: list files from the relatedLocations property
2021-11-16 21:32:07 -08:00
Michael Hohn
e36874cb54 sarif-results-summary: underline affected code region
Using
    sarif-results-summary -s data/linux-small data/torvalds_linux__2021-10-21_10_07_00__export.sarif |less
now underscores the indicated regions, e.g.

tools/cgroup/iocost_monitor.py:64:5:64:27: Normal methods should have 'self', rather than 'blkcg', as their first parameter.

    def blkcg_name(blkcg):
    ^^^^^^^^^^^^^^^^^^^^^^
2021-11-15 14:16:23 -08:00
Michael Hohn
a756abbb09 Consistency with tabs in Python source code
In load_lines, use 1 space for each tab
2021-11-15 14:00:18 -08:00
Michael Hohn
912f75c52a fix load_lines: only strip newlines 2021-11-15 13:41:51 -08:00
Michael Hohn
8d1aa8f11e Include linux/ top-level files 2021-11-15 12:56:32 -08:00
Michael Hohn
b69eec404d sarif-results-summary -s: include source file lines in output 2021-11-09 16:10:12 -08:00
Michael Hohn
ab1d7c27ef Use sensible values for start/end line/columns for empty entries in the sarif 'region' structure. 2021-11-09 15:04:36 -08:00
Michael Hohn
2d1180a515 use python 3 'key in dict' idiom 2021-11-09 14:45:16 -08:00
Michael Hohn
e4cee2b6a6 fix permissions 2021-11-09 14:30:31 -08:00
Michael Hohn
a0af2c8c59 fix: traverse all languages 2021-11-09 14:29:31 -08:00
Michael Hohn
c6641019bf include cpp result files 2021-11-09 14:28:45 -08:00
Michael Hohn
d4af129033 add license 2021-11-09 12:25:37 -08:00
Michael Hohn
3032fe3fcd pre-alpha versions of bin/sarif-{digest,labeled,list-files,results-summary 2021-11-09 12:21:12 -08:00
Michael Hohn
d180a079b0 include needed python result files 2021-11-09 11:46:14 -08:00