Commit Graph

166 Commits

Author SHA1 Message Date
Michael Hohn
b61fbf8896 Small documentation update 2024-11-19 15:25:35 -08:00
Michael Hohn
dd776e312a Add type information 2024-11-19 15:24:41 -08:00
Michael Hohn
18333bfdb1 Start hepc-init: the data collector for DBs on the file system 2024-11-19 15:23:20 -08:00
Michael Hohn
e335b6c843 Add ignore rules 2024-11-18 13:11:47 -08:00
Michael Hohn
4d52176c5a Add Publisher Confirms and Consumer Acknowledgements to rabbitmq channels
Also updated the end-to-end workflow

The confirmation channel size is intentionally very large to prevent
blocking the server or agents.
2024-11-14 12:04:18 -08:00
Michael Hohn
dd58a64ef7 Add summary of vs code plugin build procedure 2024-11-06 12:43:56 -08:00
Michael Hohn
4e93929943 Code generalization: cleanup 2024-10-30 11:10:28 -07:00
Michael Hohn
e7d32861e5 Code generalization: request db info from other source: remove unused constants 2024-10-28 18:45:21 -07:00
Michael Hohn
52aafd6fc9 Code generalization: request db info from other source: remove unneccessary types 2024-10-28 14:34:07 -07:00
Michael Hohn
77ce997fbb Major changes to support cli-end-to-end demonstration. See full log
* notes/cli-end-to-end-demo.org (Database Aquisition):
  Starting description for the end-to-end demonstration workflow.
  Very simplified version of notes/cli-end-to-end.org

* docker-compose-demo.yml (services):
  Make the pre-populated ql database storage an explicit container
  to get persistent data and straightforward mount semantics.

* docker-compose-demo-build.yml (services):
  Add a docker-compose configuration for *building* the demo environment.

* demo/containers/dbsdata/Dockerfile:
  Add dbsdata Docker image to hold initialized minio database file tree

* client/containers/vscode/README.org
  Update vscode container to use custom plugin for later mrva redirection
2024-10-15 10:18:42 -07:00
Michael Hohn
187c49688e Add temporary files to .gitignore 2024-10-11 13:38:28 -07:00
Michael Hohn
d5bcb8b981 Moved file content 2024-09-30 11:55:46 -07:00
Michael Hohn
ec0799696e Remove quotes -- they became part of the names 2024-09-30 10:01:50 -07:00
Michael Hohn
9ccea8ac80 Minor cleanup 2024-09-26 13:38:50 -07:00
Michael Hohn
080c311516 Redirect localhost via network_mode: "service:server" 2024-09-26 13:38:17 -07:00
Michael Hohn
faeb13efb1 7. [X] Download the sarif files, optionally also get databases. For the current 2024-09-26 13:33:29 -07:00
Michael Hohn
0378c4cb7f 6. [X] Check the status 2024-09-26 13:32:00 -07:00
Michael Hohn
7de3ee59ce Fixed dependencies, run 'Submit the mrva job' 2024-09-26 13:28:28 -07:00
Michael Hohn
7ae6e9a1cb Add codeql to gh-mrva container 2024-09-26 12:50:20 -07:00
Michael Hohn
2d92ad51c3 Migrate entries from global Makefile to local 2024-09-26 12:36:31 -07:00
Michael Hohn
bef8a6dc97 4. [X] Provide the CodeQL query 2024-09-24 16:28:05 -07:00
Michael Hohn
d08e32dc42 3. [X] Provide the specification files 2024-09-24 16:22:23 -07:00
Michael Hohn
64b77c5d70 Add 'Run MRVA from command line, set up the configuration' 2024-09-24 14:20:23 -07:00
Michael Hohn
71ce8c0823 Add notes on docker-compose -f docker-compose-demo.yml up -d 2024-09-24 13:23:55 -07:00
Michael Hohn
067e477f61 fix 2024-09-24 12:58:11 -07:00
Michael Hohn
8f807e0e42 4. Starting the server
4.1. Optional: Inspect the Backing Store
4.2. Optional: Inspect the MinIO DB
2024-09-24 12:51:21 -07:00
Michael Hohn
195dda9fd7 Add 'Repository Selection' 2024-09-19 11:11:38 -07:00
Michael Hohn
f60b55f181 Storate container simplification
Only one is really needed for large storage for the dbstore container.  The demo containers can contain their own data -- it's small
and the containers are made for demonstration anyway.
2024-09-13 12:04:30 -07:00
Michael Hohn
727381dc5a Fix: use explicit file names instead of $@ 2024-09-13 11:55:02 -07:00
Michael Hohn
a35fc619e6 Use mk. prefix for Makefile time stamps and make git ignore them 2024-09-13 09:44:08 -07:00
Michael Hohn
8dd6c94918 Set up and push fully configured vs code container 2024-09-12 14:05:59 -07:00
Michael Hohn
34958e4cf4 WIP: Working individual containers and docker compose demo 2024-09-12 09:49:25 -07:00
Michael Hohn
259bac55fb Add container mapping diagram 2024-09-12 09:46:28 -07:00
Michael Hohn
41f6db5de0 Add Makefile to push mrva agent container image 2024-09-06 14:42:31 -07:00
Michael Hohn
19330c3a0f Add Makefile to push mrva server container image 2024-09-06 11:40:13 -07:00
Michael Hohn
1e2df515e3 Set up and push Docker containers for demonstration purposes
These containers take the place of a desktop install
2024-09-04 15:52:18 -07:00
Michael Hohn
681fcdab8c Add new containers to streamline setup 2024-08-29 13:22:59 -07:00
Michael Hohn
5021fc824b Fix: include minio in requirements.txt 2024-08-23 08:17:27 -07:00
Michael Hohn
7d27b910cd Fix: include CID when filtering in mc-rows-from-mrva-list 2024-08-22 13:58:07 -07:00
Michael Hohn
0d3f4c5e40 Updated requirements for container 2024-08-21 16:25:13 -07:00
Michael Hohn
a86f955aab Clarify notes/cli-end-to-end.org -> notes/cli-end-to-end-detailed.org 2024-08-21 11:19:00 -07:00
Michael Hohn
c556605e44 Run containers without mitmweb proxy 2024-08-21 11:17:20 -07:00
Michael Hohn
7b06484b29 get github to render cli-end-to-end.org 2024-08-16 15:06:10 -07:00
Michael Hohn
fc751ae08f Add full walkthrough description in notes/cli-end-to-end.org 2024-08-16 14:39:44 -07:00
Michael Hohn
d956f47db3 Fix: Produce complete SARIF output in agent
The problem was missing fields in the SARIF output.  After the debugging
below, the cause were conversions from JSON to Go to JSON; the Go ->
JSON conversion only output the fields defined in the Go struct.

Because SARIF has so many optional fields, no attempt is made to enforce
a statically-defined structure.  Instead, the JSON -> Go conversion is
now to a fully dynamic structure; unused fields are simply passed through

Debugging:

       Comparing two SARIF files shows

           {
             "$schema" : "https://json.schemastore.org/sarif-2.1.0.json",
             "version" : "2.1.0",
             "runs" : [ {...
             } ]
           }

       and

           {
             "runs": [...
             ]
           }

       so there are missing fields.

    The Problem
     1. Problem origin

          // Modify the sarif: start by extracting
          var sarif Sarif
          if err := json.Unmarshal(sarifData, &sarif); err != nil {
              return nil, fmt.Errorf("failed to unmarshal SARIF: %v", err)
          }
          ...
          // now inject version control info
          ...
          // and write it back
          sarifBytes, err := json.Marshal(sarif)
          if err != nil {
              return nil, fmt.Errorf("failed to marshal SARIF: %v", err)
          }

     2. But the struct only has one of the needed fields

          type Sarif struct {
              Runs []SarifRun `json:"runs"`
          }

     3. From the docs:

          // To unmarshal JSON into a struct, Unmarshal matches incoming object
          // keys to the keys used by [Marshal] (either the struct field name or its tag),
          // preferring an exact match but also accepting a case-insensitive match. By
          // default, object keys which don't have a corresponding struct field are
          // ignored (see [Decoder.DisallowUnknownFields] for an alternative).
2024-08-16 14:27:46 -07:00
Michael Hohn
0a52b729cd Expand the codeql db download response
End-to-end testing contained an unhandled CodeQL database download
request.  The handlers are added in this patch.  Debugging info is below
for reference.

The mrvacommander *server* fails with the following.  The source code is
: func setupEndpoints(c CommanderAPI)
See mrvacommander/pkg/server/server.go, endpoints for getting a URL to download artifacts.

  Original

          Downloading artifacts for tdlib_telegram-bot-apictsj8529d9_2
          ...
          Downloading database tdlib/telegram-bot-apictsj8529d9 cpp mirva-session-1400 tdlib_telegram-bot-apictsj8529d9_2
          ...
          2024/08/13 12:31:38 >> GET http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
          ...
          2024/08/13 12:31:38 << 404 http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
          ...
          -rwxr-xr-x@  1 hohn  staff  169488 Aug 13 12:29 tdlib_telegram-bot-apictsj8529d9_2.sarif*
          -rwxr-xr-x@  1 hohn  staff      10 Aug 13 12:31 tdlib_telegram-bot-apictsj8529d9_2_db.zip*

  Server log

          server         | 2024/08/13 19:31:38 ERROR Unhandled endpoint method=GET uri=/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp

  Try a manual download from the server

          8:$ wget http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
          --2024-08-13 12:56:05--  http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
          Resolving localhost (localhost)... ::1, 127.0.0.1
          Connecting to localhost (localhost)|::1|:8080... connected.
          HTTP request sent, awaiting response... 404 Not Found
          2024-08-13 12:56:05 ERROR 404: Not Found.

          server         | 2024/08/13 19:56:05 ERROR Unhandled endpoint method=GET uri=/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp

  The full info for the DB
          tdlib,telegram-bot-api,8529d9,2.17.0,2024-05-09 08:02:49.545174+00:00,cpp,f95d406da67adb8ac13d9c562291aa57c65398e0,306106.0,/Users/hohn/work-gh/mrva/mrva-open-source-download/repos-2024-04-29/tdlib/telegram-bot-api/code-scanning/codeql/databases/cpp/db.zip,cpp,C/C++,1244.0,306106.0,2024-05-13T15:54:54.749093,cpp,True,3375,373477635

The gh-mrva *client* sends the following.  The source is
gh-mrva/utils/utils.go,
    client.Get(fmt.Sprintf("http://localhost:8080/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))

We have
  cd /Users/hohn/work-gh/mrva/gh-mrva
  0:$ rg 'repos/.*/code-scanning/codeql/databases'

          ...
          utils/utils.go
          625:	// resp, err := client.Get(fmt.Sprintf("https://api.github.com/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))
          626:	resp, err := client.Get(fmt.Sprintf("http://localhost:8080/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))

  And
          resp, err := client.Get(fmt.Sprintf("http://localhost:8080/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))

The original DB upload was
  cd ~/work-gh/mrva/mrvacommander/client/qldbtools && \
      ./bin/mc-db-populate-minio -n 11 < scratch/db-info-3.csv

  ...
  2024-08-14 09:29:19 [INFO] Uploaded /Users/hohn/work-gh/mrva/mrva-open-source-download/repos-2024-04-29/tdlib/telegram-bot-api/code-scanning/codeql/databases/cpp/db.zip as tdlib$telegram-bot-apictsj8529d9.zip to bucket qldb
  ...
2024-08-14 13:01:15 -07:00
Michael Hohn
6bebf4abfc Remove interactive debug statements 2024-08-13 09:27:13 -07:00
Michael Hohn
9d60489908 wip: Handle varying CodeQL DB formats. This code contains debugging features
This patch fixes the following

     - [X] Wrong db metadata path.  Fixed via
       : globRecursively(databasePath, "codeql-database.yml")

       The log output for reference:

                 agent          | 2024/08/09 21:16:40 DEBUG XX:getDataBaseMetadata databasePath=/tmp/ce523549-a217-4b54-a118-7224ce444870/db "Waiting for SIGUSR1 or SIGUSR2..."=<nil>
                 agent          | 2024/08/09 21:16:40 DEBUG XX:getDataBaseMetadata databasePath=/tmp/bc24fe72-b520-4e72-9634-a98d630cb75e/db "Waiting for SIGUSR1 or SIGUSR2..."=<nil>
                 agent          | 2024/08/09 21:16:40 DEBUG Received signal: %s "user defined signal 1"=<nil>
                 agent          | 2024/08/09 21:16:40 DEBUG XX:getDataBaseMetadata databasePath=/tmp/41fcf5cc-e151-4a11-bccc-481d599aa426/db "Waiting for SIGUSR1 or SIGUSR2..."=<nil>

            From

                 func getDatabaseMetadata(databasePath string) (*DatabaseMetadata, error) {
                 data, err := os.ReadFile(filepath.Join(databasePath, "codeql-database.yml"))
                 ...}

            And some inspection:

                 root@3fa4b8013336:~# find /tmp |grep ql-datab
                 /tmp/27f09b9f-254f-4ef5-abf5-9a1a2927906b/db/cpp/codeql-database.yml
                 /tmp/d7e14cd4-8789-4176-81bc-2ac1957ed9fd/db/codeql_db/codeql-database.yml
                 /tmp/41fcf5cc-e151-4a11-bccc-481d599aa426/db/codeql_db/codeql-database.yml
                 /tmp/bc24fe72-b520-4e72-9634-a98d630cb75e/db/codeql_db/codeql-database.yml
                 /tmp/ce523549-a217-4b54-a118-7224ce444870/db/codeql_db/codeql-database.yml

     - [X] Wrong db path.  Fixed via
       : findDBDir(databasePath)

       The log output for reference:

                 agent          | 2024/08/09 21:51:09 ERROR Failed to run analysis job error="failed to run analysis: failed to run queries: exit status 2\nOutput: A fatal error occurred: /tmp/91c61e0b-dfd9-4dd3-a3ad-cb77dbc1cbfd/db is not a recognized CodeQL database.\n"
                 agent          | 2024/08/09 21:51:09 INFO Running analysis job job="{Spec:{SessionID:1 NameWithOwner:{Owner:USCiLab Repo:cerealctsj264953}} QueryPackLocation:{Key:1 Bucket:packs} QueryLanguage:cpp}"
                 agent          | 2024/08/09 21:51:09 ERROR Failed to run analysis job error="failed to run analysis: failed to run queries: exit status 2\nOutput: A fatal error occurred: /tmp/1b8ffeba-8ad1-465e-8ec7-36cda449a5f5/db is not a recognized CodeQL database.\n"
                 ...

            This is easily confirmed:

                 root@171b5417e05f:~# /opt/codeql/codeql database upgrade  /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2/
                 A fatal error occurred: /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2 is not a recognized CodeQL database.

            Another try:

                 root@171b5417e05f:~# /opt/codeql/codeql database upgrade  /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2/database.zip
                 A fatal error occurred: Database root /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2/database.zip is not a directory.

             This one is correct:

                 root@171b5417e05f:~# /opt/codeql/codeql database upgrade /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2/db/codeql_db
                 /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2/db/codeql_db/db-cpp is up to date.

     - [X] Wrong database source prefix.  Also fixed via
       : findDBDir(databasePath)

       Similar log entries:

                 agent          | 2024/08/13 15:40:14 ERROR Failed to run analysis job error="failed to run analysis: failed to get source location prefix: failed to resolve database: exit status 2\nOutput: A fatal error occurred: /tmp/da420844-a284-4d82-9470-fa189a5b4ee6/db is not a recognized CodeQL database.\n"
                 agent          | 2024/08/13 15:40:14 INFO Worker stopping due to reduction in worker count
                 agent          | 2024/08/13 15:40:18 ERROR Failed to run analysis job error="failed to run analysis: failed to get source location prefix: failed to resolve database: exit status 2\nOutput: A fatal error occurred: /tmp/eebfc52c-3ecf-490d-bbf4-23c305d6ba18/db is not a recognized CodeQL database.\n"

            and
                 agent          | 2024/08/13 15:49:33 ERROR Failed to resolve database err="exit status 2" output="A fatal error occurred: /tmp/b5c4941a-5692-4640-aa79-9810bcab39f4/db is not a recognized CodeQL database.\n"
                 agent          | 2024/08/13 15:49:33 DEBUG XX: RunQuery failed to get source location prefixdatabasePath=/tmp/b5c4941a-5692-4640-aa79-9810bcab39f4/db "Waiting for SIGUSR1 or SIGUSR2..."=<nil>
                 agent          | 2024/08/13 15:49:35 INFO Modifying worker count current=3 new=2
                 agent          | 2024/08/13 15:49:35 ERROR Failed to resolve database err="exit status 2" output="A fatal error occurred: /tmp/eda30582-81a3-4582-8897-65f8904e8501/db is not a recognized CodeQL database.\n"
                 agent          | 2024/08/13 15:49:35 DEBUG XX: RunQuery failed to get source location prefixdatabasePath=/tmp/eda30582-81a3-4582-8897-65f8904e8501/db "Waiting for SIGUSR1 or SIGUSR2..."=<nil>

            And this fails

                 root@51464985499f:~# /opt/codeql/codeql resolve database /tmp/eda30582-81a3-4582-8897-65f8904e8501/db/
                 A fatal error occurred: /tmp/eda30582-81a3-4582-8897-65f8904e8501/db is not a recognized CodeQL database.

            But this works:

                 root@51464985499f:~# /opt/codeql/codeql resolve database /tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/
                 {
                   "sourceLocationPrefix" : "/home/runner/work/bulk-builder/bulk-builder",
                   "columnKind" : "utf8",
                   "unicodeNewlines" : false,
                   "sourceArchiveZip" : "/tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/src.zip",
                   "sourceArchiveRoot" : "/tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/src",
                   "datasetFolder" : "/tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/db-cpp",
                   "logsFolder" : "/tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/log",
                   "languages" : [
                     "cpp"
                   ],
                   "scratchDir" : "/tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/working"
                }
2024-08-13 09:22:24 -07:00
Michael Hohn
35100f89a7 Add html generation/view targets 2024-08-13 00:03:16 -07:00
Michael Hohn
742b059a49 Add script to list full details for a mrva-list file 2024-08-09 08:37:31 -07:00