Michael Hohn 0a52b729cd Expand the codeql db download response
End-to-end testing contained an unhandled CodeQL database download
request.  The handlers are added in this patch.  Debugging info is below
for reference.

The mrvacommander *server* fails with the following.  The source code is
: func setupEndpoints(c CommanderAPI)
See mrvacommander/pkg/server/server.go, endpoints for getting a URL to download artifacts.

  Original

          Downloading artifacts for tdlib_telegram-bot-apictsj8529d9_2
          ...
          Downloading database tdlib/telegram-bot-apictsj8529d9 cpp mirva-session-1400 tdlib_telegram-bot-apictsj8529d9_2
          ...
          2024/08/13 12:31:38 >> GET http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
          ...
          2024/08/13 12:31:38 << 404 http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
          ...
          -rwxr-xr-x@  1 hohn  staff  169488 Aug 13 12:29 tdlib_telegram-bot-apictsj8529d9_2.sarif*
          -rwxr-xr-x@  1 hohn  staff      10 Aug 13 12:31 tdlib_telegram-bot-apictsj8529d9_2_db.zip*

  Server log

          server         | 2024/08/13 19:31:38 ERROR Unhandled endpoint method=GET uri=/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp

  Try a manual download from the server

          8:$ wget http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
          --2024-08-13 12:56:05--  http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
          Resolving localhost (localhost)... ::1, 127.0.0.1
          Connecting to localhost (localhost)|::1|:8080... connected.
          HTTP request sent, awaiting response... 404 Not Found
          2024-08-13 12:56:05 ERROR 404: Not Found.

          server         | 2024/08/13 19:56:05 ERROR Unhandled endpoint method=GET uri=/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp

  The full info for the DB
          tdlib,telegram-bot-api,8529d9,2.17.0,2024-05-09 08:02:49.545174+00:00,cpp,f95d406da67adb8ac13d9c562291aa57c65398e0,306106.0,/Users/hohn/work-gh/mrva/mrva-open-source-download/repos-2024-04-29/tdlib/telegram-bot-api/code-scanning/codeql/databases/cpp/db.zip,cpp,C/C++,1244.0,306106.0,2024-05-13T15:54:54.749093,cpp,True,3375,373477635

The gh-mrva *client* sends the following.  The source is
gh-mrva/utils/utils.go,
    client.Get(fmt.Sprintf("http://localhost:8080/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))

We have
  cd /Users/hohn/work-gh/mrva/gh-mrva
  0:$ rg 'repos/.*/code-scanning/codeql/databases'

          ...
          utils/utils.go
          625:	// resp, err := client.Get(fmt.Sprintf("https://api.github.com/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))
          626:	resp, err := client.Get(fmt.Sprintf("http://localhost:8080/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))

  And
          resp, err := client.Get(fmt.Sprintf("http://localhost:8080/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))

The original DB upload was
  cd ~/work-gh/mrva/mrvacommander/client/qldbtools && \
      ./bin/mc-db-populate-minio -n 11 < scratch/db-info-3.csv

  ...
  2024-08-14 09:29:19 [INFO] Uploaded /Users/hohn/work-gh/mrva/mrva-open-source-download/repos-2024-04-29/tdlib/telegram-bot-api/code-scanning/codeql/databases/cpp/db.zip as tdlib$telegram-bot-apictsj8529d9.zip to bucket qldb
  ...
2024-08-14 13:01:15 -07:00
2024-07-03 08:55:27 -07:00
2024-07-10 15:38:59 -07:00
2024-06-26 10:35:10 -07:00
2024-06-26 10:35:10 -07:00
2024-06-16 19:43:29 -07:00
2024-06-16 19:43:29 -07:00
2024-05-07 15:38:59 -07:00
2024-08-13 00:03:16 -07:00
2024-08-05 18:43:17 -07:00

Overview

TODO diagram

TODO Style notes

  • NO package init() functions
  • Dynamic behaviour must be explicit

Client CodeQL Database Selector

Separate from the server's downloading of databases, a client-side interface is needed to generate the databases.json file. This

  1. must be usable from the shell
  2. must be interactive (Python, Jupyter)
  3. is session based to allow iterations on selection / narrowing
  4. must be queryable. There is no need to reinvent sql / dataframes

Python with dataframes is ideal for this; the project is in client/.

Reverse proxy

For testing, replay flows using mitmweb. This is faster and simpler than using gh-mrva or the VS Code plugin.

  • Set up the virtual environment and install tools

    python3.11 -m venv venv
    source venv/bin/activate
    pip install mitmproxy
    

For intercepting requests:

  1. Start mitmproxy to listen on port 8080 and forward requests to port 8081, with web interface

    mitmweb --mode reverse:http://localhost:8081 -p 8080
    
  2. Change server ports in docker-compose.yml to

    ports:
    - "8081:8080" # host:container
    
  3. Start the containers.

  4. Submit requests.

  5. Save the flows for later replay.

One such session is in tools/mitmweb-flows; it can be loaded to replay the requests:

  1. start mitmweb --mode reverse:http://localhost:8081 -p 8080
  2. file > open > tools/mitmweb-flows
  3. replay at least the submit, status, and download requests

Cross-compile server on host, run it in container

These are simple steps using a single container.

  1. build server on host

    GOOS=linux GOARCH=arm64 go build
    
  2. build docker image

    cd cmd/server
    docker build -t server-image .
    
  3. Start container with shared directory

    docker run -it \
           -v   /Users/hohn/work-gh/mrva/mrvacommander:/mrva/mrvacommander \
           server-image
    
  4. Run server in container

    cd /mrva/mrvacommander/cmd/server/ && ./server
    

Using docker-compose

Steps to build and run the server

Steps to build and run the server in a multi-container environment set up by docker-compose.

  1. Built the server-image, above

  2. Build server on host

    cd ~/work-gh/mrva/mrvacommander/cmd/server/
    GOOS=linux GOARCH=arm64 go build
    
  3. Start the containers

    cd ~/work-gh/mrva/mrvacommander/
    docker-compose down
    docker-compose up -d
    
  4. Run server in its container

    cd ~/work-gh/mrva/mrvacommander/
    docker exec -it server bash
    cd /mrva/mrvacommander/cmd/server/ 
    ./server -loglevel=debug -mode=container
    
  5. Test server from the host via

    cd ~/work-gh/mrva/mrvacommander/tools
    sh ./request_16-Jun-2024_11-33-16.curl
    
  6. Follow server logging via

    cd ~/work-gh/mrva/mrvacommander
    docker-compose up -d
    docker-compose logs -f server
    
  7. Completely rebuild all containers. Useful when running into docker errors

    cd ~/work-gh/mrva/mrvacommander
    docker-compose up --build
    
  8. Test server via remote client by following the steps in gh-mrva

Some general docker-compose commands

  1. Get service status

    docker-compose ps
    
  2. Stop services

    docker-compose down
    
  3. View all logs

    docker-compose logs
    
  4. check containers from server container

    docker exec -it server bash
    curl -I http://rabbitmq:15672
    

Use the minio ql database db

  1. Web access via

    open http://localhost:9001/login
    

    username / password are in docker-compose.yml for now. The ql db listing will be at

    http://localhost:9001/browser/qldb
    
  2. Populate the database by running

    ./populate-dbstore.sh
    

    from the host.

  3. The names in the bucket use the owner_repo format for now, e.g. google_flatbuffers_db.zip. TODO This will be enhanced to include other data later

  4. Test Go's access to the dbstore -- from the host -- via

    cd ./test
    go test -v
    

    This should produce

    === RUN   TestDBListing
    dbstore_test.go:44: Object Key: google_flatbuffers_db.zip
    dbstore_test.go:44: Object Key: psycopg_psycopg2_db.zip
    

Use the minio query pack db

  1. Web access via

    open http://localhost:19001/login
    

    username / password are in docker-compose.yml for now. The ql db listing will be at

    http://localhost:19001/browser/qpstore
    

To run Use the minio query pack db

Description
No description provided
Readme Apache-2.0 1.2 GiB
Languages
Go 47.7%
Jupyter Notebook 21.2%
Python 17.4%
CSS 6.3%
Dockerfile 3.6%
Other 3.7%