Previously, the refined info was collected and the CID computed before saving. This was a major development time sink, so the CID is now computed in the following step (bin/mc-db-unique). The columns previously chosen for the CID are not enough. If these columns are empty for any reason, the CID repeats. Just including the owner/name won't help, because those are duplicates. Some possibilities considered and rejected: 1. Could use a random number for missing columns. But this makes the CID nondeterministic. 2. Switch to the file system ctime? Not unique across owner/repo pairs, but unique within one. Also, this could be changed externally and cause *very* subtle bugs. 3. Use the file system path? It has to be unique at ingestion time, but repo collections can move. Instead, this patch 4. Drops rows that don't have the | cliVersion | | creationTime | | language | | sha | columns. There are very few (16 out of 6000) and their DBs are quesionable.
Overview
TODO diagram
TODO Style notes
- NO package init() functions
- Dynamic behaviour must be explicit
Client CodeQL Database Selector
Separate from the server's downloading of databases, a client-side interface is needed to generate the databases.json file. This
- must be usable from the shell
- must be interactive (Python, Jupyter)
- is session based to allow iterations on selection / narrowing
- must be queryable. There is no need to reinvent sql / dataframes
Python with dataframes is ideal for this; the project is in client/.
Reverse proxy
For testing, replay flows using mitmweb. This is faster and simpler than using gh-mrva or the VS Code plugin.
-
Set up the virtual environment and install tools
python3.11 -m venv venv source venv/bin/activate pip install mitmproxy
For intercepting requests:
-
Start mitmproxy to listen on port 8080 and forward requests to port 8081, with web interface
mitmweb --mode reverse:http://localhost:8081 -p 8080 -
Change
serverports indocker-compose.ymltoports: - "8081:8080" # host:container -
Start the containers.
-
Submit requests.
-
Save the flows for later replay.
One such session is in tools/mitmweb-flows; it can be loaded to replay the
requests:
- start
mitmweb --mode reverse:http://localhost:8081 -p 8080 file>open>tools/mitmweb-flows- replay at least the submit, status, and download requests
Cross-compile server on host, run it in container
These are simple steps using a single container.
-
build server on host
GOOS=linux GOARCH=arm64 go build -
build docker image
cd cmd/server docker build -t server-image . -
Start container with shared directory
docker run -it \ -v /Users/hohn/work-gh/mrva/mrvacommander:/mrva/mrvacommander \ server-image -
Run server in container
cd /mrva/mrvacommander/cmd/server/ && ./server
Using docker-compose
Steps to build and run the server
Steps to build and run the server in a multi-container environment set up by docker-compose.
-
Built the server-image, above
-
Build server on host
cd ~/work-gh/mrva/mrvacommander/cmd/server/ GOOS=linux GOARCH=arm64 go build -
Start the containers
cd ~/work-gh/mrva/mrvacommander/ docker-compose down docker-compose up -d -
Run server in its container
cd ~/work-gh/mrva/mrvacommander/ docker exec -it server bash cd /mrva/mrvacommander/cmd/server/ ./server -loglevel=debug -mode=container -
Test server from the host via
cd ~/work-gh/mrva/mrvacommander-1/tools sh ./request_16-Jun-2024_11-33-16.curl -
Follow server logging via
cd ~/work-gh/mrva/mrvacommander-1 docker-compose up -d docker-compose logs -f server -
Completely rebuild all containers. Useful when running into docker errors
cd ~/work-gh/mrva/mrvacommander-1 docker-compose up --build -
Test server via remote client by following the steps in gh-mrva
Some general docker-compose commands
-
Get service status
docker-compose ps -
Stop services
docker-compose down -
View all logs
docker-compose logs -
check containers from server container
docker exec -it server bash curl -I http://rabbitmq:15672
Use the minio ql database db
-
Web access via
open http://localhost:9001/loginusername / password are in
docker-compose.ymlfor now. The ql db listing will be athttp://localhost:9001/browser/qldb -
Populate the database by running
./populate-dbstore.shfrom the host.
-
The names in the bucket use the
owner_repoformat for now, e.g.google_flatbuffers_db.zip. TODO This will be enhanced to include other data later -
Test Go's access to the dbstore -- from the host -- via
cd ./test go test -vThis should produce
=== RUN TestDBListing dbstore_test.go:44: Object Key: google_flatbuffers_db.zip dbstore_test.go:44: Object Key: psycopg_psycopg2_db.zip
Use the minio query pack db
-
Web access via
open http://localhost:19001/loginusername / password are in
docker-compose.ymlfor now. The ql db listing will be athttp://localhost:19001/browser/qpstore