- Introduce hepc-serve-global to serve global MRVA values from hohnlab.org/mrva/values without local DB provisioning. - Keep schema initialization symmetric across server and agent, while serializing PostgreSQL DDL via a global advisory lock to prevent concurrent CREATE TABLE races. - Pin RabbitMQ image to rabbitmq:3.13.7-management to avoid credential incompatibilities introduced by upstream image changes. - Remove pre-hashed RabbitMQ credentials and return to deterministic user/password initialization. - Eliminate reliance on implicit container state to ensure reproducible startup. The primary purpose of this change is integration of global MRVA values; the remaining fixes are required to make the new startup path reliable.
Introduction to hepc – HTTP End Point for CodeQL
Note on Global Data
hepc requires a pre-existing, complete, self-consistent metadata set. Creation and maintenance of this metadata is intentionally outside the scope of hepc itself.
There are several supported ways to obtain such a metadata set:
- The sections Usage Sample for Containers and Form link to DBs from filesystem, create metadata.sql illustrate how to scan local file systems and generate a compatible metadata.sql. This approach is appropriate if you maintain your own CodeQL databases. In this case, host-hepc-init and host-hepc-serve can be used without significant changes.
- hepc-serve-global references a curated collection of open-source projects' CodeQL databases. This requires a network connection but no local setup. The collection is currently around 70 GB and is expected to grow beyond 160 GB; it is therefore unsuitable for inclusion in a Docker image. At present, it contains approximately 3000 databases. For experimentation at medium scale, this is the recommended approach.
- host-hepc-serve implements a simple HTTP API used by the other MRVA containers. You may replace this service entirely by implementing a compatible server and adjusting the container wiring accordingly.
In all cases, hepc operates strictly as a reader of metadata and does not create, modify, or repair metadata sets.
Usage Sample for Containers
This is for container preparation; all operations produce full, standalone copies. For a host service with full storage, see Usage Sample for Full Machines
cd ~/work-gh/mrva/mrvahepc
uv sync # one-time install; uv-run shebangs avoid manual activation
# Collect DBs from filesystem
cd ~/work-gh/mrva/mrvahepc && rm -fR db-collection.tmp/
export MRVA_HEPC_ENDPOINT=http://hepc
./bin/mc-hepc-init --db_collection_dir db-collection.tmp \
--starting_path ~/work-gh/mrva/mrva-open-source-download/ \
--max_dbs 17
# Serve collected DBs plus metadata
cd ~/work-gh/mrva/mrvahepc
./bin/mc-hepc-serve --codeql-db-dir db-collection.tmp
# Test server
curl 127.0.0.1:8070/index -o - 2>/dev/null | wc -l
curl 127.0.0.1:8070/api/v1/latest_results/codeql-all \
-o - 2>/dev/null | wc -l
url=$(curl 127.0.0.1:8070/api/v1/latest_results/codeql-all \
-o - 2>/dev/null | head -1 | jq -r .result_url)
echo $url
# http://hepc/db/db-collection.tmp/aircrack-ng-aircrack-ng-ctsj-41ebbe.zip
wget $(echo $url|sed 's|http://hepc|http://127.0.0.1:8070|g;')
Usage Sample for Full Machines
This is for providing a DB service from a complete machine, with DBs already in place. All operations produce links to existing DBs, not copies.
Form link to DBs from filesystem, create metadata.sql
cd ~/work-gh/mrva/mrvahepc
uv sync
# Form link to DBs from filesystem, create metadata.sql
cd ~/work-gh/mrva/mrvahepc && rm -fR db-collection-host.tmp/
workers=$(( $(nproc) * 2 ))
./bin/host-hepc-init --db_collection_dir db-collection-host.tmp \
--starting_path ~/work-gh/mrva/mrva-open-source-download/ \
--max_dbs 8000 \
--max_workers=$workers \
> db-collection-host.log 2>&1
Inspect metadata.sql
cd ~/work-gh/mrva/mrvahepc
uv sync
# Inspect metadata.sql
cd ~/work-gh/mrva/mrvahepc
sqlite3 db-collection-host.tmp/metadata.sql <<eoo
.tables
select count(*) from metadata;
.mode column
.width 13 13 13 13 13 13 13 13 13 13 13
select * from metadata limit 1;
eoo
Use metadata.sql to provide DB selection UI
Here, just run
cd ~/work-gh/mrva/mrvahepc
uv sync
# Use metadata.sql to provide DB selection UI
cd ~/work-gh/mrva/mrvahepc
./bin/db-selector-gui --metadata_db_path db-collection-host.tmp/metadata.sql
The docker compose file mounts host paths used in
db-collection-host.tmp/metadata.sql as hepc volumes
TODO Serve collected DBs plus metadata
# Serve collected DBs plus metadata
cd ~/work-gh/mrva/mrvahepc
./bin/mc-hepc-serve --codeql-db-dir db-collection.tmp
# Test server
curl 127.0.0.1:8070/index -o - 2>/dev/null | wc -l
curl 127.0.0.1:8070/api/v1/latest_results/codeql-all \
-o - 2>/dev/null | wc -l
url=$(curl 127.0.0.1:8070/api/v1/latest_results/codeql-all \
-o - 2>/dev/null | head -1 | jq -r .result_url)
echo $url
# http://hepc/db/db-collection.tmp/aircrack-ng-aircrack-ng-ctsj-41ebbe.zip
wget $(echo $url|sed 's|http://hepc|http://127.0.0.1:8070|g;')
Installation
-
Use uv to install dependencies without manually activating a venv
cd ~/work-gh/mrva/mrvahepc uv sync
uv syncinstalls everything declared inpyproject.tomlinto a managed.venvand caches the wheels; thebin/scripts already useuv runin their shebangs, so you can execute them directly. -
Local development
cd ~/work-gh/mrva/mrvahepc uv sync uv pip install --editable .
The
--editableinstall is optional; use./bin/*directly to avoid relying on entry points in your shell. -
Full installation
pip install mrvahepc
Use as library
The best way to examine the code is starting from the high-level scripts
in bin/.