Include custom id (CID) to distinguish CodeQL databases

The current api (<2024-07-26 Fri>) is set up only for (owner,name).  This is
insufficient for distinguishing CodeQL databases.

Other differences must be considered;  this patch combines the fields
    | cliVersion   |
    | creationTime |
    | language     |
    | sha          |
into one called CID.  The CID field is a hash of these others and therefore can be
changed in the future without affecting workflows or the server.

The cid is combined with the owner/name to form one
identifier.  This requires no changes to server or client -- the db
selection's interface is separate from VS Code and gh-mrva in any case.

To test this, this version imports multiple versions of the same owner/repo pairs from multiple directories.  In this case, from
    ~/work-gh/mrva/mrva-open-source-download/repos
and
    ~/work-gh/mrva/mrva-open-source-download/repos-2024-04-29/
The unique database count increases from 3000 to 5360 -- see README.md,
    ./bin/mc-db-view-info < db-info-3.csv &

Other code modifications:
    - Push (owner,repo,cid) names to minio
    - Generate databases.json for use in vs code extension
    -  Generate list-databases.json for use by gh-mrva client
This commit is contained in:
Michael Hohn
2024-07-30 10:47:29 -07:00
committed by =Michael Hohn
parent b4f1a2b8a6
commit 1e1daf9330
8 changed files with 322 additions and 52 deletions

View File

@@ -6,48 +6,48 @@ qldbtools is a Python package for working with CodeQL databases
- Set up the virtual environment and install tools
cd ~/work-gh/mrva/mrvacommander/client/qldbtools/
python3.11 -m venv venv
source venv/bin/activate
pip install --upgrade pip
cd ~/work-gh/mrva/mrvacommander/client/qldbtools/
python3.11 -m venv venv
source venv/bin/activate
pip install --upgrade pip
# From requirements.txt
pip install -r requirements.txt
# Or explicitly
pip install jupyterlab pandas ipython
pip install lckr-jupyterlab-variableinspector
# From requirements.txt
pip install -r requirements.txt
# Or explicitly
pip install jupyterlab pandas ipython
pip install lckr-jupyterlab-variableinspector
- Run jupyterlab
cd ~/work-gh/mrva/mrvacommander/client
source venv/bin/activate
jupyter lab &
The variable inspector is a right-click on an open console or notebook.
The `jupyter` command produces output including
Jupyter Server 2.14.1 is running at:
http://127.0.0.1:8888/lab?token=4c91308819786fe00a33b76e60f3321840283486457516a1
cd ~/work-gh/mrva/mrvacommander/client
source venv/bin/activate
jupyter lab &
The variable inspector is a right-click on an open console or notebook.
The `jupyter` command produces output including
Jupyter Server 2.14.1 is running at:
http://127.0.0.1:8888/lab?token=4c91308819786fe00a33b76e60f3321840283486457516a1
Use this to connect multiple front ends
Use this to connect multiple front ends
- Local development
```bash
cd ~/work-gh/mrva/mrvacommander/client/qldbtools
source venv/bin/activate
pip install --editable .
```
```bash
cd ~/work-gh/mrva/mrvacommander/client/qldbtools
source venv/bin/activate
pip install --editable .
```
The `--editable` *should* use symlinks for all scripts; use `./bin/*` to be sure.
The `--editable` *should* use symlinks for all scripts; use `./bin/*` to be sure.
- Full installation
```bash
pip install qldbtools
```
```bash
pip install qldbtools
```
## Use as library
@@ -58,15 +58,32 @@ import qldbtools as ql
## Command-line use
cd ~/work-gh/mrva/mrvacommander/client/qldbtools
./bin/mc-db-initial-info ~/work-gh/mrva/mrva-open-source-download > db-info-1.csv
./bin/mc-db-refine-info < db-info-1.csv > db-info-2.csv
./bin/mc-db-populate-minio < db-info-2.csv -n 3
Initial information collection requires a unique file path so it can be run
repeatedly over DB collections with the same (owner,name) but other differences
-- namely, in one or more of
./bin/mc-db-view-info < db-info-2.csv
./bin/mc-db-unique < db-info-2.csv > db-info-3.csv
- creationTime
- sha
- cliVersion
- language
Those fields are collected and a single name addenum formed in
`bin/mc-db-refine-info`.
XX: Add `mc-db-generate-selection`
The command sequence, grouped by data files, is
cd ~/work-gh/mrva/mrvacommander/client/qldbtools
./bin/mc-db-initial-info ~/work-gh/mrva/mrva-open-source-download > db-info-1.csv
./bin/mc-db-refine-info < db-info-1.csv > db-info-2.csv
./bin/mc-db-view-info < db-info-2.csv &
./bin/mc-db-unique < db-info-2.csv > db-info-3.csv
./bin/mc-db-view-info < db-info-3.csv &
./bin/mc-db-populate-minio -n 23 < db-info-3.csv
./bin/mc-db-generate-selection -n 23 vscode-selection.json gh-mrva-selection.json < db-info-3.csv