Major changes to support cli-end-to-end demonstration. See full log

* notes/cli-end-to-end-demo.org (Database Aquisition):
  Starting description for the end-to-end demonstration workflow.
  Very simplified version of notes/cli-end-to-end.org

* docker-compose-demo.yml (services):
  Make the pre-populated ql database storage an explicit container
  to get persistent data and straightforward mount semantics.

* docker-compose-demo-build.yml (services):
  Add a docker-compose configuration for *building* the demo environment.

* demo/containers/dbsdata/Dockerfile:
  Add dbsdata Docker image to hold initialized minio database file tree

* client/containers/vscode/README.org
  Update vscode container to use custom plugin for later mrva redirection
This commit is contained in:
Michael Hohn
2024-10-15 10:18:42 -07:00
committed by =Michael Hohn
parent 187c49688e
commit 77ce997fbb
8 changed files with 849 additions and 134 deletions

View File

@@ -1,28 +0,0 @@
all: code-server-initialized
CSI_TARGET := code-server-initialized:0.1.24
csi: mk.code-server-initialized
mk.code-server-initialized:
docker build -t ${CSI_TARGET} .
touch $@
csi-serve: csi
docker run -d -p 9080:9080 ${CSI_TARGET}
clean:
-docker rmi -f ${CSI_TARGET}
-rm mk.code-server-initialized
# Targets below are used after some manual setup of the container. See README.org
# for details
csi-push: mk.csi-push
mk.csi-push: csi
docker tag ${CSI_TARGET} ghcr.io/hohn/${CSI_TARGET}
docker push ghcr.io/hohn/${CSI_TARGET}
touch $@
csi-test:
docker pull ghcr.io/hohn/${CSI_TARGET}
docker run --rm -d -p 9080:9080 --name test-code-server-codeql\
ghcr.io/hohn/${CSI_TARGET}

View File

@@ -1,14 +1,15 @@
* MRVA VS Code server container * MRVA VS Code server container
On the host: On the host:
- Build the container via
#+BEGIN_SRC sh #+BEGIN_SRC sh
make csi # Build the container via
#+END_SRC cd ~/work-gh/mrva/mrvacommander/client/containers/vscode/
docker build -t code-server-initialized:0.1.24 .
- Run the container via # Run the container in standalone mode via
#+BEGIN_SRC sh cd ~/work-gh/mrva/mrvacommander/client/containers/vscode/
make csi-serve docker run -v ~/work-gh/mrva/vscode-codeql:/work-gh/mrva/vscode-codeql \
-d -p 9080:9080 code-server-initialized:0.1.24
#+END_SRC #+END_SRC
- Connect to it at http://localhost:9080/?folder=/home/coder, password is =mrva=. - Connect to it at http://localhost:9080/?folder=/home/coder, password is =mrva=.
@@ -24,9 +25,9 @@
codeql pack add codeql/python-all@1.0.6 codeql pack add codeql/python-all@1.0.6
#+END_SRC #+END_SRC
- Open a new file =qldemo/simple.ql= and add this this query to it. The plugin - Create a new file =qldemo/simple.ql= with this query. Open it in VS Code.
will download the CodeQL binaries (but never use them -- the configuration The plugin will download the CodeQL binaries (but never use them -- the
redirects) configuration redirects)
#+BEGIN_SRC sh #+BEGIN_SRC sh
cd cd
cat > qldemo/simple.ql <<eof cat > qldemo/simple.ql <<eof
@@ -48,29 +49,71 @@
- Set the database as default and run the query =simple.ql= - Set the database as default and run the query =simple.ql=
- Add the customized VS Code plugin
On the host
#+BEGIN_SRC sh
cd ~/work-gh/mrva/vscode-codeql
git checkout mrva-standalone
# Install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
# Install correct node version
cd ./extensions/ql-vscode
nvm install
# Build the extension
cd ~/work-gh/mrva/vscode-codeql/extensions/ql-vscode
npm install
npm run build
#+END_SRC
In the container
#+BEGIN_SRC sh
# Install extension
cd /work-gh/mrva/vscode-codeql/dist
/bin/code-server --force --install-extension vscode-codeql-*.vsix
#+END_SRC
- Capture the state of this container and create a new image from it - Capture the state of this container and create a new image from it
#+BEGIN_SRC sh #+BEGIN_SRC sh
docker ps docker ps
docker commit 3802c3e9ad8c code-server-initialized:0.1.24 # Check id column. Use it below.
# sha256:76cfebdd75a69bcaa8020d06fec52593bd5fde06a100c1962f3201af809bfac0 docker commit 2df5732c1850 code-server-initialized:0.1.24
docker kill 3802c3e9ad8c # Keep the sha
# sha256:87c8260146e28aed25b094d023a30a015a958f829c09e66cb50ccca2c4a2a000
docker kill 2df5732c1850
# Make sure the image tag matches the sha # Make sure the image tag matches the sha
docker inspect code-server-initialized:0.1.24 |grep Id docker inspect code-server-initialized:0.1.24 |grep Id
# "Id": "sha256:76cfebdd75a69bcaa8020d06fec52593bd5fde06a100c1962f3201af809bfac0",
# Run the image and check # Run the image and check
docker run -d -p 9080:9080 code-server-initialized:0.1.24 docker run --rm -d -p 9080:9080 --name test-code-server-codeql \
code-server-initialized:0.1.24
#+END_SRC #+END_SRC
Again connect to it at http://localhost:9080/?folder=/home/coder, password is =mrva=.
- Push this container - Push this container
#+BEGIN_SRC sh #+BEGIN_SRC sh
make csi-push # Common
export CSI_TARGET=code-server-initialized:0.1.24
# Push container
docker tag ${CSI_TARGET} ghcr.io/hohn/${CSI_TARGET}
docker push ghcr.io/hohn/${CSI_TARGET}
#+END_SRC #+END_SRC
- Test the registry image - Test the registry image
#+BEGIN_SRC sh #+BEGIN_SRC sh
make csi-test # Test pushed container
docker pull ghcr.io/hohn/${CSI_TARGET}
docker run --rm -d -p 9080:9080 --name test-code-server-codeql\
ghcr.io/hohn/${CSI_TARGET}
#+END_SRC #+END_SRC
In the container, inside the running vs code:
- Check the plugin version number via the command
: codeql: copy version information

View File

@@ -0,0 +1,7 @@
# Use a minimal base image
FROM busybox
ADD dbsdata_backup.tar /
# Just run sh if this container is ever started
CMD ["sh"]

View File

@@ -0,0 +1,70 @@
* MRVA cli tools container
Set up / run:
#+BEGIN_SRC sh
# Run the raw container assembly
cd ~/work-gh/mrva/mrvacommander/
docker-compose -f docker-compose-demo-build.yml up -d
# Use the following commands to populate the mrvacommander database storage
cd ~/work-gh/mrva/mrvacommander/client/qldbtools
mkdir -p scratch
source venv/bin/activate
./bin/mc-db-initial-info ~/work-gh/mrva/mrva-open-source-download > scratch/db-info-1.csv
./bin/mc-db-refine-info < scratch/db-info-1.csv > scratch/db-info-2.csv
./bin/mc-db-unique cpp < scratch/db-info-2.csv > scratch/db-info-3.csv
./bin/mc-db-generate-selection -n 11 \
scratch/vscode-selection.json \
scratch/gh-mrva-selection.json \
< scratch/db-info-3.csv
# Several seconds start-up time; fast db population
./bin/mc-db-populate-minio -n 11 < scratch/db-info-3.csv
# While the containers are running, this will show minio's storage. The zip files
# are split into part.* and xl.meta by minio. Use the web interface to see real
# names.
docker exec dbstore ls -R /data/mrvacommander/
# Open browser to see the file listing
open http://localhost:9001/browser/qldb
# list the volumes
docker volume ls |grep dbs
docker volume inspect mrvacommander_dbsdata
# Persist volume using container
cd ~/work-gh/mrva/mrvacommander/demo/containers/dbsdata
# Note: use mrvacommander_dbsdata, not mrvacommander-dbsdata
# Get the data as tar file from the image
docker run --rm \
-v mrvacommander_dbsdata:/data \
-v $(pwd):/backup \
busybox sh -c "tar cvf /backup/dbsdata_backup.tar ."
# Build container with the tarball
cd ~/work-gh/mrva/mrvacommander/demo/containers/dbsdata
docker build -t dbsdata-container:0.1.24 .
docker image ls | grep dbs
# check container contents
docker run -it dbsdata-container:0.1.24 /bin/sh
docker run -it dbsdata-container:0.1.24 ls data/qldb
# Tag the dbstore backing container
docker inspect dbsdata-container:0.1.24 |grep Id
docker tag dbsdata-container:0.1.24 ghcr.io/hohn/dbsdata-container:0.1.24
# Push the pre-populated image
docker push ghcr.io/hohn/dbsdata-container:0.1.24
# Check the tagged image
docker run -it ghcr.io/hohn/dbsdata-container:0.1.24 \
ls data/qldb
# Shut down the container assembly
docker-compose -f docker-compose-demo-build.yml down
#+END_SRC

View File

@@ -0,0 +1,129 @@
# This is the compose configuration used to build / prepopulate the containers for
# a demo.
services:
dbssvc:
## image: ghcr.io/hohn/dbsdata-container:0.1.24
build:
context: .
dockerfile: ./demo/containers/dbsdata/Dockerfile
container_name: dbssvc
volumes:
- dbsdata:/data/mrvacommander/dbstore-data
networks:
- backend
dbstore:
image: minio/minio:RELEASE.2024-06-11T03-13-30Z
container_name: dbstore
ports:
- "9000:9000"
- "9001:9001"
env_file:
- path: .env.container
required: true
command: server /data/mrvacommander/dbstore-data --console-address ":9001"
depends_on:
- dbssvc
volumes:
- dbsdata:/data/mrvacommander/dbstore-data
networks:
- backend
client-ghmrva:
## image: ghcr.io/hohn/client-ghmrva-container:0.1.24
build:
context: .
dockerfile: ./client/containers/ghmrva/Dockerfile
network_mode: "service:server" # Share the 'server' network namespace
environment:
- SERVER_URL=http://localhost:8080 # 'localhost' now refers to 'server'
code-server:
## image: ghcr.io/hohn/code-server-initialized:0.1.24
build:
context: ./client/containers/vscode
dockerfile: Dockerfile
ports:
- "9080:9080"
environment:
- PASSWORD=mrva
rabbitmq:
image: rabbitmq:3-management
hostname: rabbitmq
container_name: rabbitmq
volumes:
- ./init/rabbitmq/rabbitmq.conf:/etc/rabbitmq/rabbitmq.conf:ro
- ./init/rabbitmq/definitions.json:/etc/rabbitmq/definitions.json:ro
ports:
- "5672:5672"
- "15672:15672"
healthcheck:
test: rabbitmq-diagnostics check_port_connectivity
interval: 30s
timeout: 30s
retries: 10
networks:
- backend
server:
build:
context: .
dockerfile: ./cmd/server/Dockerfile
command: [ '--mode=container', '--loglevel=debug' ]
container_name: server
stop_grace_period: 1s
ports:
# - "8081:8080" # host:container for proxy
- "8080:8080" # host:container
depends_on:
- rabbitmq
- dbstore
- artifactstore
env_file:
- path: ./.env.container
required: true
networks:
- backend
artifactstore:
image: minio/minio:RELEASE.2024-06-11T03-13-30Z
container_name: artifactstore
ports:
- "19000:9000" # host:container
- "19001:9001"
env_file:
- path: ./.env.container
required: true
command: server /data --console-address ":9001"
volumes:
# The artifactstore is only populated at runtime so there is no need
# for Docker storage; a directory is fine.
- ./qpstore-data:/data
networks:
- backend
agent:
## image: ghcr.io/hohn/mrva-agent:0.1.24
build:
context: .
dockerfile: ./cmd/agent/Dockerfile
command: [ '--loglevel=debug' ]
container_name: agent
depends_on:
- rabbitmq
- dbstore
- artifactstore
env_file:
- path: ./.env.container
required: true
networks:
- backend
networks:
backend:
driver: bridge
volumes:
dbsdata:

View File

@@ -1,12 +1,16 @@
services: services:
mrvadata: dbssvc:
image: ghcr.io/hohn/mrvadata:0.1.24 # dbsdata-container:0.1.24
container_name: mrvadata image: ghcr.io/hohn/dbsdata-container:0.1.24
command: tail -f /dev/null # Keep the container running
# volumes:
# - /qldb # Directory inside the container that contains the data
volumes: volumes:
- mrvadata:/data/mrvacommander/dbstore-data - dbsdata:/data
container_name: dbssvc
networks: networks:
- backend - backend
dbstore: dbstore:
image: minio/minio:RELEASE.2024-06-11T03-13-30Z image: minio/minio:RELEASE.2024-06-11T03-13-30Z
container_name: dbstore container_name: dbstore
@@ -18,23 +22,14 @@ services:
required: true required: true
command: server /data/mrvacommander/dbstore-data --console-address ":9001" command: server /data/mrvacommander/dbstore-data --console-address ":9001"
depends_on: depends_on:
- mrvadata - dbssvc
# The mrvadata volume has content of ./dbstore-data, so the volume mount # volumes_from:
# below is equivalent of this original: # - dbsdata # Use the volumes from dbsdata container
# volumes:
# - ./dbstore-data:/data
volumes: volumes:
- mrvadata:/data - dbsdata:/data/mrvacommander/dbstore-data
networks: networks:
- backend - backend
client-qldbtools:
image: ghcr.io/hohn/client-qldbtools-container:0.1.24
# XX: Copy client/qldbtools/scratch into this container
networks:
- backend
client-ghmrva: client-ghmrva:
image: ghcr.io/hohn/client-ghmrva-container:0.1.24 image: ghcr.io/hohn/client-ghmrva-container:0.1.24
network_mode: "service:server" # Share the 'server' network namespace network_mode: "service:server" # Share the 'server' network namespace
@@ -118,4 +113,4 @@ networks:
driver: bridge driver: bridge
volumes: volumes:
mrvadata: dbsdata:

View File

@@ -0,0 +1,471 @@
# -*- coding: utf-8 -*-
#+OPTIONS: H:2 num:t \n:nil @:t ::t |:t ^:{} f:t *:t TeX:t LaTeX:t skip:nil p:nil
* End-to-end example of CLI use
This document describes the build steps for the demo containers.
* Database Aquisition
For this demo, the data is preloaded via container. To set up the container
#+BEGIN_SRC sh
# On host, run
docker exec -it dbstore /bin/bash
# In the container
ls -la /data/dbstore-data/
ls /data/dbstore-data/qldb/ | wc -l
#+END_SRC
Here we use a small sample of an example for open-source
repositories, 23 in all.
* Repository Selection
When using all of the MRVA system, we select a small subset of repositories
available to you in [[*Database Aquisition][Database Aquisition]]. For this demo we include a small
collection -- 23 repositories -- and here we further narrow the selection to 12.
The full list
#+BEGIN_SRC text
ls -1 /data/dbstore-data/qldb/
'BoomingTech$Piccoloctsj6d7177.zip'
'KhronosGroup$OpenXR-SDKctsj984ee6.zip'
'OpenRCT2$OpenRCT2ctsj975d7c.zip'
'StanfordLegion$legionctsj39cbe4.zip'
'USCiLab$cerealctsj264953.zip'
'WinMerge$winmergectsj101305.zip'
'draios$sysdigctsj12c02d.zip'
'gildor2$UEViewerctsjfefdd8.zip'
'git-for-windows$gitctsjb7c2bd.zip'
'google$orbitctsj9bbeaf.zip'
'libfuse$libfusectsj7a66a4.zip'
'luigirizzo$netmapctsj6417fa.zip'
'mawww$kakounectsjc54fab.zip'
'microsoft$node-native-keymapctsj4cc9a2.zip'
'nem0$LumixEnginectsjfab756.zip'
'pocoproject$pococtsj26b932.zip'
'quickfix$quickfixctsjebfd13.zip'
'rui314$moldctsjfec16a.zip'
'swig$swigctsj78bcd3.zip'
'tdlib$telegram-bot-apictsj8529d9.zip'
'timescale$timescaledbctsjf617cf.zip'
'xoreaxeaxeax$movfuscatorctsj8f7e5b.zip'
'xrootd$xrootdctsje4b745.zip'
#+END_SRC
The selection of 12 repositories, from an initial collection of 6000 was made
using a collection of Python/pandas scripts made for the purpose, the [[https://github.com/hohn/mrvacommander/blob/hohn-0.1.21.2-improve-structure-and-docs/client/qldbtools/README.md#installation][qldbtools]]
package. The resulting selection, in the format expected by the VS Code
extension, follows.
#+BEGIN_SRC text
cat /data/qldbtools/scratch/vscode-selection.json
{
"version": 1,
"databases": {
"variantAnalysis": {
"repositoryLists": [
{
"name": "mirva-list",
"repositories": [
"xoreaxeaxeax/movfuscatorctsj8f7e5b",
"microsoft/node-native-keymapctsj4cc9a2",
"BoomingTech/Piccoloctsj6d7177",
"USCiLab/cerealctsj264953",
"KhronosGroup/OpenXR-SDKctsj984ee6",
"tdlib/telegram-bot-apictsj8529d9",
"WinMerge/winmergectsj101305",
"timescale/timescaledbctsjf617cf",
"pocoproject/pococtsj26b932",
"quickfix/quickfixctsjebfd13",
"libfuse/libfusectsj7a66a4"
]
}
],
"owners": [],
"repositories": []
}
},
"selected": {
"kind": "variantAnalysisUserDefinedList",
"listName": "mirva-list"
}
#+END_SRC
This selection is deceptively simple. For a full explanation, see [[file:cli-end-to-end-detailed.org::*Repository Selection][Repository
Selection]] in the detailed version of this document.
** Optional: The meaning of the names
The repository names all end with =ctsj= followed by 6 hex digits like
=ctsj4cc9a2=.
The information critial for selection of databases are the columns
1. owner
2. name
3. language
4. "sha"
5. "cliVersion"
6. "creationTime"
There are others that may be useful, but they are not strictly required.
The critical ones deserve more explanation:
1. "sha": The =git= commit SHA of the repository the CodeQL database was
created from. Required to distinguish query results over the evolution of
a code base.
2. "cliVersion": The version of the CodeQL CLI used to create the database.
Required to identify advances/regressions originating from the CodeQL binary.
3. "creationTime": The time the database was created. Required (or at least
very handy) for following the evolution of query results over time.
There is a computed column, CID. The CID column combines
- cliVersion
- creationTime
- language
- sha
into a single 6-character string via hashing. Together with (owner, repo) it
provides a unique index for every DB.
For this document, we simply use a pseudo-random selection of 11 databases via
#+BEGIN_SRC sh
./bin/mc-db-generate-selection -n 11 \
scratch/vscode-selection.json \
scratch/gh-mrva-selection.json \
< scratch/db-info-3.csv
#+END_SRC
Note that these use pseudo-random numbers, so the selection is in fact
deterministic.
* Starting the server
Clone the full repository before continuing:
#+BEGIN_SRC sh
mkdir -p ~/work-gh/mrva/
git clone git@github.com:hohn/mrvacommander.git
#+END_SRC
Make sure Docker is installed and running.
With docker-compose set up and this repository cloned, we just run
#+BEGIN_SRC sh
cd ~/work-gh/mrva/mrvacommander
docker-compose -f docker-compose-demo.yml up -d
#+END_SRC
and wait until the log output no longer changes.
Should look like
#+BEGIN_SRC text
docker-compose -f docker-compose-demo.yml up -d
[+] Running 27/6
✔ dbstore Pulled 1.1s
✔ artifactstore Pulled 1.1s
✔ mrvadata 3 layers [⣿⣿⣿] 0B/0B Pulled 263.8s
✔ server 2 layers [⣿⣿] 0B/0B Pulled 25.2s
✔ agent 5 layers [⣿⣿⣿⣿⣿] 0B/0B Pulled 24.9s
✔ client-qldbtools 11 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 20.8s
[+] Running 9/9
✔ Container mrvadata Started 0.3s
✔ Container mrvacommander-client-qldbtools-1 Started 0.3s
✔ Container mrvacommander-client-ghmrva-1 Running 0.0s
✔ Container mrvacommander-code-server-1 Running 0.0s
✔ Container artifactstore Running 0.0s
✔ Container rabbitmq Running 0.0s
✔ Container dbstore Started 0.4s
✔ Container agent Started 0.5s
✔ Container server Started 0.5s
#+END_SRC
The content is prepopulated in the =dbstore= container.
** Optional: Inspect the Backing Store
As completely optional step, you can inspect the backing store:
#+BEGIN_SRC sh
docker exec -it dbstore /bin/bash
ls /data/qldb/
# 'BoomingTech$Piccoloctsj6d7177.zip' 'mawww$kakounectsjc54fab.zip'
# 'KhronosGroup$OpenXR-SDKctsj984ee6.zip' 'microsoft$node-native-keymapctsj4cc9a2.zip'
# ...
#+END_SRC
** Optional: Inspect the MinIO DB
Another completely optional step, you can inspect the minio DB contents if you
have the minio cli installed:
#+BEGIN_SRC sh
# Configuration
MINIO_ALIAS="qldbminio"
MINIO_URL="http://localhost:9000"
MINIO_ROOT_USER="user"
MINIO_ROOT_PASSWORD="mmusty8432"
QL_DB_BUCKET_NAME="qldb"
# Check for MinIO client
if ! command -v mc &> /dev/null
then
echo "MinIO client (mc) not found."
fi
# Configure MinIO client
mc alias set $MINIO_ALIAS $MINIO_URL $MINIO_ROOT_USER $MINIO_ROOT_PASSWORD
# Show contents
mc ls qldbminio/qldb
#+END_SRC
* Running the gh-mrva command-line client
The first run uses the test query to verify basic functionality, but it returns
no results.
** Run MRVA from command line
# From ~/work-gh/mrva/gh-mrva
1. Check mrva cli
#+BEGIN_SRC sh
docker exec -it mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva -h
#+END_SRC
2. Set up the configuration
#+BEGIN_SRC sh
docker exec -i mrvacommander-client-ghmrva-1 \
sh -c 'mkdir -p /root/.config/gh-mrva/'
cat | docker exec -i mrvacommander-client-ghmrva-1 \
sh -c 'cat > /root/.config/gh-mrva/config.yml' <<eof
codeql_path: not-used/$HOME/work-gh
controller: not-used/mirva-controller
list_file: /root/work-gh/mrva/gh-mrva/gh-mrva-selection.json
eof
# check:
docker exec -i mrvacommander-client-ghmrva-1 ls /root/.config/gh-mrva/config.yml
docker exec -i mrvacommander-client-ghmrva-1 cat /root/.config/gh-mrva/config.yml
#+END_SRC
3. Provide the repository list file
#+BEGIN_SRC sh
docker exec -i mrvacommander-client-ghmrva-1 \
sh -c 'mkdir -p /root/work-gh/mrva/gh-mrva'
cat | docker exec -i mrvacommander-client-ghmrva-1 \
sh -c 'cat > /root/work-gh/mrva/gh-mrva/gh-mrva-selection.json' <<eof
{
"mirva-list": [
"xoreaxeaxeax/movfuscatorctsj8f7e5b",
"microsoft/node-native-keymapctsj4cc9a2",
"BoomingTech/Piccoloctsj6d7177",
"USCiLab/cerealctsj264953",
"KhronosGroup/OpenXR-SDKctsj984ee6",
"tdlib/telegram-bot-apictsj8529d9",
"WinMerge/winmergectsj101305",
"timescale/timescaledbctsjf617cf",
"pocoproject/pococtsj26b932",
"quickfix/quickfixctsjebfd13",
"libfuse/libfusectsj7a66a4"
]
}
eof
#+END_SRC
4. Provide the CodeQL query
#+BEGIN_SRC sh
cat | docker exec -i mrvacommander-client-ghmrva-1 \
sh -c 'cat > /root/work-gh/mrva/gh-mrva/FlatBuffersFunc.ql' <<eof
/**
,* @name pickfun
,* @description pick function from FlatBuffers
,* @kind problem
,* @id cpp-flatbuffer-func
,* @problem.severity warning
,*/
import cpp
from Function f
where
f.getName() = "MakeBinaryRegion" or
f.getName() = "microprotocols_add"
select f, "definition of MakeBinaryRegion"
eof
#+END_SRC
5. Submit the mrva job
#+BEGIN_SRC sh
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
submit --language cpp --session mirva-session-1360 \
--list mirva-list \
--query /root/work-gh/mrva/gh-mrva/FlatBuffersFunc.ql
#+END_SRC
6. Check the status
#+BEGIN_SRC sh
# Check the status
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
status --session mirva-session-1360
#+END_SRC
7. Download the sarif files, optionally also get databases. For the current
query / database combination there are zero result hence no downloads.
#+BEGIN_SRC sh
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
download --session mirva-session-1360 \
--download-dbs \
--output-dir mirva-session-1360
#+END_SRC
** TODO Write query that has some results
XX:
In this case, the trivial =alu_mul=,
alu_mul for https://github.com/xoreaxeaxeax/movfuscator/blob/master/movfuscator/movfuscator.c
#+BEGIN_SRC java
/**
,* @name findalu
,* @description find calls to a function
,* @kind problem
,* @id cpp-call
,* @problem.severity warning
,*/
import cpp
from FunctionCall fc
where
fc.getTarget().getName() = "alu_mul"
select fc, "call of alu_mul"
#+END_SRC
Repeat the submit steps with this query
1. [X] --
2. [X] --
3. [ ] Provide the CodeQL query
#+BEGIN_SRC sh
cat | docker exec -i mrvacommander-client-ghmrva-1 \
sh -c 'cat > /root/work-gh/mrva/gh-mrva/Alu_Mul.ql' <<eof
/**
,* @name findalu
,* @description find calls to a function
,* @kind problem
,* @id cpp-call
,* @problem.severity warning
,*/
import cpp
from FunctionCall fc
where
fc.getTarget().getName() = "alu_mul"
select fc, "call of alu_mul"
eof
#+END_SRC
4. [-] Submit the mrva job
#+BEGIN_SRC sh
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
submit --language cpp --session mirva-session-1490 \
--list mirva-list \
--query /root/work-gh/mrva/gh-mrva/Alu_Mul.ql
#+END_SRC
- [X] XX:
server | 2024/09/27 20:03:16 DEBUG Processed request info location="{Key:3 Bucket:packs}" language=cpp
server | 2024/09/27 20:03:16 WARN No repositories found for analysis
server | 2024/09/27 20:03:16 DEBUG Queueing analysis jobs count=0
server | 2024/09/27 20:03:16 DEBUG Forming and sending response for submitted analysis job id=3
NO: debug in the server container
#+BEGIN_SRC sh
docker exec -it server /bin/bash
apt-get update
apt-get install delve
replace
ENTRYPOINT ["./mrva_server"]
CMD ["--mode=container"]
#+END_SRC
- [ ] XX:
The dbstore is empty -- see http://localhost:9001/browser
must populate it properly, then save the image.
5. [ ] Check the status
#+BEGIN_SRC sh
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
status --session mirva-session-1490
#+END_SRC
This time we have results
#+BEGIN_SRC text
...
Run name: mirva-session-1490
Status: succeeded
Total runs: 1
Total successful scans: 11
Total failed scans: 0
Total skipped repositories: 0
Total skipped repositories due to access mismatch: 0
Total skipped repositories due to not found: 0
Total skipped repositories due to no database: 0
Total skipped repositories due to over limit: 0
Total repositories with findings: 7
Total findings: 618
Repositories with findings:
quickfix/quickfixctsjebfd13 (cpp-fprintf-call): 5
libfuse/libfusectsj7a66a4 (cpp-fprintf-call): 146
xoreaxeaxeax/movfuscatorctsj8f7e5b (cpp-fprintf-call): 80
pocoproject/pococtsj26b932 (cpp-fprintf-call): 17
BoomingTech/Piccoloctsj6d7177 (cpp-fprintf-call): 10
tdlib/telegram-bot-apictsj8529d9 (cpp-fprintf-call): 247
WinMerge/winmergectsj101305 (cpp-fprintf-call): 113
#+END_SRC
6. [ ] Download the sarif files, optionally also get databases.
#+BEGIN_SRC sh
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
download --session mirva-session-1490 \
--download-dbs \
--output-dir mirva-session-1490
# And list them:
\ls -la *1490*
#+END_SRC
7. [ ] Use the [[https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer][SARIF Viewer]] plugin in VS Code to open and review the results.
Prepare the source directory so the viewer can be pointed at it
#+BEGIN_SRC sh
cd ~/work-gh/mrva/gh-mrva/mirva-session-1490
unzip -qd BoomingTech_Piccoloctsj6d7177_1_db BoomingTech_Piccoloctsj6d7177_1_db.zip
cd BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/
unzip -qd src src.zip
#+END_SRC
Use the viewer
#+BEGIN_SRC sh
code BoomingTech_Piccoloctsj6d7177_1.sarif
# For lauxlib.c, point the source viewer to
find ~/work-gh/mrva/gh-mrva/mirva-session-1490/BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/src/home/runner/work/bulk-builder/bulk-builder -name lauxlib.c
# Here: ~/work-gh/mrva/gh-mrva/mirva-session-1490/BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/src/home/runner/work/bulk-builder/bulk-builder/engine/3rdparty/lua-5.4.4/lauxlib.c
#+END_SRC
8. [ ] (optional) Large result sets are more easily filtered via
dataframes or spreadsheets. Convert the SARIF to CSV if needed; see [[https://github.com/hohn/sarif-cli/][sarif-cli]].
* Running the CodeQL VS Code plugin
- [ ] XX: include the *custom* codeql plugin in the container.
* Ending the session
Shut down docker via
#+BEGIN_SRC sh
cd ~/work-gh/mrva/mrvacommander
docker-compose -f docker-compose-demo.yml down
#+END_SRC
* Footnotes
[fn:1]The =csvkit= can be installed into the same Python virtual environment as
the =qldbtools=.

View File

@@ -1,8 +1,9 @@
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
#+OPTIONS: H:2 num:t \n:nil @:t ::t |:t ^:{} f:t *:t TeX:t LaTeX:t skip:nil p:nil
* End-to-end example of CLI use * End-to-end example of CLI use
This document describes a complete cycle of the MRVA workflow. The steps This document describes a complete cycle of the MRVA workflow, but using
included are pre-populated data. The steps included are
1. aquiring CodeQL databases 1. aquiring CodeQL databases
2. selection of databases 2. selection of databases
3. configuration and use of the command-line client 3. configuration and use of the command-line client
@@ -11,6 +12,14 @@
6. retrieval of the results 6. retrieval of the results
7. examination of the results 7. examination of the results
* Start the containers
#+BEGIN_SRC sh
cd ~/work-gh/mrva/mrvacommander/
docker-compose -f docker-compose-demo.yml down --volumes --remove-orphans
docker-compose -f docker-compose-demo.yml up --build
#+END_SRC
* Database Aquisition * Database Aquisition
General database aquisition is beyond the scope of this document as it is very specific General database aquisition is beyond the scope of this document as it is very specific
to an organization's environment. to an organization's environment.
@@ -22,9 +31,12 @@
docker exec -it dbstore /bin/bash docker exec -it dbstore /bin/bash
# In the container # In the container
ls -la /data/dbstore-data/ ls -la /data/mrvacommander/dbstore-data/qldb
ls /data/dbstore-data/qldb/ | wc -l
# Or in one step
docker exec -it dbstore ls -la /data/mrvacommander/dbstore-data/qldb
#+END_SRC #+END_SRC
Here we use a small sample of an example for open-source Here we use a small sample of an example for open-source
repositories, 23 in all. repositories, 23 in all.
@@ -307,11 +319,8 @@
6. Check the status 6. Check the status
#+BEGIN_SRC sh #+BEGIN_SRC sh
# Check the status # Check the status
./gh-mrva status --session mirva-session-1360
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \ docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
status --session mirva-session-1360 status --session mirva-session-1360
#+END_SRC #+END_SRC
7. Download the sarif files, optionally also get databases. For the current 7. Download the sarif files, optionally also get databases. For the current
@@ -325,27 +334,15 @@
** TODO Write query that has some results ** TODO Write query that has some results
XX: XX:
First, get the list of paths corresponding to the previously selected
databases.
#+BEGIN_SRC sh
cd ~/work-gh/mrva/mrvacommander/client/qldbtools
./bin/mc-rows-from-mrva-list scratch/gh-mrva-selection.json \
scratch/db-info-3.csv > scratch/selection-full-info
csvcut -c path scratch/selection-full-info
#+END_SRC
Use one of these databases to write a query. It need not produce results. In this case, the trivial =alu_mul=,
#+BEGIN_SRC sh alu_mul for https://github.com/xoreaxeaxeax/movfuscator/blob/master/movfuscator/movfuscator.c
cd ~/work-gh/mrva/gh-mrva/
code gh-mrva.code-workspace
#+END_SRC
In this case, the trivial =findPrintf=:
#+BEGIN_SRC java #+BEGIN_SRC java
/** /**
,* @name findPrintf ,* @name findalu
,* @description find calls to plain fprintf ,* @description find calls to a function
,* @kind problem ,* @kind problem
,* @id cpp-fprintf-call ,* @id cpp-call
,* @problem.severity warning ,* @problem.severity warning
,*/ ,*/
@@ -353,34 +350,77 @@
from FunctionCall fc from FunctionCall fc
where where
fc.getTarget().getName() = "fprintf" fc.getTarget().getName() = "alu_mul"
select fc, "call of fprintf" select fc, "call of alu_mul"
#+END_SRC #+END_SRC
Repeat the submit steps with this query Repeat the submit steps with this query
1. -- 1. [X] --
2. -- 2. [X] --
3. Submit the mrva job 3. [ ] Provide the CodeQL query
#+BEGIN_SRC sh #+BEGIN_SRC sh
cp ~/work-gh/mrva/mrvacommander/client/qldbtools/scratch/gh-mrva-selection.json \ cat | docker exec -i mrvacommander-client-ghmrva-1 \
~/work-gh/mrva/gh-mrva/gh-mrva-selection.json sh -c 'cat > /root/work-gh/mrva/gh-mrva/Alu_Mul.ql' <<eof
/**
,* @name findalu
,* @description find calls to a function
,* @kind problem
,* @id cpp-call
,* @problem.severity warning
,*/
cd ~/work-gh/mrva/gh-mrva/ import cpp
./gh-mrva submit --language cpp --session mirva-session-1480 \
--list mirva-list \ from FunctionCall fc
--query ~/work-gh/mrva/gh-mrva/Fprintf.ql where
fc.getTarget().getName() = "alu_mul"
select fc, "call of alu_mul"
eof
#+END_SRC #+END_SRC
4. Check the status
4. [-] Submit the mrva job
#+BEGIN_SRC sh #+BEGIN_SRC sh
cd ~/work-gh/mrva/gh-mrva/ docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
./gh-mrva status --session mirva-session-1480 submit --language cpp --session mirva-session-1490 \
--list mirva-list \
--query /root/work-gh/mrva/gh-mrva/Alu_Mul.ql
#+END_SRC
- [X] XX:
server | 2024/09/27 20:03:16 DEBUG Processed request info location="{Key:3 Bucket:packs}" language=cpp
server | 2024/09/27 20:03:16 WARN No repositories found for analysis
server | 2024/09/27 20:03:16 DEBUG Queueing analysis jobs count=0
server | 2024/09/27 20:03:16 DEBUG Forming and sending response for submitted analysis job id=3
NO: debug in the server container
#+BEGIN_SRC sh
docker exec -it server /bin/bash
apt-get update
apt-get install delve
replace
ENTRYPOINT ["./mrva_server"]
CMD ["--mode=container"]
#+END_SRC
- [ ] XX:
The dbstore is empty -- see http://localhost:9001/browser
must populate it properly, then save the image.
5. [ ] Check the status
#+BEGIN_SRC sh
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
status --session mirva-session-1490
#+END_SRC #+END_SRC
This time we have results This time we have results
#+BEGIN_SRC text #+BEGIN_SRC text
... ...
Run name: mirva-session-1480 Run name: mirva-session-1490
Status: succeeded Status: succeeded
Total runs: 1 Total runs: 1
Total successful scans: 11 Total successful scans: 11
@@ -401,42 +441,22 @@
tdlib/telegram-bot-apictsj8529d9 (cpp-fprintf-call): 247 tdlib/telegram-bot-apictsj8529d9 (cpp-fprintf-call): 247
WinMerge/winmergectsj101305 (cpp-fprintf-call): 113 WinMerge/winmergectsj101305 (cpp-fprintf-call): 113
#+END_SRC #+END_SRC
5. Download the sarif files, optionally also get databases. 6. [ ] Download the sarif files, optionally also get databases.
#+BEGIN_SRC sh #+BEGIN_SRC sh
cd ~/work-gh/mrva/gh-mrva/ docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
# Just download the sarif files download --session mirva-session-1490 \
./gh-mrva download --session mirva-session-1480 \ --download-dbs \
--output-dir mirva-session-1480 --output-dir mirva-session-1490
# Download the sarif files and CodeQL dbs
./gh-mrva download --session mirva-session-1480 \
--download-dbs \
--output-dir mirva-session-1480
# And list them: # And list them:
\ls -la *1480* \ls -la *1490*
-rwxr-xr-x@ 1 hohn staff 1915857 Aug 16 14:10 BoomingTech_Piccoloctsj6d7177_1.sarif
drwxr-xr-x@ 3 hohn staff 96 Aug 16 14:15 BoomingTech_Piccoloctsj6d7177_1_db
-rwxr-xr-x@ 1 hohn staff 89857056 Aug 16 14:11 BoomingTech_Piccoloctsj6d7177_1_db.zip
-rwxr-xr-x@ 1 hohn staff 3105663 Aug 16 14:10 WinMerge_winmergectsj101305_1.sarif
-rwxr-xr-x@ 1 hohn staff 227812131 Aug 16 14:12 WinMerge_winmergectsj101305_1_db.zip
-rwxr-xr-x@ 1 hohn staff 193976 Aug 16 14:10 libfuse_libfusectsj7a66a4_1.sarif
-rwxr-xr-x@ 1 hohn staff 12930693 Aug 16 14:10 libfuse_libfusectsj7a66a4_1_db.zip
-rwxr-xr-x@ 1 hohn staff 1240694 Aug 16 14:10 pocoproject_pococtsj26b932_1.sarif
-rwxr-xr-x@ 1 hohn staff 158924920 Aug 16 14:12 pocoproject_pococtsj26b932_1_db.zip
-rwxr-xr-x@ 1 hohn staff 888494 Aug 16 14:10 quickfix_quickfixctsjebfd13_1.sarif
-rwxr-xr-x@ 1 hohn staff 75023303 Aug 16 14:11 quickfix_quickfixctsjebfd13_1_db.zip
-rwxr-xr-x@ 1 hohn staff 1487363 Aug 16 14:10 tdlib_telegram-bot-apictsj8529d9_1.sarif
-rwxr-xr-x@ 1 hohn staff 373477635 Aug 16 14:14 tdlib_telegram-bot-apictsj8529d9_1_db.zip
-rwxr-xr-x@ 1 hohn staff 103657 Aug 16 14:10 xoreaxeaxeax_movfuscatorctsj8f7e5b_1.sarif
-rwxr-xr-x@ 1 hohn staff 9464225 Aug 16 14:10 xoreaxeaxeax_movfuscatorctsj8f7e5b_1_db.zip
#+END_SRC #+END_SRC
6. Use the [[https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer][SARIF Viewer]] plugin in VS Code to open and review the results. 7. [ ] Use the [[https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer][SARIF Viewer]] plugin in VS Code to open and review the results.
Prepare the source directory so the viewer can be pointed at it Prepare the source directory so the viewer can be pointed at it
#+BEGIN_SRC sh #+BEGIN_SRC sh
cd ~/work-gh/mrva/gh-mrva/mirva-session-1480 cd ~/work-gh/mrva/gh-mrva/mirva-session-1490
unzip -qd BoomingTech_Piccoloctsj6d7177_1_db BoomingTech_Piccoloctsj6d7177_1_db.zip unzip -qd BoomingTech_Piccoloctsj6d7177_1_db BoomingTech_Piccoloctsj6d7177_1_db.zip
@@ -449,17 +469,25 @@
code BoomingTech_Piccoloctsj6d7177_1.sarif code BoomingTech_Piccoloctsj6d7177_1.sarif
# For lauxlib.c, point the source viewer to # For lauxlib.c, point the source viewer to
find ~/work-gh/mrva/gh-mrva/mirva-session-1480/BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/src/home/runner/work/bulk-builder/bulk-builder -name lauxlib.c find ~/work-gh/mrva/gh-mrva/mirva-session-1490/BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/src/home/runner/work/bulk-builder/bulk-builder -name lauxlib.c
# Here: ~/work-gh/mrva/gh-mrva/mirva-session-1480/BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/src/home/runner/work/bulk-builder/bulk-builder/engine/3rdparty/lua-5.4.4/lauxlib.c # Here: ~/work-gh/mrva/gh-mrva/mirva-session-1490/BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/src/home/runner/work/bulk-builder/bulk-builder/engine/3rdparty/lua-5.4.4/lauxlib.c
#+END_SRC #+END_SRC
7. (optional) Large result sets are more easily filtered via 8. [ ] (optional) Large result sets are more easily filtered via
dataframes or spreadsheets. Convert the SARIF to CSV if needed; see [[https://github.com/hohn/sarif-cli/][sarif-cli]]. dataframes or spreadsheets. Convert the SARIF to CSV if needed; see [[https://github.com/hohn/sarif-cli/][sarif-cli]].
* Running the CodeQL VS Code plugin
- [ ] XX: include the *custom* codeql plugin in the container.
* Ending the session
Shut down docker via
#+BEGIN_SRC sh
cd ~/work-gh/mrva/mrvacommander
docker-compose -f docker-compose-demo.yml down
#+END_SRC
* Footnotes * Footnotes
[fn:1]The =csvkit= can be installed into the same Python virtual environment as [fn:1]The =csvkit= can be installed into the same Python virtual environment as
the =qldbtools=. the =qldbtools=.