Major changes to support cli-end-to-end demonstration. See full log

* notes/cli-end-to-end-demo.org (Database Aquisition):
  Starting description for the end-to-end demonstration workflow.
  Very simplified version of notes/cli-end-to-end.org

* docker-compose-demo.yml (services):
  Make the pre-populated ql database storage an explicit container
  to get persistent data and straightforward mount semantics.

* docker-compose-demo-build.yml (services):
  Add a docker-compose configuration for *building* the demo environment.

* demo/containers/dbsdata/Dockerfile:
  Add dbsdata Docker image to hold initialized minio database file tree

* client/containers/vscode/README.org
  Update vscode container to use custom plugin for later mrva redirection
This commit is contained in:
Michael Hohn
2024-10-15 10:18:42 -07:00
committed by =Michael Hohn
parent 187c49688e
commit 77ce997fbb
8 changed files with 849 additions and 134 deletions

View File

@@ -1,8 +1,9 @@
# -*- coding: utf-8 -*-
#+OPTIONS: H:2 num:t \n:nil @:t ::t |:t ^:{} f:t *:t TeX:t LaTeX:t skip:nil p:nil
* End-to-end example of CLI use
This document describes a complete cycle of the MRVA workflow. The steps
included are
This document describes a complete cycle of the MRVA workflow, but using
pre-populated data. The steps included are
1. aquiring CodeQL databases
2. selection of databases
3. configuration and use of the command-line client
@@ -11,6 +12,14 @@
6. retrieval of the results
7. examination of the results
* Start the containers
#+BEGIN_SRC sh
cd ~/work-gh/mrva/mrvacommander/
docker-compose -f docker-compose-demo.yml down --volumes --remove-orphans
docker-compose -f docker-compose-demo.yml up --build
#+END_SRC
* Database Aquisition
General database aquisition is beyond the scope of this document as it is very specific
to an organization's environment.
@@ -22,9 +31,12 @@
docker exec -it dbstore /bin/bash
# In the container
ls -la /data/dbstore-data/
ls /data/dbstore-data/qldb/ | wc -l
ls -la /data/mrvacommander/dbstore-data/qldb
# Or in one step
docker exec -it dbstore ls -la /data/mrvacommander/dbstore-data/qldb
#+END_SRC
Here we use a small sample of an example for open-source
repositories, 23 in all.
@@ -307,11 +319,8 @@
6. Check the status
#+BEGIN_SRC sh
# Check the status
./gh-mrva status --session mirva-session-1360
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
status --session mirva-session-1360
#+END_SRC
7. Download the sarif files, optionally also get databases. For the current
@@ -325,27 +334,15 @@
** TODO Write query that has some results
XX:
First, get the list of paths corresponding to the previously selected
databases.
#+BEGIN_SRC sh
cd ~/work-gh/mrva/mrvacommander/client/qldbtools
./bin/mc-rows-from-mrva-list scratch/gh-mrva-selection.json \
scratch/db-info-3.csv > scratch/selection-full-info
csvcut -c path scratch/selection-full-info
#+END_SRC
Use one of these databases to write a query. It need not produce results.
#+BEGIN_SRC sh
cd ~/work-gh/mrva/gh-mrva/
code gh-mrva.code-workspace
#+END_SRC
In this case, the trivial =findPrintf=:
In this case, the trivial =alu_mul=,
alu_mul for https://github.com/xoreaxeaxeax/movfuscator/blob/master/movfuscator/movfuscator.c
#+BEGIN_SRC java
/**
,* @name findPrintf
,* @description find calls to plain fprintf
,* @name findalu
,* @description find calls to a function
,* @kind problem
,* @id cpp-fprintf-call
,* @id cpp-call
,* @problem.severity warning
,*/
@@ -353,34 +350,77 @@
from FunctionCall fc
where
fc.getTarget().getName() = "fprintf"
select fc, "call of fprintf"
fc.getTarget().getName() = "alu_mul"
select fc, "call of alu_mul"
#+END_SRC
Repeat the submit steps with this query
1. --
2. --
3. Submit the mrva job
1. [X] --
2. [X] --
3. [ ] Provide the CodeQL query
#+BEGIN_SRC sh
cp ~/work-gh/mrva/mrvacommander/client/qldbtools/scratch/gh-mrva-selection.json \
~/work-gh/mrva/gh-mrva/gh-mrva-selection.json
cat | docker exec -i mrvacommander-client-ghmrva-1 \
sh -c 'cat > /root/work-gh/mrva/gh-mrva/Alu_Mul.ql' <<eof
/**
,* @name findalu
,* @description find calls to a function
,* @kind problem
,* @id cpp-call
,* @problem.severity warning
,*/
cd ~/work-gh/mrva/gh-mrva/
./gh-mrva submit --language cpp --session mirva-session-1480 \
--list mirva-list \
--query ~/work-gh/mrva/gh-mrva/Fprintf.ql
import cpp
from FunctionCall fc
where
fc.getTarget().getName() = "alu_mul"
select fc, "call of alu_mul"
eof
#+END_SRC
4. Check the status
4. [-] Submit the mrva job
#+BEGIN_SRC sh
cd ~/work-gh/mrva/gh-mrva/
./gh-mrva status --session mirva-session-1480
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
submit --language cpp --session mirva-session-1490 \
--list mirva-list \
--query /root/work-gh/mrva/gh-mrva/Alu_Mul.ql
#+END_SRC
- [X] XX:
server | 2024/09/27 20:03:16 DEBUG Processed request info location="{Key:3 Bucket:packs}" language=cpp
server | 2024/09/27 20:03:16 WARN No repositories found for analysis
server | 2024/09/27 20:03:16 DEBUG Queueing analysis jobs count=0
server | 2024/09/27 20:03:16 DEBUG Forming and sending response for submitted analysis job id=3
NO: debug in the server container
#+BEGIN_SRC sh
docker exec -it server /bin/bash
apt-get update
apt-get install delve
replace
ENTRYPOINT ["./mrva_server"]
CMD ["--mode=container"]
#+END_SRC
- [ ] XX:
The dbstore is empty -- see http://localhost:9001/browser
must populate it properly, then save the image.
5. [ ] Check the status
#+BEGIN_SRC sh
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
status --session mirva-session-1490
#+END_SRC
This time we have results
#+BEGIN_SRC text
...
Run name: mirva-session-1480
Run name: mirva-session-1490
Status: succeeded
Total runs: 1
Total successful scans: 11
@@ -401,42 +441,22 @@
tdlib/telegram-bot-apictsj8529d9 (cpp-fprintf-call): 247
WinMerge/winmergectsj101305 (cpp-fprintf-call): 113
#+END_SRC
5. Download the sarif files, optionally also get databases.
6. [ ] Download the sarif files, optionally also get databases.
#+BEGIN_SRC sh
cd ~/work-gh/mrva/gh-mrva/
# Just download the sarif files
./gh-mrva download --session mirva-session-1480 \
--output-dir mirva-session-1480
# Download the sarif files and CodeQL dbs
./gh-mrva download --session mirva-session-1480 \
--download-dbs \
--output-dir mirva-session-1480
docker exec -i mrvacommander-client-ghmrva-1 /usr/local/bin/gh-mrva \
download --session mirva-session-1490 \
--download-dbs \
--output-dir mirva-session-1490
# And list them:
\ls -la *1480*
-rwxr-xr-x@ 1 hohn staff 1915857 Aug 16 14:10 BoomingTech_Piccoloctsj6d7177_1.sarif
drwxr-xr-x@ 3 hohn staff 96 Aug 16 14:15 BoomingTech_Piccoloctsj6d7177_1_db
-rwxr-xr-x@ 1 hohn staff 89857056 Aug 16 14:11 BoomingTech_Piccoloctsj6d7177_1_db.zip
-rwxr-xr-x@ 1 hohn staff 3105663 Aug 16 14:10 WinMerge_winmergectsj101305_1.sarif
-rwxr-xr-x@ 1 hohn staff 227812131 Aug 16 14:12 WinMerge_winmergectsj101305_1_db.zip
-rwxr-xr-x@ 1 hohn staff 193976 Aug 16 14:10 libfuse_libfusectsj7a66a4_1.sarif
-rwxr-xr-x@ 1 hohn staff 12930693 Aug 16 14:10 libfuse_libfusectsj7a66a4_1_db.zip
-rwxr-xr-x@ 1 hohn staff 1240694 Aug 16 14:10 pocoproject_pococtsj26b932_1.sarif
-rwxr-xr-x@ 1 hohn staff 158924920 Aug 16 14:12 pocoproject_pococtsj26b932_1_db.zip
-rwxr-xr-x@ 1 hohn staff 888494 Aug 16 14:10 quickfix_quickfixctsjebfd13_1.sarif
-rwxr-xr-x@ 1 hohn staff 75023303 Aug 16 14:11 quickfix_quickfixctsjebfd13_1_db.zip
-rwxr-xr-x@ 1 hohn staff 1487363 Aug 16 14:10 tdlib_telegram-bot-apictsj8529d9_1.sarif
-rwxr-xr-x@ 1 hohn staff 373477635 Aug 16 14:14 tdlib_telegram-bot-apictsj8529d9_1_db.zip
-rwxr-xr-x@ 1 hohn staff 103657 Aug 16 14:10 xoreaxeaxeax_movfuscatorctsj8f7e5b_1.sarif
-rwxr-xr-x@ 1 hohn staff 9464225 Aug 16 14:10 xoreaxeaxeax_movfuscatorctsj8f7e5b_1_db.zip
\ls -la *1490*
#+END_SRC
6. Use the [[https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer][SARIF Viewer]] plugin in VS Code to open and review the results.
7. [ ] Use the [[https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-viewer][SARIF Viewer]] plugin in VS Code to open and review the results.
Prepare the source directory so the viewer can be pointed at it
#+BEGIN_SRC sh
cd ~/work-gh/mrva/gh-mrva/mirva-session-1480
cd ~/work-gh/mrva/gh-mrva/mirva-session-1490
unzip -qd BoomingTech_Piccoloctsj6d7177_1_db BoomingTech_Piccoloctsj6d7177_1_db.zip
@@ -449,17 +469,25 @@
code BoomingTech_Piccoloctsj6d7177_1.sarif
# For lauxlib.c, point the source viewer to
find ~/work-gh/mrva/gh-mrva/mirva-session-1480/BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/src/home/runner/work/bulk-builder/bulk-builder -name lauxlib.c
find ~/work-gh/mrva/gh-mrva/mirva-session-1490/BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/src/home/runner/work/bulk-builder/bulk-builder -name lauxlib.c
# Here: ~/work-gh/mrva/gh-mrva/mirva-session-1480/BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/src/home/runner/work/bulk-builder/bulk-builder/engine/3rdparty/lua-5.4.4/lauxlib.c
# Here: ~/work-gh/mrva/gh-mrva/mirva-session-1490/BoomingTech_Piccoloctsj6d7177_1_db/codeql_db/src/home/runner/work/bulk-builder/bulk-builder/engine/3rdparty/lua-5.4.4/lauxlib.c
#+END_SRC
7. (optional) Large result sets are more easily filtered via
8. [ ] (optional) Large result sets are more easily filtered via
dataframes or spreadsheets. Convert the SARIF to CSV if needed; see [[https://github.com/hohn/sarif-cli/][sarif-cli]].
* Running the CodeQL VS Code plugin
- [ ] XX: include the *custom* codeql plugin in the container.
* Ending the session
Shut down docker via
#+BEGIN_SRC sh
cd ~/work-gh/mrva/mrvacommander
docker-compose -f docker-compose-demo.yml down
#+END_SRC
* Footnotes
[fn:1]The =csvkit= can be installed into the same Python virtual environment as
the =qldbtools=.