Files
mrva-docker/README.org

24 KiB
Raw Blame History

lima vm for running docker

  limactl  create -h
  # Create an instance of Lima
  limactl create --list-templates

  # create deb12
  limactl create                                  \
          --arch aarch64                          \
          --cpus 8                                \
          --disk 20                               \
          --memory 8.0                            \
          --name deb12                            \
          template://debian-12

  # admin
  limactl list

  # start deb12
  limactl start deb12

  # enter deb12
  limactl shell deb12


  # install docker
  # 1. Prerequisites
  sudo apt update
  sudo apt install -y ca-certificates curl gnupg lsb-release

  # 2. Add Dockers official GPG key
  sudo install -m 0755 -d /etc/apt/keyrings
  curl -fsSL https://download.docker.com/linux/debian/gpg | \
      sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
  sudo chmod a+r /etc/apt/keyrings/docker.gpg

  # 3. Add Dockers APT repo
  echo \
    "deb [arch=$(dpkg --print-architecture) \
     signed-by=/etc/apt/keyrings/docker.gpg] \
     https://download.docker.com/linux/debian \
     $(lsb_release -cs) stable" | \
    sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

  # 4. Install Docker packages
  sudo apt update
  sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

  # enable daemons
  sudo systemctl enable docker
  sudo systemctl start docker

  # add self to docker group
  sudo usermod -aG docker $USER
  limactl stop deb12
  limactl start deb12
  limactl shell deb12
  groups  # should now show "docker"


  # Build container images
  export MAG_VERSION=0.4.0

  {
      # ghmrva
      cd /Users/hohn/work-gh/mrva/mrva-docker/containers/ghmrva/
      docker build --no-cache -t client-ghmrva-container:${MAG_VERSION} .
  }
  {
      # code-server
      cd /Users/hohn/work-gh/mrva/mrva-docker/containers/vscode/
      docker build --no-cache -t code-server-initialized:${MAG_VERSION} .
  }
  {
      # hepc
      cd /Users/hohn/work-gh/mrva/mrva-docker/containers/hepc
      rm -fR ./mrvahepc && cp -r  ../../../mrvahepc .

      # Usual build
      docker build --no-cache -t mrva-hepc-container:${MAG_VERSION} .
  }
  {
      # server
      cd /Users/hohn/work-gh/mrva/mrva-docker/containers/server
      docker build --no-cache --network host -t mrva-server:${MAG_VERSION} .
  }
  {
      # Agent
      cd /Users/hohn/work-gh/mrva/mrva-docker/containers/agent/
      MAG_TARGET=mrva-agent:0.4.0
      docker build --no-cache --network host -t ${MAG_TARGET} .
  }

  # list images
  docker image ls

  # run containers
  cd /Users/hohn/work-gh/mrva/mrva-docker
  docker compose -f docker-compose-demo.yml up

TODO lima vm

intro

When dealing with a highly stateful, evolving system, development workflows that treat containers as immutable black boxes fall apart. Docker's model is great for microservices and stateless demos — but not for real systems where:

  • Executables change frequently (still coding)
  • Data must persist (and be inspected live)
  • Containers cannot be restarted casually (because they are the system)

Inside a single, well-managed VM we can

  • Mount real filesystems (/data, /code, /state) — no awkward volume plugins
  • Recompile and make install — no need to rebuild images
  • Keep all services running — no data loss
  • Log in and debug anything interactively

For local development of a complex, stateful system like MRVA, dumping Docker in favor of chroot or systemd-nspawn-style environments gives us:

  • Full control over state, logs, mounts
  • Zero rebuild delay
  • Native process inspection, debugging, and file editing
  • Persistent state without Dockers volume opacity
  • Easy replication of logical components via shell or Make

Current Logical Structure

Using a 2x3 terminal layout in iterm, we have

Window Directory
UL mrvaagent
UM mrvacommander
UR mrvahepc
LL mrvaserver
LM mrva-docker
LR vscode-codeql

Each of these corresponds to a separate Git repo, aligned with a Docker container.

This gives

  • Logical alignment between containers and repos
  • Physical separation (Docker images/filesystems) that's painful for development
  • Fast navigation and full visibility via iTerm2 panes

vm chroots from docker

The chroot will have the same directory structure as the Docker By following standard layout with debootstrap or debian:bullseye as base:

  /bin 
  /etc 
  /lib 
  /usr 
  /opt 
  /tmp 
  /var

This aligns precisely with what a Docker image would have. The only differences might be:

Path Docker chroot via debootstrap
/root present but unused optional, often empty
/home sometimes empty in both create it if needed
/proc, /sys managed by container runtime mount manually if needed

Compare to Docker

Feature VM + chroot setup Docker
Rebuild control Full, script-driven Layer cache voodoo
File system transparency Total Hidden layers
Tool version management Shared or isolated Always isolated
Dev→debug roundtrip Instant Context build/upload
Disk efficiency Optional Layered, rigid
Mental model File tree + script "Magic image"
Debug container during run Simple chroot Unnatural UX

Rebuild Cadence

Stage Scope Frequency Cost Notes
VM base image Full VM Rare (~1 per loop) Medium Clean slate; fast via Lima
VM tweaks Apt/tools 12 per loop Low Fully scripted
Chroot setup Per component 1 per loop Fast Includes system + tool setup
Component dev Go binary 10×+ per chroot Instant Local builds, bound mount ok
Full system test All chroots After major change MedHigh Manual or scripted

lima machine creation

  limactl  create -h
  # Create an instance of Lima
  limactl create --list-templates

  # create deb12
  limactl create                                  \
          --arch aarch64                          \
          --cpus 8                                \
          --disk 20                               \
          --memory 8.0                            \
          --name deb12                            \
          template://debian-12

  # start deb12
  limactl start deb12

  # enter deb12
  limactl shell deb12

  # admin
  limactl list

In

~/.lima/deb12/lima.yaml

add

  - location: "/Users/hohn/work-gh/mrva"
    writable: true

to the

mounts:

section. Then,

  limactl stop deb12
  limactl start deb12

TODO migrating the containers to chroot

Inside the lima vm

  # enter vm
  limactl shell deb12

  # expand setup scripts
  m4 common-setup.m4 agent-setup.m4 > setup-agent-chroot.sh
  m4 common-setup.m4 server-setup.m4 > setup-server-chroot.sh     
  m4 common-setup.m4 ghmrva-setup.m4 > setup-ghmrva-chroot.sh
  m4 common-setup.m4 mrvastore-setup.m4 > setup-mrvastore-chroot.sh

Using the Containers

Running the containers

  1. Start the containers

      cd ~/work-gh/mrva/mrva-docker/
      docker-compose -f docker-compose-demo.yml down
      docker ps
      docker-compose -f docker-compose-demo.yml up
  2. View all logs

    docker-compose logs
    
  3. Follow all logs if started with -d

      docker-compose logs -f
  4. Follow single container, server, logging via

      cd ~/work-gh/mrva/mrvacommander
      docker-compose up -d
      docker-compose logs -f server
  5. Cleanup in case of obscure errors (network or other)

      docker-compose -f docker-compose-demo.yml down --volumes --remove-orphans
      docker network prune
      docker-compose -f docker-compose-demo.yml up --build

Updating binaries in running container

To update the binaries in a running container mainly during development:

  • server

      #* Cross-compile locally
      cd ~/work-gh/mrva/mrvaserver
      make msla
    
      #* check for running containers
      docker ps --format "table {{.ID}}\t{{.Image}}\t{{.Names}}"
    
      #* Copy the new binary
      cd ~/work-gh/mrva/mrvaserver
      docker cp mrvaserver mrva-server:/usr/local/bin/mrvaserver
    
      #* Restart the binary
      docker exec mrva-server pkill mrvaserver
  • agent

      #* Cross-compile locally
      cd ~/work-gh/mrva/mrvaagent
      make mala
    
      #* Look for the agent's name in the process table
      docker ps --format "table {{.ID}}\t{{.Image}}\t{{.Names}}"
    
      #* Copy the new binary
      cd ~/work-gh/mrva/mrvaagent
      docker cp mrvaagent mrva-agent:/usr/local/bin/mrvaagent
    
      #* Restart the binary
      docker exec mrva-agent pkill mrvaagent
  • gh-mrva

      #* Cross-compile locally
      cd ~/work-gh/mrva/gh-mrva
      go mod edit -replace="github.com/GitHubSecurityLab/gh-mrva=/Users/hohn/work-gh/mrva/gh-mrva"
      go mod tidy 
      GOOS=linux GOARCH=arm64 go build
    
      #* Look for the gh-mrva name in the process table
      docker ps --format "table {{.ID}}\t{{.Image}}\t{{.Names}}"
    
    
    
      #* Copy the new binary
      cd ~/work-gh/mrva/gh-mrva
      docker cp mrvaagent mrva-agent:/usr/local/bin/mrvaagent
    
      #* Restart the binary
      docker exec mrva-agent pkill mrvaagent

Use gh-mrva container to send request via cli

Start container and check gh-mrva tool

  # Start an interactive bash shell inside the running Docker container
  docker exec -it mrva-ghmrva bash

  # Check if the gh-mrva tool is installed and accessible
  gh-mrva -h

Set up gh-mrva configuration

   # Create configuration directory and generate config file for gh-mrva
   mkdir -p ~/.config/gh-mrva
   cat > ~/.config/gh-mrva/config.yml <<eof
   # Configuration file for the gh-mrva tool
   # codeql_path: Path to the CodeQL distribution (not used in this setup)
   # controller: Placeholder for a controller NWO (not relevant in this setup)
   # list_file: Path to the repository selection JSON file

   codeql_path: not-used/codeql-path
   controller: not-used/mirva-controller
   list_file: $HOME/work-gh/mrva/gh-mrva/gh-mrva-selection.json
   eof

Create repository selection list

   # Create a directory and generate the JSON file specifying repositories
   mkdir -p ~/work-gh/mrva/gh-mrva
   cat > ~/work-gh/mrva/gh-mrva/gh-mrva-selection.json <<eof
   {
       "mirva-list": [
           "Serial-Studio/Serial-Studio",
           "UEFITool/UEFITool",
           "aircrack-ng/aircrack-ng",
           "bulk-builder/bulk-builder",
           "tesseract/tesseract"
       ]
   }
   eof

Create and submit the first query (FlatBuffersFunc.ql)

   # Generate a sample CodeQL query for functions of interest
   cat > ~/work-gh/mrva/gh-mrva/FlatBuffersFunc.ql <<eof
   /**
    ,* @name pickfun
    ,* @description Pick function from FlatBuffers
    ,* @kind problem
    ,* @id cpp-flatbuffer-func
    ,* @problem.severity warning
    ,*/

   import cpp

   from Function f
   where
     f.getName() = "MakeBinaryRegion" or
     f.getName() = "microprotocols_add"
   select f, "definition of MakeBinaryRegion"
   eof

   # Submit the MRVA job with the first query
   cd ~/work-gh/mrva/gh-mrva/
   gh-mrva submit --language cpp --session mirva-session-1172 \
             --list mirva-list                                \
             --query ~/work-gh/mrva/gh-mrva/FlatBuffersFunc.ql

Check status and download results for the first session

   # Check the status of the submitted session
   gh-mrva status --session mirva-session-1172

   # Download SARIF files and databases if there are results.  For the current
   # query / database combination there are zero result hence no downloads
   cd ~/work-gh/mrva/gh-mrva/
   gh-mrva download --session mirva-session-1172   \
           --download-dbs                          \
           --output-dir mirva-session-1172

Next, run a query with results

  #**  Set up QLPack for the next query
  # Create a qlpack.yml file required for the next query
  cat > ~/work-gh/mrva/gh-mrva/qlpack.yml <<eof
  library: false
  name: codeql-dataflow-ii-cpp
  version: 0.0.1
  dependencies:
    codeql/cpp-all: 0.5.3
  eof

  #**  Create and submit the second query (Fprintf.ql)
  # Generate a CodeQL query to find calls to fprintf
  cat > ~/work-gh/mrva/gh-mrva/Fprintf.ql <<eof
  /**
   ,* @name findPrintf
   ,* @description Find calls to plain fprintf
   ,* @kind problem
   ,* @id cpp-fprintf-call
   ,* @problem.severity warning
   ,*/

  import cpp

  from FunctionCall fc
  where
    fc.getTarget().getName() = "fprintf"
  select fc, "call of fprintf"
  eof

  # Submit a new MRVA job with the second query
  cd ~/work-gh/mrva/gh-mrva/
  gh-mrva submit                                      \
          --language cpp --session mirva-session-2161 \
          --list mirva-list                           \
          --query ~/work-gh/mrva/gh-mrva/Fprintf.ql

Check status and download results for the second session

  # Check the status of the second session
  gh-mrva status --session mirva-session-2161

  # Download SARIF files and databases for the second query
  cd ~/work-gh/mrva/gh-mrva/
  gh-mrva download --session mirva-session-2161   \
          --download-dbs                          \
          --output-dir mirva-session-2161

  ls -l mirva-session-2161

As shell script

Send request via gui, using vs code

The following sequence works when run from a local vs code with the custom codeql plugin.

Connect to vscode-codeql container at http://localhost:9080/?folder=/home/coder

Provide settings

The file

/home/coder/.local/share/code-server/User/settings.json
  cat > /home/coder/.local/share/code-server/User/settings.json << EOF
  {
      "codeQL.runningQueries.numberOfThreads": 2,
      "codeQL.cli.executablePath": "/opt/codeql/codeql",

      "codeQL.variantAnalysis.enableGhecDr": true,
      "github-enterprise.uri": "http://server:8080/"
  }
  EOF

Provide list of repositories to analyze

ql tab > variant analysis repositories > {}, put this into databases.json

  {
      "version": 1,
      "databases": {
          "variantAnalysis": {
              "repositoryLists": [
                  {
                      "name": "mrva-list",
                      "repositories": [
                          "Serial-Studio/Serial-Studio",
                          "UEFITool/UEFITool",
                          "aircrack-ng/aircrack-ng",
                          "bulk-builder/bulk-builder",
                          "tesseract/tesseract"
                      ]
                  }
              ],
              "owners": [],
              "repositories": []
          }
      },
      "selected": {
          "kind": "variantAnalysisUserDefinedList",
          "listName": "mirva-list"
      }
  }

Make the list current

ql tab > variant analysis repositories > 'select' mrva-list

Provide a query

Select file qldemo/simple.ql and put Fprintf.ql parallel to it:

  cat > /home/coder/qldemo/Fprintf.ql <<eof
  /**
   ,* @name findPrintf
   ,* @description find calls to plain fprintf
   ,* @kind problem
   ,* @id cpp-fprintf-call
   ,* @problem.severity warning
   ,*/

  import cpp

  from FunctionCall fc
  where
    fc.getTarget().getName() = "fprintf"
  select fc, "call of fprintf"
  eof
  /**
   ,* @name findPrintf
   ,* @description find calls to plain fprintf
   ,* @kind problem
   ,* @id cpp-fprintf-call
   ,* @problem.severity warning
   ,*/

  import cpp

  from FunctionCall fc
  where
    fc.getTarget().getName() = "fprintf"
  select fc, "call of fprintf"

Provide the qlpack specification

Create qlpack.yml for cpp:

  cat > /home/coder/qldemo/qlpack.yml <<eof
  library: false
  name: codeql-dataflow-ii-cpp
  version: 0.0.1
  dependencies:
    codeql/cpp-all: 0.5.3
  eof

Then

  1. Delete qlpack.lock file
  2. In shell,

      cd ~/qldemo
      /opt/codeql/codeql pack install
  3. In GUI, 'install pack dependencies'
  4. In GUI, 'reload windows'

Submit the analysis job

Fprintf.ql > right click > run variant analysis

Update Container Images

XX:

grep 'docker tag' containers/*/*.org containers/*/Makefile
(grep "grep --color=auto -nH --null -e 'docker tag' containers/*/*")
  # To snapshot a running Docker container and create a new image from it, use the
  # following CLI sequence: 

  #* Get the container IDs

  docker ps --format "table {{.ID}}\t{{.Image}}\t{{.Names}}"
  # 0:$ docker ps --format "table {{.ID}}\t{{.Image}}\t{{.Names}}"
  # CONTAINER ID   IMAGE                                         NAMES
  # 99de2a875317   ghcr.io/hohn/client-ghmrva-container:0.1.24   mrva-docker-client-ghmrva-1
  # 081900278c0e   ghcr.io/hohn/mrva-server:0.1.24               server
  # a23352c009fb   ghcr.io/hohn/mrva-agent:0.1.24                agent
  # 9e9248a77957   minio/minio:RELEASE.2024-06-11T03-13-30Z      dbstore
  # cd043e5bad77   ghcr.io/hohn/code-server-initialized:0.1.24   mrva-docker-code-server-1
  # 783e30d6f9d0   rabbitmq:3-management                         rabbitmq
  # d05f606b4ea0   ghcr.io/hohn/mrva-hepc-container:0.1.24       hepc
  # 7858ccf18fad   ghcr.io/hohn/dbsdata-container:0.1.24         dbssvc
  # 85d85484849b   minio/minio:RELEASE.2024-06-11T03-13-30Z      artifactstore

  #* Commit the running containers to new images
  # Commit the running container to a new image:
  ( cd ~/work-gh/mrva/mrva-docker/ && rg 'docker (commit)' )

  docker commit 99de2a875317 mrva-client-ghmrva:0.2.0 
  # sha256:2eadb76a6b051200eaa395d2f815bad63f88473a16aa4c0a6cdebb114c556498

  docker commit 081900278c0e   mrva-server-server:0.2.0
  # sha256:0ec38b245021b0aea2c31eab8f75a9141cce8ee789e406cec4dabac484e03aff

  docker commit a23352c009fb   mrva-server-agent:0.2.0
  # sha256:75c6dee1dc57cda571482f7fdb2d3dd292f51e423c1733071927f21f3ab0cec5

  docker commit cd043e5bad77   mrva-client-vscode:0.2.0
  # sha256:b239d13f44637cac3601697dca49325faf123be8cf040c05b6dafe2b11504cc8

  docker commit d05f606b4ea0   mrva-server-hepc:0.2.0
  # sha256:238d39313590837587b7bd235bdfe749e18417b38e046553059295cf2064e0d2

  docker commit 7858ccf18fad   mrva-server-dbsdata:0.2.0
  # sha256:a283d69e6f9ba03856178149de95908dd6fa4b6a8cf407a1464d6cec5fa5fdc0

  #* Verify the newly created images
  docker images

  #* Tag the images for a registry
  ( cd ~/work-gh/mrva/mrva-docker/ && rg 'docker (tag)' )

  tagpushimg () {
      name=$1
      version=$2

      docker tag $name:$version ghcr.io/hohn/$name:$version
      docker push ghcr.io/hohn/$name:$version
  }

  tagpushimg mrva-client-ghmrva 0.2.0

  tagpushimg mrva-server-server 0.2.0

  tagpushimg mrva-server-agent 0.2.0

  tagpushimg mrva-client-vscode 0.2.0

  tagpushimg mrva-server-hepc 0.2.0

  tagpushimg mrva-server-dbsdata 0.2.0

view container image list on ghcr.io: https://github.com/hohn?tab=packages

Project Tools

This project, mrva-docker, is the highest-level part of the project as it packages all others. So it also houses simple project tools.

  # On macos

  # install uv
  curl -LsSf https://astral.sh/uv/install.sh | sh
  uv self update

  # set up mrva-env on mac
  cd ~/work-gh/mrva/mrva-docker
  uv venv mrva-env-mac

  # activate mrva-env
  source mrva-env-mac/bin/activate

  # link scripts (lazy 'install')
  cd  mrva-env-mac/bin/
  ln -s ../../bin/* .

Access minio

  • command line

      # 
      brew install minio/stable/mc  # macOS
      # or
      curl -O https://dl.min.io/client/mc/release/linux-amd64/mc && chmod +x mc && sudo mv mc /usr/local/bin/
    
      # Configuration
      MINIO_ALIAS="qldbminio"
      MINIO_URL="http://localhost:9000"
      MINIO_ROOT_USER="user"
      MINIO_ROOT_PASSWORD="mmusty8432"
      QL_DB_BUCKET_NAME="qldb"
    
      # Configure MinIO client
      mc alias set $MINIO_ALIAS $MINIO_URL $MINIO_ROOT_USER $MINIO_ROOT_PASSWORD
    
    
      # List everything uploaded under session 5
      mc ls qldbminio/mrvabucket | grep '^5-'
    
      # Drill into each expected result
      mc ls local/mrvabucket/5-{Serial-Studio Serial-Studio}
      mc ls local/mrvabucket/5-{UEFITool UEFITool}
      mc ls local/mrvabucket/5-{aircrack-ng aircrack-ng}
      mc ls local/mrvabucket/5-{bulk-builder bulk-builder}
      mc ls local/mrvabucket/5-{tesseract tesseract}
  • web console http://localhost:9001/browser