Patch 1: PostgreSQL Query Ordering
Fixed GetJobList() to return jobs ordered by job_repo_id
Added JOIN with job_repo_map table to ensure proper ordering
This ensures slice indices match the stored repository IDs
Patch 2: Updated Comments
Removed the TODO comment about hacky job IDing
Added explanation that ordering is now consistent
Patch 3: Added Validation
Added runtime validation to catch ID mismatches
Logs warnings/errors if slice index doesn't match expected job_repo_id
Helps debug issues in different state implementations
The problem was missing fields in the SARIF output. After the debugging
below, the cause were conversions from JSON to Go to JSON; the Go ->
JSON conversion only output the fields defined in the Go struct.
Because SARIF has so many optional fields, no attempt is made to enforce
a statically-defined structure. Instead, the JSON -> Go conversion is
now to a fully dynamic structure; unused fields are simply passed through
Debugging:
Comparing two SARIF files shows
{
"$schema" : "https://json.schemastore.org/sarif-2.1.0.json",
"version" : "2.1.0",
"runs" : [ {...
} ]
}
and
{
"runs": [...
]
}
so there are missing fields.
The Problem
1. Problem origin
// Modify the sarif: start by extracting
var sarif Sarif
if err := json.Unmarshal(sarifData, &sarif); err != nil {
return nil, fmt.Errorf("failed to unmarshal SARIF: %v", err)
}
...
// now inject version control info
...
// and write it back
sarifBytes, err := json.Marshal(sarif)
if err != nil {
return nil, fmt.Errorf("failed to marshal SARIF: %v", err)
}
2. But the struct only has one of the needed fields
type Sarif struct {
Runs []SarifRun `json:"runs"`
}
3. From the docs:
// To unmarshal JSON into a struct, Unmarshal matches incoming object
// keys to the keys used by [Marshal] (either the struct field name or its tag),
// preferring an exact match but also accepting a case-insensitive match. By
// default, object keys which don't have a corresponding struct field are
// ignored (see [Decoder.DisallowUnknownFields] for an alternative).
End-to-end testing contained an unhandled CodeQL database download
request. The handlers are added in this patch. Debugging info is below
for reference.
The mrvacommander *server* fails with the following. The source code is
: func setupEndpoints(c CommanderAPI)
See mrvacommander/pkg/server/server.go, endpoints for getting a URL to download artifacts.
Original
Downloading artifacts for tdlib_telegram-bot-apictsj8529d9_2
...
Downloading database tdlib/telegram-bot-apictsj8529d9 cpp mirva-session-1400 tdlib_telegram-bot-apictsj8529d9_2
...
2024/08/13 12:31:38 >> GET http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
...
2024/08/13 12:31:38 << 404 http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
...
-rwxr-xr-x@ 1 hohn staff 169488 Aug 13 12:29 tdlib_telegram-bot-apictsj8529d9_2.sarif*
-rwxr-xr-x@ 1 hohn staff 10 Aug 13 12:31 tdlib_telegram-bot-apictsj8529d9_2_db.zip*
Server log
server | 2024/08/13 19:31:38 ERROR Unhandled endpoint method=GET uri=/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
Try a manual download from the server
8:$ wget http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
--2024-08-13 12:56:05-- http://localhost:8080/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:8080... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-08-13 12:56:05 ERROR 404: Not Found.
server | 2024/08/13 19:56:05 ERROR Unhandled endpoint method=GET uri=/repos/tdlib/telegram-bot-apictsj8529d9/code-scanning/codeql/databases/cpp
The full info for the DB
tdlib,telegram-bot-api,8529d9,2.17.0,2024-05-09 08:02:49.545174+00:00,cpp,f95d406da67adb8ac13d9c562291aa57c65398e0,306106.0,/Users/hohn/work-gh/mrva/mrva-open-source-download/repos-2024-04-29/tdlib/telegram-bot-api/code-scanning/codeql/databases/cpp/db.zip,cpp,C/C++,1244.0,306106.0,2024-05-13T15:54:54.749093,cpp,True,3375,373477635
The gh-mrva *client* sends the following. The source is
gh-mrva/utils/utils.go,
client.Get(fmt.Sprintf("http://localhost:8080/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))
We have
cd /Users/hohn/work-gh/mrva/gh-mrva
0:$ rg 'repos/.*/code-scanning/codeql/databases'
...
utils/utils.go
625: // resp, err := client.Get(fmt.Sprintf("https://api.github.com/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))
626: resp, err := client.Get(fmt.Sprintf("http://localhost:8080/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))
And
resp, err := client.Get(fmt.Sprintf("http://localhost:8080/repos/%s/code-scanning/codeql/databases/%s", task.Nwo, task.Language))
The original DB upload was
cd ~/work-gh/mrva/mrvacommander/client/qldbtools && \
./bin/mc-db-populate-minio -n 11 < scratch/db-info-3.csv
...
2024-08-14 09:29:19 [INFO] Uploaded /Users/hohn/work-gh/mrva/mrva-open-source-download/repos-2024-04-29/tdlib/telegram-bot-api/code-scanning/codeql/databases/cpp/db.zip as tdlib$telegram-bot-apictsj8529d9.zip to bucket qldb
...
This patch fixes the following
- [X] Wrong db metadata path. Fixed via
: globRecursively(databasePath, "codeql-database.yml")
The log output for reference:
agent | 2024/08/09 21:16:40 DEBUG XX:getDataBaseMetadata databasePath=/tmp/ce523549-a217-4b54-a118-7224ce444870/db "Waiting for SIGUSR1 or SIGUSR2..."=<nil>
agent | 2024/08/09 21:16:40 DEBUG XX:getDataBaseMetadata databasePath=/tmp/bc24fe72-b520-4e72-9634-a98d630cb75e/db "Waiting for SIGUSR1 or SIGUSR2..."=<nil>
agent | 2024/08/09 21:16:40 DEBUG Received signal: %s "user defined signal 1"=<nil>
agent | 2024/08/09 21:16:40 DEBUG XX:getDataBaseMetadata databasePath=/tmp/41fcf5cc-e151-4a11-bccc-481d599aa426/db "Waiting for SIGUSR1 or SIGUSR2..."=<nil>
From
func getDatabaseMetadata(databasePath string) (*DatabaseMetadata, error) {
data, err := os.ReadFile(filepath.Join(databasePath, "codeql-database.yml"))
...}
And some inspection:
root@3fa4b8013336:~# find /tmp |grep ql-datab
/tmp/27f09b9f-254f-4ef5-abf5-9a1a2927906b/db/cpp/codeql-database.yml
/tmp/d7e14cd4-8789-4176-81bc-2ac1957ed9fd/db/codeql_db/codeql-database.yml
/tmp/41fcf5cc-e151-4a11-bccc-481d599aa426/db/codeql_db/codeql-database.yml
/tmp/bc24fe72-b520-4e72-9634-a98d630cb75e/db/codeql_db/codeql-database.yml
/tmp/ce523549-a217-4b54-a118-7224ce444870/db/codeql_db/codeql-database.yml
- [X] Wrong db path. Fixed via
: findDBDir(databasePath)
The log output for reference:
agent | 2024/08/09 21:51:09 ERROR Failed to run analysis job error="failed to run analysis: failed to run queries: exit status 2\nOutput: A fatal error occurred: /tmp/91c61e0b-dfd9-4dd3-a3ad-cb77dbc1cbfd/db is not a recognized CodeQL database.\n"
agent | 2024/08/09 21:51:09 INFO Running analysis job job="{Spec:{SessionID:1 NameWithOwner:{Owner:USCiLab Repo:cerealctsj264953}} QueryPackLocation:{Key:1 Bucket:packs} QueryLanguage:cpp}"
agent | 2024/08/09 21:51:09 ERROR Failed to run analysis job error="failed to run analysis: failed to run queries: exit status 2\nOutput: A fatal error occurred: /tmp/1b8ffeba-8ad1-465e-8ec7-36cda449a5f5/db is not a recognized CodeQL database.\n"
...
This is easily confirmed:
root@171b5417e05f:~# /opt/codeql/codeql database upgrade /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2/
A fatal error occurred: /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2 is not a recognized CodeQL database.
Another try:
root@171b5417e05f:~# /opt/codeql/codeql database upgrade /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2/database.zip
A fatal error occurred: Database root /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2/database.zip is not a directory.
This one is correct:
root@171b5417e05f:~# /opt/codeql/codeql database upgrade /tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2/db/codeql_db
/tmp/7ed27578-d7ea-42e0-902a-effbc4df05f2/db/codeql_db/db-cpp is up to date.
- [X] Wrong database source prefix. Also fixed via
: findDBDir(databasePath)
Similar log entries:
agent | 2024/08/13 15:40:14 ERROR Failed to run analysis job error="failed to run analysis: failed to get source location prefix: failed to resolve database: exit status 2\nOutput: A fatal error occurred: /tmp/da420844-a284-4d82-9470-fa189a5b4ee6/db is not a recognized CodeQL database.\n"
agent | 2024/08/13 15:40:14 INFO Worker stopping due to reduction in worker count
agent | 2024/08/13 15:40:18 ERROR Failed to run analysis job error="failed to run analysis: failed to get source location prefix: failed to resolve database: exit status 2\nOutput: A fatal error occurred: /tmp/eebfc52c-3ecf-490d-bbf4-23c305d6ba18/db is not a recognized CodeQL database.\n"
and
agent | 2024/08/13 15:49:33 ERROR Failed to resolve database err="exit status 2" output="A fatal error occurred: /tmp/b5c4941a-5692-4640-aa79-9810bcab39f4/db is not a recognized CodeQL database.\n"
agent | 2024/08/13 15:49:33 DEBUG XX: RunQuery failed to get source location prefixdatabasePath=/tmp/b5c4941a-5692-4640-aa79-9810bcab39f4/db "Waiting for SIGUSR1 or SIGUSR2..."=<nil>
agent | 2024/08/13 15:49:35 INFO Modifying worker count current=3 new=2
agent | 2024/08/13 15:49:35 ERROR Failed to resolve database err="exit status 2" output="A fatal error occurred: /tmp/eda30582-81a3-4582-8897-65f8904e8501/db is not a recognized CodeQL database.\n"
agent | 2024/08/13 15:49:35 DEBUG XX: RunQuery failed to get source location prefixdatabasePath=/tmp/eda30582-81a3-4582-8897-65f8904e8501/db "Waiting for SIGUSR1 or SIGUSR2..."=<nil>
And this fails
root@51464985499f:~# /opt/codeql/codeql resolve database /tmp/eda30582-81a3-4582-8897-65f8904e8501/db/
A fatal error occurred: /tmp/eda30582-81a3-4582-8897-65f8904e8501/db is not a recognized CodeQL database.
But this works:
root@51464985499f:~# /opt/codeql/codeql resolve database /tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/
{
"sourceLocationPrefix" : "/home/runner/work/bulk-builder/bulk-builder",
"columnKind" : "utf8",
"unicodeNewlines" : false,
"sourceArchiveZip" : "/tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/src.zip",
"sourceArchiveRoot" : "/tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/src",
"datasetFolder" : "/tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/db-cpp",
"logsFolder" : "/tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/log",
"languages" : [
"cpp"
],
"scratchDir" : "/tmp/eda30582-81a3-4582-8897-65f8904e8501/db/codeql_db/working"
}
Previously:
- There is confusion between nameWithOwner and queryLanguage. Both are strings.
Between
runResult, err := codeql.RunQuery(databasePath, job.QueryLanguage, queryPackPath, tempDir)
(agent.go l205)
and
func RunQuery(database string, nwo string, queryPackPath string, tempDir string) (*RunQueryResult, error)
QueryLanguage is suddenly name with owner in the code.
Added some debugging, the value is the query language in the two places it gets used:
server | 2024/07/03 18:30:15 DEBUG Processed request info location="{Data:map[bucket:packs key:1]}" language=cpp
...
agent | 2024/07/03 18:30:15 DEBUG XX: is nwo a name/owner, or the original callers' queryLanguage? nwo=cpp
...
agent | 2024/07/03 18:30:19 DEBUG XX: 2: is nwo a name/owner, or the original callers' queryLanguage? nwo=cpp
Changes:
- Introduce explicit type QueryLanguage = string and update code to clarify
- inline trivial function