The current api (<2024-07-26 Fri>) is set up only for (owner,name). This is
insufficient for distinguishing CodeQL databases.
Other differences must be considered; this patch combines the fields
| cliVersion |
| creationTime |
| language |
| sha |
into one called CID. The CID field is a hash of these others and therefore can be
changed in the future without affecting workflows or the server.
The cid is combined with the owner/name to form one
identifier. This requires no changes to server or client -- the db
selection's interface is separate from VS Code and gh-mrva in any case.
To test this, this version imports multiple versions of the same owner/repo pairs from multiple directories. In this case, from
~/work-gh/mrva/mrva-open-source-download/repos
and
~/work-gh/mrva/mrva-open-source-download/repos-2024-04-29/
The unique database count increases from 3000 to 5360 -- see README.md,
./bin/mc-db-view-info < db-info-3.csv &
Other code modifications:
- Push (owner,repo,cid) names to minio
- Generate databases.json for use in vs code extension
- Generate list-databases.json for use by gh-mrva client
45 lines
1.1 KiB
Python
Executable File
45 lines
1.1 KiB
Python
Executable File
#!/usr/bin/env python
|
|
""" Read a table of CodeQL DB information,
|
|
group entries by (owner,name,CID),
|
|
sort each group by creationTime,
|
|
and keep only the top (newest) element.
|
|
"""
|
|
import argparse
|
|
import logging
|
|
|
|
#
|
|
#* Configure logger
|
|
#
|
|
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(message)s')
|
|
# Overwrite log level set by minio
|
|
root_logger = logging.getLogger()
|
|
root_logger.setLevel(logging.INFO)
|
|
|
|
#
|
|
#* Process command line
|
|
#
|
|
parser = argparse.ArgumentParser(
|
|
description=""" Read a table of CodeQL DB information,
|
|
group entries by (owner,name), sort each group by
|
|
creationTime and keep only the top (newest) element.
|
|
""")
|
|
|
|
args = parser.parse_args()
|
|
#
|
|
#* Collect the information and select subset
|
|
#
|
|
import pandas as pd
|
|
import sys
|
|
|
|
df0 = pd.read_csv(sys.stdin)
|
|
|
|
df_sorted = df0.sort_values(by=['owner', 'name', 'CID', 'creationTime'])
|
|
df_unique = df_sorted.groupby(['owner', 'name', 'CID']).first().reset_index()
|
|
|
|
df_unique.to_csv(sys.stdout, index=False)
|
|
|
|
|
|
# Local Variables:
|
|
# python-shell-virtualenv-root: "~/work-gh/mrva/mrvacommander/client/qldbtools/venv/"
|
|
# End:
|