Fix subtle type problem: M8 is required for early steps, datetime64[ns] later

This commit is contained in:
2025-10-19 13:35:02 -07:00
committed by =michael hohn
parent bed9d3e659
commit c15dc6d4bc
7 changed files with 271 additions and 29 deletions

View File

@@ -205,9 +205,9 @@
#+BEGIN_SRC sh :session shared :results output :eval never-export
cd ~/local/sarif-cli/data/codeql-dataflow-sql-injection
sarif-extract-scans \
sqlidb-1.1.sarif.scanspec \
sqlidb-1.1.sarif.scantables \
sqlidb-1.1.sarif.csv \
sqlidb-1.sarif.scanspec \
sqlidb-1.sarif.scantables \
sqlidb-1.sarif.csv \
-f CLI
#+END_SRC

67
notes/quickstart.org Normal file
View File

@@ -0,0 +1,67 @@
* sarif-cli quickstart
Set up the virtual environment and install the packages:
#+BEGIN_SRC sh
cd ~/work-gh/sarif-cli/
# set up virtual environment
python3 -m venv .venv
. .venv/bin/activate
# Use requirementsDEV.txt
python -m pip install -r requirementsDEV.txt
# install scripts
pip install -e .
# force symlinks for development
rm -f "$VIRTUAL_ENV/bin/sarif-"*
ln -sf "$PWD/bin/sarif-"* "$VIRTUAL_ENV/bin/"
#+END_SRC
Run SARIF extraction for one test file and inspect results.
This assumes you are in the above virtual environment where all =sarif-*= tools
are on =$PATH=.
#+BEGIN_SRC sh
cd ~/work-gh/sarif-cli/data/codeql-dataflow-sql-injection
# ---------------------------------------------------------------------
# 1. Set base name of the original SARIF file (without extension)
# ---------------------------------------------------------------------
orig="sqlidb-1"
# ---------------------------------------------------------------------
# 2. Remove any stale output from previous runs
# ---------------------------------------------------------------------
rm -fR -- "${orig}.1.sarif."*
# ---------------------------------------------------------------------
# 3. Ensure versionControlProvenance field is present
# ---------------------------------------------------------------------
sarif-insert-vcp "${orig}.sarif" > "${orig}.1.sarif"
# ---------------------------------------------------------------------
# 4. Run the converter (CLI input signature)
# - Logs are written only if errors occur.
# ---------------------------------------------------------------------
sarif-extract-scans-runner --input-signature CLI - > /dev/null <<EOF
${orig}.1.sarif
EOF
# ---------------------------------------------------------------------
# 5. If errors occurred, show the scan log.
# The log lists the exact commands that can be re-run manually under pdb.
# ---------------------------------------------------------------------
if [[ -f "${orig}.1.sarif.scanlog" ]]; then
echo "Conversion errors logged in ${orig}.1.sarif.scanlog"
cat "${orig}.1.sarif.scanlog"
fi
# ---------------------------------------------------------------------
# 6. Examine results (converted SARIF, logs, etc.)
# ---------------------------------------------------------------------
ls -l "${orig}.1.sarif"*
#+END_SRC
For interactive examination / debugging, see [[file:README.org::*Run using embedded repls][Run using embedded repls]]

76
notes/update.org Normal file
View File

@@ -0,0 +1,76 @@
* issues <2025-10-18 Sat>
** DONE
CLOSED: [2025-10-18 Sat 22:34]
- State "DONE" from "NEXT" [2025-10-18 Sat 22:34]
#+BEGIN_SRC text
~/work-gh/sarif-cli/data/codeql-dataflow-sql-injection]$
1:$ bat sqlidb-1.sarif.scanspec sqlidb-1.sarif.scantables sqlidb-1.sarif.csv
───────┬──────────────────────────────────────────────────────────────────────────────────────────────────
│ File: sqlidb-1.sarif.scanspec
───────┼──────────────────────────────────────────────────────────────────────────────────────────────────
1 │ {"scan_id": 12314655876769447717, "sarif_file_name": "sqlidb-1.sarif"}
───────┴──────────────────────────────────────────────────────────────────────────────────────────────────
[bat error]: 'sqlidb-1.sarif.scantables' is a directory.
───────┬──────────────────────────────────────────────────────────────────────────────────────────────────
│ File: sqlidb-1.sarif.csv
───────┼──────────────────────────────────────────────────────────────────────────────────────────────────
1 │ sarif_file,level,levelcode,message,extra_info
2 │ sqlidb-1.sarif,WARNING,2,Input sarif is missing neccesary properties.,"Missing: {'newlineSequence
│ s', 'versionControlProvenance'}, "
───────┴──────────────────────────────────────────────────────────────────────────────────────────────────
(.venv-m325) (base) [hohn@m325 ~/work-gh/sarif-cli/data/codeql-dataflow-sql-injection]$
#+END_SRC
sarif_file,level,levelcode,message,extra_info
sqlidb-1.sarif,WARNING,2,Input sarif is missing neccesary properties.,"Missing:
{'newlineSequences', 'versionControlProvenance'}
see
File: ./bin/sarif-insert-vcp
2 11 # Add the versionControlProvenance key to a SARIF file
9 6 | ( .versionControlProvenance |=
File: ./scripts/test-vcp.sh
21 15 #* Insert versionControlProvenance
o The CLI sarif **MUST** contain one additional property `versionControlProvenance` - which needs to look like:
```
"versionControlProvenance": [
{
"repositoryUri": "https://github.com/testorg/testrepo.git",
"revisionId": "testsha"
}
]
```
The script
bin/sarif-insert-vcp
[[file:~/work-gh/sarif-cli/bin/sarif-insert-vcp::uri=vcp-no-uri]]
will add that entry to a SARIF file.
Also,
./sarif_cli/signature.py:308: # Ensure newlineSequences is present when versionControlProvenance is
./sarif_cli/signature.py:309: full_elem['newlineSequences'] = elem.get('newlineSequences', dummy_newlineSequences)
So:
- adding versionControlProvenance first will add newlineSequences later also
** TODO sarif-cli type error
#+BEGIN_SRC text
~/work-gh/sarif-cli/data/codeql-dataflow-sql-injection]$
0:$ less sqlidb-1.1.sarif.scanlog
...
File "/Users/hohn/work-gh/sarif-cli/.venv-m325/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py", line 734, in astype
raise TypeError(
TypeError: Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.
#+END_SRC