The Code Scanning UI shows just the first paragraph of the query help
as a summary, until a user chooses to expand the help.
We decided it was more useful to display the standard query help in this
summary compared to the experimental query notice, since there is
already a notice about experimental queries on the alert show page.
We didn’t catch this because our unit tests test only library code due
to the previous difficulty of running queries with an ML model (the ML
models in packs work should fix that), and because the end-to-end
evaluation runs separate queries that have different result patterns.
Going forward we should create unit tests for the queries themselves,
which will require using the ML model in tests. We should also be able
to catch this type of error using DCA.
Query help is identical to the original query, except for a new
paragraph prepended to the overview explaining that the queries are
experimental.
We add Markdown query help since only Markdown query help is embedded in
SARIF via `--sarif-add-query-help`.
The refactoring to remove the `CodeToFeatures` AST reintroduced a
performance problem. This commit resolves it by pushing size
restrictions into intermediate predicates.
Pushing the restriction to 256 tokens into the `bodyTokens` predicate
means we avoid this predicate blowing up due to very large functions.
This results in a runtime improvement from 1800s+ to 294s as measured
on a problematic repo on my machine (I didn't wait for the query to
finish running).
Absent features are now represented implicitly by the absence of a row
in the `tokenFeatures` relation, rather than explicitly by an empty
string. This leads to improved runtime performance. To enable this
implicit representation, we pass the set of supported token features to
the `scoreEndpoints` HOP. Requires CodeQL CLI v2.7.4.
Previously heuristic sinks were always included, to avoid us filtering
them out due to not being an argument to an external library call.
In this commit we move the argument to an external library call
filtering to the query-specific endpoint filters.
This lets us filter out heuristic sinks if they match one of the other
endpoint filters, reducing FPs.