Absent features are now represented implicitly by the absence of a row
in the `tokenFeatures` relation, rather than explicitly by an empty
string. This leads to improved runtime performance. To enable this
implicit representation, we pass the set of supported token features to
the `scoreEndpoints` HOP. Requires CodeQL CLI v2.7.4.
Now that change notes are per-package, new change notes should be created in the `change-notes` folder under the affected pack (e.g., `cpp/ql/src/change-notes` for C++ query change notes. I've moved all of the change note files that were added before we started publishing them in packs to an `old-change-notes` directory under each language, to reduce the temptation to add new change notes there.
I'm working on a document to describe how and when to create change notes for packs separately.
CWE-185: Incorrect Regular Expression
The software specifies a regular expression in a way that causes data to
be improperly matched or compared.
https://cwe.mitre.org/data/definitions/185.html
CWE-186: Overly Restrictive Regular Expression
> A regular expression is overly restrictive, which prevents dangerous values from being detected.
>
> (...) [this CWE] is about a regular expression that does not match all
> values that are intended. (...)
https://cwe.mitre.org/data/definitions/186.html
From my understanding,
CWE-625: Permissive Regular Expression, is not applicable. (since this
is about accepting a regex match where there should not be a match).
Combine two regexes into a single one.
This saves up to 5s on large databases by reducing the number
of separate scans of the comments table before regex matching.
The combined regex is slightly more permissive than the
original two, since it allows a combination of the two
matched formats. A string that matches one of the original
regexes will match the combined regex.
Some subclasses of GeneratedCodeMarkerComment regex match against `getLine(_)`.
When evaluated, this results in multiple scans (one per subclass that uses it)
of all comment lines in the database, before regex matching against those lines.
To make these scans smaller, regex match against the entire comment text
without splitting them into lines.
This is achieved using `?m` (multiline) and line boundaries in the regexes.