Now that change notes are per-package, new change notes should be created in the `change-notes` folder under the affected pack (e.g., `cpp/ql/src/change-notes` for C++ query change notes. I've moved all of the change note files that were added before we started publishing them in packs to an `old-change-notes` directory under each language, to reduce the temptation to add new change notes there.
I'm working on a document to describe how and when to create change notes for packs separately.
CWE-185: Incorrect Regular Expression
The software specifies a regular expression in a way that causes data to
be improperly matched or compared.
https://cwe.mitre.org/data/definitions/185.html
CWE-186: Overly Restrictive Regular Expression
> A regular expression is overly restrictive, which prevents dangerous values from being detected.
>
> (...) [this CWE] is about a regular expression that does not match all
> values that are intended. (...)
https://cwe.mitre.org/data/definitions/186.html
From my understanding,
CWE-625: Permissive Regular Expression, is not applicable. (since this
is about accepting a regex match where there should not be a match).
Combine two regexes into a single one.
This saves up to 5s on large databases by reducing the number
of separate scans of the comments table before regex matching.
The combined regex is slightly more permissive than the
original two, since it allows a combination of the two
matched formats. A string that matches one of the original
regexes will match the combined regex.
Some subclasses of GeneratedCodeMarkerComment regex match against `getLine(_)`.
When evaluated, this results in multiple scans (one per subclass that uses it)
of all comment lines in the database, before regex matching against those lines.
To make these scans smaller, regex match against the entire comment text
without splitting them into lines.
This is achieved using `?m` (multiline) and line boundaries in the regexes.
Previously heuristic sinks were always included, to avoid us filtering
them out due to not being an argument to an external library call.
In this commit we move the argument to an external library call
filtering to the query-specific endpoint filters.
This lets us filter out heuristic sinks if they match one of the other
endpoint filters, reducing FPs.
0e31439 introduces some occasional duplicate tokens due to duplicate AST
node attributes. The long-term fix is to update `CodeToFeatures.qll`,
but for the short-term, we update the concatenation to concatenate
unique (location, token) pairs.