C++: Combine crypto blacklist regexes into one

Instead of `algorithmBlacklistRegex` having 2 * 5 results, it now has
only one result, which is a single regex that represents the union of
the previous 2 * 5 regexes. This means that `BrokenCryptoAlgorithm.ql`
has much less regex matching to do.

On https://github.com/ericniebler/range-v3, this change reduces the run
time of the two slowest predicates from

    BrokenCryptoAlgorithm::InsecureMacroSpec#class#f .... 2m21s
    BrokenCryptoAlgorithm::InsecureFunctionCall#class#f . 54.5s

to

    BrokenCryptoAlgorithm::InsecureMacroSpec#class#f .... 35.1s
    BrokenCryptoAlgorithm::InsecureFunctionCall#class#f . 12.8s
This commit is contained in:
Jonas Jensen
2019-03-18 11:51:50 +01:00
parent 69a048d102
commit f72ff37226

View File

@@ -20,14 +20,17 @@ string hashAlgorithmBlacklist() {
/** A regex for matching strings that look like they contain a blacklisted algorithm */
string algorithmBlacklistRegex() {
// algorithms usually appear in names surrounded by characters that are not
// alphabetical characters in the same case. This handles the upper and lower
// case cases
result = "(^|.*[^A-Z])" + algorithmBlacklist() + "([^A-Z].*|$)"
// for lowercase, we want to be careful to avoid being confused by camelCase
// hence we require two preceding uppercase letters to be sure of a case switch,
// or a preceding non-alphabetic character
or result = "(^|.*[A-Z]{2}|.*[^a-zA-Z])" + algorithmBlacklist().toLowerCase() + "([^a-z].*|$)"
result =
// algorithms usually appear in names surrounded by characters that are not
// alphabetical characters in the same case. This handles the upper and lower
// case cases
"(^|.*[^A-Z])(" + strictconcat(algorithmBlacklist(), "|") + ")([^A-Z].*|$)" +
"|" +
// for lowercase, we want to be careful to avoid being confused by camelCase
// hence we require two preceding uppercase letters to be sure of a case switch,
// or a preceding non-alphabetic character
"(^|.*[A-Z]{2}|.*[^a-zA-Z])(" + strictconcat(algorithmBlacklist().toLowerCase(), "|") +
")([^a-z].*|$)"
}
/** A whitelist of algorithms that are known to be secure */