Address review feedback by moving the shared method-name-based encryption/hash/digest
check into Sanitizers.qll, and reference it from both CleartextStorageQuery.qll and
SensitiveLoggingQuery.qll instead of duplicating the definition.
Replace the five-way result = ... or result = ... disjunction with a
single equality on a set literal. Addresses the CodeQL style alert
"Use a set literal in place of or" reported by the self-scan on this
PR. Pure refactor, no semantic change.
`com.ctc.wstx.stax.WstxInputFactory` overrides `createXMLStreamReader`,
`createXMLEventReader` and `setProperty` from `XMLInputFactory`, so the
existing `XmlInputFactory` model in `XmlParsers.qll` does not match calls
where the static receiver type is `WstxInputFactory` (or its supertype
`org.codehaus.stax2.XMLInputFactory2`). Woodstox is vulnerable to XXE in
its default configuration, so these missed sinks were false negatives in
`java/xxe`.
This adds a scoped framework model under
`semmle/code/java/frameworks/woodstox/WoodstoxXml.qll` (registered in the
`Frameworks` module of `XmlParsers.qll`) that recognises these calls as
XXE sinks and treats the factory as safe when both
`javax.xml.stream.supportDTD` and
`javax.xml.stream.isSupportingExternalEntities` are disabled — mirroring
the existing `XMLInputFactory` safe-configuration logic.
secretQuestion is ambiguous: it could be the question text (not
sensitive) or a security question answer. Worse, the regex
secrets?(question) also matches secretQuestionAnswer, which is
clearly sensitive. Drop it to avoid false negatives.
The sensitive-log query (CWE-532) lacked sanitizers for hashed or
encrypted data, while the sibling cleartext-storage query (CWE-312)
already recognized methods with "encrypt", "hash", or "digest" in their
names as sanitizers (CleartextStorageQuery.qll:86).
This adds an EncryptionBarrier to SensitiveLoggingQuery that applies the
same name-based heuristic, making the two queries consistent. Calls like
DigestUtils.sha256Hex(password) or hashPassword(secret) are no longer
flagged when their results are logged.
PathNormalizeSanitizer recognized Path.normalize() and
File.getCanonicalPath()/getCanonicalFile(), but not Path.toRealPath().
toRealPath() is strictly stronger than normalize() (resolves symlinks
and verifies file existence in addition to normalizing ".." components),
and is functionally equivalent to File.getCanonicalPath() for the NIO.2
API. CERT FIO16-J and OWASP both recommend it for path traversal defense.
This adds toRealPath to PathNormalizeSanitizer alongside normalize,
reducing false positives for code using idiomatic NIO.2 path handling.