Address review feedback by introducing dedicated subclasses of
TrustBoundaryValidationSanitizer for SimpleTypeSanitizer, RegexpCheckBarrier,
and the HttpServletSession type check, so isBarrier only references the
abstract class.
Replace the five-way result = ... or result = ... disjunction with a
single equality on a set literal. Addresses the CodeQL style alert
"Use a set literal in place of or" reported by the self-scan on this
PR. Pure refactor, no semantic change.
`com.ctc.wstx.stax.WstxInputFactory` overrides `createXMLStreamReader`,
`createXMLEventReader` and `setProperty` from `XMLInputFactory`, so the
existing `XmlInputFactory` model in `XmlParsers.qll` does not match calls
where the static receiver type is `WstxInputFactory` (or its supertype
`org.codehaus.stax2.XMLInputFactory2`). Woodstox is vulnerable to XXE in
its default configuration, so these missed sinks were false negatives in
`java/xxe`.
This adds a scoped framework model under
`semmle/code/java/frameworks/woodstox/WoodstoxXml.qll` (registered in the
`Frameworks` module of `XmlParsers.qll`) that recognises these calls as
XXE sinks and treats the factory as safe when both
`javax.xml.stream.supportDTD` and
`javax.xml.stream.isSupportingExternalEntities` are disabled — mirroring
the existing `XMLInputFactory` safe-configuration logic.
Adds a dedicated test verifying that fields annotated with
@javax.validation.constraints.Pattern are recognized as sanitized
by RegexpCheckBarrier, in addition to the existing String.matches()
guard test.
secretQuestion is ambiguous: it could be the question text (not
sensitive) or a security question answer. Worse, the regex
secrets?(question) also matches secretQuestionAnswer, which is
clearly sensitive. Drop it to avoid false negatives.
The trust-boundary-violation query only recognized OWASP ESAPI validators
as sanitizers. ESAPI is rarely used in modern Java projects, while regex
validation via String.matches() and @javax.validation.constraints.Pattern
is the standard approach in Spring/Jakarta applications.
RegexpCheckBarrier already exists in Sanitizers.qll and is used by other
queries (e.g., RequestForgery). This wires it into TrustBoundaryConfig,
so patterns like input.matches("[a-zA-Z0-9]+") and @Pattern annotations
are recognized as sanitizers, consistent with the existing ESAPI treatment.
PathNormalizeSanitizer recognized Path.normalize() and
File.getCanonicalPath()/getCanonicalFile(), but not Path.toRealPath().
toRealPath() is strictly stronger than normalize() (resolves symlinks
and verifies file existence in addition to normalizing ".." components),
and is functionally equivalent to File.getCanonicalPath() for the NIO.2
API. CERT FIO16-J and OWASP both recommend it for path traversal defense.
This adds toRealPath to PathNormalizeSanitizer alongside normalize,
reducing false positives for code using idiomatic NIO.2 path handling.