The `cpp/missing-case-in-switch` performed badly on some snapshots, to
the extent where it was as slow as the most expensive IR stages
(example: ChakraCore). This commit makes it faster, removing a
`pragma[noopt]` along the way.
The intermediate tuple counts on a customer codebase drop from 84M to
3M, while the content hash of `getAMissingCase` is the same.
Before:
(124s) Tuple counts for Stmt::EnumSwitch::getAMissingCase#ff#antijoin_rhs:
20867789 ~0% {3} r1 = JOIN Stmt::SwitchStmt::getASwitchCase_dispred#ff AS L WITH Stmt::EnumSwitch::getAMissingCase#ff#shared AS R ON FIRST 1 OUTPUT L.<1>, R.<0>, R.<1>
20122830 ~0% {3} r2 = JOIN r1 WITH Stmt::SwitchCase::getExpr_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r1.<1>, r1.<2>
20122830 ~0% {3} r3 = JOIN r2 WITH Expr::Expr::getValue_dispred#ff AS R ON FIRST 1 OUTPUT r2.<2>, r2.<1>, R.<1>
83961918 ~0% {4} r4 = JOIN r3 WITH Enum::EnumConstant::getInitializer_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r3.<1>, r3.<0>, r3.<2>
83961918 ~0% {4} r5 = JOIN r4 WITH initialisers AS R ON FIRST 1 OUTPUT R.<2>, r4.<3>, r4.<1>, r4.<2>
234348 ~185% {2} r6 = JOIN r5 WITH Expr::Expr::getValue_dispred#ff AS R ON FIRST 2 OUTPUT r5.<2>, r5.<3>
return r6
...
(124s) Tuple counts for Stmt::EnumSwitch::getAMissingCase#ff:
663127 ~4% {2} r1 = Stmt::EnumSwitch::getAMissingCase#ff#shared AS L AND NOT Stmt::EnumSwitch::getAMissingCase#ff#antijoin_rhs AS R(L.<0>, L.<1>)
return r1
(124s) Registering Stmt::EnumSwitch::getAMissingCase#ff + [] with content 2060ff326cvhihcsvoph6k9divuv4
(124s) >>> Wrote relation Stmt::EnumSwitch::getAMissingCase#ff with 663127 rows and 2 columns.
After:
(5s) Tuple counts for Stmt::EnumSwitch::getAMissingCase_dispred#ff#antijoin_rhs:
746029 ~0% {2} r1 = JOIN Stmt::EnumSwitch::getAMissingCase_dispred#ff#shared AS L WITH Enum::Enum::getAnEnumConstant_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, L.<1>
3116197 ~2% {3} r2 = JOIN r1 WITH Enum::EnumConstant::getInitializer_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r1.<1>, r1.<0>
3116197 ~0% {3} r3 = JOIN r2 WITH initialisers AS R ON FIRST 1 OUTPUT R.<2>, r2.<1>, r2.<2>
3116197 ~311% {3} r4 = JOIN r3 WITH Expr::Expr::getValue_dispred#ff AS R ON FIRST 1 OUTPUT r3.<1>, R.<1>, r3.<2>
234348 ~185% {2} r5 = JOIN r4 WITH Stmt::EnumSwitch::matchesValue#ff AS R ON FIRST 2 OUTPUT r4.<0>, r4.<2>
return r5
(5s) Registering Stmt::EnumSwitch::getAMissingCase_dispred#ff#antijoin_rhs + [] with content 173483d71508vl534mvlr1g0ehi12
(5s) >>> Wrote relation Stmt::EnumSwitch::getAMissingCase_dispred#ff#antijoin_rhs with 82902 rows and 2 columns.
(5s) Starting to evaluate predicate Stmt::EnumSwitch::getAMissingCase_dispred#ff/2@ae4c0b
(5s) Tuple counts for Stmt::EnumSwitch::getAMissingCase_dispred#ff:
746029 ~2% {2} r1 = JOIN Stmt::EnumSwitch::getAMissingCase_dispred#ff#shared AS L WITH Enum::Enum::getAnEnumConstant_dispred#ff AS R ON FIRST 1 OUTPUT L.<1>, R.<1>
663127 ~4% {2} r2 = r1 AND NOT Stmt::EnumSwitch::getAMissingCase_dispred#ff#antijoin_rhs AS R(r1.<0>, r1.<1>)
return r2
(5s) Registering Stmt::EnumSwitch::getAMissingCase_dispred#ff + [] with content 2060ff326cvhihcsvoph6k9divuv4
(5s) >>> Wrote relation Stmt::EnumSwitch::getAMissingCase_dispred#ff with 663127 rows and 2 columns.
On wireshark/wireshark, `isInCycle` ran into a low-memory loop on the
`aliased_ssa` stage. It shouldn't be necessary to detect cycles after
the `raw` stage, so this commit moves cycle detection into the
`Construction` modules and makes it a no-op in `SSAConstruction.qll`.
The `loopVariant` predicate in `ComparisonWithWiderType.ql` is intended
to identify loop counters, but it was too much of a stretch to apply it
to any subexpression of the small side of the comparison.
This change fixes two false positives on arvidn/libtorrent and many
others seen in the wild (on Linux, CoreCLR, ffmpeg, ...).
This fixes a failing qltest and makes the exclusion similar to what's in
`PointerOverflow.ql`. It's possible we should exclude based on both `+`
and `<`, but we can revisit that if false positives show up.
Before this change, evaluating `cpp/constant-comparison` followed by
`cpp/signed-overflow-check` would result in re-evaluation of almost all
the cached stages they share: CFG, basic blocks, SSA, and range
analysis. The same effect could be seen on `cpp/bad-strncpy-size`, which
also uses the GVN library.
We were privately importing `semmle.code.<lang>.ir.internal.Overlap`, but `PrintSSA.qll` was depending on it being public. This is made a little more complicated by the presence of cross-langage pyrameterized modules.
The `SignedOverflowCheck.ql` query was very slow on certain snapshots
(jluttine/suitesparse and Chromium) due to bad magic in
`MacroInvocation::getAnAffectedElement_dispred#fb`. This commit doesn't
fix the bad magic but changes the exclusion mechanism to use a predicate
where we can better control the magic and optimization.
The query should also give more good results due to this new exclusion
mechanism, which is the same one used in its sibling,
`PointerOverflow.ql`.
1. The two last examples were misleading at best. The first of those two
recommended casting to non-negative `int`s to `unsigned int` and then
checking if their addition would overflow, but overflow was
impossible because their sum (on 32-bit two's complement) could be at
most 2^32 - 2. The second example could lead to the wrong condition
(unsigned overflow) being checked if taken literally. Instead of
keeping that example, I reworeded the first paragraph of the
Recommendation section.
2. The assumptions about `delta` being positive was relaxed to
non-negative.
3. There was no need to assume that an unsigned short was non-negative.
4. Some of the suggestions were missing `i >`.