mirror of
https://github.com/github/codeql.git
synced 2025-12-22 03:36:30 +01:00
On some snapshots, notably ffmpeg, the IR `ValueNumbering` recursion would generate billions of tuples and eventually run out of space. It turns out it was fairly common for an `Instruction` to get more than one `ValueNumber` in the base cases for `VariableAddressInstruction` and `InitializeParameterInstruction`, and it could also happen in an instruction with more than one operand of the same `OperandTag`. When a binary operation was applied to an instruction with `m` value numbers and another instruction with `n` value numbers, the result would get `m * n` value numbers. This led to doubly-exponential growth in the number of value numbers in rare cases. The underlying reason why a `VariableAddressInstruction` could get multiple value numbers is that it was keyed on the associated `IRVariable`, and the `IRVariable` is defined in part by the type of its underlying `Variable` (or other AST element). If the extractor defines a variable to have multiple types because of linker ambiguity, this leads to the creation of multiple `IRVariable`s. That should ideally be solved in `TIRVariable.qll`, but for now I've put a workaround in `ValueNumberingInternal.qll` instead. To remove the problem with instructions having multiple operands, the construction in `Operand.qll` will now filter out any such operand. It wasn't enough to apply that filter to the `raw` stage, so I've applied it to all three stages.