C++: Add more documentation to the 'cpp/invalid-pointer-deref' query.

This commit is contained in:
Mathias Vorreiter Pedersen
2023-07-19 14:42:20 +01:00
parent 0a0e9bb25b
commit 922f4d5496
3 changed files with 162 additions and 18 deletions

View File

@@ -15,6 +15,51 @@
* external/cwe/cwe-787
*/
/*
* High-level description of the query:
*
* The goal of this query is to identify issues such as:
* ```cpp
* 1. int* base = new int[size];
* 2. int* end = base + size;
* 3. for(int* p = base; p <= end; ++p) {
* 4. *p = 0; // BUG: Should have been bounded by `p < end`.
* 5. }
* ```
* In order to do this, we split the problem into three subtasks:
* 1. First, we find flow from `new int[size]` to `base + size`.
* 2. Then, we find flow from `base + size` to `end` (on line 3).
* 3. Finally, we use range-analysis to find a write to (or read from) a pointer that may be equal to `end`.
*
* Step 1 is implemented in `AllocationToInvalidPointer.qll`, and step 2 is implemented by
* `InvalidPointerToDereference.qll`. See those files for the description of these.
*
* This file imports both libraries and define a final dataflow configuration that constructs the full path from
* the allocation to the dereference of the out-of-bounds pointer. This is done for several reasons:
* 1. It means the user is able to inspect the entire path from the allocation to the dereference, which can be useful
* to understand the problem highlighted.
* 2. It ensures that the call-contexts line up correctly when we transition from step 1 to step 2. See the
* `test_missing_call_context_1` and `test_missing_call_context_2` tests for how this may flag false positives
* without this final configuration.
*
* The source of the final path is an allocation that's:
* 1. identified as flowing to an invalid pointer (by `AllocationToInvalidPointer`), and
* 2. for which the invalid pointer flows to a dereference (as identified by `InvalidPointerToDereference`).
*
* The path can be described in 3 "chunks":
* 1. One path from the allocation to the construction of the invalid pointer
* 2. Another path from the construction of the invalid pointer to the final pointer that's about to be dereferenced.
* 3. Finally, there's a single step from the dataflow node that represents the final pointer to the dereference.
*
* Step 1 happens when the flow state is `TInitial`, and step 2 and 3 happens when the flow state is `TPointerArith(pai)`
* where the pointer-arithmetic instruction `pai` tracks the instruction that generated the out-of-bounds pointer. This
* instruction is used in the construction of the alert message.
*
* The set of pointer-arithmetic instructions that define the `TPointerArith` flow state is restricted to be the pointer-
* arithmetic instructions that both receive flow from the allocation (as identified by `AllocationToInvalidPointer.qll`),
* and further flows to a dereference (as identified by `InvalidPointerToDereference.qll`).
*/
import cpp
import semmle.code.cpp.dataflow.new.DataFlow
import semmle.code.cpp.ir.IR