From 922f4d54966f272bf204006667fac898ddf37a62 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Wed, 19 Jul 2023 14:42:20 +0100 Subject: [PATCH 01/43] C++: Add more documentation to the 'cpp/invalid-pointer-deref' query. --- .../AllocationToInvalidPointer.qll | 70 ++++++++++++++----- .../InvalidPointerToDereference.qll | 65 +++++++++++++++++ .../CWE/CWE-193/InvalidPointerDeref.ql | 45 ++++++++++++ 3 files changed, 162 insertions(+), 18 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index 61821ee5bca..5f7c31d1500 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -1,6 +1,56 @@ /** * This file provides the first phase of the `cpp/invalid-pointer-deref` query that identifies flow * from an allocation to a pointer-arithmetic instruction that constructs a pointer that is out of bounds. + * + * Consider the following snippet: + * ```cpp + * 1. char* begin = (char*)malloc(size); + * 2. char* end = begin + size; + * 3. for(int *p = begin; p <= end; p++) { + * 4. use(*p); + * 5. } + * ``` + * this file identifies the flow from `new int[size]` to `base + size`. + * + * This is done using the product-flow library. The configuration tracks flow from the pair `(allocation, size of allocation)` + * to a pair `(a, b)` where there exists a pointer-arithmetic instruction `pai` such that: + * 1. `a` is equal to the left-hand side of `pai`, and + * 2. `b` is a dataflow node that represents an operand that _non-strictly_ upper bounds the right-hand side of `pai`. + * See `pointerAddInstructionHasBounds` for the implementation of this. + * + * In the above example, the pair `(a, b)` is `(base, size)` from the expression `base + size` on line 2. However, it could + * also be something more complex like `(base, size)` where `base` is from line 3 and `size` is from line 2, and the + * pointer-arithmetic instruction is `base + n` on line 3 in the following example: + * ```cpp + * 1. int* base = new int[size]; + * 2. if(n <= size) { + * 3. int* end = base + n; + * 4. for(int* p = base; p <= end; ++p) { + * 5. *p = 0; // BUG: Should have been bounded by `p < end`. + * 6. } + * 7. } + * ``` + * + * Reducing the size of `pointerAddInstructionHasBounds`: + * The `pointerAddInstructionHasBounds` can be very large since the `sink2` column is defined as anything that non-strictly + * upper bounds the right-hand side of a pointer-arithmetic instruction. In order to reduce the size of this predicate we prune + * the set of pointer-arithmetic instructions to only those instructions where the left-hand side flows from an allocation. + * + * Handling false positives: + * + * Consider a snippet such as: + * ```cpp + * 1. int* base = new int[size]; + * 2. int n = condition() ? size : 0; + * 3. if(n >= size) return; + * 4. int* end = base + n; + * 5. for(int* p = base; p <= end; ++p) { + * 6. *p = 0; // This is fine since `end < base + size` + * 7. } + * ``` + * In order to remove this false positive we define a barrier (see `Barrier2::BarrierConfig2`) that finds the possible guards + * that compares a value to the size of the allocation. In the above example, that's the `(n >= size)` guard on line 3. + * `Barrier2::getABarrierNode` then defines any node that's guarded by such a guard as a barrier in the dataflow configuration. */ private import cpp @@ -151,24 +201,8 @@ private module InterestingPointerAddInstruction { } /** - * A product-flow configuration for flow from an (allocation, size) pair to a - * pointer-arithmetic operation that is non-strictly upper-bounded by `allocation + size`. - * - * The goal of this query is to find patterns such as: - * ```cpp - * 1. char* begin = (char*)malloc(size); - * 2. char* end = begin + size; - * 3. for(int *p = begin; p <= end; p++) { - * 4. use(*p); - * 5. } - * ``` - * - * We do this by splitting the task up into two configurations: - * 1. `AllocToInvalidPointerConfig` find flow from `malloc(size)` to `begin + size`, and - * 2. `InvalidPointerToDerefConfig` finds flow from `begin + size` to an `end` (on line 3). - * - * Finally, the range-analysis library will find a load from (or store to) an address that - * is non-strictly upper-bounded by `end` (which in this case is `*p`). + * A product-flow configuration for flow from an `(allocation, size)` pair to a pointer- + * arithmetic instruction that is non-strictly upper-bounded by `allocation + size`. */ private module Config implements ProductFlow::StateConfigSig { class FlowState1 = Unit; diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 3eb0a2da670..a3420026971 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -2,6 +2,71 @@ * This file provides the second phase of the `cpp/invalid-pointer-deref` query that identifies flow * from the out-of-bounds pointer identified by the `AllocationToInvalidPointer.qll` library to * a dereference of the out-of-bounds pointer. + * + * Consider the following snippet: + * ```cpp + * 1. char* begin = (char*)malloc(size); + * 2. char* end = begin + size; + * 3. for(int *p = begin; p <= end; p++) { + * 4. use(*p); + * 5. } + * ``` + * this file identifies the flow from `base + size` to `end`. We call `base + size` the "dereference source" and `end` + * the "dereference sink" (even though `end` is not actually dereferenced - it will be used to find the correct + * dereference eventually). + * + * Merely _constructing_ a pointer that's out-of-bounds is fine if the pointer is never dereferenced (in reality, the + * standard only guarentees that it's safe to move the pointer one element past the last element. But we ignore that + * here). So this step is about identifying which of those out-of-bounds pointers identified from step 1 that are + * actually being dereferenced. We do this using a regular dataflow configuration (see `InvalidPointerToDerefConfig`). + * + * This dataflow traversal defines the set of sources as any dataflow node that is non-strictly upper-bounded by the + * pointer-arithmetic instruction identified by `AllocationToInvalidPointer.qll`. (TOOD: I'm pretty sure this is incorrect, + * and we should define the set of sources as anything that is non-strictly _lower_ bounded by the pointer-arithmetic + * instruction). That is, the set of sources is any dataflow node `source` such that `source.asInstruction <= pai + delta1` + * for some `delta1 >= 0`. + * + * The set of sinks is defined to be any address operand `addr` that is non-strictly upper-bounded by the sink. That is, + * any dataflow node `n` such that `addr <= sink.asInstruction() + delta2` for some `delta2`. We call the instruction that + * consumes the address operand the "operation". + * + * For example, consider the flow from `begin + size` to `end` above. The sink is `end` on line 3 because that is a dataflow + * node whose underlying instruction non-strictly upper bounds the address operand `p` in `use(*p)`. The load attached to `*p` + * is the "operation". To ensure that the path makes intuitive sense, we only pick operations that are control-flow reachable + * from the dereference sink. + * + * To compute the amount of the dereference is away from the final entry of the allocation, we sum the two deltas `delta1` and + * `delta2`. This is done in the `operationIsOffBy` predicate (which is the only predicate exposed by this file). + * + * Handling false positives: + * + * Consider the following snippet: + * ```cpp + * 1. char *p = new char[size]; + * 2. char *end = p + size; // $ alloc=L363 + * 3. if (p < end) { + * 4. p += 1; + * 5. } + * 6. if (p < end) { + * 7. int val = *p; // GOOD + * 8. } + * ``` + * this is safe because `p` is guarded to be strictly less than `end` on line 6 before the dereference on line 7. However, if we + * run the query on the above without further modifications we'd see an alert on line 7. This is because range analysis infers + * that `p <= end` after the increment on line 4, and thus the result of `p += 1` is seen as a valid dereference source. This + * node then flows to `p` on line 6 (which is a valid dereference sink since it non-strictly upper bounds an address operand), and + * range analysis then infers that the address operand of `*p` (i.e., `p`) is non-strictly upper bounded by `p`, and thus reports + * an alert on line 7. + * + * In order to handle this false positive, we define a barrier that identifies guards such as `p < end` that ensures that a value + * is less than the pointer-arithmetic instruction that computed the invalid pointer. This is done in the `InvalidPointerToDerefBarrier` + * module. Since the node we're tracking isn't necessarily _equal_ to the pointer-arithmetic instruction, but rather satisfies + * `node.asInstruction() <= pai + delta`, we need to account for the delta when checking if a guard is sufficiently strong to infer + * that a future dereference is safe. To do this, we check that the guard guarantees that a node `n` satisfies `n < node + d` where + * `node` is a node we know is equal to the value of the dereference source (i.e., it satisfies `node.asInstruction() <= pai + delta`) + * and `d <= delta`. Combining this we have `n < node + d <= node + delta <= pai + 2*delta` (TODO: Oops. This math doesn't quite work + * out. This is because we need to redefine the `BarrierConfig` to start flow at the pointer-arithmetic instruction instead of at the + * dereference source. When combined with TODO above it's easy to show that this guard ensures that the dereference is safe). */ private import cpp diff --git a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql index 93cac5939d4..3ff208f1d3f 100644 --- a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql +++ b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql @@ -15,6 +15,51 @@ * external/cwe/cwe-787 */ +/* + * High-level description of the query: + * + * The goal of this query is to identify issues such as: + * ```cpp + * 1. int* base = new int[size]; + * 2. int* end = base + size; + * 3. for(int* p = base; p <= end; ++p) { + * 4. *p = 0; // BUG: Should have been bounded by `p < end`. + * 5. } + * ``` + * In order to do this, we split the problem into three subtasks: + * 1. First, we find flow from `new int[size]` to `base + size`. + * 2. Then, we find flow from `base + size` to `end` (on line 3). + * 3. Finally, we use range-analysis to find a write to (or read from) a pointer that may be equal to `end`. + * + * Step 1 is implemented in `AllocationToInvalidPointer.qll`, and step 2 is implemented by + * `InvalidPointerToDereference.qll`. See those files for the description of these. + * + * This file imports both libraries and define a final dataflow configuration that constructs the full path from + * the allocation to the dereference of the out-of-bounds pointer. This is done for several reasons: + * 1. It means the user is able to inspect the entire path from the allocation to the dereference, which can be useful + * to understand the problem highlighted. + * 2. It ensures that the call-contexts line up correctly when we transition from step 1 to step 2. See the + * `test_missing_call_context_1` and `test_missing_call_context_2` tests for how this may flag false positives + * without this final configuration. + * + * The source of the final path is an allocation that's: + * 1. identified as flowing to an invalid pointer (by `AllocationToInvalidPointer`), and + * 2. for which the invalid pointer flows to a dereference (as identified by `InvalidPointerToDereference`). + * + * The path can be described in 3 "chunks": + * 1. One path from the allocation to the construction of the invalid pointer + * 2. Another path from the construction of the invalid pointer to the final pointer that's about to be dereferenced. + * 3. Finally, there's a single step from the dataflow node that represents the final pointer to the dereference. + * + * Step 1 happens when the flow state is `TInitial`, and step 2 and 3 happens when the flow state is `TPointerArith(pai)` + * where the pointer-arithmetic instruction `pai` tracks the instruction that generated the out-of-bounds pointer. This + * instruction is used in the construction of the alert message. + * + * The set of pointer-arithmetic instructions that define the `TPointerArith` flow state is restricted to be the pointer- + * arithmetic instructions that both receive flow from the allocation (as identified by `AllocationToInvalidPointer.qll`), + * and further flows to a dereference (as identified by `InvalidPointerToDereference.qll`). + */ + import cpp import semmle.code.cpp.dataflow.new.DataFlow import semmle.code.cpp.ir.IR From 369cee9ed9ebc2c4f8a74a14874b43a93ed82e46 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Fri, 21 Jul 2023 09:32:14 +0100 Subject: [PATCH 02/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index a3420026971..67695f73e81 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -21,7 +21,7 @@ * actually being dereferenced. We do this using a regular dataflow configuration (see `InvalidPointerToDerefConfig`). * * This dataflow traversal defines the set of sources as any dataflow node that is non-strictly upper-bounded by the - * pointer-arithmetic instruction identified by `AllocationToInvalidPointer.qll`. (TOOD: I'm pretty sure this is incorrect, + * pointer-arithmetic instruction identified by `AllocationToInvalidPointer.qll`. (TODO: I'm pretty sure this is incorrect, * and we should define the set of sources as anything that is non-strictly _lower_ bounded by the pointer-arithmetic * instruction). That is, the set of sources is any dataflow node `source` such that `source.asInstruction <= pai + delta1` * for some `delta1 >= 0`. From 4075dacd52e4fbf44e47e0fee54b4554c7c4cbc7 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Fri, 21 Jul 2023 09:55:23 +0100 Subject: [PATCH 03/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll --- .../AllocationToInvalidPointer.qll | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index 5f7c31d1500..fdc18fa8adb 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -4,9 +4,9 @@ * * Consider the following snippet: * ```cpp - * 1. char* begin = (char*)malloc(size); - * 2. char* end = begin + size; - * 3. for(int *p = begin; p <= end; p++) { + * 1. char* base = (char*)malloc(size); + * 2. char* end = base + size; + * 3. for(int *p = base; p <= end; p++) { * 4. use(*p); * 5. } * ``` From 6c3c4c302e556b42a7138788bc81ad25985bfee8 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Fri, 21 Jul 2023 10:19:04 +0100 Subject: [PATCH 04/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com> --- .../InvalidPointerToDereference.qll | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 67695f73e81..30034645367 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -5,9 +5,9 @@ * * Consider the following snippet: * ```cpp - * 1. char* begin = (char*)malloc(size); - * 2. char* end = begin + size; - * 3. for(int *p = begin; p <= end; p++) { + * 1. char* base = (char*)malloc(size); + * 2. char* end = base + size; + * 3. for(int *p = base; p <= end; p++) { * 4. use(*p); * 5. } * ``` From a7ee27ec2255dd2a3601b02a63d3dabf1b34b660 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Fri, 21 Jul 2023 10:20:24 +0100 Subject: [PATCH 05/43] C++: Fix 'begin'/'base' confusion. --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 30034645367..078c8c92c57 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -30,7 +30,7 @@ * any dataflow node `n` such that `addr <= sink.asInstruction() + delta2` for some `delta2`. We call the instruction that * consumes the address operand the "operation". * - * For example, consider the flow from `begin + size` to `end` above. The sink is `end` on line 3 because that is a dataflow + * For example, consider the flow from `base + size` to `end` above. The sink is `end` on line 3 because that is a dataflow * node whose underlying instruction non-strictly upper bounds the address operand `p` in `use(*p)`. The load attached to `*p` * is the "operation". To ensure that the path makes intuitive sense, we only pick operations that are control-flow reachable * from the dereference sink. From 5270cf6c41c6a74a040fd0659af6e0a67e66d4c5 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Fri, 21 Jul 2023 11:09:01 +0100 Subject: [PATCH 06/43] C++: Update documentation based on PR feedback. --- .../AllocationToInvalidPointer.qll | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index fdc18fa8adb..c3db211fcc3 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -12,11 +12,11 @@ * ``` * this file identifies the flow from `new int[size]` to `base + size`. * - * This is done using the product-flow library. The configuration tracks flow from the pair `(allocation, size of allocation)` - * to a pair `(a, b)` where there exists a pointer-arithmetic instruction `pai` such that: - * 1. `a` is equal to the left-hand side of `pai`, and - * 2. `b` is a dataflow node that represents an operand that _non-strictly_ upper bounds the right-hand side of `pai`. - * See `pointerAddInstructionHasBounds` for the implementation of this. + * This is done using the product-flow library. The configuration tracks flow from the pair + * `(allocation, size of allocation)` to a pair `(a, b)` where there exists a pointer-arithmetic instruction + * `pai = a + r` such that `b` is a dataflow node where `b <= r`. Because there will be a dataflow-path from + * `allocation` to `a` this means that the `pai` will compute a pointer that's some number of elements away + * from the end position in the allocation. See `pointerAddInstructionHasBounds` for the implementation of this. * * In the above example, the pair `(a, b)` is `(base, size)` from the expression `base + size` on line 2. However, it could * also be something more complex like `(base, size)` where `base` is from line 3 and `size` is from line 2, and the From 8ab2f89d5331ff7f8b1d19aaf83b58d5eed66404 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 12:03:52 +0200 Subject: [PATCH 07/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/AllocationToInvalidPointer.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index c3db211fcc3..a22b98b59a5 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -50,7 +50,7 @@ * ``` * In order to remove this false positive we define a barrier (see `Barrier2::BarrierConfig2`) that finds the possible guards * that compares a value to the size of the allocation. In the above example, that's the `(n >= size)` guard on line 3. - * `Barrier2::getABarrierNode` then defines any node that's guarded by such a guard as a barrier in the dataflow configuration. + * `Barrier2::getABarrierNode` then defines any node that is guarded by such a guard as a barrier in the dataflow configuration. */ private import cpp From f0ab3a3c842ba782f22fe0e086929e3faaa5f988 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 12:04:02 +0200 Subject: [PATCH 08/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/AllocationToInvalidPointer.qll | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index a22b98b59a5..ad652063f19 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -15,8 +15,8 @@ * This is done using the product-flow library. The configuration tracks flow from the pair * `(allocation, size of allocation)` to a pair `(a, b)` where there exists a pointer-arithmetic instruction * `pai = a + r` such that `b` is a dataflow node where `b <= r`. Because there will be a dataflow-path from - * `allocation` to `a` this means that the `pai` will compute a pointer that's some number of elements away - * from the end position in the allocation. See `pointerAddInstructionHasBounds` for the implementation of this. + * `allocation` to `a` this means that the `pai` will compute a pointer that is some number of elements beyond + * the end position in the allocation. See `pointerAddInstructionHasBounds` for the implementation of this. * * In the above example, the pair `(a, b)` is `(base, size)` from the expression `base + size` on line 2. However, it could * also be something more complex like `(base, size)` where `base` is from line 3 and `size` is from line 2, and the From 9cb09d6e9aef7c6969843c568a7f91fc668646c9 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 12:12:00 +0200 Subject: [PATCH 09/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/AllocationToInvalidPointer.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index 8297a9fe6d3..0e97c5920fb 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -49,7 +49,7 @@ * 7. } * ``` * In order to remove this false positive we define a barrier (see `Barrier2::BarrierConfig2`) that finds the possible guards - * that compares a value to the size of the allocation. In the above example, that's the `(n >= size)` guard on line 3. + * that compares a value to the size of the allocation. In the above example, this is the `(n >= size)` guard on line 3. * `Barrier2::getABarrierNode` then defines any node that is guarded by such a guard as a barrier in the dataflow configuration. */ From 9108982b07b69774fbc87e899b41cbcebe8d9bd5 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 12:13:31 +0200 Subject: [PATCH 10/43] C++: Update example in QLDoc. --- .../InvalidPointerDereference/AllocationToInvalidPointer.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index 0e97c5920fb..3208154f8a1 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -7,7 +7,7 @@ * 1. char* base = (char*)malloc(size); * 2. char* end = base + size; * 3. for(int *p = base; p <= end; p++) { - * 4. use(*p); + * 4. use(*p); // BUG: Should have been bounded by `p < end`. * 5. } * ``` * this file identifies the flow from `new int[size]` to `base + size`. From a272eb8447102721dcee2359e9df30bc1ee11327 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 12:14:47 +0200 Subject: [PATCH 11/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../AllocationToInvalidPointer.qll | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index 3208154f8a1..c2c6b59e713 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -18,9 +18,9 @@ * `allocation` to `a` this means that the `pai` will compute a pointer that is some number of elements beyond * the end position in the allocation. See `pointerAddInstructionHasBounds` for the implementation of this. * - * In the above example, the pair `(a, b)` is `(base, size)` from the expression `base + size` on line 2. However, it could - * also be something more complex like `(base, size)` where `base` is from line 3 and `size` is from line 2, and the - * pointer-arithmetic instruction is `base + n` on line 3 in the following example: + * In the above example, the pair `(a, b)` is `(base, size)` with `base` and `size` coming from the expression `base + size` on line 2, which is also the pointer-arithmetic instruction. In general, the pair does not necessarily correspond directly to the operands of the pointer-arithmetic instruction. + * In the following example, the pair is again `(base, size)`, but with `base` coming from line 3 and `size` from line 2, and the + * pointer-arithmetic instruction being `base + n` on line 3: * ```cpp * 1. int* base = new int[size]; * 2. if(n <= size) { From 5edc5e7c7bcad314513c88daf4adb771c0440233 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 12:15:57 +0200 Subject: [PATCH 12/43] C++: Reflow comments in QLDoc. --- .../AllocationToInvalidPointer.qll | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index c2c6b59e713..5b4d7d41d2b 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -18,9 +18,11 @@ * `allocation` to `a` this means that the `pai` will compute a pointer that is some number of elements beyond * the end position in the allocation. See `pointerAddInstructionHasBounds` for the implementation of this. * - * In the above example, the pair `(a, b)` is `(base, size)` with `base` and `size` coming from the expression `base + size` on line 2, which is also the pointer-arithmetic instruction. In general, the pair does not necessarily correspond directly to the operands of the pointer-arithmetic instruction. - * In the following example, the pair is again `(base, size)`, but with `base` coming from line 3 and `size` from line 2, and the - * pointer-arithmetic instruction being `base + n` on line 3: + * In the above example, the pair `(a, b)` is `(base, size)` with `base` and `size` coming from the expression + * `base + size` on line 2, which is also the pointer-arithmetic instruction. In general, the pair does not necessarily + * correspond directly to the operands of the pointer-arithmetic instruction. + * In the following example, the pair is again `(base, size)`, but with `base` coming from line 3 and `size` from line 2, + * and the pointer-arithmetic instruction being `base + n` on line 3: * ```cpp * 1. int* base = new int[size]; * 2. if(n <= size) { From d02a1c28404fb92b4625b14cba6e794dc9322242 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 12:20:38 +0200 Subject: [PATCH 13/43] C++: Remove paragraph. --- .../InvalidPointerDereference/AllocationToInvalidPointer.qll | 5 ----- 1 file changed, 5 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index 5b4d7d41d2b..80c62a0e0d2 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -33,11 +33,6 @@ * 7. } * ``` * - * Reducing the size of `pointerAddInstructionHasBounds`: - * The `pointerAddInstructionHasBounds` can be very large since the `sink2` column is defined as anything that non-strictly - * upper bounds the right-hand side of a pointer-arithmetic instruction. In order to reduce the size of this predicate we prune - * the set of pointer-arithmetic instructions to only those instructions where the left-hand side flows from an allocation. - * * Handling false positives: * * Consider a snippet such as: From 4345369e9b8e21b8d86dc59c25fd295fc332e001 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 12:22:26 +0200 Subject: [PATCH 14/43] C++: Replace 'Barrier2' with 'SizeBarrier' in QLDoc. --- .../AllocationToInvalidPointer.qll | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index 80c62a0e0d2..7d1243a7943 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -45,9 +45,10 @@ * 6. *p = 0; // This is fine since `end < base + size` * 7. } * ``` - * In order to remove this false positive we define a barrier (see `Barrier2::BarrierConfig2`) that finds the possible guards - * that compares a value to the size of the allocation. In the above example, this is the `(n >= size)` guard on line 3. - * `Barrier2::getABarrierNode` then defines any node that is guarded by such a guard as a barrier in the dataflow configuration. + * In order to remove this false positive we define a barrier (see `SizeBarrier::SizeBarrierConfig`) that finds the + * possible guards that compares a value to the size of the allocation. In the above example, this is the `(n >= size)` + * guard on line 3. `SizeBarrier::getABarrierNode` then defines any node that is guarded by such a guard as a barrier + * in the dataflow configuration. */ private import cpp From 4a276c37ac03d9a03738012ead536398255e107c Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 12:24:46 +0200 Subject: [PATCH 15/43] C++: Remove 'TODO' now that the implementation has been fixed. --- .../InvalidPointerToDereference.qll | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 48a90ab28b5..845e755d87a 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -20,11 +20,9 @@ * here). So this step is about identifying which of those out-of-bounds pointers identified from step 1 that are * actually being dereferenced. We do this using a regular dataflow configuration (see `InvalidPointerToDerefConfig`). * - * This dataflow traversal defines the set of sources as any dataflow node that is non-strictly upper-bounded by the - * pointer-arithmetic instruction identified by `AllocationToInvalidPointer.qll`. (TODO: I'm pretty sure this is incorrect, - * and we should define the set of sources as anything that is non-strictly _lower_ bounded by the pointer-arithmetic - * instruction). That is, the set of sources is any dataflow node `source` such that `source.asInstruction <= pai + delta1` - * for some `delta1 >= 0`. + * This dataflow traversal defines the set of sources as any dataflow node that is non-strictly lower-bounded by the + * pointer-arithmetic instruction identified by `AllocationToInvalidPointer.qll`. That is, the set of sources is any + * dataflow node `source` such that `source.asInstruction() >= pai + delta1` for some `delta1 >= 0`. * * The set of sinks is defined to be any address operand `addr` that is non-strictly upper-bounded by the sink. That is, * any dataflow node `n` such that `addr <= sink.asInstruction() + delta2` for some `delta2`. We call the instruction that From 88b78284ec76ab18d68742024e9317510ad9c3f8 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 14:57:59 +0200 Subject: [PATCH 16/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 845e755d87a..cba404ec241 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -8,7 +8,7 @@ * 1. char* base = (char*)malloc(size); * 2. char* end = base + size; * 3. for(int *p = base; p <= end; p++) { - * 4. use(*p); + * 4. use(*p); // BUG: Should have been bounded by `p < end`. * 5. } * ``` * this file identifies the flow from `base + size` to `end`. We call `base + size` the "dereference source" and `end` From db8b506106502b77ccab51ceb003d4fab2518a41 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 14:58:06 +0200 Subject: [PATCH 17/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/AllocationToInvalidPointer.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll index 7d1243a7943..43120e1c7f0 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/AllocationToInvalidPointer.qll @@ -16,7 +16,7 @@ * `(allocation, size of allocation)` to a pair `(a, b)` where there exists a pointer-arithmetic instruction * `pai = a + r` such that `b` is a dataflow node where `b <= r`. Because there will be a dataflow-path from * `allocation` to `a` this means that the `pai` will compute a pointer that is some number of elements beyond - * the end position in the allocation. See `pointerAddInstructionHasBounds` for the implementation of this. + * the end position of the allocation. See `pointerAddInstructionHasBounds` for the implementation of this. * * In the above example, the pair `(a, b)` is `(base, size)` with `base` and `size` coming from the expression * `base + size` on line 2, which is also the pointer-arithmetic instruction. In general, the pair does not necessarily From acb1310e994d5315e6b82dc7f2c8ba2689530647 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Mon, 24 Jul 2023 15:28:36 +0200 Subject: [PATCH 18/43] C++: Add more documentation. --- .../InvalidPointerToDereference.qll | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index cba404ec241..9c9a7b79261 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -7,13 +7,14 @@ * ```cpp * 1. char* base = (char*)malloc(size); * 2. char* end = base + size; - * 3. for(int *p = base; p <= end; p++) { + * 3. for(char *p = base; p <= end; p++) { * 4. use(*p); // BUG: Should have been bounded by `p < end`. * 5. } * ``` * this file identifies the flow from `base + size` to `end`. We call `base + size` the "dereference source" and `end` - * the "dereference sink" (even though `end` is not actually dereferenced - it will be used to find the correct - * dereference eventually). + * the "dereference sink" (even though `end` is not actually dereferenced we will use this term because we will perform + * dataflow to find a use of a pointer `x` such that `x <= end` which is dereferenced. In the above example, `x` is `p` + * on line 4). * * Merely _constructing_ a pointer that's out-of-bounds is fine if the pointer is never dereferenced (in reality, the * standard only guarentees that it's safe to move the pointer one element past the last element. But we ignore that From 359a9e5fe80787825533debda0434782fcd0705c Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 09:07:01 +0200 Subject: [PATCH 19/43] C++: 'Step 1' does not make a lot of sense now that the files have been split. --- .../InvalidPointerToDereference.qll | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 9c9a7b79261..ef570344a3e 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -18,8 +18,9 @@ * * Merely _constructing_ a pointer that's out-of-bounds is fine if the pointer is never dereferenced (in reality, the * standard only guarentees that it's safe to move the pointer one element past the last element. But we ignore that - * here). So this step is about identifying which of those out-of-bounds pointers identified from step 1 that are - * actually being dereferenced. We do this using a regular dataflow configuration (see `InvalidPointerToDerefConfig`). + * here). So this step is about identifying which of those out-of-bounds pointers found by `pointerAddInstructionHasBounds` + * in `AllocationToInvalidPointer.qll` that are actually being dereferenced. We do this using a regular dataflow + * configuration (see `InvalidPointerToDerefConfig`). * * This dataflow traversal defines the set of sources as any dataflow node that is non-strictly lower-bounded by the * pointer-arithmetic instruction identified by `AllocationToInvalidPointer.qll`. That is, the set of sources is any From 55cfadb1f4bff2919c129f0127372dde1d9c705c Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 09:13:27 +0200 Subject: [PATCH 20/43] C++: Simplify the description of the source. --- .../InvalidPointerToDereference.qll | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index ef570344a3e..16339b91bd8 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -22,9 +22,10 @@ * in `AllocationToInvalidPointer.qll` that are actually being dereferenced. We do this using a regular dataflow * configuration (see `InvalidPointerToDerefConfig`). * - * This dataflow traversal defines the set of sources as any dataflow node that is non-strictly lower-bounded by the - * pointer-arithmetic instruction identified by `AllocationToInvalidPointer.qll`. That is, the set of sources is any - * dataflow node `source` such that `source.asInstruction() >= pai + delta1` for some `delta1 >= 0`. + * This dataflow traversal defines the set of sources as any dataflow node `n` such that there exists a pointer-arithmetic + * instruction `pai` found by `AllocationToInvalidPointer.qll` and `n.asInstruction() >= pai + deltaDerefSourceAndPai`. + * Here, `deltaDerefSourceAndPai` is the constant difference between the source we track for finding a dereference and the + * pointer-arithmetic instruction. * * The set of sinks is defined to be any address operand `addr` that is non-strictly upper-bounded by the sink. That is, * any dataflow node `n` such that `addr <= sink.asInstruction() + delta2` for some `delta2`. We call the instruction that From 1612ee3e9a046480c726f13c4f027c133cd5e9b5 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 09:19:14 +0200 Subject: [PATCH 21/43] C++: Simplify the description of the sink. --- .../InvalidPointerToDereference.qll | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 16339b91bd8..a2bcd741ee1 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -27,9 +27,10 @@ * Here, `deltaDerefSourceAndPai` is the constant difference between the source we track for finding a dereference and the * pointer-arithmetic instruction. * - * The set of sinks is defined to be any address operand `addr` that is non-strictly upper-bounded by the sink. That is, - * any dataflow node `n` such that `addr <= sink.asInstruction() + delta2` for some `delta2`. We call the instruction that - * consumes the address operand the "operation". + * The set of sinks is defined as any dataflow node `n` such that `addr <= n.asInstruction() + deltaDerefSinkAndDerefAddress` + * for some address operand `addr` and constant difference `deltaDerefSinkAndDerefAddress`. Since an address operand is + * always consumed by an instruction that performs a dereference this lets us identify a "bad dereference". We call the + * instruction that consumes the address operand the "operation". * * For example, consider the flow from `base + size` to `end` above. The sink is `end` on line 3 because that is a dataflow * node whose underlying instruction non-strictly upper bounds the address operand `p` in `use(*p)`. The load attached to `*p` From 0d116a00fb0ef0612e78f56b720766789e33101a Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:25:34 +0200 Subject: [PATCH 22/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index a2bcd741ee1..8189011d2a8 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -17,7 +17,7 @@ * on line 4). * * Merely _constructing_ a pointer that's out-of-bounds is fine if the pointer is never dereferenced (in reality, the - * standard only guarentees that it's safe to move the pointer one element past the last element. But we ignore that + * standard only guarantees that it is safe to move the pointer one element past the last element, but we ignore that * here). So this step is about identifying which of those out-of-bounds pointers found by `pointerAddInstructionHasBounds` * in `AllocationToInvalidPointer.qll` that are actually being dereferenced. We do this using a regular dataflow * configuration (see `InvalidPointerToDerefConfig`). From 6ebd5ab3edbccf72bf0ae0be0f6c06c58903b206 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:25:41 +0200 Subject: [PATCH 23/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 8189011d2a8..d7745b50d72 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -18,7 +18,7 @@ * * Merely _constructing_ a pointer that's out-of-bounds is fine if the pointer is never dereferenced (in reality, the * standard only guarantees that it is safe to move the pointer one element past the last element, but we ignore that - * here). So this step is about identifying which of those out-of-bounds pointers found by `pointerAddInstructionHasBounds` + * here). So this step is about identifying which of the out-of-bounds pointers found by `pointerAddInstructionHasBounds` * in `AllocationToInvalidPointer.qll` that are actually being dereferenced. We do this using a regular dataflow * configuration (see `InvalidPointerToDerefConfig`). * From a176ba262bfe18ca3f428ed545235df21e3d8d0d Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:30:21 +0200 Subject: [PATCH 24/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index d7745b50d72..b4ac80ff245 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -45,7 +45,7 @@ * Consider the following snippet: * ```cpp * 1. char *p = new char[size]; - * 2. char *end = p + size; // $ alloc=L363 + * 2. char *end = p + size; * 3. if (p < end) { * 4. p += 1; * 5. } From 5cad8ec0a2cfefdfdd6f3b257b2f49179eada275 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:30:38 +0200 Subject: [PATCH 25/43] Update cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql index 3ff208f1d3f..51c953a5446 100644 --- a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql +++ b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql @@ -42,7 +42,7 @@ * `test_missing_call_context_1` and `test_missing_call_context_2` tests for how this may flag false positives * without this final configuration. * - * The source of the final path is an allocation that's: + * The source of the final path is an allocation that is: * 1. identified as flowing to an invalid pointer (by `AllocationToInvalidPointer`), and * 2. for which the invalid pointer flows to a dereference (as identified by `InvalidPointerToDereference`). * From 2cfa14b91ff86e3d60bf5c1336e1fc5351bc94ce Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:30:44 +0200 Subject: [PATCH 26/43] Update cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql index 51c953a5446..9d8d75ecaf3 100644 --- a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql +++ b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql @@ -34,7 +34,7 @@ * Step 1 is implemented in `AllocationToInvalidPointer.qll`, and step 2 is implemented by * `InvalidPointerToDereference.qll`. See those files for the description of these. * - * This file imports both libraries and define a final dataflow configuration that constructs the full path from + * This file imports both libraries and defines a final dataflow configuration that constructs the full path from * the allocation to the dereference of the out-of-bounds pointer. This is done for several reasons: * 1. It means the user is able to inspect the entire path from the allocation to the dereference, which can be useful * to understand the problem highlighted. From 70ac0a5462694909f812701615f0372bb3a784b4 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:30:56 +0200 Subject: [PATCH 27/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index b4ac80ff245..3553d8523eb 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -54,7 +54,7 @@ * 8. } * ``` * this is safe because `p` is guarded to be strictly less than `end` on line 6 before the dereference on line 7. However, if we - * run the query on the above without further modifications we'd see an alert on line 7. This is because range analysis infers + * run the query on the above without further modifications we would see an alert on line 7. This is because range analysis infers * that `p <= end` after the increment on line 4, and thus the result of `p += 1` is seen as a valid dereference source. This * node then flows to `p` on line 6 (which is a valid dereference sink since it non-strictly upper bounds an address operand), and * range analysis then infers that the address operand of `*p` (i.e., `p`) is non-strictly upper bounded by `p`, and thus reports From 9f9cf9f765c09d87d6b4c3f019e06d04b4a08c0d Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:31:06 +0200 Subject: [PATCH 28/43] Update cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql index 9d8d75ecaf3..2a9ed2d5b6d 100644 --- a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql +++ b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql @@ -49,7 +49,7 @@ * The path can be described in 3 "chunks": * 1. One path from the allocation to the construction of the invalid pointer * 2. Another path from the construction of the invalid pointer to the final pointer that's about to be dereferenced. - * 3. Finally, there's a single step from the dataflow node that represents the final pointer to the dereference. + * 3. Finally, a single step from the dataflow node that represents the final pointer to the dereference. * * Step 1 happens when the flow state is `TInitial`, and step 2 and 3 happens when the flow state is `TPointerArith(pai)` * where the pointer-arithmetic instruction `pai` tracks the instruction that generated the out-of-bounds pointer. This From e1763db36c2d70bc892efa2dbd6adc674a9a73c8 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:31:37 +0200 Subject: [PATCH 29/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 3553d8523eb..bce0f6e68b6 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -60,7 +60,7 @@ * range analysis then infers that the address operand of `*p` (i.e., `p`) is non-strictly upper bounded by `p`, and thus reports * an alert on line 7. * - * In order to handle this false positive, we define a barrier that identifies guards such as `p < end` that ensures that a value + * In order to handle the above false positive, we define a barrier that identifies guards such as `p < end` that ensures that a value * is less than the pointer-arithmetic instruction that computed the invalid pointer. This is done in the `InvalidPointerToDerefBarrier` * module. Since the node we're tracking isn't necessarily _equal_ to the pointer-arithmetic instruction, but rather satisfies * `node.asInstruction() <= pai + delta`, we need to account for the delta when checking if a guard is sufficiently strong to infer From e1f519fab755b8d42add89c99a823b2fe7fff841 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:33:17 +0200 Subject: [PATCH 30/43] Update cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql index 2a9ed2d5b6d..f59df356ac0 100644 --- a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql +++ b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql @@ -51,7 +51,7 @@ * 2. Another path from the construction of the invalid pointer to the final pointer that's about to be dereferenced. * 3. Finally, a single step from the dataflow node that represents the final pointer to the dereference. * - * Step 1 happens when the flow state is `TInitial`, and step 2 and 3 happens when the flow state is `TPointerArith(pai)` + * Step 1 happens when the flow state is `TInitial`, and step 2 and 3 happen when the flow state is `TPointerArith(pai)` * where the pointer-arithmetic instruction `pai` tracks the instruction that generated the out-of-bounds pointer. This * instruction is used in the construction of the alert message. * From 97809e764625d29542adbbffad40d8a9f3af55c6 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:33:38 +0200 Subject: [PATCH 31/43] Update cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql index f59df356ac0..c29e78e5c1f 100644 --- a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql +++ b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql @@ -57,7 +57,7 @@ * * The set of pointer-arithmetic instructions that define the `TPointerArith` flow state is restricted to be the pointer- * arithmetic instructions that both receive flow from the allocation (as identified by `AllocationToInvalidPointer.qll`), - * and further flows to a dereference (as identified by `InvalidPointerToDereference.qll`). + * and further flow to a dereference (as identified by `InvalidPointerToDereference.qll`). */ import cpp From af904f5cfea51b959e87cf33df119c2935e35778 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:33:57 +0200 Subject: [PATCH 32/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index bce0f6e68b6..c88a268bd36 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -62,7 +62,7 @@ * * In order to handle the above false positive, we define a barrier that identifies guards such as `p < end` that ensures that a value * is less than the pointer-arithmetic instruction that computed the invalid pointer. This is done in the `InvalidPointerToDerefBarrier` - * module. Since the node we're tracking isn't necessarily _equal_ to the pointer-arithmetic instruction, but rather satisfies + * module. Since the node we are tracking is not necessarily _equal_ to the pointer-arithmetic instruction, but rather satisfies * `node.asInstruction() <= pai + delta`, we need to account for the delta when checking if a guard is sufficiently strong to infer * that a future dereference is safe. To do this, we check that the guard guarantees that a node `n` satisfies `n < node + d` where * `node` is a node we know is equal to the value of the dereference source (i.e., it satisfies `node.asInstruction() <= pai + delta`) From 7f7930b3bb66119a80f8faf73cb0b56fdfe87a34 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:36:12 +0200 Subject: [PATCH 33/43] Update cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql index c29e78e5c1f..1292011a534 100644 --- a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql +++ b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql @@ -29,7 +29,7 @@ * In order to do this, we split the problem into three subtasks: * 1. First, we find flow from `new int[size]` to `base + size`. * 2. Then, we find flow from `base + size` to `end` (on line 3). - * 3. Finally, we use range-analysis to find a write to (or read from) a pointer that may be equal to `end`. + * 3. Finally, we use range-analysis to find a write to (or read from) a pointer that may be greater than or equal to `end`. * * Step 1 is implemented in `AllocationToInvalidPointer.qll`, and step 2 is implemented by * `InvalidPointerToDereference.qll`. See those files for the description of these. From e75f6041721043758725fa1a9432281cfac13da7 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:48:23 +0200 Subject: [PATCH 34/43] C++: Replace more text with formulas. --- .../InvalidPointerToDereference.qll | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index c88a268bd36..8e64c99a564 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -32,10 +32,10 @@ * always consumed by an instruction that performs a dereference this lets us identify a "bad dereference". We call the * instruction that consumes the address operand the "operation". * - * For example, consider the flow from `base + size` to `end` above. The sink is `end` on line 3 because that is a dataflow - * node whose underlying instruction non-strictly upper bounds the address operand `p` in `use(*p)`. The load attached to `*p` - * is the "operation". To ensure that the path makes intuitive sense, we only pick operations that are control-flow reachable - * from the dereference sink. + * For example, consider the flow from `base + size` to `end` above. The sink is `end` on line 3 because + * `p <= end.asInstruction() + deltaDerefSinkAndDerefAddress` where `p` is the address operand in `use(*p)` and + * `deltaDerefSinkAndDerefAddress >= 0`. The load attached to `*p` is the "operation". To ensure that the path makes + * intuitive sense, we only pick operations that are control-flow reachable from the dereference sink. * * To compute the amount of the dereference is away from the final entry of the allocation, we sum the two deltas `delta1` and * `delta2`. This is done in the `operationIsOffBy` predicate (which is the only predicate exposed by this file). From 46832d0b170bc8aacee6882894ccfc05c883c1ca Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 10:50:51 +0200 Subject: [PATCH 35/43] C++: Rename 'delta1' and 'delta2' in documentation. --- .../InvalidPointerToDereference.qll | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 8e64c99a564..5627b6bd2ce 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -37,8 +37,9 @@ * `deltaDerefSinkAndDerefAddress >= 0`. The load attached to `*p` is the "operation". To ensure that the path makes * intuitive sense, we only pick operations that are control-flow reachable from the dereference sink. * - * To compute the amount of the dereference is away from the final entry of the allocation, we sum the two deltas `delta1` and - * `delta2`. This is done in the `operationIsOffBy` predicate (which is the only predicate exposed by this file). + * To compute the amount of the dereference is away from the final entry of the allocation, we sum the two deltas + * `deltaDerefSourceAndPai` and `deltaDerefSinkAndDerefAddress`. This is done in the `operationIsOffBy` predicate + * (which is the only predicate exposed by this file). * * Handling false positives: * From 54afed6e1d68c89f01ace3a4eeb55dc45a140aa3 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 11:04:18 +0200 Subject: [PATCH 36/43] C++: Rename 'delta' to 'deltaDerefSourceAndPai'. --- .../InvalidPointerToDereference.qll | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 5627b6bd2ce..550e96e1098 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -64,12 +64,13 @@ * In order to handle the above false positive, we define a barrier that identifies guards such as `p < end` that ensures that a value * is less than the pointer-arithmetic instruction that computed the invalid pointer. This is done in the `InvalidPointerToDerefBarrier` * module. Since the node we are tracking is not necessarily _equal_ to the pointer-arithmetic instruction, but rather satisfies - * `node.asInstruction() <= pai + delta`, we need to account for the delta when checking if a guard is sufficiently strong to infer - * that a future dereference is safe. To do this, we check that the guard guarantees that a node `n` satisfies `n < node + d` where - * `node` is a node we know is equal to the value of the dereference source (i.e., it satisfies `node.asInstruction() <= pai + delta`) - * and `d <= delta`. Combining this we have `n < node + d <= node + delta <= pai + 2*delta` (TODO: Oops. This math doesn't quite work - * out. This is because we need to redefine the `BarrierConfig` to start flow at the pointer-arithmetic instruction instead of at the - * dereference source. When combined with TODO above it's easy to show that this guard ensures that the dereference is safe). + * `node.asInstruction() <= pai + deltaDerefSourceAndPai`, we need to account for the delta when checking if a guard is sufficiently + * strong to infer that a future dereference is safe. To do this, we check that the guard guarantees that a node `n` satisfies + * `n < node + d` where `node` is a node we know is equal to the value of the dereference source (i.e., it satisfies + * `node.asInstruction() <= pai + deltaDerefSourceAndPai`) and `d <= delta`. Combining this we have + * `n < node + d <= node + delta <= pai + 2*delta` (TODO: Oops. This math doesn't quite work out. This is because we need to + * redefine the `BarrierConfig` to start flow at the pointer-arithmetic instruction instead of at the dereference source. When + * combined with TODO above it's easy to show that this guard ensures that the dereference is safe). */ private import cpp From 99f6e685c7b2bfdceab6d227a739acf5646deb94 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 11:18:46 +0200 Subject: [PATCH 37/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 550e96e1098..4496f0ee94c 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -19,7 +19,7 @@ * Merely _constructing_ a pointer that's out-of-bounds is fine if the pointer is never dereferenced (in reality, the * standard only guarantees that it is safe to move the pointer one element past the last element, but we ignore that * here). So this step is about identifying which of the out-of-bounds pointers found by `pointerAddInstructionHasBounds` - * in `AllocationToInvalidPointer.qll` that are actually being dereferenced. We do this using a regular dataflow + * in `AllocationToInvalidPointer.qll` are actually being dereferenced. We do this using a regular dataflow * configuration (see `InvalidPointerToDerefConfig`). * * This dataflow traversal defines the set of sources as any dataflow node `n` such that there exists a pointer-arithmetic From 2caad679808969961a7e95f506b68ed82154b203 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 11:18:54 +0200 Subject: [PATCH 38/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 4496f0ee94c..31b2b917ef2 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -22,7 +22,7 @@ * in `AllocationToInvalidPointer.qll` are actually being dereferenced. We do this using a regular dataflow * configuration (see `InvalidPointerToDerefConfig`). * - * This dataflow traversal defines the set of sources as any dataflow node `n` such that there exists a pointer-arithmetic + * The dataflow traversal defines the set of sources as any dataflow node `n` such that there exists a pointer-arithmetic * instruction `pai` found by `AllocationToInvalidPointer.qll` and `n.asInstruction() >= pai + deltaDerefSourceAndPai`. * Here, `deltaDerefSourceAndPai` is the constant difference between the source we track for finding a dereference and the * pointer-arithmetic instruction. From 997eb1caf21cc4a9178045a5c30bb4ef2b4a3bcb Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 11:19:14 +0200 Subject: [PATCH 39/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 31b2b917ef2..6e3ea48ce41 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -23,7 +23,7 @@ * configuration (see `InvalidPointerToDerefConfig`). * * The dataflow traversal defines the set of sources as any dataflow node `n` such that there exists a pointer-arithmetic - * instruction `pai` found by `AllocationToInvalidPointer.qll` and `n.asInstruction() >= pai + deltaDerefSourceAndPai`. + * instruction `pai` found by `AllocationToInvalidPointer.qll` and a `n.asInstruction() >= pai + deltaDerefSourceAndPai`. * Here, `deltaDerefSourceAndPai` is the constant difference between the source we track for finding a dereference and the * pointer-arithmetic instruction. * From 099e11fb0ca80667a7ab7721780723ac5a8a6067 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 11:19:49 +0200 Subject: [PATCH 40/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 6e3ea48ce41..c12c64c8512 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -33,7 +33,7 @@ * instruction that consumes the address operand the "operation". * * For example, consider the flow from `base + size` to `end` above. The sink is `end` on line 3 because - * `p <= end.asInstruction() + deltaDerefSinkAndDerefAddress` where `p` is the address operand in `use(*p)` and + * `p <= end.asInstruction() + deltaDerefSinkAndDerefAddress`, where `p` is the address operand in `use(*p)` and * `deltaDerefSinkAndDerefAddress >= 0`. The load attached to `*p` is the "operation". To ensure that the path makes * intuitive sense, we only pick operations that are control-flow reachable from the dereference sink. * From 13989dba91498188c2679bdbc1d1c4167a3e2fa1 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 11:20:03 +0200 Subject: [PATCH 41/43] Update cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../InvalidPointerDereference/InvalidPointerToDereference.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index c12c64c8512..8e6012a427d 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -37,7 +37,7 @@ * `deltaDerefSinkAndDerefAddress >= 0`. The load attached to `*p` is the "operation". To ensure that the path makes * intuitive sense, we only pick operations that are control-flow reachable from the dereference sink. * - * To compute the amount of the dereference is away from the final entry of the allocation, we sum the two deltas + * To compute how many elements the dereference is beyond the end position of the allocation, we sum the two deltas * `deltaDerefSourceAndPai` and `deltaDerefSinkAndDerefAddress`. This is done in the `operationIsOffBy` predicate * (which is the only predicate exposed by this file). * From b1c6ee4396981e1da1fab85610311d548ae0710e Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 11:20:49 +0200 Subject: [PATCH 42/43] Update cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com> --- .../experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql index 1292011a534..f778b4f776f 100644 --- a/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql +++ b/cpp/ql/src/experimental/Security/CWE/CWE-193/InvalidPointerDeref.ql @@ -48,7 +48,7 @@ * * The path can be described in 3 "chunks": * 1. One path from the allocation to the construction of the invalid pointer - * 2. Another path from the construction of the invalid pointer to the final pointer that's about to be dereferenced. + * 2. Another path from the construction of the invalid pointer to the final pointer that is about to be dereferenced. * 3. Finally, a single step from the dataflow node that represents the final pointer to the dereference. * * Step 1 happens when the flow state is `TInitial`, and step 2 and 3 happen when the flow state is `TPointerArith(pai)` From 9f2ee0d7c2eee8f6357f9dc3a2bb88e1cd6069e4 Mon Sep 17 00:00:00 2001 From: Mathias Vorreiter Pedersen Date: Tue, 25 Jul 2023 11:25:24 +0200 Subject: [PATCH 43/43] C++: Rename 'delta' to 'deltaDerefSourceAndPai'. --- .../InvalidPointerToDereference.qll | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll index 8e6012a427d..b84bd819df5 100644 --- a/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll +++ b/cpp/ql/lib/semmle/code/cpp/security/InvalidPointerDereference/InvalidPointerToDereference.qll @@ -66,11 +66,11 @@ * module. Since the node we are tracking is not necessarily _equal_ to the pointer-arithmetic instruction, but rather satisfies * `node.asInstruction() <= pai + deltaDerefSourceAndPai`, we need to account for the delta when checking if a guard is sufficiently * strong to infer that a future dereference is safe. To do this, we check that the guard guarantees that a node `n` satisfies - * `n < node + d` where `node` is a node we know is equal to the value of the dereference source (i.e., it satisfies - * `node.asInstruction() <= pai + deltaDerefSourceAndPai`) and `d <= delta`. Combining this we have - * `n < node + d <= node + delta <= pai + 2*delta` (TODO: Oops. This math doesn't quite work out. This is because we need to - * redefine the `BarrierConfig` to start flow at the pointer-arithmetic instruction instead of at the dereference source. When - * combined with TODO above it's easy to show that this guard ensures that the dereference is safe). + * `n < node + k` where `node` is a node we know is equal to the value of the dereference source (i.e., it satisfies + * `node.asInstruction() <= pai + deltaDerefSourceAndPai`) and `k <= deltaDerefSourceAndPai`. Combining this we have + * `n < node + k <= node + deltaDerefSourceAndPai <= pai + 2*deltaDerefSourceAndPai` (TODO: Oops. This math doesn't quite work out. + * I think this is because we need to redefine the `BarrierConfig` to start flow at the pointer-arithmetic instruction instead of + * at the dereference source. When combined with TODO above it's easy to show that this guard ensures that the dereference is safe). */ private import cpp