From 36a6d9ea81ecc30a6384d631f246eb11a6ab848d Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Thu, 6 Jun 2019 17:45:26 +0100 Subject: [PATCH 01/22] Update text for consistency --- change-notes/1.21/analysis-cpp.md | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/change-notes/1.21/analysis-cpp.md b/change-notes/1.21/analysis-cpp.md index 39667c5e70a..fba6e217ad4 100644 --- a/change-notes/1.21/analysis-cpp.md +++ b/change-notes/1.21/analysis-cpp.md @@ -6,10 +6,10 @@ | **Query** | **Tags** | **Purpose** | |-----------------------------|-----------|--------------------------------------------------------------------| -| `()`-declared function called with too few arguments (`cpp/too-few-arguments`) | Correctness | Find all cases where the number of arguments is less than the number of parameters of the function, provided the function is also properly declared/defined elsewhere. | -| `()`-declared function called with mismatched arguments (`cpp/mismatched-function-arguments`) | Correctness | Find all cases where the types of arguments do not match the types of parameters of the function, provided the function is also properly declared/defined elsewhere. | -| Call to alloca in a loop (`cpp/alloca-in-loop`) | reliability, correctness, external/cwe/cwe-770 | Finds calls to `alloca` in loops, which can lead to stack overflow if the number of iterations is large. Newly displayed on LGTM. | -| Use of dangerous function (`dangerous-function-overflow`) | reliability, security, external/cwe/cwe-242 | Finds calls to `gets`, which does not guard against buffer overflow. These results were previously detected by the `cpp/potentially-dangerous-function` query. | +| Call to function with fewer arguments than declared parameters + (`cpp/too-few-arguments`) | Correctness | Find all cases where the number of arguments is fewer than the number of parameters of the function, provided the function is also properly declared/defined elsewhere. Results are displayed by default on [LGTM](https://lgtm.com/rules/1508860726279/). | +| Call to alloca in a loop (`cpp/alloca-in-loop`) | reliability, correctness, external/cwe/cwe-770 | Finds calls to `alloca` in loops, which can lead to stack overflow if the number of iterations is large. Newly displayed on [LGTM](https://lgtm.com/rules/1508831665988/). | +| Use of dangerous function (`cpp/dangerous-function-overflow`) | reliability, security, external/cwe/cwe-242 | Finds calls to `gets`, which does not guard against buffer overflow. These results were previously detected by the `cpp/potentially-dangerous-function` query. Results for both queries are displayed by default on LGTM. | ## Changes to existing queries @@ -17,26 +17,27 @@ |----------------------------|------------------------|------------------------------------------------------------------| | Buffer not sufficient for string (`cpp/overflow-calculated`) | Fewer results | This query no longer reports results that would be found by the 'No space for zero terminator' (`cpp/no-space-for-terminator`) query. | | No space for zero terminator (`cpp/no-space-for-terminator`) | More correct results | This query now detects calls to `std::malloc`. | -| Commented-out code (`cpp/commented-out-code`) | More correct results | Commented out preprocessor code is now detected by this query. | +| Commented-out code (`cpp/commented-out-code`) | More correct results | Commented-out preprocessor code is now detected by this query. | | Dead code due to goto or break statement (`cpp/dead-code-goto`) | Fewer false positive results | Functions containing preprocessor logic are now excluded from this analysis. | -| Mismatching new/free or malloc/delete (`cpp/new-free-mismatch`) | Fewer false positive results | Fixed an issue where functions were being identified as allocation functions inappropriately. Also affects `cpp/new-array-delete-mismatch` and `cpp/new-delete-array-mismatch`. | +| Mismatching new/free or malloc/delete (`cpp/new-free-mismatch`) | Fewer false positive results | Fixed an issue where functions were being identified as allocation functions inappropriately. This correction also affects `cpp/new-array-delete-mismatch` and `cpp/new-delete-array-mismatch`. | | Overflow in uncontrolled allocation size (`cpp/uncontrolled-allocation-size`) | More correct results | This query has been reworked so that it can find a wider variety of results. | | Memory may not be freed (`cpp/memory-may-not-be-freed`) | More correct results | Support added for more Microsoft-specific allocation functions, including `LocalAlloc`, `GlobalAlloc`, `HeapAlloc` and `CoTaskMemAlloc`. | | Memory is never freed (`cpp/memory-never-freed`) | More correct results | Support added for more Microsoft-specific allocation functions, including `LocalAlloc`, `GlobalAlloc`, `HeapAlloc` and `CoTaskMemAlloc`. | | Resource not released in destructor (`cpp/resource-not-released-in-destructor`) | Fewer false positive results | Resource allocation and deallocation functions are now determined more accurately. | -| Comparison result is always the same | Fewer false positive results | The range analysis library is now more conservative about floating point values being possibly `NaN` | +| Comparison result is always the same | Fewer false positive results | The range analysis library is now more conservative about floating point values being possibly `NaN`. | | Use of potentially dangerous function | More correct results | Calls to `localtime`, `ctime` and `asctime` are now detected by this query. | | Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | More correct results and fewer false positive results | This query now more accurately identifies wide and non-wide string/character format arguments on different platforms. Platform detection has also been made more accurate for the purposes of this query. | -| Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | Fewer false positive results | Non-standard uses of %L are now understood. | -| `()`-declared function called with too many arguments (`cpp/futile-params`) | Improved coverage | Query has been generalized to find all cases where the number of arguments exceedes the number of parameters of the function, provided the function is also properly declared/defined elsewhere. | -| Use of potentially dangerous function (`cpp/potentially-dangerous-function`) | Fewer results | Results relating to the standard library `gets` function have been moved into a new query (`dangerous-function-overflow`). | -| Constructor with default arguments will be used as a copy constructor (`cpp/constructor-used-as-copy-constructor`) | Lowered severity and precision | The severity and precision of this query have been reduced to "warning" and "low", respectively, due to this coding pattern being used intentionally and safely in a number of real-world projects. | +| Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | Fewer false positive results | Non-standard uses of `%L` are now understood. | +| Call to function with extraneous arguments + (`cpp/futile-params`) | Improved coverage | Query has been generalized to find all cases where the number of arguments exceeds the number of parameters of the function, provided the function is also properly declared/defined elsewhere. | +| Use of potentially dangerous function (`cpp/potentially-dangerous-function`) | Fewer results | Results relating to the standard library `gets` function have been moved into a new query (`cpp/dangerous-function-overflow`). | +| Constructor with default arguments will be used as a copy constructor (`cpp/constructor-used-as-copy-constructor`) | Lowered severity and precision | The severity and precision of this query have been reduced to "warning" and "low", respectively. This coding pattern is used intentionally and safely in a number of real-world projects. Results are no longer displayed on LGTM unless you choose to display them. | ## Changes to QL libraries - The predicate `Declaration.hasGlobalName` now only holds for declarations that are not nested in a class. For example, it no longer holds for a member function `MyClass::myFunction` or a constructor `MyClass::MyClass`, whereas previously it would classify those two declarations as global names. -- In class `Declaration`, predicates `getQualifiedName/0` and `hasQualifiedName/1` are no longer recommended for finding functions by name. Instead, use `hasGlobalName/1` and the new `hasQualifiedName/2` and `hasQualifiedName/3` predicates. This improves performance and makes it more reliable to identify names involving templates and inline namespaces. -- Additional support for definition by reference has been added to the `semmle.code.cpp.dataflow.TaintTracking` library. - - The taint tracking library now includes taint-specific edges for functions modeled in `semmle.code.cpp.models.interfaces.DataFlow`. - - The taint tracking library adds flow through library functions that are modeled in `semmle.code.cpp.models.interfaces.Taint`. Queries can add subclasses of `TaintFunction` to specify additional flow. +- In class `Declaration`, predicates `getQualifiedName/0` and `hasQualifiedName/1` are no longer recommended for finding functions by name. Instead, use `hasGlobalName/1` and the new `hasQualifiedName/2` and `hasQualifiedName/3` predicates. This improves performance and identifies names involving templates and inline namespaces more reliably. +- Additional support for definition by reference has been added to the `semmle.code.cpp.dataflow.TaintTracking` library, including: + - Taint-specific edges for functions modeled in `semmle.code.cpp.models.interfaces.DataFlow`. + - Flow through library functions that are modeled in `semmle.code.cpp.models.interfaces.Taint`. Queries can add subclasses of `TaintFunction` to specify additional flow. - There is a new `FoldExpr` class, representing C++17 fold expressions. - The member predicates `DeclarationEntry.getUnspecifiedType`, `Expr.getUnspecifiedType`, and `Variable.getUnspecifiedType` have been added. These should be preferred over the existing `getUnderlyingType` predicates. \ No newline at end of file From 2f97aac64ad4884a7ae2b96294bae8c2ec89056d Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Thu, 6 Jun 2019 17:50:17 +0100 Subject: [PATCH 02/22] Sort table rows alphabetically --- change-notes/1.21/analysis-cpp.md | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/change-notes/1.21/analysis-cpp.md b/change-notes/1.21/analysis-cpp.md index fba6e217ad4..e3da1779f76 100644 --- a/change-notes/1.21/analysis-cpp.md +++ b/change-notes/1.21/analysis-cpp.md @@ -6,9 +6,8 @@ | **Query** | **Tags** | **Purpose** | |-----------------------------|-----------|--------------------------------------------------------------------| -| Call to function with fewer arguments than declared parameters - (`cpp/too-few-arguments`) | Correctness | Find all cases where the number of arguments is fewer than the number of parameters of the function, provided the function is also properly declared/defined elsewhere. Results are displayed by default on [LGTM](https://lgtm.com/rules/1508860726279/). | | Call to alloca in a loop (`cpp/alloca-in-loop`) | reliability, correctness, external/cwe/cwe-770 | Finds calls to `alloca` in loops, which can lead to stack overflow if the number of iterations is large. Newly displayed on [LGTM](https://lgtm.com/rules/1508831665988/). | +| Call to function with fewer arguments than declared parameters (`cpp/too-few-arguments`) | Correctness | Find all cases where the number of arguments is fewer than the number of parameters of the function, provided the function is also properly declared/defined elsewhere. Results are displayed by default on [LGTM](https://lgtm.com/rules/1508860726279/). | | Use of dangerous function (`cpp/dangerous-function-overflow`) | reliability, security, external/cwe/cwe-242 | Finds calls to `gets`, which does not guard against buffer overflow. These results were previously detected by the `cpp/potentially-dangerous-function` query. Results for both queries are displayed by default on LGTM. | ## Changes to existing queries @@ -16,22 +15,21 @@ | **Query** | **Expected impact** | **Change** | |----------------------------|------------------------|------------------------------------------------------------------| | Buffer not sufficient for string (`cpp/overflow-calculated`) | Fewer results | This query no longer reports results that would be found by the 'No space for zero terminator' (`cpp/no-space-for-terminator`) query. | -| No space for zero terminator (`cpp/no-space-for-terminator`) | More correct results | This query now detects calls to `std::malloc`. | +| Call to function with extraneous arguments (`cpp/futile-params`) | Improved coverage | Query has been generalized to find all cases where the number of arguments exceeds the number of parameters of the function, provided the function is also properly declared/defined elsewhere. | | Commented-out code (`cpp/commented-out-code`) | More correct results | Commented-out preprocessor code is now detected by this query. | -| Dead code due to goto or break statement (`cpp/dead-code-goto`) | Fewer false positive results | Functions containing preprocessor logic are now excluded from this analysis. | -| Mismatching new/free or malloc/delete (`cpp/new-free-mismatch`) | Fewer false positive results | Fixed an issue where functions were being identified as allocation functions inappropriately. This correction also affects `cpp/new-array-delete-mismatch` and `cpp/new-delete-array-mismatch`. | -| Overflow in uncontrolled allocation size (`cpp/uncontrolled-allocation-size`) | More correct results | This query has been reworked so that it can find a wider variety of results. | -| Memory may not be freed (`cpp/memory-may-not-be-freed`) | More correct results | Support added for more Microsoft-specific allocation functions, including `LocalAlloc`, `GlobalAlloc`, `HeapAlloc` and `CoTaskMemAlloc`. | -| Memory is never freed (`cpp/memory-never-freed`) | More correct results | Support added for more Microsoft-specific allocation functions, including `LocalAlloc`, `GlobalAlloc`, `HeapAlloc` and `CoTaskMemAlloc`. | -| Resource not released in destructor (`cpp/resource-not-released-in-destructor`) | Fewer false positive results | Resource allocation and deallocation functions are now determined more accurately. | | Comparison result is always the same | Fewer false positive results | The range analysis library is now more conservative about floating point values being possibly `NaN`. | +| Constructor with default arguments will be used as a copy constructor (`cpp/constructor-used-as-copy-constructor`) | Lowered severity and precision | The severity and precision of this query have been reduced to "warning" and "low", respectively. This coding pattern is used intentionally and safely in a number of real-world projects. Results are no longer displayed on LGTM unless you choose to display them. | +| Dead code due to goto or break statement (`cpp/dead-code-goto`) | Fewer false positive results | Functions containing preprocessor logic are now excluded from this analysis. | +| Memory is never freed (`cpp/memory-never-freed`) | More correct results | Support added for more Microsoft-specific allocation functions, including `LocalAlloc`, `GlobalAlloc`, `HeapAlloc` and `CoTaskMemAlloc`. | +| Memory may not be freed (`cpp/memory-may-not-be-freed`) | More correct results | Support added for more Microsoft-specific allocation functions, including `LocalAlloc`, `GlobalAlloc`, `HeapAlloc` and `CoTaskMemAlloc`. | +| Mismatching new/free or malloc/delete (`cpp/new-free-mismatch`) | Fewer false positive results | Fixed an issue where functions were being identified as allocation functions inappropriately. This correction also affects `cpp/new-array-delete-mismatch` and `cpp/new-delete-array-mismatch`. | +| No space for zero terminator (`cpp/no-space-for-terminator`) | More correct results | This query now detects calls to `std::malloc`. | +| Overflow in uncontrolled allocation size (`cpp/uncontrolled-allocation-size`) | More correct results | This query has been reworked so that it can find a wider variety of results. | +| Resource not released in destructor (`cpp/resource-not-released-in-destructor`) | Fewer false positive results | Resource allocation and deallocation functions are now determined more accurately. | | Use of potentially dangerous function | More correct results | Calls to `localtime`, `ctime` and `asctime` are now detected by this query. | | Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | More correct results and fewer false positive results | This query now more accurately identifies wide and non-wide string/character format arguments on different platforms. Platform detection has also been made more accurate for the purposes of this query. | | Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | Fewer false positive results | Non-standard uses of `%L` are now understood. | -| Call to function with extraneous arguments - (`cpp/futile-params`) | Improved coverage | Query has been generalized to find all cases where the number of arguments exceeds the number of parameters of the function, provided the function is also properly declared/defined elsewhere. | | Use of potentially dangerous function (`cpp/potentially-dangerous-function`) | Fewer results | Results relating to the standard library `gets` function have been moved into a new query (`cpp/dangerous-function-overflow`). | -| Constructor with default arguments will be used as a copy constructor (`cpp/constructor-used-as-copy-constructor`) | Lowered severity and precision | The severity and precision of this query have been reduced to "warning" and "low", respectively. This coding pattern is used intentionally and safely in a number of real-world projects. Results are no longer displayed on LGTM unless you choose to display them. | ## Changes to QL libraries - The predicate `Declaration.hasGlobalName` now only holds for declarations that are not nested in a class. For example, it no longer holds for a member function `MyClass::myFunction` or a constructor `MyClass::MyClass`, whereas previously it would classify those two declarations as global names. From 463371aeb987c36e2b0619aac40aa83acc41a9c2 Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Thu, 6 Jun 2019 17:52:52 +0100 Subject: [PATCH 03/22] Merge two rows for one query --- change-notes/1.21/analysis-cpp.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/change-notes/1.21/analysis-cpp.md b/change-notes/1.21/analysis-cpp.md index e3da1779f76..07c8c97b1ba 100644 --- a/change-notes/1.21/analysis-cpp.md +++ b/change-notes/1.21/analysis-cpp.md @@ -27,8 +27,7 @@ | Overflow in uncontrolled allocation size (`cpp/uncontrolled-allocation-size`) | More correct results | This query has been reworked so that it can find a wider variety of results. | | Resource not released in destructor (`cpp/resource-not-released-in-destructor`) | Fewer false positive results | Resource allocation and deallocation functions are now determined more accurately. | | Use of potentially dangerous function | More correct results | Calls to `localtime`, `ctime` and `asctime` are now detected by this query. | -| Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | More correct results and fewer false positive results | This query now more accurately identifies wide and non-wide string/character format arguments on different platforms. Platform detection has also been made more accurate for the purposes of this query. | -| Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | Fewer false positive results | Non-standard uses of `%L` are now understood. | +| Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | More correct results and fewer false positive results | This query now understands non-standard uses of `%L`. In addition, it more accurately identifies wide and non-wide string/character format arguments on different platforms. Platform detection has also been made more accurate for this query. | | Use of potentially dangerous function (`cpp/potentially-dangerous-function`) | Fewer results | Results relating to the standard library `gets` function have been moved into a new query (`cpp/dangerous-function-overflow`). | ## Changes to QL libraries From eba8abe3d40855db54dc3f5d61f0139f3f4b8476 Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Thu, 6 Jun 2019 18:33:56 +0100 Subject: [PATCH 04/22] Minor text changes --- change-notes/1.21/analysis-csharp.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/change-notes/1.21/analysis-csharp.md b/change-notes/1.21/analysis-csharp.md index d38316872ac..a7a62e8f2bb 100644 --- a/change-notes/1.21/analysis-csharp.md +++ b/change-notes/1.21/analysis-csharp.md @@ -1,12 +1,16 @@ # Improvements to C# analysis +## General improvements + +C# analysis now supports the extraction and analysis of many C# 8 features. For details see [Changes to code extraction](#changes-to-code-extraction) and [Changes to QL libraries](#changes-to-ql-libraries) below. + ## Changes to existing queries | **Query** | **Expected impact** | **Change** | |------------------------------|------------------------|-----------------------------------| -| Class defines a field that uses an ICryptoTransform class in a way that would be unsafe for concurrent threads (`cs/thread-unsafe-icryptotransform-field-in-class`) | Fewer false positive results | The criteria for a result has changed to include nested properties, nested fields and collections. The format of the alert message has changed to highlight the static field. | -| Constant condition (`cs/constant-condition`) | Fewer false positive results | Results have been removed where the `null` value is in a conditional expression on the left hand side of a null-coalescing expression. For example, in `(a ? b : null) ?? c`, `null` is not considered to be a constant condition. | -| Useless upcast (`cs/useless-upcast`) | Fewer false positive results | Results have been removed where the upcast is used to disambiguate the target of a constructor call. | +| Class defines a field that uses an ICryptoTransform class in a way that would be unsafe for concurrent threads (`cs/thread-unsafe-icryptotransform-field-in-class`) | Fewer false positive results | The criteria for a result has changed to include nested properties, nested fields, and collections. The format of the alert message has changed to highlight the static field. | +| Constant condition (`cs/constant-condition`) | Fewer false positive results | Results where the `null` value is in a conditional expression on the left hand side of a null-coalescing expression are now ignored. For example, in `(a ? b : null) ?? c`, `null` is not considered to be a constant condition. | +| Useless upcast (`cs/useless-upcast`) | Fewer false positive results | Results where the upcast is used to disambiguate the target of a constructor call are now ignored. | ## Changes to code extraction @@ -17,12 +21,12 @@ - `static` modifiers on local functions - Null-coalescing assignment expressions -* The `unmanaged` type parameter constraint is now extracted. +* The `unmanaged` type parameter constraint is also now extracted. ## Changes to QL libraries -* The class `Attribute` has two new predicates: `getConstructorArgument()` and `getNamedArgument()`. The first predicate returns arguments to the underlying constructor call and the latter returns named arguments for initializing fields and properties. -* The class `TypeParameterConstraints` has a new predicate `hasUnmanagedTypeConstraint()`, indicating that the type parameter has the `unmanaged` constraint. +* The class `Attribute` has two new predicates: `getConstructorArgument()` and `getNamedArgument()`. The first predicate returns arguments to the underlying constructor call and the second returns named arguments for initializing fields and properties. +* The class `TypeParameterConstraints` has a new predicate `hasUnmanagedTypeConstraint()`. This shows whether the type parameter has the `unmanaged` constraint. * The following QL classes have been added to model C# 8 features: - Class `AssignCoalesceExpr` models null-coalescing assignment, for example `x ??= y` - Class `IndexExpr` models from-end index expressions, for example `^1` @@ -37,5 +41,3 @@ - Class `Switch` models both `SwitchExpr` and `SwitchStmt` - Class `Case` models both `CaseStmt` and `SwitchCaseExpr` - Class `UsingStmt` models both `UsingBlockStmt` and `UsingDeclStmt` - -## Changes to autobuilder From 5be8576ee2901059155f913a31191dbfe1c5cb0e Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Thu, 6 Jun 2019 18:48:23 +0100 Subject: [PATCH 05/22] Minor text updates --- change-notes/1.21/analysis-java.md | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-) diff --git a/change-notes/1.21/analysis-java.md b/change-notes/1.21/analysis-java.md index 2b71148b3a4..b3ccee73c78 100644 --- a/change-notes/1.21/analysis-java.md +++ b/change-notes/1.21/analysis-java.md @@ -1,10 +1,5 @@ # Improvements to Java analysis -## New queries - -| **Query** | **Tags** | **Purpose** | -|-----------------------------|-----------|--------------------------------------------------------------------| - ## Changes to existing queries | **Query** | **Expected impact** | **Change** | @@ -19,14 +14,13 @@ `checkArgument` and `checkState` methods in `com.google.common.base.Preconditions`, the `isTrue` and `validState` methods in `org.apache.commons.lang3.Validate`, as well as any similar custom - methods. This means that more guards are recognized yielding precision - improvements in a number of queries including `java/index-out-of-bounds`, + methods. This means that more guards are recognized which improves the precision of a number of queries including `java/index-out-of-bounds`, `java/dereferenced-value-may-be-null`, and `java/useless-null-check`. * The default sanitizer in taint tracking has been made more precise. The - sanitizer works by looking for guards that inspect tainted strings, and it - used to work at the level of individual variables. This has been changed to - use the `Guards` library, such that only guarded variable accesses are - sanitized. This may give additional results in the security queries. -* Spring framework support is enhanced by taking into account additional + sanitizer works by looking for guards that inspect tainted strings. It + previously worked at the level of individual variables. Now it + uses the `Guards` library, such that only guarded variable accesses are + sanitized. This may give additional results for security queries. +* Spring framework support now takes into account additional annotations that indicate remote user input. This affects all security - queries, which may yield additional results. + queries, which may give additional results. From 53ea76ba486d54ae908911861910677d6ba93425 Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Fri, 7 Jun 2019 10:51:19 +0100 Subject: [PATCH 06/22] Update for feedback --- change-notes/1.21/analysis-cpp.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/change-notes/1.21/analysis-cpp.md b/change-notes/1.21/analysis-cpp.md index 07c8c97b1ba..2ea412e0c39 100644 --- a/change-notes/1.21/analysis-cpp.md +++ b/change-notes/1.21/analysis-cpp.md @@ -7,7 +7,8 @@ | **Query** | **Tags** | **Purpose** | |-----------------------------|-----------|--------------------------------------------------------------------| | Call to alloca in a loop (`cpp/alloca-in-loop`) | reliability, correctness, external/cwe/cwe-770 | Finds calls to `alloca` in loops, which can lead to stack overflow if the number of iterations is large. Newly displayed on [LGTM](https://lgtm.com/rules/1508831665988/). | -| Call to function with fewer arguments than declared parameters (`cpp/too-few-arguments`) | Correctness | Find all cases where the number of arguments is fewer than the number of parameters of the function, provided the function is also properly declared/defined elsewhere. Results are displayed by default on [LGTM](https://lgtm.com/rules/1508860726279/). | +| Call to function with fewer arguments than declared parameters (`cpp/too-few-arguments`) | correctness, maintainability, security | Finds all cases where the number of arguments is fewer than the number of parameters of the function, provided the function is also properly declared/defined elsewhere. Results are displayed by default on [LGTM](https://lgtm.com/rules/1508860726279/). | +| Call to a function with one or more incompatible arguments (`cpp/mismatched-function-arguments`) | correctness, maintainability | Finds all cases where the types of arguments do not match the types of parameters of the function, provided the function is also properly declared/defined elsewhere. Results are not displayed by default on [LGTM](https://lgtm.com/rules/1508849286093/). | | Use of dangerous function (`cpp/dangerous-function-overflow`) | reliability, security, external/cwe/cwe-242 | Finds calls to `gets`, which does not guard against buffer overflow. These results were previously detected by the `cpp/potentially-dangerous-function` query. Results for both queries are displayed by default on LGTM. | ## Changes to existing queries From fe38417765e73f459f68e1eb9f736757bd42af43 Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Fri, 7 Jun 2019 11:54:22 +0100 Subject: [PATCH 07/22] Update for feedback --- change-notes/1.21/analysis-csharp.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/change-notes/1.21/analysis-csharp.md b/change-notes/1.21/analysis-csharp.md index a7a62e8f2bb..f46218c230d 100644 --- a/change-notes/1.21/analysis-csharp.md +++ b/change-notes/1.21/analysis-csharp.md @@ -9,8 +9,8 @@ C# analysis now supports the extraction and analysis of many C# 8 features. For | **Query** | **Expected impact** | **Change** | |------------------------------|------------------------|-----------------------------------| | Class defines a field that uses an ICryptoTransform class in a way that would be unsafe for concurrent threads (`cs/thread-unsafe-icryptotransform-field-in-class`) | Fewer false positive results | The criteria for a result has changed to include nested properties, nested fields, and collections. The format of the alert message has changed to highlight the static field. | -| Constant condition (`cs/constant-condition`) | Fewer false positive results | Results where the `null` value is in a conditional expression on the left hand side of a null-coalescing expression are now ignored. For example, in `(a ? b : null) ?? c`, `null` is not considered to be a constant condition. | -| Useless upcast (`cs/useless-upcast`) | Fewer false positive results | Results where the upcast is used to disambiguate the target of a constructor call are now ignored. | +| Constant condition (`cs/constant-condition`) | Fewer false positive results | The query now ignores code where the `null` value is in a conditional expression on the left hand side of a null-coalescing expression. For example, in `(a ? b : null) ?? c`, `null` is not considered to be a constant condition. | +| Useless upcast (`cs/useless-upcast`) | Fewer false positive results | The query now ignores code where the upcast is used to disambiguate the target of a constructor call. | ## Changes to code extraction From da04086385f4e0f04bc47501875856e4f9e28729 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Fri, 7 Jun 2019 11:59:43 +0100 Subject: [PATCH 08/22] Add missing extractor change notes for 1.21. --- change-notes/1.21/analysis-python.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/change-notes/1.21/analysis-python.md b/change-notes/1.21/analysis-python.md index fdd34f78dc4..808079b3734 100644 --- a/change-notes/1.21/analysis-python.md +++ b/change-notes/1.21/analysis-python.md @@ -19,7 +19,12 @@ ## Changes to code extraction -* *Series of bullet points* +* The extractor now exits gracefully if passed a non-existent file or directory either as a `--path` option or as a file name. + +* The extractor now exits gracefully if an invalid number was given as the `--max-procs` option. + +* String literals as expressions within literal string interpolation (f-strings) are now handled correctly. + ## Changes to QL libraries From da9b9edde23a7c2866bff38d332800359e7ec36e Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Fri, 7 Jun 2019 12:03:19 +0100 Subject: [PATCH 09/22] Remove redundant sentence --- change-notes/1.21/analysis-cpp.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/change-notes/1.21/analysis-cpp.md b/change-notes/1.21/analysis-cpp.md index 2ea412e0c39..df67ef837a0 100644 --- a/change-notes/1.21/analysis-cpp.md +++ b/change-notes/1.21/analysis-cpp.md @@ -28,7 +28,7 @@ | Overflow in uncontrolled allocation size (`cpp/uncontrolled-allocation-size`) | More correct results | This query has been reworked so that it can find a wider variety of results. | | Resource not released in destructor (`cpp/resource-not-released-in-destructor`) | Fewer false positive results | Resource allocation and deallocation functions are now determined more accurately. | | Use of potentially dangerous function | More correct results | Calls to `localtime`, `ctime` and `asctime` are now detected by this query. | -| Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | More correct results and fewer false positive results | This query now understands non-standard uses of `%L`. In addition, it more accurately identifies wide and non-wide string/character format arguments on different platforms. Platform detection has also been made more accurate for this query. | +| Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | More correct results and fewer false positive results | This query now understands non-standard uses of `%L`. In addition, it more accurately identifies wide and non-wide string/character format arguments on different platforms. | | Use of potentially dangerous function (`cpp/potentially-dangerous-function`) | Fewer results | Results relating to the standard library `gets` function have been moved into a new query (`cpp/dangerous-function-overflow`). | ## Changes to QL libraries From 80909687d3c31a68d4ef16114e19527c63d18fd4 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Fri, 7 Jun 2019 15:09:22 +0100 Subject: [PATCH 10/22] Python extractor change notes: Rephrase for clarity. --- change-notes/1.21/analysis-python.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/change-notes/1.21/analysis-python.md b/change-notes/1.21/analysis-python.md index 808079b3734..80269f5f4e1 100644 --- a/change-notes/1.21/analysis-python.md +++ b/change-notes/1.21/analysis-python.md @@ -19,12 +19,13 @@ ## Changes to code extraction -* The extractor now exits gracefully if passed a non-existent file or directory either as a `--path` option or as a file name. - -* The extractor now exits gracefully if an invalid number was given as the `--max-procs` option. - * String literals as expressions within literal string interpolation (f-strings) are now handled correctly. +* The Python extractor now handles invalid input more robustly. In particular, it exits gracefully when: + + * A non-existent file or directory is specified using the `--path` option, or as a file name. + * An invalid number is specified for the `--max-procs` option. + ## Changes to QL libraries From 1a97936d5cd85208d25c1e9b298950f780c55d20 Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Fri, 7 Jun 2019 15:21:45 +0100 Subject: [PATCH 11/22] Further corrections --- change-notes/1.21/analysis-cpp.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/change-notes/1.21/analysis-cpp.md b/change-notes/1.21/analysis-cpp.md index df67ef837a0..efe1c3e6c40 100644 --- a/change-notes/1.21/analysis-cpp.md +++ b/change-notes/1.21/analysis-cpp.md @@ -6,9 +6,9 @@ | **Query** | **Tags** | **Purpose** | |-----------------------------|-----------|--------------------------------------------------------------------| -| Call to alloca in a loop (`cpp/alloca-in-loop`) | reliability, correctness, external/cwe/cwe-770 | Finds calls to `alloca` in loops, which can lead to stack overflow if the number of iterations is large. Newly displayed on [LGTM](https://lgtm.com/rules/1508831665988/). | -| Call to function with fewer arguments than declared parameters (`cpp/too-few-arguments`) | correctness, maintainability, security | Finds all cases where the number of arguments is fewer than the number of parameters of the function, provided the function is also properly declared/defined elsewhere. Results are displayed by default on [LGTM](https://lgtm.com/rules/1508860726279/). | -| Call to a function with one or more incompatible arguments (`cpp/mismatched-function-arguments`) | correctness, maintainability | Finds all cases where the types of arguments do not match the types of parameters of the function, provided the function is also properly declared/defined elsewhere. Results are not displayed by default on [LGTM](https://lgtm.com/rules/1508849286093/). | +| Call to alloca in a loop (`cpp/alloca-in-loop`) | reliability, correctness, external/cwe/cwe-770 | Finds calls to `alloca` in loops, which can lead to stack overflow if the number of iterations is large. Newly displayed [on LGTM](https://lgtm.com/rules/1508831665988/). | +| Call to function with fewer arguments than declared parameters (`cpp/too-few-arguments`) | correctness, maintainability, security | Finds all cases where the number of arguments is fewer than the number of parameters of the function, provided the function is also properly declared/defined elsewhere. Results are displayed by default [on LGTM](https://lgtm.com/rules/1508860726279/). | +| Call to a function with one or more incompatible arguments (`cpp/mistyped-function-arguments`) | correctness, maintainability | Finds all cases where the types of arguments do not match the types of parameters of the function, provided the function is also properly declared/defined elsewhere. Results are not displayed by default [on LGTM](https://lgtm.com/rules/1508849286093/). | | Use of dangerous function (`cpp/dangerous-function-overflow`) | reliability, security, external/cwe/cwe-242 | Finds calls to `gets`, which does not guard against buffer overflow. These results were previously detected by the `cpp/potentially-dangerous-function` query. Results for both queries are displayed by default on LGTM. | ## Changes to existing queries From 468975b0e522f185d792a5e8e842ede7193d5c89 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 3 Jun 2019 12:00:59 +0100 Subject: [PATCH 12/22] Python points-to: Handle varargs in callee. --- python/ql/src/semmle/python/Exprs.qll | 4 ++ .../src/semmle/python/objects/Sequences.qll | 32 ++++++++++++++- .../ql/src/semmle/python/objects/TObject.qll | 5 +++ .../src/semmle/python/pointsto/PointsTo.qll | 39 ++++++++++++++----- .../library-tests/PointsTo/new/Call.expected | 3 ++ .../PointsTo/new/NameSpace.expected | 3 ++ .../PointsTo/new/PointsToWithContext.expected | 37 ++++++++++++++++++ .../PointsTo/new/PointsToWithType.expected | 33 ++++++++++++++++ .../PointsTo/new/Values.expected | 32 +++++++++++++++ .../PointsTo/new/code/l_calls.py | 13 +++++++ 10 files changed, 191 insertions(+), 10 deletions(-) diff --git a/python/ql/src/semmle/python/Exprs.qll b/python/ql/src/semmle/python/Exprs.qll index 82c5bfd5219..b471afef95d 100644 --- a/python/ql/src/semmle/python/Exprs.qll +++ b/python/ql/src/semmle/python/Exprs.qll @@ -227,6 +227,10 @@ class Call extends Call_ { result = this.getKwargs().(Dict).getAKey().(StrConst).getText() } + int getPositionalArgumentCount() { + count(this.getStarargs()) < 2 and + result = count(this.getAPositionalArg()) + } } /** A conditional expression such as, `body if test else orelse` */ diff --git a/python/ql/src/semmle/python/objects/Sequences.qll b/python/ql/src/semmle/python/objects/Sequences.qll index 17ec3ec6db6..e274f6f40a2 100644 --- a/python/ql/src/semmle/python/objects/Sequences.qll +++ b/python/ql/src/semmle/python/objects/Sequences.qll @@ -39,7 +39,9 @@ abstract class TupleObjectInternal extends SequenceObjectInternal { } private string contents(int n) { - n = this.length() and result = "" + n < 4 and n = this.length() and result = "" + or + n = 4 and n < this.length() and result = "... " + (this.length()-4).toString() + " more" or result = this.getItem(n).toString() + ", " + this.contents(n+1) } @@ -145,6 +147,34 @@ class PythonTupleObjectInternal extends TPythonTuple, TupleObjectInternal { } +class VarargsTupleObjectInternal extends TVarargsTuple, TupleObjectInternal { + + override predicate introducedAt(ControlFlowNode node, PointsToContext context) { + none() + } + + override Builtin getBuiltin() { + none() + } + + override ControlFlowNode getOrigin() { + none() + } + + override ObjectInternal getItem(int n) { + exists(CallNode call, PointsToContext context, int offset, int length | + this = TVarargsTuple(call, context, offset, length) and + n < length and + PointsToInternal::pointsTo(call.getArg(offset+n), context, result, _) + ) + } + + override int length() { + this = TVarargsTuple(_, _, _, result) + } +} + + /** The `sys.version_info` object. We treat this specially to prevent premature pruning and * false positives when we are unsure of the actual version of Python that the code is expecting. */ diff --git a/python/ql/src/semmle/python/objects/TObject.qll b/python/ql/src/semmle/python/objects/TObject.qll index c6e398e5691..3b2ef393716 100644 --- a/python/ql/src/semmle/python/objects/TObject.qll +++ b/python/ql/src/semmle/python/objects/TObject.qll @@ -179,6 +179,11 @@ cached newtype TObject = context.appliesTo(origin) } or + /* Varargs tuple */ + TVarargsTuple(CallNode call, PointsToContext context, int offset, int length) { + InterProceduralPointsTo::varargs_tuple(call, _, context, _, offset, length) + } + or /* `type` */ TType() or diff --git a/python/ql/src/semmle/python/pointsto/PointsTo.qll b/python/ql/src/semmle/python/pointsto/PointsTo.qll index af5133ca3c8..a0e33021744 100644 --- a/python/ql/src/semmle/python/pointsto/PointsTo.qll +++ b/python/ql/src/semmle/python/pointsto/PointsTo.qll @@ -899,17 +899,38 @@ module InterProceduralPointsTo { pragma [noinline] private predicate special_parameter_points_to(ParameterDefinition def, PointsToContext context, ObjectInternal value, ControlFlowNode origin) { special_parameter_value(def, value) and - ( - context.isRuntime() - or - exists(PointsToContext caller, CallNode call | - context.fromCall(call, caller) and - context.appliesToScope(def.getScope()) and - not exists(call.getArg(def.getParameter().getPosition())) and - not exists(call.getArgByName(def.getParameter().getName())) - ) + context.isRuntime() and + origin = def.getDefiningNode() + or + exists(CallNode call, Function scope, PointsToContext caller, int offset, int length | + varargs_tuple(call, scope, caller, context, offset, length) and + value = TVarargsTuple(call, caller, offset, length) and + def.getScope() = scope ) and origin = def.getDefiningNode() + or + exists(Function scope | + varargs_empty_tuple(scope, context) and + value.(BuiltinTupleObjectInternal).length() = 0 and + def.getScope() = scope + ) and + origin = def.getDefiningNode() + } + + predicate varargs_tuple(CallNode call, Function scope, PointsToContext caller, PointsToContext callee, int startOffset, int length) { + exists(int parameter_offset | + callsite_calls_function(call, caller, scope, callee, parameter_offset) and + startOffset = scope.getPositionalParameterCount() - parameter_offset and + length = call.getNode().getPositionalArgumentCount() - startOffset and + length > 0 + ) + } + + predicate varargs_empty_tuple(Function scope, PointsToContext callee) { + exists(CallNode call, PointsToContext caller, int parameter_offset | + callsite_calls_function(call, caller, scope, callee, parameter_offset) and + scope.getPositionalParameterCount() - parameter_offset >= call.getNode().getPositionalArgumentCount() + ) } /** Helper predicate for special_parameter_points_to */ diff --git a/python/ql/test/library-tests/PointsTo/new/Call.expected b/python/ql/test/library-tests/PointsTo/new/Call.expected index b97734a8302..70b024ae874 100644 --- a/python/ql/test/library-tests/PointsTo/new/Call.expected +++ b/python/ql/test/library-tests/PointsTo/new/Call.expected @@ -16,6 +16,9 @@ | l_calls.py:10 | ControlFlowNode for bar() | bar | | l_calls.py:24 | ControlFlowNode for Attribute() | Owner.cm | | l_calls.py:25 | ControlFlowNode for Attribute() | Owner.cm2 | +| l_calls.py:37 | ControlFlowNode for f() | f | +| l_calls.py:38 | ControlFlowNode for Attribute() | E.m | +| l_calls.py:39 | ControlFlowNode for Attribute() | E.m | | q_super.py:4 | ControlFlowNode for Attribute() | object.__init__ | | q_super.py:12 | ControlFlowNode for Attribute() | Base2.__init__ | | q_super.py:22 | ControlFlowNode for Attribute() | Base1.meth | diff --git a/python/ql/test/library-tests/PointsTo/new/NameSpace.expected b/python/ql/test/library-tests/PointsTo/new/NameSpace.expected index 481b25a8258..bb6901f1213 100644 --- a/python/ql/test/library-tests/PointsTo/new/NameSpace.expected +++ b/python/ql/test/library-tests/PointsTo/new/NameSpace.expected @@ -112,12 +112,15 @@ | k_getsetattr.py:0 | Module code.k_getsetattr | k | Function k | | k_getsetattr.py:4 | Class C | meth1 | Function meth1 | | k_getsetattr.py:4 | Class C | meth2 | Function meth2 | +| l_calls.py:0 | Module code.l_calls | E | class E | | l_calls.py:0 | Module code.l_calls | Owner | class Owner | | l_calls.py:0 | Module code.l_calls | bar | Function bar | +| l_calls.py:0 | Module code.l_calls | f | Function f | | l_calls.py:0 | Module code.l_calls | foo | Function foo | | l_calls.py:12 | Class Owner | cm | classmethod() | | l_calls.py:12 | Class Owner | cm2 | classmethod() | | l_calls.py:12 | Class Owner | m | Function m | +| l_calls.py:32 | Class E | m | Function m | | o_no_returns.py:0 | Module code.o_no_returns | bar | Function bar | | o_no_returns.py:0 | Module code.o_no_returns | fail | Function fail | | o_no_returns.py:0 | Module code.o_no_returns | foo | Function foo | diff --git a/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected b/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected index 0a933761e78..f69685edcde 100755 --- a/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected +++ b/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected @@ -598,6 +598,43 @@ WARNING: Predicate points_to has been deprecated and may be removed in future (P | l_calls.py:25 | ControlFlowNode for Attribute() | int 1 | builtin-class int | 25 | runtime | | l_calls.py:25 | ControlFlowNode for IntegerLiteral | int 1 | builtin-class int | 25 | runtime | | l_calls.py:25 | ControlFlowNode for a | class Owner | builtin-class type | 12 | runtime | +| l_calls.py:29 | ControlFlowNode for FunctionExpr | Function f | builtin-class function | 29 | import | +| l_calls.py:29 | ControlFlowNode for args | args | builtin-class tuple | 29 | runtime | +| l_calls.py:29 | ControlFlowNode for f | Function f | builtin-class function | 29 | import | +| l_calls.py:30 | ControlFlowNode for args | args | builtin-class tuple | 29 | code/l_calls.py:37 from import | +| l_calls.py:30 | ControlFlowNode for args | args | builtin-class tuple | 29 | runtime | +| l_calls.py:32 | ControlFlowNode for ClassExpr | class E | builtin-class type | 32 | import | +| l_calls.py:32 | ControlFlowNode for E | class E | builtin-class type | 32 | import | +| l_calls.py:32 | ControlFlowNode for object | builtin-class object | builtin-class type | 32 | import | +| l_calls.py:33 | ControlFlowNode for FunctionExpr | Function m | builtin-class function | 33 | import | +| l_calls.py:33 | ControlFlowNode for args | args | builtin-class tuple | 33 | runtime | +| l_calls.py:33 | ControlFlowNode for m | Function m | builtin-class function | 33 | import | +| l_calls.py:34 | ControlFlowNode for self | E() | class E | 38 | code/l_calls.py:38 from import | +| l_calls.py:34 | ControlFlowNode for self | int 3 | builtin-class int | 39 | code/l_calls.py:39 from import | +| l_calls.py:34 | ControlFlowNode for self | self | builtin-class tuple | 33 | code/l_calls.py:38 from import | +| l_calls.py:34 | ControlFlowNode for self | self | builtin-class tuple | 33 | code/l_calls.py:39 from import | +| l_calls.py:34 | ControlFlowNode for self | self | class E | 33 | runtime | +| l_calls.py:35 | ControlFlowNode for args | args | builtin-class tuple | 33 | code/l_calls.py:38 from import | +| l_calls.py:35 | ControlFlowNode for args | args | builtin-class tuple | 33 | code/l_calls.py:39 from import | +| l_calls.py:35 | ControlFlowNode for args | args | builtin-class tuple | 33 | runtime | +| l_calls.py:37 | ControlFlowNode for IntegerLiteral | int 1 | builtin-class int | 37 | import | +| l_calls.py:37 | ControlFlowNode for IntegerLiteral | int 2 | builtin-class int | 37 | import | +| l_calls.py:37 | ControlFlowNode for IntegerLiteral | int 3 | builtin-class int | 37 | import | +| l_calls.py:37 | ControlFlowNode for f | Function f | builtin-class function | 29 | import | +| l_calls.py:37 | ControlFlowNode for f() | args | builtin-class tuple | 29 | import | +| l_calls.py:38 | ControlFlowNode for Attribute | Attribute | builtin-class method | 38 | import | +| l_calls.py:38 | ControlFlowNode for Attribute() | args | builtin-class tuple | 33 | import | +| l_calls.py:38 | ControlFlowNode for E | class E | builtin-class type | 32 | import | +| l_calls.py:38 | ControlFlowNode for E() | E() | class E | 38 | import | +| l_calls.py:38 | ControlFlowNode for IntegerLiteral | int 2 | builtin-class int | 38 | import | +| l_calls.py:38 | ControlFlowNode for IntegerLiteral | int 3 | builtin-class int | 38 | import | +| l_calls.py:38 | ControlFlowNode for IntegerLiteral | int 4 | builtin-class int | 38 | import | +| l_calls.py:39 | ControlFlowNode for Attribute | Function m | builtin-class function | 33 | import | +| l_calls.py:39 | ControlFlowNode for Attribute() | args | builtin-class tuple | 33 | import | +| l_calls.py:39 | ControlFlowNode for E | class E | builtin-class type | 32 | import | +| l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 3 | builtin-class int | 39 | import | +| l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 4 | builtin-class int | 39 | import | +| l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 5 | builtin-class int | 39 | import | | m_attributes.py:3 | ControlFlowNode for C | class C | builtin-class type | 3 | import | | m_attributes.py:3 | ControlFlowNode for ClassExpr | class C | builtin-class type | 3 | import | | m_attributes.py:3 | ControlFlowNode for object | builtin-class object | builtin-class type | 3 | import | diff --git a/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected b/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected index 999cebadfa7..0700fe3cb48 100644 --- a/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected +++ b/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected @@ -675,6 +675,39 @@ WARNING: Predicate points_to has been deprecated and may be removed in future (P | l_calls.py:25 | ControlFlowNode for Attribute() | int 1 | builtin-class int | 25 | | l_calls.py:25 | ControlFlowNode for IntegerLiteral | int 1 | builtin-class int | 25 | | l_calls.py:25 | ControlFlowNode for a | class Owner | builtin-class type | 12 | +| l_calls.py:29 | ControlFlowNode for FunctionExpr | Function f | builtin-class function | 29 | +| l_calls.py:29 | ControlFlowNode for args | args | builtin-class tuple | 29 | +| l_calls.py:29 | ControlFlowNode for f | Function f | builtin-class function | 29 | +| l_calls.py:30 | ControlFlowNode for args | args | builtin-class tuple | 29 | +| l_calls.py:32 | ControlFlowNode for ClassExpr | class E | builtin-class type | 32 | +| l_calls.py:32 | ControlFlowNode for E | class E | builtin-class type | 32 | +| l_calls.py:32 | ControlFlowNode for object | builtin-class object | builtin-class type | 32 | +| l_calls.py:33 | ControlFlowNode for FunctionExpr | Function m | builtin-class function | 33 | +| l_calls.py:33 | ControlFlowNode for args | args | builtin-class tuple | 33 | +| l_calls.py:33 | ControlFlowNode for m | Function m | builtin-class function | 33 | +| l_calls.py:34 | ControlFlowNode for self | E() | class E | 38 | +| l_calls.py:34 | ControlFlowNode for self | int 3 | builtin-class int | 39 | +| l_calls.py:34 | ControlFlowNode for self | self | builtin-class tuple | 33 | +| l_calls.py:34 | ControlFlowNode for self | self | class E | 33 | +| l_calls.py:35 | ControlFlowNode for args | args | builtin-class tuple | 33 | +| l_calls.py:37 | ControlFlowNode for IntegerLiteral | int 1 | builtin-class int | 37 | +| l_calls.py:37 | ControlFlowNode for IntegerLiteral | int 2 | builtin-class int | 37 | +| l_calls.py:37 | ControlFlowNode for IntegerLiteral | int 3 | builtin-class int | 37 | +| l_calls.py:37 | ControlFlowNode for f | Function f | builtin-class function | 29 | +| l_calls.py:37 | ControlFlowNode for f() | args | builtin-class tuple | 29 | +| l_calls.py:38 | ControlFlowNode for Attribute | Attribute | builtin-class method | 38 | +| l_calls.py:38 | ControlFlowNode for Attribute() | args | builtin-class tuple | 33 | +| l_calls.py:38 | ControlFlowNode for E | class E | builtin-class type | 32 | +| l_calls.py:38 | ControlFlowNode for E() | E() | class E | 38 | +| l_calls.py:38 | ControlFlowNode for IntegerLiteral | int 2 | builtin-class int | 38 | +| l_calls.py:38 | ControlFlowNode for IntegerLiteral | int 3 | builtin-class int | 38 | +| l_calls.py:38 | ControlFlowNode for IntegerLiteral | int 4 | builtin-class int | 38 | +| l_calls.py:39 | ControlFlowNode for Attribute | Function m | builtin-class function | 33 | +| l_calls.py:39 | ControlFlowNode for Attribute() | args | builtin-class tuple | 33 | +| l_calls.py:39 | ControlFlowNode for E | class E | builtin-class type | 32 | +| l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 3 | builtin-class int | 39 | +| l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 4 | builtin-class int | 39 | +| l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 5 | builtin-class int | 39 | | s_scopes.py:4 | ControlFlowNode for True | bool True | builtin-class bool | 4 | | s_scopes.py:4 | ControlFlowNode for float | bool True | builtin-class bool | 4 | | s_scopes.py:7 | ControlFlowNode for C2 | class C2 | builtin-class type | 7 | diff --git a/python/ql/test/library-tests/PointsTo/new/Values.expected b/python/ql/test/library-tests/PointsTo/new/Values.expected index 650515e70a8..7fe8471cf38 100644 --- a/python/ql/test/library-tests/PointsTo/new/Values.expected +++ b/python/ql/test/library-tests/PointsTo/new/Values.expected @@ -466,6 +466,38 @@ | l_calls.py:25 | ControlFlowNode for Attribute() | runtime | int 1 | builtin-class int | | l_calls.py:25 | ControlFlowNode for IntegerLiteral | runtime | int 1 | builtin-class int | | l_calls.py:25 | ControlFlowNode for a | runtime | class Owner | builtin-class type | +| l_calls.py:29 | ControlFlowNode for FunctionExpr | import | Function f | builtin-class function | +| l_calls.py:30 | ControlFlowNode for args | code/l_calls.py:37 from import | (int 1, int 2, int 3, ) | builtin-class tuple | +| l_calls.py:30 | ControlFlowNode for args | runtime | instance of tuple | builtin-class tuple | +| l_calls.py:32 | ControlFlowNode for ClassExpr | import | class E | builtin-class type | +| l_calls.py:32 | ControlFlowNode for object | import | builtin-class object | builtin-class type | +| l_calls.py:33 | ControlFlowNode for FunctionExpr | import | Function E.m | builtin-class function | +| l_calls.py:34 | ControlFlowNode for self | code/l_calls.py:38 from import | (int 2, int 3, int 4, ) | builtin-class tuple | +| l_calls.py:34 | ControlFlowNode for self | code/l_calls.py:38 from import | E() | class E | +| l_calls.py:34 | ControlFlowNode for self | code/l_calls.py:39 from import | (int 4, int 5, ) | builtin-class tuple | +| l_calls.py:34 | ControlFlowNode for self | code/l_calls.py:39 from import | int 3 | builtin-class int | +| l_calls.py:34 | ControlFlowNode for self | runtime | self instance of E | class E | +| l_calls.py:35 | ControlFlowNode for args | code/l_calls.py:38 from import | (int 2, int 3, int 4, ) | builtin-class tuple | +| l_calls.py:35 | ControlFlowNode for args | code/l_calls.py:39 from import | (int 4, int 5, ) | builtin-class tuple | +| l_calls.py:35 | ControlFlowNode for args | runtime | instance of tuple | builtin-class tuple | +| l_calls.py:37 | ControlFlowNode for IntegerLiteral | import | int 1 | builtin-class int | +| l_calls.py:37 | ControlFlowNode for IntegerLiteral | import | int 2 | builtin-class int | +| l_calls.py:37 | ControlFlowNode for IntegerLiteral | import | int 3 | builtin-class int | +| l_calls.py:37 | ControlFlowNode for f | import | Function f | builtin-class function | +| l_calls.py:37 | ControlFlowNode for f() | import | (int 1, int 2, int 3, ) | builtin-class tuple | +| l_calls.py:38 | ControlFlowNode for Attribute | import | Method(Function E.m, E()) | builtin-class method | +| l_calls.py:38 | ControlFlowNode for Attribute() | import | (int 2, int 3, int 4, ) | builtin-class tuple | +| l_calls.py:38 | ControlFlowNode for E | import | class E | builtin-class type | +| l_calls.py:38 | ControlFlowNode for E() | import | E() | class E | +| l_calls.py:38 | ControlFlowNode for IntegerLiteral | import | int 2 | builtin-class int | +| l_calls.py:38 | ControlFlowNode for IntegerLiteral | import | int 3 | builtin-class int | +| l_calls.py:38 | ControlFlowNode for IntegerLiteral | import | int 4 | builtin-class int | +| l_calls.py:39 | ControlFlowNode for Attribute | import | Function E.m | builtin-class function | +| l_calls.py:39 | ControlFlowNode for Attribute() | import | (int 4, int 5, ) | builtin-class tuple | +| l_calls.py:39 | ControlFlowNode for E | import | class E | builtin-class type | +| l_calls.py:39 | ControlFlowNode for IntegerLiteral | import | int 3 | builtin-class int | +| l_calls.py:39 | ControlFlowNode for IntegerLiteral | import | int 4 | builtin-class int | +| l_calls.py:39 | ControlFlowNode for IntegerLiteral | import | int 5 | builtin-class int | | m_attributes.py:3 | ControlFlowNode for ClassExpr | import | class C | builtin-class type | | m_attributes.py:3 | ControlFlowNode for object | import | builtin-class object | builtin-class type | | m_attributes.py:5 | ControlFlowNode for FunctionExpr | import | Function C.__init__ | builtin-class function | diff --git a/python/ql/test/library-tests/PointsTo/new/code/l_calls.py b/python/ql/test/library-tests/PointsTo/new/code/l_calls.py index d49f373cec4..7d5b9e348a6 100644 --- a/python/ql/test/library-tests/PointsTo/new/code/l_calls.py +++ b/python/ql/test/library-tests/PointsTo/new/code/l_calls.py @@ -24,3 +24,16 @@ class Owner(object): a = self.cm(0) return a.cm2(1) +# *args + +def f(*args): + return args + +class E(object): + def m(self, *args): + self + return args + +f(1, 2, 3) +E().m(2, 3, 4) +E.m(3, 4, 5) From 9e537a76dceda56f158564c96f13386f0823a6d9 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 3 Jun 2019 18:50:03 +0100 Subject: [PATCH 13/22] Python points-to: Handle varargs in caller --- python/ql/src/semmle/python/Exprs.qll | 14 ++++-- python/ql/src/semmle/python/Flow.qll | 5 +++ .../src/semmle/python/objects/Sequences.qll | 4 +- .../src/semmle/python/pointsto/PointsTo.qll | 45 ++++++++++++++----- .../library-tests/PointsTo/new/Call.expected | 4 ++ .../PointsTo/new/NameSpace.expected | 4 ++ .../PointsTo/new/PointsToNone.expected | 1 + .../PointsTo/new/PointsToWithContext.expected | 32 +++++++++++++ .../PointsTo/new/PointsToWithType.expected | 31 +++++++++++++ .../PointsTo/new/Values.expected | 28 ++++++++++++ .../PointsTo/new/code/l_calls.py | 15 +++++++ 11 files changed, 168 insertions(+), 15 deletions(-) diff --git a/python/ql/src/semmle/python/Exprs.qll b/python/ql/src/semmle/python/Exprs.qll index b471afef95d..794456744f6 100644 --- a/python/ql/src/semmle/python/Exprs.qll +++ b/python/ql/src/semmle/python/Exprs.qll @@ -168,12 +168,12 @@ class Call extends Call_ { override CallNode getAFlowNode() { result = super.getAFlowNode() } - /** Gets a tuple (*) argument of this class definition. */ + /** Gets a tuple (*) argument of this call. */ Expr getStarargs() { result = this.getAPositionalArg().(Starred).getValue() } - /** Gets a dictionary (**) argument of this class definition. */ + /** Gets a dictionary (**) argument of this call. */ Expr getKwargs() { result = this.getANamedArg().(DictUnpacking).getValue() } @@ -227,10 +227,18 @@ class Call extends Call_ { result = this.getKwargs().(Dict).getAKey().(StrConst).getText() } + /** Gets the positional argument count of this call, provided there is no more than one tuple (*) argument. */ int getPositionalArgumentCount() { count(this.getStarargs()) < 2 and - result = count(this.getAPositionalArg()) + result = count(Expr arg | arg = this.getAPositionalArg() and not arg instanceof Starred) } + + /** Gets the tuple (*) argument of this call, provided there is exactly one. */ + Expr getStarArg() { + count(this.getStarargs()) < 2 and + result = getStarargs() + } + } /** A conditional expression such as, `body if test else orelse` */ diff --git a/python/ql/src/semmle/python/Flow.qll b/python/ql/src/semmle/python/Flow.qll index d2cb71fd547..5b8f3dabd67 100755 --- a/python/ql/src/semmle/python/Flow.qll +++ b/python/ql/src/semmle/python/Flow.qll @@ -481,6 +481,11 @@ class CallNode extends ControlFlowNode { ) } + ControlFlowNode getStarArg() { + result.getNode() = this.getNode().getStarArg() and + result.getBasicBlock().dominates(this.getBasicBlock()) + } + } /** A control flow corresponding to an attribute expression, such as `value.attr` */ diff --git a/python/ql/src/semmle/python/objects/Sequences.qll b/python/ql/src/semmle/python/objects/Sequences.qll index e274f6f40a2..691d4743686 100644 --- a/python/ql/src/semmle/python/objects/Sequences.qll +++ b/python/ql/src/semmle/python/objects/Sequences.qll @@ -41,7 +41,7 @@ abstract class TupleObjectInternal extends SequenceObjectInternal { private string contents(int n) { n < 4 and n = this.length() and result = "" or - n = 4 and n < this.length() and result = "... " + (this.length()-4).toString() + " more" + n = 3 and this.length() > 3 and result = (this.length()-3).toString() + " more..." or result = this.getItem(n).toString() + ", " + this.contents(n+1) } @@ -165,7 +165,7 @@ class VarargsTupleObjectInternal extends TVarargsTuple, TupleObjectInternal { exists(CallNode call, PointsToContext context, int offset, int length | this = TVarargsTuple(call, context, offset, length) and n < length and - PointsToInternal::pointsTo(call.getArg(offset+n), context, result, _) + InterProceduralPointsTo::positional_argument_points_to(call, offset+n, context, result, _) ) } diff --git a/python/ql/src/semmle/python/pointsto/PointsTo.qll b/python/ql/src/semmle/python/pointsto/PointsTo.qll index a0e33021744..3112d9ee68e 100644 --- a/python/ql/src/semmle/python/pointsto/PointsTo.qll +++ b/python/ql/src/semmle/python/pointsto/PointsTo.qll @@ -844,11 +844,13 @@ module InterProceduralPointsTo { private predicate normal_parameter_points_to(ParameterDefinition def, PointsToContext context, ObjectInternal value, ControlFlowNode origin) { exists(PointsToContext caller, ControlFlowNode arg | PointsToInternal::pointsTo(arg, caller, value, origin) and - callsite_argument_transfer(arg, caller, def, context) + named_argument_transfer(arg, caller, def, context) ) or not def.isSelf() and not def.isVarargs() and not def.isKwargs() and context.isRuntime() and value = ObjectInternal::unknown() and origin = def.getDefiningNode() + or + positional_parameter_points_to(def, context, value, origin) } pragma [noinline] @@ -921,7 +923,7 @@ module InterProceduralPointsTo { exists(int parameter_offset | callsite_calls_function(call, caller, scope, callee, parameter_offset) and startOffset = scope.getPositionalParameterCount() - parameter_offset and - length = call.getNode().getPositionalArgumentCount() - startOffset and + length = positional_argument_count(call, caller) - startOffset and length > 0 ) } @@ -929,7 +931,7 @@ module InterProceduralPointsTo { predicate varargs_empty_tuple(Function scope, PointsToContext callee) { exists(CallNode call, PointsToContext caller, int parameter_offset | callsite_calls_function(call, caller, scope, callee, parameter_offset) and - scope.getPositionalParameterCount() - parameter_offset >= call.getNode().getPositionalArgumentCount() + scope.getPositionalParameterCount() - parameter_offset >= positional_argument_count(call, caller) ) } @@ -940,16 +942,39 @@ module InterProceduralPointsTo { p.isKwargs() and value = TUnknownInstance(ObjectInternal::builtin("dict")) } - /** Holds if the `(argument, caller)` pair matches up with `(param, callee)` pair across call. */ - cached predicate callsite_argument_transfer(ControlFlowNode argument, PointsToContext caller, ParameterDefinition param, PointsToContext callee) { + predicate positional_argument_points_to(CallNode call, int argument, PointsToContext caller, ObjectInternal value, ControlFlowNode origin) { + PointsToInternal::pointsTo(call.getArg(argument), caller, value, origin) + or + exists(SequenceObjectInternal arg, int pos | + pos = call.getNode().getPositionalArgumentCount() and + PointsToInternal::pointsTo(origin, caller, arg, _) and + value = arg.getItem(argument-pos) and + origin = call.getStarArg() + ) + } + + private int positional_argument_count(CallNode call, PointsToContext caller) { + result = call.getNode().getPositionalArgumentCount() and not exists(call.getStarArg()) and caller.appliesTo(call) + or + exists(SequenceObjectInternal arg, int pos | + pos = call.getNode().getPositionalArgumentCount() and + PointsToInternal::pointsTo(call.getStarArg(), caller, arg, _) and + result = pos + arg.length() + ) + } + + predicate positional_parameter_points_to(ParameterDefinition def, PointsToContext context, ObjectInternal value, ControlFlowNode origin) { + exists(CallNode call, int argument, PointsToContext caller, Function func, int offset | + positional_argument_points_to(call, argument, caller, value, origin) and + callsite_calls_function(call, caller, func, context, offset) and + def.getParameter() = func.getArg(argument+offset) + ) + } + + cached predicate named_argument_transfer(ControlFlowNode argument, PointsToContext caller, ParameterDefinition param, PointsToContext callee) { exists(CallNode call, Function func, int offset | callsite_calls_function(call, caller, func, callee, offset) | - exists(int n | - argument = call.getArg(n) and - param.getParameter() = func.getArg(n+offset) - ) - or exists(string name | argument = call.getArgByName(name) and param.getParameter() = func.getArgByName(name) diff --git a/python/ql/test/library-tests/PointsTo/new/Call.expected b/python/ql/test/library-tests/PointsTo/new/Call.expected index 70b024ae874..dbd95c9d356 100644 --- a/python/ql/test/library-tests/PointsTo/new/Call.expected +++ b/python/ql/test/library-tests/PointsTo/new/Call.expected @@ -19,6 +19,10 @@ | l_calls.py:37 | ControlFlowNode for f() | f | | l_calls.py:38 | ControlFlowNode for Attribute() | E.m | | l_calls.py:39 | ControlFlowNode for Attribute() | E.m | +| l_calls.py:42 | ControlFlowNode for f() | f | +| l_calls.py:51 | ControlFlowNode for g() | g | +| l_calls.py:52 | ControlFlowNode for Attribute() | F.m | +| l_calls.py:53 | ControlFlowNode for Attribute() | F.m | | q_super.py:4 | ControlFlowNode for Attribute() | object.__init__ | | q_super.py:12 | ControlFlowNode for Attribute() | Base2.__init__ | | q_super.py:22 | ControlFlowNode for Attribute() | Base1.meth | diff --git a/python/ql/test/library-tests/PointsTo/new/NameSpace.expected b/python/ql/test/library-tests/PointsTo/new/NameSpace.expected index bb6901f1213..bcf7d969dd7 100644 --- a/python/ql/test/library-tests/PointsTo/new/NameSpace.expected +++ b/python/ql/test/library-tests/PointsTo/new/NameSpace.expected @@ -113,14 +113,18 @@ | k_getsetattr.py:4 | Class C | meth1 | Function meth1 | | k_getsetattr.py:4 | Class C | meth2 | Function meth2 | | l_calls.py:0 | Module code.l_calls | E | class E | +| l_calls.py:0 | Module code.l_calls | F | class F | | l_calls.py:0 | Module code.l_calls | Owner | class Owner | | l_calls.py:0 | Module code.l_calls | bar | Function bar | | l_calls.py:0 | Module code.l_calls | f | Function f | | l_calls.py:0 | Module code.l_calls | foo | Function foo | +| l_calls.py:0 | Module code.l_calls | g | Function g | +| l_calls.py:0 | Module code.l_calls | t | Tuple | | l_calls.py:12 | Class Owner | cm | classmethod() | | l_calls.py:12 | Class Owner | cm2 | classmethod() | | l_calls.py:12 | Class Owner | m | Function m | | l_calls.py:32 | Class E | m | Function m | +| l_calls.py:47 | Class F | m | Function m | | o_no_returns.py:0 | Module code.o_no_returns | bar | Function bar | | o_no_returns.py:0 | Module code.o_no_returns | fail | Function fail | | o_no_returns.py:0 | Module code.o_no_returns | foo | Function foo | diff --git a/python/ql/test/library-tests/PointsTo/new/PointsToNone.expected b/python/ql/test/library-tests/PointsTo/new/PointsToNone.expected index a72036844ab..b26108780af 100644 --- a/python/ql/test/library-tests/PointsTo/new/PointsToNone.expected +++ b/python/ql/test/library-tests/PointsTo/new/PointsToNone.expected @@ -75,6 +75,7 @@ | k_getsetattr.py:15 | ControlFlowNode for Attribute() | 6 | | l_calls.py:4 | ControlFlowNode for Attribute() | 4 | | l_calls.py:9 | ControlFlowNode for foo() | 4 | +| l_calls.py:48 | ControlFlowNode for None | 48 | | m_attributes.py:12 | ControlFlowNode for Attribute() | 8 | | m_attributes.py:13 | ControlFlowNode for Attribute() | 8 | | o_no_returns.py:7 | ControlFlowNode for fail() | 10 | diff --git a/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected b/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected index f69685edcde..0dec2c5da97 100755 --- a/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected +++ b/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected @@ -602,6 +602,7 @@ WARNING: Predicate points_to has been deprecated and may be removed in future (P | l_calls.py:29 | ControlFlowNode for args | args | builtin-class tuple | 29 | runtime | | l_calls.py:29 | ControlFlowNode for f | Function f | builtin-class function | 29 | import | | l_calls.py:30 | ControlFlowNode for args | args | builtin-class tuple | 29 | code/l_calls.py:37 from import | +| l_calls.py:30 | ControlFlowNode for args | args | builtin-class tuple | 29 | code/l_calls.py:42 from import | | l_calls.py:30 | ControlFlowNode for args | args | builtin-class tuple | 29 | runtime | | l_calls.py:32 | ControlFlowNode for ClassExpr | class E | builtin-class type | 32 | import | | l_calls.py:32 | ControlFlowNode for E | class E | builtin-class type | 32 | import | @@ -635,6 +636,37 @@ WARNING: Predicate points_to has been deprecated and may be removed in future (P | l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 3 | builtin-class int | 39 | import | | l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 4 | builtin-class int | 39 | import | | l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 5 | builtin-class int | 39 | import | +| l_calls.py:41 | ControlFlowNode for Str | 'a' | builtin-class str | 41 | import | +| l_calls.py:41 | ControlFlowNode for Str | 'b' | builtin-class str | 41 | import | +| l_calls.py:41 | ControlFlowNode for Str | 'c' | builtin-class str | 41 | import | +| l_calls.py:41 | ControlFlowNode for Tuple | Tuple | builtin-class tuple | 41 | import | +| l_calls.py:41 | ControlFlowNode for t | Tuple | builtin-class tuple | 41 | import | +| l_calls.py:42 | ControlFlowNode for f | Function f | builtin-class function | 29 | import | +| l_calls.py:42 | ControlFlowNode for f() | args | builtin-class tuple | 29 | import | +| l_calls.py:42 | ControlFlowNode for t | Tuple | builtin-class tuple | 41 | import | +| l_calls.py:44 | ControlFlowNode for FunctionExpr | Function g | builtin-class function | 44 | import | +| l_calls.py:44 | ControlFlowNode for g | Function g | builtin-class function | 44 | import | +| l_calls.py:45 | ControlFlowNode for a | 'a' | builtin-class str | 51 | code/l_calls.py:51 from import | +| l_calls.py:47 | ControlFlowNode for ClassExpr | class F | builtin-class type | 47 | import | +| l_calls.py:47 | ControlFlowNode for F | class F | builtin-class type | 47 | import | +| l_calls.py:47 | ControlFlowNode for object | builtin-class object | builtin-class type | 47 | import | +| l_calls.py:48 | ControlFlowNode for FunctionExpr | Function m | builtin-class function | 48 | import | +| l_calls.py:48 | ControlFlowNode for None | NoneType None | builtin-class NoneType | 48 | import | +| l_calls.py:48 | ControlFlowNode for m | Function m | builtin-class function | 48 | import | +| l_calls.py:49 | ControlFlowNode for x | 'a' | builtin-class str | 52 | code/l_calls.py:52 from import | +| l_calls.py:49 | ControlFlowNode for x | 'b' | builtin-class str | 53 | code/l_calls.py:53 from import | +| l_calls.py:51 | ControlFlowNode for g | Function g | builtin-class function | 44 | import | +| l_calls.py:51 | ControlFlowNode for g() | 'a' | builtin-class str | 51 | import | +| l_calls.py:51 | ControlFlowNode for t | Tuple | builtin-class tuple | 41 | import | +| l_calls.py:52 | ControlFlowNode for Attribute | Attribute | builtin-class method | 52 | import | +| l_calls.py:52 | ControlFlowNode for Attribute() | 'a' | builtin-class str | 52 | import | +| l_calls.py:52 | ControlFlowNode for F | class F | builtin-class type | 47 | import | +| l_calls.py:52 | ControlFlowNode for F() | F() | class F | 52 | import | +| l_calls.py:52 | ControlFlowNode for t | Tuple | builtin-class tuple | 41 | import | +| l_calls.py:53 | ControlFlowNode for Attribute | Function m | builtin-class function | 48 | import | +| l_calls.py:53 | ControlFlowNode for Attribute() | 'b' | builtin-class str | 53 | import | +| l_calls.py:53 | ControlFlowNode for F | class F | builtin-class type | 47 | import | +| l_calls.py:53 | ControlFlowNode for t | Tuple | builtin-class tuple | 41 | import | | m_attributes.py:3 | ControlFlowNode for C | class C | builtin-class type | 3 | import | | m_attributes.py:3 | ControlFlowNode for ClassExpr | class C | builtin-class type | 3 | import | | m_attributes.py:3 | ControlFlowNode for object | builtin-class object | builtin-class type | 3 | import | diff --git a/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected b/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected index 0700fe3cb48..2f681f63d03 100644 --- a/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected +++ b/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected @@ -708,6 +708,37 @@ WARNING: Predicate points_to has been deprecated and may be removed in future (P | l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 3 | builtin-class int | 39 | | l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 4 | builtin-class int | 39 | | l_calls.py:39 | ControlFlowNode for IntegerLiteral | int 5 | builtin-class int | 39 | +| l_calls.py:41 | ControlFlowNode for Str | 'a' | builtin-class str | 41 | +| l_calls.py:41 | ControlFlowNode for Str | 'b' | builtin-class str | 41 | +| l_calls.py:41 | ControlFlowNode for Str | 'c' | builtin-class str | 41 | +| l_calls.py:41 | ControlFlowNode for Tuple | Tuple | builtin-class tuple | 41 | +| l_calls.py:41 | ControlFlowNode for t | Tuple | builtin-class tuple | 41 | +| l_calls.py:42 | ControlFlowNode for f | Function f | builtin-class function | 29 | +| l_calls.py:42 | ControlFlowNode for f() | args | builtin-class tuple | 29 | +| l_calls.py:42 | ControlFlowNode for t | Tuple | builtin-class tuple | 41 | +| l_calls.py:44 | ControlFlowNode for FunctionExpr | Function g | builtin-class function | 44 | +| l_calls.py:44 | ControlFlowNode for g | Function g | builtin-class function | 44 | +| l_calls.py:45 | ControlFlowNode for a | 'a' | builtin-class str | 51 | +| l_calls.py:47 | ControlFlowNode for ClassExpr | class F | builtin-class type | 47 | +| l_calls.py:47 | ControlFlowNode for F | class F | builtin-class type | 47 | +| l_calls.py:47 | ControlFlowNode for object | builtin-class object | builtin-class type | 47 | +| l_calls.py:48 | ControlFlowNode for FunctionExpr | Function m | builtin-class function | 48 | +| l_calls.py:48 | ControlFlowNode for None | NoneType None | builtin-class NoneType | 48 | +| l_calls.py:48 | ControlFlowNode for m | Function m | builtin-class function | 48 | +| l_calls.py:49 | ControlFlowNode for x | 'a' | builtin-class str | 52 | +| l_calls.py:49 | ControlFlowNode for x | 'b' | builtin-class str | 53 | +| l_calls.py:51 | ControlFlowNode for g | Function g | builtin-class function | 44 | +| l_calls.py:51 | ControlFlowNode for g() | 'a' | builtin-class str | 51 | +| l_calls.py:51 | ControlFlowNode for t | Tuple | builtin-class tuple | 41 | +| l_calls.py:52 | ControlFlowNode for Attribute | Attribute | builtin-class method | 52 | +| l_calls.py:52 | ControlFlowNode for Attribute() | 'a' | builtin-class str | 52 | +| l_calls.py:52 | ControlFlowNode for F | class F | builtin-class type | 47 | +| l_calls.py:52 | ControlFlowNode for F() | F() | class F | 52 | +| l_calls.py:52 | ControlFlowNode for t | Tuple | builtin-class tuple | 41 | +| l_calls.py:53 | ControlFlowNode for Attribute | Function m | builtin-class function | 48 | +| l_calls.py:53 | ControlFlowNode for Attribute() | 'b' | builtin-class str | 53 | +| l_calls.py:53 | ControlFlowNode for F | class F | builtin-class type | 47 | +| l_calls.py:53 | ControlFlowNode for t | Tuple | builtin-class tuple | 41 | | s_scopes.py:4 | ControlFlowNode for True | bool True | builtin-class bool | 4 | | s_scopes.py:4 | ControlFlowNode for float | bool True | builtin-class bool | 4 | | s_scopes.py:7 | ControlFlowNode for C2 | class C2 | builtin-class type | 7 | diff --git a/python/ql/test/library-tests/PointsTo/new/Values.expected b/python/ql/test/library-tests/PointsTo/new/Values.expected index 7fe8471cf38..f58be65ea7a 100644 --- a/python/ql/test/library-tests/PointsTo/new/Values.expected +++ b/python/ql/test/library-tests/PointsTo/new/Values.expected @@ -468,6 +468,7 @@ | l_calls.py:25 | ControlFlowNode for a | runtime | class Owner | builtin-class type | | l_calls.py:29 | ControlFlowNode for FunctionExpr | import | Function f | builtin-class function | | l_calls.py:30 | ControlFlowNode for args | code/l_calls.py:37 from import | (int 1, int 2, int 3, ) | builtin-class tuple | +| l_calls.py:30 | ControlFlowNode for args | code/l_calls.py:42 from import | ('a', 'b', 'c', ) | builtin-class tuple | | l_calls.py:30 | ControlFlowNode for args | runtime | instance of tuple | builtin-class tuple | | l_calls.py:32 | ControlFlowNode for ClassExpr | import | class E | builtin-class type | | l_calls.py:32 | ControlFlowNode for object | import | builtin-class object | builtin-class type | @@ -498,6 +499,33 @@ | l_calls.py:39 | ControlFlowNode for IntegerLiteral | import | int 3 | builtin-class int | | l_calls.py:39 | ControlFlowNode for IntegerLiteral | import | int 4 | builtin-class int | | l_calls.py:39 | ControlFlowNode for IntegerLiteral | import | int 5 | builtin-class int | +| l_calls.py:41 | ControlFlowNode for Str | import | 'a' | builtin-class str | +| l_calls.py:41 | ControlFlowNode for Str | import | 'b' | builtin-class str | +| l_calls.py:41 | ControlFlowNode for Str | import | 'c' | builtin-class str | +| l_calls.py:41 | ControlFlowNode for Tuple | import | ('a', 'b', 'c', ) | builtin-class tuple | +| l_calls.py:42 | ControlFlowNode for f | import | Function f | builtin-class function | +| l_calls.py:42 | ControlFlowNode for f() | import | ('a', 'b', 'c', ) | builtin-class tuple | +| l_calls.py:42 | ControlFlowNode for t | import | ('a', 'b', 'c', ) | builtin-class tuple | +| l_calls.py:44 | ControlFlowNode for FunctionExpr | import | Function g | builtin-class function | +| l_calls.py:45 | ControlFlowNode for a | code/l_calls.py:51 from import | 'a' | builtin-class str | +| l_calls.py:47 | ControlFlowNode for ClassExpr | import | class F | builtin-class type | +| l_calls.py:47 | ControlFlowNode for object | import | builtin-class object | builtin-class type | +| l_calls.py:48 | ControlFlowNode for FunctionExpr | import | Function F.m | builtin-class function | +| l_calls.py:48 | ControlFlowNode for None | import | None | builtin-class NoneType | +| l_calls.py:49 | ControlFlowNode for x | code/l_calls.py:52 from import | 'a' | builtin-class str | +| l_calls.py:49 | ControlFlowNode for x | code/l_calls.py:53 from import | 'b' | builtin-class str | +| l_calls.py:51 | ControlFlowNode for g | import | Function g | builtin-class function | +| l_calls.py:51 | ControlFlowNode for g() | import | 'a' | builtin-class str | +| l_calls.py:51 | ControlFlowNode for t | import | ('a', 'b', 'c', ) | builtin-class tuple | +| l_calls.py:52 | ControlFlowNode for Attribute | import | Method(Function F.m, F()) | builtin-class method | +| l_calls.py:52 | ControlFlowNode for Attribute() | import | 'a' | builtin-class str | +| l_calls.py:52 | ControlFlowNode for F | import | class F | builtin-class type | +| l_calls.py:52 | ControlFlowNode for F() | import | F() | class F | +| l_calls.py:52 | ControlFlowNode for t | import | ('a', 'b', 'c', ) | builtin-class tuple | +| l_calls.py:53 | ControlFlowNode for Attribute | import | Function F.m | builtin-class function | +| l_calls.py:53 | ControlFlowNode for Attribute() | import | 'b' | builtin-class str | +| l_calls.py:53 | ControlFlowNode for F | import | class F | builtin-class type | +| l_calls.py:53 | ControlFlowNode for t | import | ('a', 'b', 'c', ) | builtin-class tuple | | m_attributes.py:3 | ControlFlowNode for ClassExpr | import | class C | builtin-class type | | m_attributes.py:3 | ControlFlowNode for object | import | builtin-class object | builtin-class type | | m_attributes.py:5 | ControlFlowNode for FunctionExpr | import | Function C.__init__ | builtin-class function | diff --git a/python/ql/test/library-tests/PointsTo/new/code/l_calls.py b/python/ql/test/library-tests/PointsTo/new/code/l_calls.py index 7d5b9e348a6..2e866cfe89d 100644 --- a/python/ql/test/library-tests/PointsTo/new/code/l_calls.py +++ b/python/ql/test/library-tests/PointsTo/new/code/l_calls.py @@ -37,3 +37,18 @@ class E(object): f(1, 2, 3) E().m(2, 3, 4) E.m(3, 4, 5) + +t = 'a', 'b', 'c' +f(*t) + +def g(a, b, c): + return a + +class F(object): + def m(self, x, y, z=None): + return x + +g(*t) +F().m(*t) +F.m(*t) + From de39f9208b83b7c1541783f8bcab0c674252b6fd Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Tue, 4 Jun 2019 11:43:30 +0100 Subject: [PATCH 14/22] Python: Clarify and document points-to and object model for calls involving starargs. --- python/ql/src/semmle/python/Flow.qll | 1 + .../semmle/python/objects/ObjectInternal.qll | 4 ++ .../src/semmle/python/objects/Sequences.qll | 4 +- .../ql/src/semmle/python/objects/TObject.qll | 2 +- .../src/semmle/python/pointsto/PointsTo.qll | 39 +++++++++++++------ 5 files changed, 37 insertions(+), 13 deletions(-) diff --git a/python/ql/src/semmle/python/Flow.qll b/python/ql/src/semmle/python/Flow.qll index 5b8f3dabd67..8f6d34cfed3 100755 --- a/python/ql/src/semmle/python/Flow.qll +++ b/python/ql/src/semmle/python/Flow.qll @@ -481,6 +481,7 @@ class CallNode extends ControlFlowNode { ) } + /** Gets the tuple (*) argument of this call, provided there is exactly one. */ ControlFlowNode getStarArg() { result.getNode() = this.getNode().getStarArg() and result.getBasicBlock().dominates(this.getBasicBlock()) diff --git a/python/ql/src/semmle/python/objects/ObjectInternal.qll b/python/ql/src/semmle/python/objects/ObjectInternal.qll index 973d7dca998..a59b57b70fd 100644 --- a/python/ql/src/semmle/python/objects/ObjectInternal.qll +++ b/python/ql/src/semmle/python/objects/ObjectInternal.qll @@ -481,6 +481,10 @@ module ObjectInternal { result = TBuiltinClassObject(Builtin::special("ClassType")) } + ObjectInternal emptyTuple() { + result.(BuiltinTupleObjectInternal).length() = 0 + } + } /** Helper for boolean predicates returning both `true` and `false` */ diff --git a/python/ql/src/semmle/python/objects/Sequences.qll b/python/ql/src/semmle/python/objects/Sequences.qll index 691d4743686..93e2dc2d2fe 100644 --- a/python/ql/src/semmle/python/objects/Sequences.qll +++ b/python/ql/src/semmle/python/objects/Sequences.qll @@ -90,6 +90,7 @@ abstract class TupleObjectInternal extends SequenceObjectInternal { } +/** A tuple built-in to the interpreter, including the empty tuple. */ class BuiltinTupleObjectInternal extends TBuiltinTuple, TupleObjectInternal { override predicate introducedAt(ControlFlowNode node, PointsToContext context) { @@ -116,7 +117,7 @@ class BuiltinTupleObjectInternal extends TBuiltinTuple, TupleObjectInternal { } } - +/** A tuple declared by a tuple expression in the Python source code */ class PythonTupleObjectInternal extends TPythonTuple, TupleObjectInternal { override predicate introducedAt(ControlFlowNode node, PointsToContext context) { @@ -147,6 +148,7 @@ class PythonTupleObjectInternal extends TPythonTuple, TupleObjectInternal { } +/** A tuple created by a `*` parameter */ class VarargsTupleObjectInternal extends TVarargsTuple, TupleObjectInternal { override predicate introducedAt(ControlFlowNode node, PointsToContext context) { diff --git a/python/ql/src/semmle/python/objects/TObject.qll b/python/ql/src/semmle/python/objects/TObject.qll index 3b2ef393716..5dbe08e8e7f 100644 --- a/python/ql/src/semmle/python/objects/TObject.qll +++ b/python/ql/src/semmle/python/objects/TObject.qll @@ -181,7 +181,7 @@ cached newtype TObject = or /* Varargs tuple */ TVarargsTuple(CallNode call, PointsToContext context, int offset, int length) { - InterProceduralPointsTo::varargs_tuple(call, _, context, _, offset, length) + InterProceduralPointsTo::varargs_tuple(call, context, _, _, offset, length) } or /* `type` */ diff --git a/python/ql/src/semmle/python/pointsto/PointsTo.qll b/python/ql/src/semmle/python/pointsto/PointsTo.qll index 3112d9ee68e..7b2aaf3f3d7 100644 --- a/python/ql/src/semmle/python/pointsto/PointsTo.qll +++ b/python/ql/src/semmle/python/pointsto/PointsTo.qll @@ -900,38 +900,46 @@ module InterProceduralPointsTo { /** Helper for parameter_points_to */ pragma [noinline] private predicate special_parameter_points_to(ParameterDefinition def, PointsToContext context, ObjectInternal value, ControlFlowNode origin) { + /* Runtime: Just an unknown tuple (or dict for `**` args) */ special_parameter_value(def, value) and context.isRuntime() and origin = def.getDefiningNode() or + /* A tuple constructed from positional arguments for a `*` parameter. */ exists(CallNode call, Function scope, PointsToContext caller, int offset, int length | - varargs_tuple(call, scope, caller, context, offset, length) and + varargs_tuple(call, caller, scope, context, offset, length) and value = TVarargsTuple(call, caller, offset, length) and def.getScope() = scope ) and origin = def.getDefiningNode() or + /* A `*` parameter with no surplus positional arguments; an empty tuple */ exists(Function scope | varargs_empty_tuple(scope, context) and - value.(BuiltinTupleObjectInternal).length() = 0 and + value = ObjectInternal::emptyTuple() and def.getScope() = scope ) and origin = def.getDefiningNode() } - predicate varargs_tuple(CallNode call, Function scope, PointsToContext caller, PointsToContext callee, int startOffset, int length) { + /** Holds if `call` in context `caller` calls into the function scope `func` in context `callee` and + * that the number of position arguments (including expansion of `*` argument) exceeds the number of positional arguments by + * `length` and that the excess arguments start at `start`. + */ + predicate varargs_tuple(CallNode call, PointsToContext caller, Function scope, PointsToContext callee, int start, int length) { exists(int parameter_offset | callsite_calls_function(call, caller, scope, callee, parameter_offset) and - startOffset = scope.getPositionalParameterCount() - parameter_offset and - length = positional_argument_count(call, caller) - startOffset and + start = scope.getPositionalParameterCount() - parameter_offset and + length = positional_argument_count(call, caller) - start and length > 0 ) } - predicate varargs_empty_tuple(Function scope, PointsToContext callee) { + /** Holds if for function scope `func` in context `callee` the `*` parameter will hold the empty tuple. */ + predicate varargs_empty_tuple(Function func, PointsToContext callee) { exists(CallNode call, PointsToContext caller, int parameter_offset | - callsite_calls_function(call, caller, scope, callee, parameter_offset) and - scope.getPositionalParameterCount() - parameter_offset >= positional_argument_count(call, caller) + callsite_calls_function(call, caller, func, callee, parameter_offset) and + func.getPositionalParameterCount() - parameter_offset >= positional_argument_count(call, caller) ) } @@ -942,17 +950,21 @@ module InterProceduralPointsTo { p.isKwargs() and value = TUnknownInstance(ObjectInternal::builtin("dict")) } - predicate positional_argument_points_to(CallNode call, int argument, PointsToContext caller, ObjectInternal value, ControlFlowNode origin) { - PointsToInternal::pointsTo(call.getArg(argument), caller, value, origin) + /** Holds if the `n`th argument in call `call` with context `caller` points-to `value` from `origin`, including values in tuples + * expanded by a `*` argument. For example, for the call `f('a', *(`x`,`y`))` the arguments are `('a', 'x', y')` + */ + predicate positional_argument_points_to(CallNode call, int n, PointsToContext caller, ObjectInternal value, ControlFlowNode origin) { + PointsToInternal::pointsTo(call.getArg(n), caller, value, origin) or exists(SequenceObjectInternal arg, int pos | pos = call.getNode().getPositionalArgumentCount() and PointsToInternal::pointsTo(origin, caller, arg, _) and - value = arg.getItem(argument-pos) and + value = arg.getItem(n-pos) and origin = call.getStarArg() ) } + /** Gets the number of positional arguments including values in tuples expanded by a `*` argument.*/ private int positional_argument_count(CallNode call, PointsToContext caller) { result = call.getNode().getPositionalArgumentCount() and not exists(call.getStarArg()) and caller.appliesTo(call) or @@ -963,6 +975,7 @@ module InterProceduralPointsTo { ) } + /** Holds if the parameter definition `def` points-to `value` from `origin` given the context `context` */ predicate positional_parameter_points_to(ParameterDefinition def, PointsToContext context, ObjectInternal value, ControlFlowNode origin) { exists(CallNode call, int argument, PointsToContext caller, Function func, int offset | positional_argument_points_to(call, argument, caller, value, origin) and @@ -971,6 +984,7 @@ module InterProceduralPointsTo { ) } + /** Holds if the named `argument` given the context `caller` is transferred to the parameter `param` with conntext `callee` by a call. */ cached predicate named_argument_transfer(ControlFlowNode argument, PointsToContext caller, ParameterDefinition param, PointsToContext callee) { exists(CallNode call, Function func, int offset | callsite_calls_function(call, caller, func, callee, offset) @@ -982,6 +996,9 @@ module InterProceduralPointsTo { ) } + /** Holds if the `call` with context `caller` calls the function `scope` in context `callee` + * and the offset from argument to parameter is `parameter_offset` + */ cached predicate callsite_calls_function(CallNode call, PointsToContext caller, Function scope, PointsToContext callee, int parameter_offset) { exists(ObjectInternal func | callWithContext(call, caller, func, callee) and From cd793091ca1cf27734742e2bc495d816dc50dd7e Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Wed, 5 Jun 2019 11:10:38 +0100 Subject: [PATCH 15/22] Python points-to: Make sure that vararg tuples are only assigned to vararg parameters. --- python/ql/src/semmle/python/pointsto/PointsTo.qll | 2 ++ .../library-tests/PointsTo/new/PointsToWithContext.expected | 2 -- .../test/library-tests/PointsTo/new/PointsToWithType.expected | 1 - python/ql/test/library-tests/PointsTo/new/Values.expected | 2 -- 4 files changed, 2 insertions(+), 5 deletions(-) diff --git a/python/ql/src/semmle/python/pointsto/PointsTo.qll b/python/ql/src/semmle/python/pointsto/PointsTo.qll index 7b2aaf3f3d7..235aaa0a0b6 100644 --- a/python/ql/src/semmle/python/pointsto/PointsTo.qll +++ b/python/ql/src/semmle/python/pointsto/PointsTo.qll @@ -906,6 +906,7 @@ module InterProceduralPointsTo { origin = def.getDefiningNode() or /* A tuple constructed from positional arguments for a `*` parameter. */ + def.isVarargs() and exists(CallNode call, Function scope, PointsToContext caller, int offset, int length | varargs_tuple(call, caller, scope, context, offset, length) and value = TVarargsTuple(call, caller, offset, length) and @@ -914,6 +915,7 @@ module InterProceduralPointsTo { origin = def.getDefiningNode() or /* A `*` parameter with no surplus positional arguments; an empty tuple */ + def.isVarargs() and exists(Function scope | varargs_empty_tuple(scope, context) and value = ObjectInternal::emptyTuple() and diff --git a/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected b/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected index 0dec2c5da97..b7cced59a58 100755 --- a/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected +++ b/python/ql/test/library-tests/PointsTo/new/PointsToWithContext.expected @@ -612,8 +612,6 @@ WARNING: Predicate points_to has been deprecated and may be removed in future (P | l_calls.py:33 | ControlFlowNode for m | Function m | builtin-class function | 33 | import | | l_calls.py:34 | ControlFlowNode for self | E() | class E | 38 | code/l_calls.py:38 from import | | l_calls.py:34 | ControlFlowNode for self | int 3 | builtin-class int | 39 | code/l_calls.py:39 from import | -| l_calls.py:34 | ControlFlowNode for self | self | builtin-class tuple | 33 | code/l_calls.py:38 from import | -| l_calls.py:34 | ControlFlowNode for self | self | builtin-class tuple | 33 | code/l_calls.py:39 from import | | l_calls.py:34 | ControlFlowNode for self | self | class E | 33 | runtime | | l_calls.py:35 | ControlFlowNode for args | args | builtin-class tuple | 33 | code/l_calls.py:38 from import | | l_calls.py:35 | ControlFlowNode for args | args | builtin-class tuple | 33 | code/l_calls.py:39 from import | diff --git a/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected b/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected index 2f681f63d03..131957da9fb 100644 --- a/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected +++ b/python/ql/test/library-tests/PointsTo/new/PointsToWithType.expected @@ -687,7 +687,6 @@ WARNING: Predicate points_to has been deprecated and may be removed in future (P | l_calls.py:33 | ControlFlowNode for m | Function m | builtin-class function | 33 | | l_calls.py:34 | ControlFlowNode for self | E() | class E | 38 | | l_calls.py:34 | ControlFlowNode for self | int 3 | builtin-class int | 39 | -| l_calls.py:34 | ControlFlowNode for self | self | builtin-class tuple | 33 | | l_calls.py:34 | ControlFlowNode for self | self | class E | 33 | | l_calls.py:35 | ControlFlowNode for args | args | builtin-class tuple | 33 | | l_calls.py:37 | ControlFlowNode for IntegerLiteral | int 1 | builtin-class int | 37 | diff --git a/python/ql/test/library-tests/PointsTo/new/Values.expected b/python/ql/test/library-tests/PointsTo/new/Values.expected index f58be65ea7a..d89f6a4fdef 100644 --- a/python/ql/test/library-tests/PointsTo/new/Values.expected +++ b/python/ql/test/library-tests/PointsTo/new/Values.expected @@ -473,9 +473,7 @@ | l_calls.py:32 | ControlFlowNode for ClassExpr | import | class E | builtin-class type | | l_calls.py:32 | ControlFlowNode for object | import | builtin-class object | builtin-class type | | l_calls.py:33 | ControlFlowNode for FunctionExpr | import | Function E.m | builtin-class function | -| l_calls.py:34 | ControlFlowNode for self | code/l_calls.py:38 from import | (int 2, int 3, int 4, ) | builtin-class tuple | | l_calls.py:34 | ControlFlowNode for self | code/l_calls.py:38 from import | E() | class E | -| l_calls.py:34 | ControlFlowNode for self | code/l_calls.py:39 from import | (int 4, int 5, ) | builtin-class tuple | | l_calls.py:34 | ControlFlowNode for self | code/l_calls.py:39 from import | int 3 | builtin-class int | | l_calls.py:34 | ControlFlowNode for self | runtime | self instance of E | class E | | l_calls.py:35 | ControlFlowNode for args | code/l_calls.py:38 from import | (int 2, int 3, int 4, ) | builtin-class tuple | From 2fab40111bce09da4d001eb7b7330d8fd92cd771 Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Wed, 5 Jun 2019 17:54:09 +0100 Subject: [PATCH 16/22] Fix typo in qldoc. --- python/ql/src/semmle/python/pointsto/PointsTo.qll | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/python/ql/src/semmle/python/pointsto/PointsTo.qll b/python/ql/src/semmle/python/pointsto/PointsTo.qll index 235aaa0a0b6..771ada01956 100644 --- a/python/ql/src/semmle/python/pointsto/PointsTo.qll +++ b/python/ql/src/semmle/python/pointsto/PointsTo.qll @@ -924,7 +924,7 @@ module InterProceduralPointsTo { origin = def.getDefiningNode() } - /** Holds if `call` in context `caller` calls into the function scope `func` in context `callee` and + /** Holds if `call` in context `caller` calls into the function scope `scope` in context `callee` and * that the number of position arguments (including expansion of `*` argument) exceeds the number of positional arguments by * `length` and that the excess arguments start at `start`. */ From 9fb31c65d9cd6021b523961d7d2bc5c5828111c9 Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Fri, 7 Jun 2019 15:04:05 +0100 Subject: [PATCH 17/22] Update support info for 1.21 --- change-notes/support/language-support.rst | 10 ++++++---- change-notes/support/versions-compilers.csv | 20 +++++++++++--------- 2 files changed, 17 insertions(+), 13 deletions(-) diff --git a/change-notes/support/language-support.rst b/change-notes/support/language-support.rst index 451138c10a5..02cf112b468 100644 --- a/change-notes/support/language-support.rst +++ b/change-notes/support/language-support.rst @@ -13,7 +13,9 @@ Note that where there are several versions or dialects of a language, the suppor .. container:: footnote-group - .. [1] The best results are achieved with COBOL code that stays close to the ANSI 85 standard. - .. [2] Builds that execute on Java 6 to 11 can be analyzed. The analysis understands Java 11 language features. - .. [3] JSX and Flow code, YAML, JSON, and HTML files may also be analyzed with JavaScript files. - .. [4] TypeScript analysis is performed by running the JavaScript extractor with TypeScript enabled. This is the default for LGTM. + .. [1] Support for the Arm Compiler (armcc) is preliminary. + .. [2] In addition, support for the preview features of C#8 and .NET Core 3.0. + .. [3] The best results are achieved with COBOL code that stays close to the ANSI 85 standard. + .. [4] Builds that execute on Java 6 to 12 can be analyzed. The analysis understands Java 12 language features. + .. [5] JSX and Flow code, YAML, JSON, HTML, and XML files may also be analyzed with JavaScript files. + .. [6] TypeScript analysis is performed by running the JavaScript extractor with TypeScript enabled. This is the default for LGTM. diff --git a/change-notes/support/versions-compilers.csv b/change-notes/support/versions-compilers.csv index 348eb19ee48..df55c38f732 100644 --- a/change-notes/support/versions-compilers.csv +++ b/change-notes/support/versions-compilers.csv @@ -1,16 +1,18 @@ Language,Variants,Compilers,Extensions -C/C++,"C89, C99, C11, C++98, C++03, C++11, C++14, C++17","Clang extensions (up to Clang 6.0) +C/C++,"C89, C99, C11, C++98, C++03, C++11, C++14, C++17","Clang extensions (up to Clang 8.0) -GNU extensions (up to GCC 7.3), +GNU extensions (up to GCC 8.3), -Microsoft extensions (up to VS 2017)","``.cpp``, ``.c++``, ``.cxx``, ``.hpp``, ``.hh``, ``.h++``, ``.hxx``, ``.c``, ``.cc``, ``.h``" -C#,C# up to 7.2 together with .NET versions up to 4.7.1,"Microsoft Visual Studio up to 2017, +Microsoft extensions (up to VS 2019), -.NET Core up to 2.1","``.sln``, ``.csproj``, ``.cs``, ``.cshtml``, ``.xaml``" -COBOL,ANSI 85 or newer [1]_.,Not applicable,"``.cbl``, ``.CBL``, ``.cpy``, ``.CPY``, ``.copy``, ``.COPY``" -Java,"Java 6 to 11 [2]_.","javac (OpenJDK and Oracle JDK) +Arm Compiler 5.0 [1]_.","``.cpp``, ``.c++``, ``.cxx``, ``.hpp``, ``.hh``, ``.h++``, ``.hxx``, ``.c``, ``.cc``, ``.h``" +C#,C# up to 7.3. with .NET up to 4.8 [2]_.,"Microsoft Visual Studio up to 2019, + +.NET Core up to 2.2","``.sln``, ``.csproj``, ``.cs``, ``.cshtml``, ``.xaml``" +COBOL,ANSI 85 or newer [3]_.,Not applicable,"``.cbl``, ``.CBL``, ``.cpy``, ``.CPY``, ``.copy``, ``.COPY``" +Java,"Java 6 to 12 [4]_.","javac (OpenJDK and Oracle JDK) Eclipse compiler for Java (ECJ) batch compiler",``.java`` -JavaScript,ECMAScript 2019 or lower,Not applicable,"``.js``, ``.jsx``, ``.mjs``, ``.es``, ``.es6``, ``.htm``, ``.html``, ``.xhm``, ``.xhtml``, ``.vue``, ``.json``, ``.yaml``, ``.yml``, ``.raml`` [3]_." +JavaScript,ECMAScript 2019 or lower,Not applicable,"``.js``, ``.jsx``, ``.mjs``, ``.es``, ``.es6``, ``.htm``, ``.html``, ``.xhm``, ``.xhtml``, ``.vue``, ``.json``, ``.yaml``, ``.yml``, ``.raml``, ``.xml`` [5]_." Python,"2.7, 3.5, 3.6, 3.7",Not applicable,``.py`` -TypeScript [4]_.,"2.6-3.5",Standard TypeScript compiler,"``.ts``, ``.tsx``" +TypeScript [6]_.,"2.6-3.5",Standard TypeScript compiler,"``.ts``, ``.tsx``" From 6bfbd773266beee4b007cb3fb81e1f5ee93085ac Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Fri, 7 Jun 2019 17:51:13 +0100 Subject: [PATCH 18/22] Update description of C# 8 support --- change-notes/support/language-support.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/change-notes/support/language-support.rst b/change-notes/support/language-support.rst index 02cf112b468..d4fcfd5ae71 100644 --- a/change-notes/support/language-support.rst +++ b/change-notes/support/language-support.rst @@ -14,7 +14,7 @@ Note that where there are several versions or dialects of a language, the suppor .. container:: footnote-group .. [1] Support for the Arm Compiler (armcc) is preliminary. - .. [2] In addition, support for the preview features of C#8 and .NET Core 3.0. + .. [2] In addition, support is included for the preview features of C# 8.0 and .NET Core 3.0. .. [3] The best results are achieved with COBOL code that stays close to the ANSI 85 standard. .. [4] Builds that execute on Java 6 to 12 can be analyzed. The analysis understands Java 12 language features. .. [5] JSX and Flow code, YAML, JSON, HTML, and XML files may also be analyzed with JavaScript files. From ec0c1dc70e5b04cb7fe5c7e067f376d5a940ec2a Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Fri, 7 Jun 2019 15:05:59 +0100 Subject: [PATCH 19/22] Python: Change note for new points-to implementation. --- change-notes/1.21/analysis-python.md | 29 ++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/change-notes/1.21/analysis-python.md b/change-notes/1.21/analysis-python.md index 80269f5f4e1..7c8fa1b161c 100644 --- a/change-notes/1.21/analysis-python.md +++ b/change-notes/1.21/analysis-python.md @@ -3,8 +3,33 @@ ## General improvements -> Changes that affect alerts in many files or from many queries -> For example, changes to file classification +Points-to analysis has been re-implemented to support more language features and provide better reachability analysis. +The new implementation adds the following new features: + +* Non-local tracking of bound methods and instances of `super()` +* Superior analysis of conditionals and thus improved reachability analysis. +* Superior modelling of descriptors, for example, classmethods and staticmethods. +* Superior tracking of values through parameters, especially `*` arguments. + +A new object API has been provided to complement the new points-to implementation. +A new class `Value` replaces the old `Object` class. The `Value` class has a simpler and more consistent API compared to `Object`. +Some of the functionality of `FunctionObject` and `ClassObject` has been added to `Value` to reduce the number of casts to more specific classes. +For example, the QL to find calls to `os.path.open` has changed from +`ModuleObject::named("os").attr("path").(ModuleObject).attr("join").(FunctionObject).getACall()` +to +`Value::called("os.path.join").getACall()` + +The old API is now deprecated, but will be continued to be supported for at least another year. + +### Impact on existing queries. + +As points-to underpins many queries and provides the call-graph and reachability analysis required for taint-tracking, many queries will have additional results and some may have fewer results. + +New results are a result of the improved reachability analysis and non-local tracking of bound-methods. +Removed results are a result of more precise tracking of values through `*` arguments. +The expectation is that number of true positives will increase and the number of false negatives will decline. +However, this is new code and may still contain errors. + ## New queries | **Query** | **Tags** | **Purpose** | From b7141ad998c5c7f74ed05868ae214f0f35de2fee Mon Sep 17 00:00:00 2001 From: Mark Shannon Date: Mon, 10 Jun 2019 10:56:28 +0100 Subject: [PATCH 20/22] Python: Clarify change note. --- change-notes/1.21/analysis-python.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/change-notes/1.21/analysis-python.md b/change-notes/1.21/analysis-python.md index 7c8fa1b161c..970b86976f4 100644 --- a/change-notes/1.21/analysis-python.md +++ b/change-notes/1.21/analysis-python.md @@ -23,12 +23,13 @@ The old API is now deprecated, but will be continued to be supported for at leas ### Impact on existing queries. -As points-to underpins many queries and provides the call-graph and reachability analysis required for taint-tracking, many queries will have additional results and some may have fewer results. +As points-to analysis underpins many queries, and provides the call-graph and reachability analysis required for taint-tracking, the results of many queries may change. -New results are a result of the improved reachability analysis and non-local tracking of bound-methods. -Removed results are a result of more precise tracking of values through `*` arguments. -The expectation is that number of true positives will increase and the number of false negatives will decline. -However, this is new code and may still contain errors. +The improved reachability analysis and non-local tracking of bound methods may identify new results. +The increased precision in tracking of values through `*` arguments may remove false positive results. + +Overall the number of true positive results should increase and the number false negative results should decline. +We welcome feedback on the new implementation, particularly any surprising changes in results. ## New queries From b242b847649e5f1493c3b7d46df147cebc930b94 Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Mon, 10 Jun 2019 11:32:34 +0100 Subject: [PATCH 21/22] Remove boilerplate text and fix table formatting --- change-notes/1.21/analysis-python.md | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/change-notes/1.21/analysis-python.md b/change-notes/1.21/analysis-python.md index 970b86976f4..7345ce3966d 100644 --- a/change-notes/1.21/analysis-python.md +++ b/change-notes/1.21/analysis-python.md @@ -31,17 +31,12 @@ The increased precision in tracking of values through `*` arguments may remove f Overall the number of true positive results should increase and the number false negative results should decline. We welcome feedback on the new implementation, particularly any surprising changes in results. - ## New queries - | **Query** | **Tags** | **Purpose** | - |-----------|----------|-------------| - | Accepting unknown SSH host keys when using Paramiko (`py/paramiko-missing-host-key-validation`) | security, external/cwe/cwe-295 | Finds instances where Paramiko is configured to accept unknown host keys. Results are shown on LGTM by default. | - | Use of 'return' or 'yield' outside a function (`py/return-or-yield-outside-function`) | reliability, correctness | Finds instances where `return`, `yield`, and `yield from` are used outside a function. Results are not shown on LGTM by default. | -## Changes to existing queries - - | **Query** | **Expected impact** | **Change** | - |-----------|---------------------|------------| +| **Query** | **Tags** | **Purpose** | +|-----------|----------|-------------| +| Accepting unknown SSH host keys when using Paramiko (`py/paramiko-missing-host-key-validation`) | security, external/cwe/cwe-295 | Finds instances where Paramiko is configured to accept unknown host keys. Results are shown on LGTM by default. | +| Use of 'return' or 'yield' outside a function (`py/return-or-yield-outside-function`) | reliability, correctness | Finds instances where `return`, `yield`, and `yield from` are used outside a function. Results are not shown on LGTM by default. | ## Changes to code extraction @@ -51,8 +46,3 @@ We welcome feedback on the new implementation, particularly any surprising chang * A non-existent file or directory is specified using the `--path` option, or as a file name. * An invalid number is specified for the `--max-procs` option. - - -## Changes to QL libraries - -* *Series of bullet points* From 9b9f257e0a63c8c14dd2e1436342f438396e195e Mon Sep 17 00:00:00 2001 From: Felicity Chapman Date: Mon, 10 Jun 2019 11:35:19 +0100 Subject: [PATCH 22/22] Add a link to a new query on LGTM.com --- change-notes/1.21/analysis-python.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/change-notes/1.21/analysis-python.md b/change-notes/1.21/analysis-python.md index 7345ce3966d..487535a46ec 100644 --- a/change-notes/1.21/analysis-python.md +++ b/change-notes/1.21/analysis-python.md @@ -35,7 +35,7 @@ We welcome feedback on the new implementation, particularly any surprising chang | **Query** | **Tags** | **Purpose** | |-----------|----------|-------------| -| Accepting unknown SSH host keys when using Paramiko (`py/paramiko-missing-host-key-validation`) | security, external/cwe/cwe-295 | Finds instances where Paramiko is configured to accept unknown host keys. Results are shown on LGTM by default. | +| Accepting unknown SSH host keys when using Paramiko (`py/paramiko-missing-host-key-validation`) | security, external/cwe/cwe-295 | Finds instances where Paramiko is configured to accept unknown host keys. Results are shown [on LGTM](https://lgtm.com/rules/1508297729270/) by default. | | Use of 'return' or 'yield' outside a function (`py/return-or-yield-outside-function`) | reliability, correctness | Finds instances where `return`, `yield`, and `yield from` are used outside a function. Results are not shown on LGTM by default. | ## Changes to code extraction