Compare commits

..

3 Commits

Author SHA1 Message Date
Max Schaefer
6e3293e30f Go: Add library overview. 2020-02-14 13:03:12 +00:00
Max Schaefer
7277ebe2cf JavaScript: Sort lines in change notes. 2020-02-14 10:36:46 +00:00
Max Schaefer
46f8dda86b JavaScript: Add model of http2 compatibility API.
Also deprecated the `httpOrHttps` predicate, which was now only used in one place and seemed a little pointless anyway.
2020-02-14 10:36:27 +00:00
2280 changed files with 63515 additions and 109424 deletions

View File

@@ -2,4 +2,5 @@
"*/ql/test/qlpack.yml",
"*/upgrades/qlpack.yml",
"misc/legacy-support/*/qlpack.yml",
"misc/suite-helpers/qlpack.yml" ] }
"misc/suite-helpers/qlpack.yml",
"codeql/.codeqlmanifest.json" ] }

7
.gitignore vendored
View File

@@ -1,7 +1,6 @@
# editor and OS artifacts
*~
.DS_STORE
*.swp
# query compilation caches
.cache
@@ -14,11 +13,7 @@
.vs/*
!.vs/VSWorkspaceSettings.json
# Byte-compiled python files
*.pyc
# It's useful (though not required) to be able to unpack codeql in the ql checkout itself
/codeql/
.vscode/settings.json
csharp/extractor/Semmle.Extraction.CSharp.Driver/Properties/launchSettings.json
.vscode

View File

@@ -1,66 +1,88 @@
# Contributing to CodeQL
We welcome contributions to our CodeQL libraries and queries. Got an idea for a new check, or how to improve an existing query? Then please go ahead and open a pull request! Contributions to this project are [released](https://help.github.com/articles/github-terms-of-service/#6-contributions-under-repository-license) to the public under the [project's open source license](LICENSE).
We welcome contributions to our standard library and standard checks. Got an idea for a new check, or how to improve an existing query? Then please go ahead and open a pull request!
There is lots of useful documentation to help you write queries, ranging from information about query file structure to tutorials for specific target languages. For more information on the documentation available, see [CodeQL queries](https://help.semmle.com/QL/learn-ql/writing-queries/writing-queries.html) on [help.semmle.com](https://help.semmle.com).
Before we accept your pull request, we require that you have agreed to our Contributor License Agreement, this is not something that you need to do before you submit your pull request, but until you've done so, we will be unable to accept your contribution.
## Adding a new query
## Submitting a new experimental query
If you have an idea for a query that you would like to share with other Semmle users, please open a pull request to add it to this repository.
Follow the steps below to help other users understand what your query does, and to ensure that your query is consistent with the other Semmle queries.
If you have an idea for a query that you would like to share with other CodeQL users, please open a pull request to add it to this repository. New queries start out in a `<language>/ql/src/experimental` directory, to which they can be merged when they meet the following requirements.
1. **Consult the documentation for query writers**
1. **Directory structure**
There is lots of useful documentation to help you write queries, ranging from information about query file structure to tutorials for specific target languages. For more information on the documentation available, see [Writing CodeQL queries](https://help.semmle.com/QL/learn-ql/writing-queries/writing-queries.html) on [help.semmle.com](https://help.semmle.com).
There are five language-specific query directories in this repository:
2. **Format your code correctly**
* C/C++: `cpp/ql/src`
* C#: `csharp/ql/src`
* Java: `java/ql/src`
* JavaScript: `javascript/ql/src`
* Python: `python/ql/src`
All of Semmle's standard queries and libraries are uniformly formatted for clarity and consistency, so we strongly recommend that all contributions follow the same formatting guidelines. If you use CodeQL for VS Code, you can autoformat your query in the [Editor](https://help.semmle.com/codeql/codeql-for-vscode/reference/editor.html#autoformatting). For more information, see the [CodeQL style guide](https://github.com/Semmle/ql/blob/master/docs/ql-style-guide.md).
Each language-specific directory contains further subdirectories that group queries based on their `@tags` or purpose.
- Experimental queries and libraries are stored in the `experimental` subdirectory within each language-specific directory in the [CodeQL repository](https://github.com/github/codeql). For example, experimental Java queries and libraries are stored in `java/ql/src/experimental` and any corresponding tests in `java/ql/test/experimental`.
- The structure of an `experimental` subdirectory mirrors the structure of its parent directory.
- Select or create an appropriate directory in `experimental` based on the existing directory structure of `experimental` or its parent directory.
3. **Make sure your query has the correct metadata**
2. **Query metadata**
Query metadata is used by Semmle's analysis to identify your query and make sure the query results are displayed properly.
The most important metadata to include are the `@name`, `@description`, and the `@kind`.
Other metadata properties (`@precision`, `@severity`, and `@tags`) are usually added after the query has been reviewed by Semmle staff.
For more information on writing query metadata, see the [Query metadata style guide](https://github.com/Semmle/ql/blob/master/docs/query-metadata-style-guide.md).
- The query `@id` must conform to all the requirements in the [guide on query metadata](docs/query-metadata-style-guide.md#query-id-id). In particular, it must not clash with any other queries in the repository, and it must start with the appropriate language-specific prefix.
- The query must have a `@name` and `@description` to explain its purpose.
- The query must have a `@kind` and `@problem.severity` as required by CodeQL tools.
4. **Make sure the `select` statement is compatible with the query type**
For details, see the [guide on query metadata](docs/query-metadata-style-guide.md).
The `select` statement of your query must be compatible with the query type (determined by the `@kind` metadata property) for alert or path results to be displayed correctly in LGTM and CodeQL for VS Code.
For more information on `select` statement format, see [Introduction to query files](https://help.semmle.com/QL/learn-ql/writing-queries/introduction-to-queries.html#select-clause) on help.semmle.com.
Make sure the `select` statement is compatible with the query `@kind`. See [About CodeQL queries](https://help.semmle.com/QL/learn-ql/writing-queries/introduction-to-queries.html#select-clause) on help.semmle.com.
5. **Save your query in a `.ql` file in the correct language directory in this repository**
3. **Formatting**
There are five language-specific directories in this repository:
* C/C++: `ql/cpp/ql/src`
* C#: `ql/csharp/ql/src`
* Java: `ql/java/ql/src`
* JavaScript: `ql/javascript/ql/src`
* Python: `ql/python/ql/src`
- The queries and libraries must be autoformatted, for example using the "Format Document" command in [CodeQL for Visual Studio Code](https://help.semmle.com/codeql/codeql-for-vscode/procedures/about-codeql-for-vscode.html).
Each language-specific directory contains further subdirectories that group queries based on their `@tags` properties or purpose. Select the appropriate subdirectory for your new query, or create a new one if necessary.
4. **Compilation**
6. **Write a query help file**
- Compilation of the query and any associated libraries and tests must be resilient to future development of the [supported](docs/supported-queries.md) libraries. This means that the functionality cannot use internal libraries, cannot depend on the output of `getAQlClass`, and cannot make use of regexp matching on `toString`.
- The query and any associated libraries and tests must not cause any compiler warnings to be emitted (such as use of deprecated functionality or missing `override` annotations).
5. **Results**
- The query must have at least one true positive result on some revision of a real project.
Experimental queries and libraries may not be actively maintained as the [supported](docs/supported-queries.md) libraries evolve. They may also be changed in backwards-incompatible ways or may be removed entirely in the future without deprecation warnings.
After the experimental query is merged, we welcome pull requests to improve it. Before a query can be moved out of the `experimental` subdirectory, it must satisfy [the requirements for being a supported query](docs/supported-queries.md).
Query help files explain the purpose of your query to other users. Write your query help in a `.qhelp` file and save it in the same directory as your new query.
For more information on writing query help, see the [Query help style guide](https://github.com/Semmle/ql/blob/master/docs/query-help-style-guide.md).
## Using your personal data
If you contribute to this project, we will record your name and email
address (as provided by you with your contributions) as part of the code
repositories, which are public. We might also use this information
repositories, which might be made public. We might also use this information
to contact you in relation to your contributions, as well as in the
normal course of software development. We also store records of your
CLA agreements. Under GDPR legislation, we do this
on the basis of our legitimate interest in creating the CodeQL product.
Please do get in touch (privacy@github.com) if you have any questions about
Please do get in touch (privacy@semmle.com) if you have any questions about
this or our data protection policies.
## Contributor License Agreement
This Contributor License Agreement (“Agreement”) is entered into between Semmle Limited (“Semmle,” “we” or “us” etc.), and You (as defined and further identified below).
Accordingly, You hereby agree to the following terms for Your present and future Contributions submitted to Semmle:
1. **Definitions**.
* "You" (or "Your") shall mean the Contribution copyright owner (whether an individual or organization) or legal entity authorized by the copyright owner that is making this Agreement with Semmle. For legal entities, the entity making a Contribution and all other entities that control, are controlled by, or are under common control with that entity are considered to be a single Contributor. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
* "Contribution(s)" shall mean the code, documentation or other original works of authorship, including any modifications or additions to an existing work, submitted by You to Semmle for inclusion in, or documentation of, any of the products or projects owned or managed by Semmle (the "Work(s)"). For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to Semmle or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, Semmle for the purpose of discussing and/or improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by You as "Not a Contribution."
2. **Grant of Copyright License**. You hereby grant to Semmle and to recipients of software distributed by Semmle a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, sublicense, and distribute Your Contributions and such derivative works.
3. **Grant of Patent License**. You hereby grant to Semmle and to recipients of software distributed by Semmle a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by You that are necessarily infringed by Your Contribution(s) alone or by combination of Your Contribution(s) with the Work to which such Contribution(s) was submitted. If any entity institutes patent litigation against You or any other entity (including a cross-claim or counterclaim in a lawsuit) alleging that Your Contribution, or the Work to which You have contributed, constitutes direct or contributory patent infringement, then any patent licenses granted to that entity under this Agreement for that Contribution or Work shall terminate as of the date such litigation is filed.
4. **Ownership**. Except as set out above, You keep all right, title, and interest in Your Contribution. The rights that You grant to us under this Agreement are effective on the date You first submitted a Contribution to us, even if Your submission took place before the date You entered this Agreement.
5. **Representations**. You represent and warrant that: (i) the Contributions are an original work and that You can legally grant the rights set out in this Agreement; (ii) the Contributions and Semmles exercise of any license rights granted hereunder, does not and will not, infringe the rights of any third party; (iii) You are not aware of any pending or threatened claims, suits, actions, or charges pertaining to the Contributions, including without limitation any claims or allegations that any or all of the Contributions infringes, violates, or misappropriate the intellectual property rights of any third party (You further agree that You will notify Semmle immediately if You become aware of any such actual or potential claims, suits, actions, allegations or charges).
6. **Employer**. If Your employer(s) has rights to intellectual property that You create that includes Your Contributions, You represent and warrant that Your employer has waived such rights for Your Contributions to Semmle, or that You have received permission to make Contributions on behalf of that employer and that You are authorized to execute this Agreement on behalf of Your employer.
7. **Inclusion of Code**. We determine the code that is in our Works. You understand that the decision to include the Contribution in any project or source repository is entirely that of Semmle, and this agreement does not guarantee that the Contributions will be included in any product.
8. **Disclaimer**. You are not expected to provide support for Your Contributions, except to the extent You desire to provide support. You may provide support for free, for a fee, or not at all. Except as set forth herein, and unless required by applicable law or agreed to in writing, You provide Your Contributions on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND.
9. **General**. The failure of either party to enforce its rights under this Agreement for any period shall not be construed as a waiver of such rights. No changes or modifications or waivers to this Agreement will be effective unless in writing and signed by both parties. In the event that any provision of this Agreement shall be determined to be illegal or unenforceable, that provision will be limited or eliminated to the minimum extent necessary so that this Agreement shall otherwise remain in full force and effect and enforceable. This Agreement shall be governed by and construed in accordance with the laws of the State of California in the United States without regard to the conflicts of laws provisions thereof. In any action or proceeding to enforce rights under this Agreement, the prevailing party will be entitled to recover costs and attorneys fees.

13
COPYRIGHT Normal file
View File

@@ -0,0 +1,13 @@
Copyright (c) Semmle Inc and other contributors. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use
this file except in compliance with the License. You may obtain a copy of the
License at http://www.apache.org/licenses/LICENSE-2.0
THIS CODE IS PROVIDED ON AN *AS IS* BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED
WARRANTIES OR CONDITIONS OF TITLE, FITNESS FOR A PARTICULAR PURPOSE,
MERCHANTABLITY OR NON-INFRINGEMENT.
See the Apache Version 2.0 License for specific language governing permissions
and limitations under the License.

189
LICENSE
View File

@@ -1,21 +1,176 @@
MIT License
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
Copyright (c) 2006-2020 GitHub, Inc.
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
1. Definitions.
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS

View File

@@ -1,6 +1,6 @@
# CodeQL
This open source repository contains the standard CodeQL libraries and queries that power [LGTM](https://lgtm.com) and the other CodeQL products that [GitHub](https://github.com) makes available to its customers worldwide.
This open source repository contains the standard CodeQL libraries and queries that power [LGTM](https://lgtm.com), and the other products that [Semmle](https://semmle.com) makes available to its customers worldwide.
## How do I learn CodeQL and run queries?
@@ -9,8 +9,8 @@ You can use the [interactive query console](https://lgtm.com/help/lgtm/using-que
## Contributing
We welcome contributions to our standard library and standard checks. Do you have an idea for a new check, or how to improve an existing query? Then please go ahead and open a pull request! Before you do, though, please take the time to read our [contributing guidelines](CONTRIBUTING.md). You can also consult our [style guides](https://github.com/github/codeql/tree/master/docs) to learn how to format your code for consistency and clarity, how to write query metadata, and how to write query help documentation for your query.
We welcome contributions to our standard library and standard checks. Do you have an idea for a new check, or how to improve an existing query? Then please go ahead and open a pull request! Before you do, though, please take the time to read our [contributing guidelines](CONTRIBUTING.md). You can also consult our [style guides](https://github.com/Semmle/ql/tree/master/docs) to learn how to format your code for consistency and clarity, how to write query metadata, and how to write query help documentation for your query.
## License
The code in this repository is licensed under [Apache License 2.0](LICENSE) by [GitHub](https://github.com).
The code in this repository is licensed under [Apache License 2.0](LICENSE) by [Semmle](https://semmle.com).

View File

@@ -4,8 +4,6 @@ The following changes in version 1.24 affect C/C++ analysis in all applications.
## General improvements
You can now suppress alerts using either single-line block comments (`/* ... */`) or line comments (`// ...`).
## New queries
| **Query** | **Tags** | **Purpose** |
@@ -14,71 +12,36 @@ You can now suppress alerts using either single-line block comments (`/* ... */`
## Changes to existing queries
A new taint-tracking library is used by all the security queries that track tainted values
(`cpp/path-injection`, `cpp/cgi-xss`, `cpp/sql-injection`, `cpp/uncontrolled-process-operation`,
`cpp/unbounded-write`, `cpp/tainted-format-string`, `cpp/tainted-format-string-through-global`,
`cpp/uncontrolled-arithmetic`, `cpp/uncontrolled-allocation-size`, `cpp/user-controlled-bypass`,
`cpp/cleartext-storage-buffer`, `cpp/tainted-permissions-check`).
These queries now have more precise results and also offer _path explanations_ so you can explore the results easily.
There is a performance cost to this, and the LGTM query suite will overall run slower than before.
| **Query** | **Expected impact** | **Change** |
|----------------------------|------------------------|------------------------------------------------------------------|
| Boost\_asio TLS Settings Misconfiguration (`cpp/boost/tls-settings-misconfiguration`) | Query id change | The identifier was updated to use dashes in place of underscores (previous identifier `cpp/boost/tls_settings_misconfiguration`). |
| Buffer not sufficient for string (`cpp/overflow-calculated`) | More true positive results | This query now identifies a wider variety of buffer allocations using the `semmle.code.cpp.models.interfaces.Allocation` library. |
| Hard-coded Japanese era start date (`cpp/japanese-era/exact-era-date`) | | This query is no longer run on LGTM. |
| No space for zero terminator (`cpp/no-space-for-terminator`) | More true positive results | This query now identifies a wider variety of buffer allocations using the `semmle.code.cpp.models.interfaces.Allocation` library. |
| Memory is never freed (`cpp/memory-never-freed`) | More true positive results | This query now identifies a wider variety of buffer allocations using the `semmle.code.cpp.models.interfaces.Allocation` library. |
| Memory may not be freed (`cpp/memory-may-not-be-freed`) | More true positive results | This query now identifies a wider variety of buffer allocations using the `semmle.code.cpp.models.interfaces.Allocation` library. |
| Mismatching new/free or malloc/delete (`cpp/new-free-mismatch`) | Fewer false positive results | Improved handling of template code gives greater precision. |
| Missing return statement (`cpp/missing-return`) | Fewer false positive results and more accurate locations | Functions containing `asm` statements are no longer highlighted by this query. The locations reported by this query are now more accurate in some cases. |
| No space for zero terminator (`cpp/no-space-for-terminator`) | More results with greater precision | The query gives more precise results for a wider variety of buffer allocations. String arguments to formatting functions are now (usually) expected to be null terminated strings. Use of the `semmle.code.cpp.models.interfaces.Allocation` library identifies problems with a wider variety of buffer allocations. This query is also more conservative when identifying which pointers point to null-terminated strings. |
| Overflow in uncontrolled allocation size (`cpp/uncontrolled-allocation-size`) | Fewer false positive results | The query now produces fewer, more accurate results. Cases where the tainted allocation size is range checked are more reliably excluded. |
| Missing return statement (`cpp/missing-return`) | Fewer false positive results | Functions containing `asm` statements are no longer highlighted by this query. |
| Hard-coded Japanese era start date (`cpp/japanese-era/exact-era-date`) | | This query is no longer run on LGTM. |
| No space for zero terminator (`cpp/no-space-for-terminator`) | Fewer false positive results | This query has been modified to be more conservative when identifying which pointers point to null-terminated strings. This approach produces fewer, more accurate results. |
| Overloaded assignment does not return 'this' (`cpp/assignment-does-not-return-this`) | Fewer false positive results | This query no longer reports incorrect results in template classes. |
| Pointer overflow check (`cpp/pointer-overflow-check`),<br> Possibly wrong buffer size in string copy (`cpp/bad-strncpy-size`),<br> Signed overflow check (`cpp/signed-overflow-check`) | More correct results | A new library is used for determining which expressions have identical value, giving more precise results. There is a performance cost to this, and the LGTM suite will overall run slower than before. |
| Unsafe array for days of the year (`cpp/leap-year/unsafe-array-for-days-of-the-year`) | | This query is no longer run on LGTM. |
| Unsigned comparison to zero (`cpp/unsigned-comparison-zero`) | More correct results | This query now also looks for comparisons of the form `0 <= x`. |
## Changes to libraries
* The built-in C++20 "spaceship operator" (`<=>`) is now supported via the QL
class `SpaceshipExpr`. Overloaded forms are modeled as calls to functions
named `operator<=>`.
* The data-flow library (`semmle.code.cpp.dataflow.DataFlow` and
`semmle.code.cpp.dataflow.TaintTracking`) has been improved, which affects
and improves some security queries. The improvements are:
- Track flow through functions that combine taint tracking with flow through fields.
- Track flow through clone-like functions, that is, functions that read contents of a field from a
parameter and stores the value in the field of a returned object.
* The security pack taint tracking library
(`semmle.code.cpp.security.TaintTracking`) uses a new intermediate
representation. This provides a more precise analysis of flow through
parameters and pointers. For new queries, however, we continue to recommend
using `semmle.code.cpp.dataflow.TaintTracking`.
* The global value numbering library
(`semmle.code.cpp.valuenumbering.GlobalValueNumbering`) uses a new
intermediate representation to provide a more precise analysis of
heap-allocated memory and pointers to stack variables.
* New libraries have been created to provide a more consistent and useful interface
for modeling allocation and deallocation. These replace the old
`semmle.code.cpp.commons.Alloc` library.
* The new `semmle.code.cpp.models.interfaces.Allocation` library models
allocations, such as `new` expressions and calls to `malloc`.
* The new `semmle.code.cpp.models.interfaces.Deallocation` library
models deallocations, such as `delete` expressions and calls to `free`.
* The predicate `freeCall` in `semmle.code.cpp.commons.Alloc` has been
deprecated. The `Allocation` and `Deallocation` models in
`semmle.code.cpp.models.interfaces` should be used instead.
* The data-flow library has been improved when flow through functions needs to be
combined with both taint tracking and flow through fields allowing more flow
to be tracked. This affects and improves some security queries, which may
report additional results.
* Created the `semmle.code.cpp.models.interfaces.Allocation` library to model allocation such as `new` expressions and calls to `malloc`. This in intended to replace the functionality in `semmle.code.cpp.commons.Alloc` with a more consistent and useful interface.
* Created the `semmle.code.cpp.models.interfaces.Deallocation` library to model deallocation such as `delete` expressions and calls to `free`. This in intended to replace the functionality in `semmle.code.cpp.commons.Alloc` with a more consistent and useful interface.
* The new class `StackVariable` should be used in place of `LocalScopeVariable`
in most cases. The difference is that `StackVariable` does not include
variables declared with `static` or `thread_local`.
* As a rule of thumb, custom queries about the _values_ of variables should
be changed from `LocalScopeVariable` to `StackVariable`, while queries
about the _name or scope_ of variables should remain unchanged.
* The `LocalScopeVariableReachability` library is deprecated in favor of
`StackVariableReachability`. The functionality is the same.
* Taint tracking and data flow now features better modeling of commonly-used
library functions:
* `gets` and similar functions,
* the most common operations on `std::string`,
* `strdup` and similar functions, and
* formatting functions such as `sprintf`.
* As a rule of thumb, custom queries about the _values_ of variables should
be changed from `LocalScopeVariable` to `StackVariable`, while queries
about the _name or scope_ of variables should remain unchanged.
* The `LocalScopeVariableReachability` library is deprecated in favor of
`StackVariableReachability`. The functionality is the same.
* The models library models `strlen` in more detail, and includes common variations such as `wcslen`.
* The taint tracking library (`semmle.code.cpp.dataflow.TaintTracking`) has had
the following improvements:
* The library now models data flow through `strdup` and similar functions.
* The library now models data flow through formatting functions such as `sprintf`.

View File

@@ -2,47 +2,39 @@
The following changes in version 1.24 affect C# analysis in all applications.
## General improvements
You can now suppress alerts using either single-line block comments (`/* ... */`) or line comments (`// ...`).
## New queries
| **Query** | **Tags** | **Purpose** |
|-----------------------------|-----------|--------------------------------------------------------------------|
| Assembly path injection (`cs/assembly-path-injection`) | security, external/cwe/cwe-114 | Finds user-controlled data used to load an assembly. Results are shown on LGTM by default. |
| Insecure configuration for ASP.NET requestValidationMode (`cs/insecure-request-validation-mode`) | security, external/cwe/cwe-016 | Finds where this attribute has been set to a value less than 4.5, which turns off some validation features and makes the application less secure. By default, the query is not run on LGTM. |
| Insecure SQL connection (`cs/insecure-sql-connection`) | security, external/cwe/cwe-327 | Finds unencrypted SQL connection strings. Results are not shown on LGTM by default. |
| Page request validation is disabled (`cs/web/request-validation-disabled`) | security, frameworks/asp.net, external/cwe/cwe-016 | Finds where ASP.NET page request validation has been disabled, which could make the application less secure. By default, the query is not run on LGTM. |
| Serialization check bypass (`cs/serialization-check-bypass`) | security, external/cwe/cwe-20 | Finds where data is not validated in a deserialization method. Results are not shown on LGTM by default. |
| XML injection (`cs/xml-injection`) | security, external/cwe/cwe-091 | Finds user-controlled data that is used to write directly to an XML document. Results are shown on LGTM by default. |
| Assembly path injection (`cs/assembly-path-injection`) | security, external/cwe/cwe-114 | Finds user-controlled data used to load an assembly. |
| Insecure configuration for ASP.NET requestValidationMode (`cs/insecure-request-validation-mode`) | security, external/cwe/cwe-016 | Finds where this attribute has been set to a value less than 4.5, which turns off some validation features and makes the application less secure. |
| Insecure SQL connection (`cs/insecure-sql-connection`) | security, external/cwe/cwe-327 | Finds unencrypted SQL connection strings. |
| Page request validation is disabled (`cs/web/request-validation-disabled`) | security, frameworks/asp.net, external/cwe/cwe-016 | Finds where ASP.NET page request validation has been disabled, which could make the application less secure. |
| Serialization check bypass (`cs/serialization-check-bypass`) | security, external/cwe/cwe-20 | Finds where data is not validated in a deserialization method. |
| XML injection (`cs/xml-injection`) | security, external/cwe/cwe-091 | Finds user-controlled data that is used to write directly to an XML document. |
## Changes to existing queries
| **Query** | **Expected impact** | **Change** |
|------------------------------|------------------------|-----------------------------------|
| Useless assignment to local variable (`cs/useless-assignment-to-local`) | Fewer false positive results | Results have been removed when the variable is named `_` in a `foreach` statement. |
| Dereferenced variable may be null (`cs/dereferenced-value-may-be-null`) | More results | Results are reported from parameters with a default value of `null`. |
| Information exposure through an exception (`cs/information-exposure-through-exception`) | More results | The query now recognizes writes to cookies, writes to ASP.NET (`Inner`)`Text` properties, and email contents as additional sinks. |
| Information exposure through transmitted data (`cs/sensitive-data-transmission`) | More results | The query now recognizes writes to cookies and writes to ASP.NET (`Inner`)`Text` properties as additional sinks. |
| Potentially dangerous use of non-short-circuit logic (`cs/non-short-circuit`) | Fewer false positive results | Results have been removed when the expression contains an `out` parameter. |
| Useless assignment to local variable (`cs/useless-assignment-to-local`) | Fewer false positive results | Results have been removed when the value assigned is an (implicitly or explicitly) cast default-like value. For example, `var s = (string)null` and `string s = default`. Results have also been removed when the variable is named `_` in a `foreach` statement. |
| XPath injection (`cs/xml/xpath-injection`) | More results | The query now recognizes calls to methods on `System.Xml.XPath.XPathNavigator` objects. |
## Removal of old queries
## Changes to code extraction
* Tuple expressions, for example `(int,bool)` in `default((int,bool))` are now extracted correctly.
* Expression nullability flow state is extracted.
* Implicitly typed `stackalloc` expressions are now extracted correctly.
* The difference between `stackalloc` array creations and normal array creations is extracted.
* Expression nullability flow state is extracted.
## Changes to libraries
* The data-flow library has been improved, which affects and improves most security queries. The improvements are:
- Track flow through methods that combine taint tracking with flow through fields.
- Track flow through clone-like methods, that is, methods that read the contents of a field from a
parameter and store the value in the field of a returned object.
* The data-flow library has been improved when flow through methods needs to be
combined with both taint tracking and flow through fields allowing more flow
to be tracked. This affects and improves most security queries, which may
report additional results.
* The taint tracking library now tracks flow through (implicit or explicit) conversion operator calls.
* [Code contracts](https://docs.microsoft.com/en-us/dotnet/framework/debug-trace-profile/code-contracts) are now recognized, and are treated like any other assertion methods.
* Expression nullability flow state is given by the predicates `Expr.hasNotNullFlowState()` and `Expr.hasMaybeNullFlowState()`.
* `stackalloc` array creations are now represented by the QL class `Stackalloc`. Previously they were represented by the class `ArrayCreation`.
* A new class `RemoteFlowSink` has been added to model sinks where data might be exposed to external users. Examples include web page output, emails, and cookies.
## Changes to autobuilder

View File

@@ -4,33 +4,31 @@ The following changes in version 1.24 affect Java analysis in all applications.
## General improvements
* You can now suppress alerts using either single-line block comments (`/* ... */`) or line comments (`// ...`).
* A `Customizations.qll` file has been added to allow customizations of the standard library that apply to all queries.
* Alert suppression can now be done with single-line block comments (`/* ... */`) as well as line comments (`// ...`).
## New queries
| **Query** | **Tags** | **Purpose** |
|-----------------------------|-----------|--------------------------------------------------------------------|
| Disabled Spring CSRF protection (`java/spring-disabled-csrf-protection`) | security, external/cwe/cwe-352 | Finds disabled Cross-Site Request Forgery (CSRF) protection in Spring. Results are shown on LGTM by default. |
| Disabled Spring CSRF protection (`java/spring-disabled-csrf-protection`) | security, external/cwe/cwe-352 | Finds disabled Cross-Site Request Forgery (CSRF) protection in Spring. |
| Failure to use HTTPS or SFTP URL in Maven artifact upload/download (`java/maven/non-https-url`) | security, external/cwe/cwe-300, external/cwe/cwe-319, external/cwe/cwe-494, external/cwe/cwe-829 | Finds use of insecure protocols during Maven dependency resolution. Results are shown on LGTM by default. |
| LDAP query built from user-controlled sources (`java/ldap-injection`) | security, external/cwe/cwe-090 | Finds LDAP queries vulnerable to injection of unsanitized user-controlled input. Results are shown on LGTM by default. |
| Left shift by more than the type width (`java/lshift-larger-than-type-width`) | correctness | Finds left shifts of ints by 32 bits or more and left shifts of longs by 64 bits or more. Results are shown on LGTM by default. |
| Suspicious date format (`java/suspicious-date-format`) | correctness | Finds date format patterns that use placeholders that are likely to be incorrect. Results are shown on LGTM by default. |
| Suspicious date format (`java/suspicious-date-format`) | correctness | Finds date format patterns that use placeholders that are likely to be incorrect. |
## Changes to existing queries
| **Query** | **Expected impact** | **Change** |
|------------------------------|------------------------|-----------------------------------|
| Dereferenced variable may be null (`java/dereferenced-value-may-be-null`) | Fewer false positive results | Final fields with a non-null initializer are no longer reported. |
| Expression always evaluates to the same value (`java/evaluation-to-constant`) | Fewer false positive results | Expressions of the form `0 * x` are usually intended and no longer reported. Also left shift of ints by 32 bits and longs by 64 bits are no longer reported as they are not constant, these results are instead reported by the new query `java/lshift-larger-than-type-width`. |
| Useless null check (`java/useless-null-check`) | More true positive results | Useless checks on final fields with a non-null initializer are now reported. |
| Dereferenced variable may be null (`java/dereferenced-value-may-be-null`) | Fewer false positives | Final fields with a non-null initializer are no longer reported. |
| Expression always evaluates to the same value (`java/evaluation-to-constant`) | Fewer false positives | Expressions of the form `0 * x` are usually intended and no longer reported. Also left shift of ints by 32 bits and longs by 64 bits are no longer reported as they are not constant, these results are instead reported by the new query `java/lshift-larger-than-type-width`. |
| Useless null check (`java/useless-null-check`) | More true positives | Useless checks on final fields with a non-null initializer are now reported. |
## Changes to libraries
* The data-flow library has been improved, which affects and improves most security queries. The improvements are:
- Track flow through methods that combine taint tracking with flow through fields.
- Track flow through clone-like methods, that is, methods that read contents of a field from a
parameter and stores the value in the field of a returned object.
* The data-flow library has been improved when flow through methods needs to be
combined with both taint tracking and flow through fields allowing more flow
to be tracked. This affects and improves most security queries, which may
report additional results.
* Identification of test classes has been improved. Previously, one of the
match conditions would classify any class with a name containing the string
"Test" as a test class, but now this matching has been replaced with one that
@@ -38,6 +36,6 @@ The following changes in version 1.24 affect Java analysis in all applications.
general file classification mechanism and thus suppression of alerts, and
also any security queries using taint tracking, as test classes act as
default barriers stopping taint flow.
* Parentheses are now no longer modeled directly in the AST, that is, the
* Parentheses are now no longer modelled directly in the AST, that is, the
`ParExpr` class is empty. Instead, a parenthesized expression can be
identified with the `Expr.isParenthesized()` member predicate.

View File

@@ -2,99 +2,47 @@
## General improvements
* TypeScript 3.8 is now supported.
* Alert suppression can now be done with single-line block comments (`/* ... */`) as well as line comments (`// ...`).
* You can now suppress alerts using either single-line block comments (`/* ... */`) or line comments (`// ...`).
* Imports with the `.js` extension can now be resolved to a TypeScript file,
when the import refers to a file generated by TypeScript.
* Resolution of imports has improved, leading to more results from the security queries:
- Imports with the `.js` extension can now be resolved to a TypeScript file,
when the import refers to a file generated by TypeScript.
- Imports that rely on path-mappings from a `tsconfig.json` file can now be resolved.
- Export declarations of the form `export * as ns from "x"` are now analyzed more precisely.
* The analysis of sanitizers has improved, leading to more accurate results from the security queries.
In particular:
- Sanitizer guards now act across function boundaries in more cases.
- Sanitizers can now better distinguish between a tainted value and an object _containing_ a tainted value.
* Call graph construction has been improved, leading to more results from the security queries:
- Calls can now be resolved to indirectly-defined class members in more cases.
- Calls through partial invocations such as `.bind` can now be resolved in more cases.
* Support for flow summaries has been more clearly marked as being experimental and moved to the new `experimental` folder.
- The analysis of sanitizer guards has improved, leading to fewer false-positive results from the security queries.
* Support for the following frameworks and libraries has been improved:
- [chrome-remote-interface](https://www.npmjs.com/package/chrome-remote-interface)
- [Electron](https://electronjs.org/)
- [for-in](https://www.npmjs.com/package/for-in)
- [for-own](https://www.npmjs.com/package/for-own)
- [fstream](https://www.npmjs.com/package/fstream)
- [Handlebars](https://www.npmjs.com/package/handlebars)
- [http2](https://nodejs.org/api/http2.html)
- [jQuery](https://jquery.com/)
- [jsonfile](https://www.npmjs.com/package/jsonfile)
- [Koa](https://www.npmjs.com/package/koa)
- [lazy-cache](https://www.npmjs.com/package/lazy-cache)
- [mongodb](https://www.npmjs.com/package/mongodb)
- [ncp](https://www.npmjs.com/package/ncp)
- [Node.js](https://nodejs.org/)
- [node-dir](https://www.npmjs.com/package/node-dir)
- [path-exists](https://www.npmjs.com/package/path-exists)
- [pg](https://www.npmjs.com/package/pg)
- [react](https://www.npmjs.com/package/react)
- [recursive-readdir](https://www.npmjs.com/package/recursive-readdir)
- [request](https://www.npmjs.com/package/request)
- [rimraf](https://www.npmjs.com/package/rimraf)
- [send](https://www.npmjs.com/package/send)
- [Socket.IO](https://socket.io/)
- [SockJS](https://www.npmjs.com/package/sockjs)
- [SockJS-client](https://www.npmjs.com/package/sockjs-client)
- [typeahead.js](https://www.npmjs.com/package/typeahead.js)
- [vinyl-fs](https://www.npmjs.com/package/vinyl-fs)
- [WebSocket](https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API)
- [write-file-atomic](https://www.npmjs.com/package/write-file-atomic)
- [ws](https://github.com/websockets/ws)
- [Electron](https://electronjs.org/)
- [Handlebars](https://www.npmjs.com/package/handlebars)
- [Koa](https://www.npmjs.com/package/koa)
- [Node.js](https://nodejs.org/)
- [Socket.IO](https://socket.io/)
- [WebSocket](https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API)
- [http2](https://nodejs.org/api/http2.html)
- [react](https://www.npmjs.com/package/react)
- [typeahead.js](https://www.npmjs.com/package/typeahead.js)
- [ws](https://github.com/websockets/ws)
## New queries
| **Query** | **Tags** | **Purpose** |
|---------------------------------------------------------------------------------|-------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Cross-site scripting through exception (`js/xss-through-exception`) | security, external/cwe/cwe-079, external/cwe/cwe-116 | Highlights potential XSS vulnerabilities where an exception is written to the DOM. Results are not shown on LGTM by default. |
| Missing await (`js/missing-await`) | correctness | Highlights expressions that operate directly on a promise object in a nonsensical way, instead of awaiting its result. Results are shown on LGTM by default. |
| Polynomial regular expression used on uncontrolled data (`js/polynomial-redos`) | security, external/cwe/cwe-730, external/cwe/cwe-400 | Highlights expensive regular expressions that may be used on malicious input. Results are shown on LGTM by default. |
| Prototype pollution in utility function (`js/prototype-pollution-utility`) | security, external/cwe/cwe-400, external/cwe/cwe-471 | Highlights recursive assignment operations that are susceptible to prototype pollution. Results are shown on LGTM by default. |
| Regular expression always matches (`js/regex/always-matches`) | correctness, regular-expressions | Highlights regular expression checks that trivially succeed by matching an empty substring. Results are shown on LGTM by default. |
| Unsafe jQuery plugin (`js/unsafe-jquery-plugin`) | | Highlights potential XSS vulnerabilities in unsafely designed jQuery plugins. Results are shown on LGTM by default. |
| Unnecessary use of `cat` process (`js/unnecessary-use-of-cat`) | correctness, security, maintainability | Highlights command executions of `cat` where the fs API should be used instead. Results are shown on LGTM by default. |
| Missing await (`js/missing-await`) | correctness | Highlights expressions that operate directly on a promise object in a nonsensical way, instead of awaiting its result. Results are shown on LGTM by default. |
| Prototype pollution in utility function (`js/prototype-pollution-utility`) | security, external/cwe/cwe-400, external/cwe/cwe-471 | Highlights recursive copying operations that are susceptible to prototype pollution. Results are shown on LGTM by default. |
## Changes to existing queries
| **Query** | **Expected impact** | **Change** |
|--------------------------------|------------------------------|---------------------------------------------------------------------------|
| Clear-text logging of sensitive information (`js/clear-text-logging`) | More results | More results involving `process.env` and indirect calls to logging methods are recognized. |
| Duplicate parameter names (`js/duplicate-parameter-name`) | Fewer results | This query now ignores additional parameters that reasonably can have duplicated names. |
| Duplicate parameter names (`js/duplicate-parameter-name`) | Fewer results | This query now recognizes additional parameters that reasonably can have duplicated names. |
| Incomplete string escaping or encoding (`js/incomplete-sanitization`) | Fewer false positive results | This query now recognizes additional cases where a single replacement is likely to be intentional. |
| Unbound event handler receiver (`js/unbound-event-handler-receiver`) | Fewer false positive results | This query now recognizes additional ways event handler receivers can be bound. |
| Expression has no effect (`js/useless-expression`) | Fewer false positive results | The query now recognizes block-level flow type annotations and ignores the first statement of a try block. |
| Identical operands (`js/redundant-operation`) | Fewer results | This query now excludes cases where the operands change a value using ++/-- expressions. |
| Incomplete string escaping or encoding (`js/incomplete-sanitization`) | Fewer false positive results | This query now recognizes and excludes additional cases where a single replacement is likely to be intentional. |
| Incomplete URL scheme check (`js/incomplete-url-scheme-check`) | More results | This query now recognizes additional variations of URL scheme checks. |
| Missing CSRF middleware (`js/missing-token-validation`) | Fewer false positive results | The query reports fewer duplicates and only flags handlers that explicitly access cookie data. |
| Superfluous trailing arguments (`js/superfluous-trailing-arguments`) | Fewer results | This query now excludes cases where a function uses the `Function.arguments` value to process a variable number of parameters. |
| Syntax error (`js/syntax-error`) | Lower severity | This results of this query are now displayed with lower severity. |
| Unbound event handler receiver (`js/unbound-event-handler-receiver`) | Fewer false positive results | This query now recognizes additional ways event handler receivers can be bound. |
| Uncontrolled command line (`js/command-line-injection`) | More results | This query now recognizes additional ways of constructing arguments to `cmd.exe` and `/bin/sh`. |
| Uncontrolled data used in path expression (`js/path-injection`) | More results | This query now recognizes additional ways dangerous paths can be constructed and used. |
| Use of call stack introspection in strict mode (`js/strict-mode-call-stack-introspection`) | Fewer false positive results | The query no longer flags expression statements. |
| Use of password hash with insufficient computational effort (`js/insufficient-password-hash`) | Fewer false positive results | This query now recognizes and excludes additional cases that do not require secure hashing. |
| Useless regular-expression character escape (`js/useless-regexp-character-escape`) | Fewer false positive results | This query now distinguishes between escapes in strings and regular expression literals. |
| Missing CSRF middleware (`js/missing-token-validation`) | Fewer false positive results | The query reports fewer duplicates and only flags handlers that explicitly access cookie data. |
## Changes to libraries
* The predicates `RegExpTerm.getSuccessor` and `RegExpTerm.getPredecessor` have been changed to reflect textual, not operational, matching order. This only makes a difference in lookbehind assertions, which are operationally matched backwards. Previously, `getSuccessor` would mimick this, so in an assertion `(?<=ab)` the term `b` would be considered the predecessor, not the successor, of `a`. Textually, however, `a` is still matched before `b`, and this is the order we now follow.
* An extensible model of the `EventEmitter` pattern has been implemented.
* Taint-tracking configurations now interact differently with the `data` flow label, which may affect queries
that combine taint-tracking and flow labels.
- Sources added by the 1-argument `isSource` predicate are associated with the `taint` label now, instead of the `data` label.
- Sanitizers now only block the `taint` label. As a result, sanitizers no longer block the flow of tainted values wrapped inside a property of an object.
To retain the old behavior, instead use a barrier, or block the `data` flow label using a labeled sanitizer.

View File

@@ -1,55 +0,0 @@
# Improvements to Python analysis
The following changes in version 1.24 affect Python analysis in all applications.
## General improvements
- Support for Django version 2.x and 3.x
- Taint tracking now correctly tracks taint in destructuring assignments. For example, if `tainted_list` is a list of tainted tainted elements, then
```python
head, *tail = tainted_list
```
will result in `tail` being tainted with the same taint as `tainted_list`, and `head` being tainted with the taint of the elements of `tainted_list`.
- A large number of libraries and queries have been moved to the new `Value` API, which should result in more precise results.
- The `Value` interface has been extended in various ways:
- A new `StringValue` class has been added, for tracking string literals.
- Values now have a `booleanValue` method which returns the boolean interpretation of the given value.
- Built-in methods for which the return type is not fixed are now modeled as returning an unknown value by default.
## Changes to existing queries
| **Query** | **Expected impact** | **Change** |
|----------------------------|------------------------|------------------------------------------------------------------|
| Arbitrary file write during tarfile extraction (`py/tarslip`) | Fewer false negative results | Negations are now handled correctly in conditional expressions that may sanitize tainted values. |
| First parameter of a method is not named 'self' (`py/not-named-self`) | Fewer false positive results | `__class_getitem__` is now recognized as a class method. |
| Import of deprecated module (`py/import-deprecated-module`) | Fewer false positive results | Deprecated modules that are used to provide backwards compatibility are no longer reported.|
| Module imports itself (`py/import-own-module`) | Fewer false positive results | Imports local to a given package are no longer classified as self-imports. |
| Uncontrolled command line (`py/command-line-injection`) | More results | We now model the `fabric` and `invoke` packages for command execution. |
### Web framework support
The CodeQL library has improved support for the web frameworks: Bottle, CherryPy, Falcon, Pyramid, TurboGears, Tornado, and Twisted. They now provide a proper `HttpRequestTaintSource`, instead of a `TaintSource`. This will enable results for the following queries:
- `py/path-injection`
- `py/command-line-injection`
- `py/reflective-xss`
- `py/sql-injection`
- `py/code-injection`
- `py/unsafe-deserialization`
- `py/url-redirection`
The library also has improved support for the web framework Twisted. It now provides a proper
`HttpResponseTaintSink`, instead of a `TaintSink`. This will enable results for the following
queries:
- `py/reflective-xss`
- `py/stack-trace-exposure`
## Changes to libraries
### Taint tracking
- The `urlsplit` and `urlparse` functions now propagate taint appropriately.
- HTTP requests using the `requests` library are now modeled.

View File

@@ -39,12 +39,6 @@
"java/ql/src/semmle/code/java/dataflow/internal/tainttracking1/TaintTrackingImpl.qll",
"java/ql/src/semmle/code/java/dataflow/internal/tainttracking2/TaintTrackingImpl.qll"
],
"DataFlow Java/C++/C# Consistency checks": [
"java/ql/src/semmle/code/java/dataflow/internal/DataFlowImplConsistency.qll",
"cpp/ql/src/semmle/code/cpp/dataflow/internal/DataFlowImplConsistency.qll",
"cpp/ql/src/semmle/code/cpp/ir/dataflow/internal/DataFlowImplConsistency.qll",
"csharp/ql/src/semmle/code/csharp/dataflow/internal/DataFlowImplConsistency.qll"
],
"C++ SubBasicBlocks": [
"cpp/ql/src/semmle/code/cpp/controlflow/SubBasicBlocks.qll",
"cpp/ql/src/semmle/code/cpp/dataflow/internal/SubBasicBlocks.qll"
@@ -228,27 +222,13 @@
"cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/PrintSSA.qll",
"csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/internal/PrintSSA.qll"
],
"IR ValueNumberInternal": [
"cpp/ql/src/semmle/code/cpp/ir/implementation/raw/gvn/internal/ValueNumberingInternal.qll",
"cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/gvn/internal/ValueNumberingInternal.qll",
"cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/gvn/internal/ValueNumberingInternal.qll",
"csharp/ql/src/semmle/code/csharp/ir/implementation/raw/gvn/internal/ValueNumberingInternal.qll",
"csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/gvn/internal/ValueNumberingInternal.qll"
],
"C++ IR ValueNumber": [
"IR ValueNumber": [
"cpp/ql/src/semmle/code/cpp/ir/implementation/raw/gvn/ValueNumbering.qll",
"cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/gvn/ValueNumbering.qll",
"cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/gvn/ValueNumbering.qll",
"csharp/ql/src/semmle/code/csharp/ir/implementation/raw/gvn/ValueNumbering.qll",
"csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/gvn/ValueNumbering.qll"
],
"C++ IR PrintValueNumbering": [
"cpp/ql/src/semmle/code/cpp/ir/implementation/raw/gvn/PrintValueNumbering.qll",
"cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/gvn/PrintValueNumbering.qll",
"cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/gvn/PrintValueNumbering.qll",
"csharp/ql/src/semmle/code/csharp/ir/implementation/raw/gvn/PrintValueNumbering.qll",
"csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/gvn/PrintValueNumbering.qll"
],
"C++ IR ConstantAnalysis": [
"cpp/ql/src/semmle/code/cpp/ir/implementation/raw/constant/ConstantAnalysis.qll",
"cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/constant/ConstantAnalysis.qll",

View File

@@ -1,140 +0,0 @@
#!/usr/bin/env python3
# Due to various technical limitations, we sometimes have files that need to be
# kept identical in the repository. This script loads a database of such
# files and can perform two functions: check whether they are still identical,
# and overwrite the others with a master copy if needed.
import hashlib
import shutil
import os
import sys
import json
import re
path = os.path
file_groups = {}
def add_prefix(prefix, relative):
result = path.join(prefix, relative)
if path.commonprefix((path.realpath(result), path.realpath(prefix))) != \
path.realpath(prefix):
raise Exception("Path {} is not below {}".format(
result, prefix))
return result
def load_if_exists(prefix, json_file_relative):
json_file_name = path.join(prefix, json_file_relative)
if path.isfile(json_file_name):
print("Loading file groups from", json_file_name)
with open(json_file_name, 'r', encoding='utf-8') as fp:
raw_groups = json.load(fp)
prefixed_groups = {
name: [
add_prefix(prefix, relative)
for relative in relatives
]
for name, relatives in raw_groups.items()
}
file_groups.update(prefixed_groups)
# Generates a list of C# test files that should be in sync
def csharp_test_files():
test_file_re = re.compile('.*(Bad|Good)[0-9]*\\.cs$')
csharp_doc_files = {
file:os.path.join(root, file)
for root, dirs, files in os.walk("csharp/ql/src")
for file in files
if test_file_re.match(file)
}
return {
"C# test '" + file + "'" : [os.path.join(root, file), csharp_doc_files[file]]
for root, dirs, files in os.walk("csharp/ql/test")
for file in files
if file in csharp_doc_files
}
def file_checksum(filename):
with open(filename, 'rb') as file_handle:
return hashlib.sha1(file_handle.read()).hexdigest()
def check_group(group_name, files, master_file_picker, emit_error):
checksums = {file_checksum(f) for f in files}
if len(checksums) == 1:
return
master_file = master_file_picker(files)
if master_file is None:
emit_error(__file__, 0,
"Files from group '"+ group_name +"' not in sync.")
emit_error(__file__, 0,
"Run this script with a file-name argument among the "
"following to overwrite the remaining files with the contents "
"of that file or run with the --latest switch to update each "
"group of files from the most recently modified file in the group.")
for filename in files:
emit_error(__file__, 0, " " + filename)
else:
print(" Syncing others from", master_file)
for filename in files:
if filename == master_file:
continue
print(" " + filename)
os.replace(filename, filename + '~')
shutil.copy(master_file, filename)
print(" Backups written with '~' appended to file names")
def chdir_repo_root():
root_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), '..')
os.chdir(root_path)
def choose_master_file(master_file, files):
if master_file in files:
return master_file
else:
return None
def choose_latest_file(files):
latest_time = None
latest_file = None
for filename in files:
file_time = os.path.getmtime(filename)
if (latest_time is None) or (latest_time < file_time):
latest_time = file_time
latest_file = filename
return latest_file
local_error_count = 0
def emit_local_error(path, line, error):
print('ERROR: ' + path + ':' + line + " - " + error)
global local_error_count
local_error_count += 1
# This function is invoked directly by a CI script, which passes a different error-handling
# callback.
def sync_identical_files(emit_error):
if len(sys.argv) == 1:
master_file_picker = lambda files: None
elif len(sys.argv) == 2:
if sys.argv[1] == "--latest":
master_file_picker = choose_latest_file
elif os.path.isfile(sys.argv[1]):
master_file_picker = lambda files: choose_master_file(sys.argv[1], files)
else:
raise Exception("File not found")
else:
raise Exception("Bad command line or file not found")
chdir_repo_root()
load_if_exists('.', 'config/identical-files.json')
file_groups.update(csharp_test_files())
for group_name, files in file_groups.items():
check_group(group_name, files, master_file_picker, emit_error)
def main():
sync_identical_files(emit_local_error)
if local_error_count > 0:
exit(1)
if __name__ == "__main__":
main()

View File

@@ -4,7 +4,7 @@ private predicate freed(Expr e) {
e = any(DeallocationExpr de).getFreedExpr()
or
exists(ExprCall c |
// cautiously assume that any `ExprCall` could be a deallocation expression.
// cautiously assume that any ExprCall could be a freeCall.
c.getAnArgument() = e
)
}

View File

@@ -5,34 +5,16 @@
import cpp
import semmle.code.cpp.controlflow.SSA
import semmle.code.cpp.dataflow.DataFlow
import semmle.code.cpp.models.implementations.Allocation
import semmle.code.cpp.models.implementations.Deallocation
/**
* Holds if `alloc` is a use of `malloc` or `new`. `kind` is
* a string describing the type of the allocation.
*/
predicate allocExpr(Expr alloc, string kind) {
isAllocationExpr(alloc) and
(
exists(Function target |
alloc.(AllocationExpr).(FunctionCall).getTarget() = target and
(
target.getName() = "operator new" and
kind = "new" and
// exclude placement new and custom overloads as they
// may not conform to assumptions
not target.getNumberOfParameters() > 1
or
target.getName() = "operator new[]" and
kind = "new[]" and
// exclude placement new and custom overloads as they
// may not conform to assumptions
not target.getNumberOfParameters() > 1
or
not target instanceof OperatorNewAllocationFunction and
kind = "malloc"
)
)
alloc instanceof FunctionCall and
kind = "malloc"
or
alloc instanceof NewExpr and
kind = "new" and
@@ -45,8 +27,7 @@ predicate allocExpr(Expr alloc, string kind) {
// exclude placement new and custom overloads as they
// may not conform to assumptions
not alloc.(NewArrayExpr).getAllocatorCall().getTarget().getNumberOfParameters() > 1
) and
not alloc.isFromUninstantiatedTemplate(_)
)
}
/**
@@ -128,20 +109,8 @@ predicate allocReaches(Expr e, Expr alloc, string kind) {
* describing the type of that free or delete.
*/
predicate freeExpr(Expr free, Expr freed, string kind) {
exists(Function target |
freed = free.(DeallocationExpr).getFreedExpr() and
free.(FunctionCall).getTarget() = target and
(
target.getName() = "operator delete" and
kind = "delete"
or
target.getName() = "operator delete[]" and
kind = "delete[]"
or
not target instanceof OperatorDeleteDeallocationFunction and
kind = "free"
)
)
freeCall(free, freed) and
kind = "free"
or
free.(DeleteExpr).getExpr() = freed and
kind = "delete"

View File

@@ -15,37 +15,24 @@ class ConstantZero extends Expr {
}
}
/**
* Holds if `candidate` is an expression such that if it's unsigned then we
* want an alert at `ge`.
*/
private predicate lookForUnsignedAt(RelationalOperation ge, Expr candidate) {
// Base case: `candidate >= 0` (or `0 <= candidate`)
(
ge instanceof GEExpr or
ge instanceof LEExpr
) and
ge.getLesserOperand() instanceof ConstantZero and
candidate = ge.getGreaterOperand().getFullyConverted() and
// left/greater operand was a signed or unsigned IntegralType before conversions
// (not a pointer, checking a pointer >= 0 is an entirely different mistake)
// (not an enum, as the fully converted type of an enum is compiler dependent
// so checking an enum >= 0 is always reasonable)
ge.getGreaterOperand().getUnderlyingType() instanceof IntegralType
or
// Recursive case: `...(largerType)candidate >= 0`
exists(Conversion conversion |
lookForUnsignedAt(ge, conversion) and
candidate = conversion.getExpr() and
conversion.getType().getSize() > candidate.getType().getSize()
)
}
class UnsignedGEZero extends ComparisonOperation {
class UnsignedGEZero extends GEExpr {
UnsignedGEZero() {
this.getRightOperand() instanceof ConstantZero and
// left operand was a signed or unsigned IntegralType before conversions
// (not a pointer, checking a pointer >= 0 is an entirely different mistake)
// (not an enum, as the fully converted type of an enum is compiler dependent
// so checking an enum >= 0 is always reasonable)
getLeftOperand().getUnderlyingType() instanceof IntegralType and
exists(Expr ue |
lookForUnsignedAt(this, ue) and
ue.getUnderlyingType().(IntegralType).isUnsigned()
// ue is some conversion of the left operand
ue = getLeftOperand().getConversion*() and
// ue is unsigned
ue.getUnderlyingType().(IntegralType).isUnsigned() and
// ue may be converted to zero or more strictly larger possibly signed types
// before it is fully converted
forall(Expr following | following = ue.getConversion+() |
following.getType().getSize() > ue.getType().getSize()
)
)
}
}

View File

@@ -3,7 +3,7 @@
* @description Using the TLS or SSLv23 protocol from the boost::asio library, but not disabling deprecated protocols, or disabling minimum-recommended protocols.
* @kind problem
* @problem.severity error
* @id cpp/boost/tls-settings-misconfiguration
* @id cpp/boost/tls_settings_misconfiguration
* @tags security
*/

View File

@@ -8,6 +8,6 @@ struct S {
// Whereas here it does make a semantic difference.
auto getValCorrect() const -> int {
return val;
return val
}
};

View File

@@ -6,7 +6,6 @@
import cpp
pragma[inline]
private predicate arithTypesMatch(Type arg, Type parm) {
arg = parm
or

View File

@@ -4,7 +4,7 @@
*
* By default they fall back to the reasonable defaults provided in
* `DefaultOptions.qll`, but by modifying this file, you can customize
* the standard analyses to give better results for your project.
* the standard Semmle analyses to give better results for your project.
*/
import cpp

View File

@@ -2,7 +2,7 @@
* @name Uncontrolled data used in path expression
* @description Accessing paths influenced by users can allow an
* attacker to access unexpected resources.
* @kind path-problem
* @kind problem
* @problem.severity warning
* @precision medium
* @id cpp/path-injection
@@ -17,7 +17,6 @@ import cpp
import semmle.code.cpp.security.FunctionWithWrappers
import semmle.code.cpp.security.Security
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
/**
* A function for opening a file.
@@ -52,19 +51,12 @@ class FileFunction extends FunctionWithWrappers {
override predicate interestingArg(int arg) { arg = 0 }
}
class TaintedPathConfiguration extends TaintTrackingConfiguration {
override predicate isSink(Element tainted) {
exists(FileFunction fileFunction | fileFunction.outermostWrapperFunctionCall(tainted, _))
}
}
from
FileFunction fileFunction, Expr taintedArg, Expr taintSource, PathNode sourceNode,
PathNode sinkNode, string taintCause, string callChain
FileFunction fileFunction, Expr taintedArg, Expr taintSource, string taintCause, string callChain
where
fileFunction.outermostWrapperFunctionCall(taintedArg, callChain) and
taintedWithPath(taintSource, taintedArg, sourceNode, sinkNode) and
tainted(taintSource, taintedArg) and
isUserInput(taintSource, taintCause)
select taintedArg, sourceNode, sinkNode,
select taintedArg,
"This argument to a file access function is derived from $@ and then passed to " + callChain,
taintSource, "user input (" + taintCause + ")"

View File

@@ -2,7 +2,7 @@
* @name CGI script vulnerable to cross-site scripting
* @description Writing user input directly to a web page
* allows for a cross-site scripting vulnerability.
* @kind path-problem
* @kind problem
* @problem.severity error
* @precision high
* @id cpp/cgi-xss
@@ -13,7 +13,6 @@
import cpp
import semmle.code.cpp.commons.Environment
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
/** A call that prints its arguments to `stdout`. */
class PrintStdoutCall extends FunctionCall {
@@ -28,13 +27,8 @@ class QueryString extends EnvironmentRead {
QueryString() { getEnvironmentVariable() = "QUERY_STRING" }
}
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element tainted) {
exists(PrintStdoutCall call | call.getAnArgument() = tainted)
}
}
from QueryString query, Element printedArg, PathNode sourceNode, PathNode sinkNode
where taintedWithPath(query, printedArg, sourceNode, sinkNode)
select printedArg, sourceNode, sinkNode, "Cross-site scripting vulnerability due to $@.", query,
"this query data"
from QueryString query, PrintStdoutCall call, Element printedArg
where
call.getAnArgument() = printedArg and
tainted(query, printedArg)
select printedArg, "Cross-site scripting vulnerability due to $@.", query, "this query data"

View File

@@ -3,7 +3,7 @@
* @description Including user-supplied data in a SQL query without
* neutralizing special elements can make code vulnerable
* to SQL Injection.
* @kind path-problem
* @kind problem
* @problem.severity error
* @precision high
* @id cpp/sql-injection
@@ -15,7 +15,6 @@ import cpp
import semmle.code.cpp.security.Security
import semmle.code.cpp.security.FunctionWithWrappers
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
class SQLLikeFunction extends FunctionWithWrappers {
SQLLikeFunction() { sqlArgument(this.getName(), _) }
@@ -23,19 +22,11 @@ class SQLLikeFunction extends FunctionWithWrappers {
override predicate interestingArg(int arg) { sqlArgument(this.getName(), arg) }
}
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element tainted) {
exists(SQLLikeFunction runSql | runSql.outermostWrapperFunctionCall(tainted, _))
}
}
from
SQLLikeFunction runSql, Expr taintedArg, Expr taintSource, PathNode sourceNode, PathNode sinkNode,
string taintCause, string callChain
from SQLLikeFunction runSql, Expr taintedArg, Expr taintSource, string taintCause, string callChain
where
runSql.outermostWrapperFunctionCall(taintedArg, callChain) and
taintedWithPath(taintSource, taintedArg, sourceNode, sinkNode) and
tainted(taintSource, taintedArg) and
isUserInput(taintSource, taintCause)
select taintedArg, sourceNode, sinkNode,
select taintedArg,
"This argument to a SQL query function is derived from $@ and then passed to " + callChain,
taintSource, "user input (" + taintCause + ")"

View File

@@ -3,7 +3,7 @@
* @description Using externally controlled strings in a process
* operation can allow an attacker to execute malicious
* commands.
* @kind path-problem
* @kind problem
* @problem.severity warning
* @precision medium
* @id cpp/uncontrolled-process-operation
@@ -14,24 +14,13 @@
import cpp
import semmle.code.cpp.security.Security
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
predicate isProcessOperationExplanation(Expr arg, string processOperation) {
exists(int processOperationArg, FunctionCall call |
isProcessOperationArgument(processOperation, processOperationArg) and
call.getTarget().getName() = processOperation and
call.getArgument(processOperationArg) = arg
)
}
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element arg) { isProcessOperationExplanation(arg, _) }
}
from string processOperation, Expr arg, Expr source, PathNode sourceNode, PathNode sinkNode
from string processOperation, int processOperationArg, FunctionCall call, Expr arg, Element source
where
isProcessOperationExplanation(arg, processOperation) and
taintedWithPath(source, arg, sourceNode, sinkNode)
select arg, sourceNode, sinkNode,
isProcessOperationArgument(processOperation, processOperationArg) and
call.getTarget().getName() = processOperation and
call.getArgument(processOperationArg) = arg and
tainted(source, arg)
select arg,
"The value of this argument may come from $@ and is being passed to " + processOperation, source,
source.toString()

View File

@@ -2,7 +2,7 @@
* @name Unbounded write
* @description Buffer write operations that do not control the length
* of data written may overflow.
* @kind path-problem
* @kind problem
* @problem.severity error
* @precision medium
* @id cpp/unbounded-write
@@ -16,7 +16,6 @@
import semmle.code.cpp.security.BufferWrite
import semmle.code.cpp.security.Security
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
/*
* --- Summary of CWE-120 alerts ---
@@ -55,48 +54,32 @@ predicate isUnboundedWrite(BufferWrite bw) {
* }
*/
/**
* Holds if `e` is a source buffer going into an unbounded write `bw` or a
* qualifier of (a qualifier of ...) such a source.
*/
predicate unboundedWriteSource(Expr e, BufferWrite bw) {
isUnboundedWrite(bw) and e = bw.getASource()
or
exists(FieldAccess fa | unboundedWriteSource(fa, bw) and e = fa.getQualifier())
}
/*
* --- user input reach ---
*/
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element tainted) { unboundedWriteSource(tainted, _) }
override predicate taintThroughGlobals() { any() }
/**
* Identifies expressions that are potentially tainted with user
* input. Most of the work for this is actually done by the
* TaintTracking library.
*/
predicate tainted2(Expr expr, Expr inputSource, string inputCause) {
taintedIncludingGlobalVars(inputSource, expr, _) and
inputCause = inputSource.toString()
or
exists(Expr e | tainted2(e, inputSource, inputCause) |
// field accesses of a tainted struct are tainted
e = expr.(FieldAccess).getQualifier()
)
}
/*
* --- put it together ---
*/
/*
* An unbounded write is, for example `strcpy(..., tainted)`. We're looking
* for a tainted source buffer of an unbounded write, where this source buffer
* is a sink in the taint-tracking analysis.
*
* In the case of `gets` and `scanf`, where the source buffer is implicit, the
* `BufferWrite` library reports the source buffer to be the same as the
* destination buffer. Since those destination-buffer arguments are also
* modeled in the taint-tracking library as being _sources_ of taint, they are
* in practice reported as being tainted because the `security.TaintTracking`
* library does not distinguish between taint going into an argument and out of
* an argument. Thus, we get the desired alerts.
*/
from BufferWrite bw, Expr inputSource, Expr tainted, PathNode sourceNode, PathNode sinkNode
from BufferWrite bw, Expr inputSource, string inputCause
where
taintedWithPath(inputSource, tainted, sourceNode, sinkNode) and
unboundedWriteSource(tainted, bw)
select bw, sourceNode, sinkNode,
"This '" + bw.getBWDesc() + "' with input from $@ may overflow the destination.", inputSource,
inputSource.toString()
isUnboundedWrite(bw) and
tainted2(bw.getASource(), inputSource, inputCause)
select bw, "This '" + bw.getBWDesc() + "' with input from $@ may overflow the destination.",
inputSource, inputCause

View File

@@ -22,25 +22,16 @@ import semmle.code.cpp.models.interfaces.Allocation
predicate terminationProblem(AllocationExpr malloc, string msg) {
// malloc(strlen(...))
exists(StrlenCall strlen | DataFlow::localExprFlow(strlen, malloc.getSizeExpr())) and
// flows to a call that implies this is a null-terminated string
// flows into a null-terminated string function
exists(ArrayFunction af, FunctionCall fc, int arg |
DataFlow::localExprFlow(malloc, fc.getArgument(arg)) and
fc.getTarget() = af and
(
// flows into null terminated string argument
// null terminated string
af.hasArrayWithNullTerminator(arg)
or
// flows into likely null terminated string argument (such as `strcpy`, `strcat`)
// likely a null terminated string (such as `strcpy`, `strcat`)
af.hasArrayWithUnknownSize(arg)
or
// flows into string argument to a formatting function (such as `printf`)
exists(int n, FormatLiteral fl |
fc.getArgument(arg) = fc.(FormattingFunctionCall).getConversionArgument(n) and
fl = fc.(FormattingFunctionCall).getFormat() and
fl.getConversionType(n) instanceof PointerType and // `%s`, `%ws` etc
not fl.getConversionType(n) instanceof VoidPointerType and // exclude: `%p`
not fl.hasPrecision(n) // exclude: `%.*s`
)
)
) and
msg = "This allocation does not include space to null-terminate the string."

View File

@@ -3,7 +3,7 @@
* @description Using externally-controlled format strings in
* printf-style functions can lead to buffer overflows
* or data representation problems.
* @kind path-problem
* @kind problem
* @problem.severity warning
* @precision medium
* @id cpp/tainted-format-string
@@ -16,21 +16,12 @@ import cpp
import semmle.code.cpp.security.Security
import semmle.code.cpp.security.FunctionWithWrappers
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element tainted) {
exists(PrintfLikeFunction printf | printf.outermostWrapperFunctionCall(tainted, _))
}
}
from
PrintfLikeFunction printf, Expr arg, PathNode sourceNode, PathNode sinkNode,
string printfFunction, Expr userValue, string cause
from PrintfLikeFunction printf, Expr arg, string printfFunction, Expr userValue, string cause
where
printf.outermostWrapperFunctionCall(arg, printfFunction) and
taintedWithPath(userValue, arg, sourceNode, sinkNode) and
tainted(userValue, arg) and
isUserInput(userValue, cause)
select arg, sourceNode, sinkNode,
select arg,
"The value of this argument may come from $@ and is being used as a formatting argument to " +
printfFunction, userValue, cause

View File

@@ -3,7 +3,7 @@
* @description Using externally-controlled format strings in
* printf-style functions can lead to buffer overflows
* or data representation problems.
* @kind path-problem
* @kind problem
* @problem.severity warning
* @precision medium
* @id cpp/tainted-format-string-through-global
@@ -16,24 +16,15 @@ import cpp
import semmle.code.cpp.security.FunctionWithWrappers
import semmle.code.cpp.security.Security
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element tainted) {
exists(PrintfLikeFunction printf | printf.outermostWrapperFunctionCall(tainted, _))
}
override predicate taintThroughGlobals() { any() }
}
from
PrintfLikeFunction printf, Expr arg, PathNode sourceNode, PathNode sinkNode,
string printfFunction, Expr userValue, string cause
PrintfLikeFunction printf, Expr arg, string printfFunction, Expr userValue, string cause,
string globalVar
where
printf.outermostWrapperFunctionCall(arg, printfFunction) and
not taintedWithoutGlobals(arg) and
taintedWithPath(userValue, arg, sourceNode, sinkNode) and
not tainted(_, arg) and
taintedIncludingGlobalVars(userValue, arg, globalVar) and
isUserInput(userValue, cause)
select arg, sourceNode, sinkNode,
"The value of this argument may come from $@ and is being used as a formatting argument to " +
printfFunction, userValue, cause
select arg,
"This value may flow through $@, originating from $@, and is a formatting argument to " +
printfFunction + ".", globalVarFromId(globalVar), globalVar, userValue, cause

View File

@@ -2,7 +2,7 @@
* @name Uncontrolled data in arithmetic expression
* @description Arithmetic operations on uncontrolled data that is not
* validated can cause overflows.
* @kind path-problem
* @kind problem
* @problem.severity warning
* @precision medium
* @id cpp/uncontrolled-arithmetic
@@ -15,7 +15,6 @@ import cpp
import semmle.code.cpp.security.Overflow
import semmle.code.cpp.security.Security
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
predicate isRandCall(FunctionCall fc) { fc.getTarget().getName() = "rand" }
@@ -41,22 +40,9 @@ class SecurityOptionsArith extends SecurityOptions {
}
}
predicate isDiv(VariableAccess va) { exists(AssignDivExpr div | div.getLValue() = va) }
predicate missingGuard(VariableAccess va, string effect) {
exists(Operation op | op.getAnOperand() = va |
missingGuardAgainstUnderflow(op, va) and effect = "underflow"
or
missingGuardAgainstOverflow(op, va) and effect = "overflow"
)
}
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element e) {
isDiv(e)
or
missingGuard(e, _)
}
predicate taintedVarAccess(Expr origin, VariableAccess va) {
isUserInput(origin, _) and
tainted(origin, va)
}
/**
@@ -64,17 +50,19 @@ class Configuration extends TaintTrackingConfiguration {
* range.
*/
predicate guardedByAssignDiv(Expr origin) {
exists(VariableAccess va |
taintedWithPath(origin, va, _, _) and
isDiv(va)
)
isUserInput(origin, _) and
exists(AssignDivExpr div, VariableAccess va | tainted(origin, va) and div.getLValue() = va)
}
from Expr origin, VariableAccess va, string effect, PathNode sourceNode, PathNode sinkNode
from Expr origin, Operation op, VariableAccess va, string effect
where
taintedWithPath(origin, va, sourceNode, sinkNode) and
missingGuard(va, effect) and
taintedVarAccess(origin, va) and
op.getAnOperand() = va and
(
missingGuardAgainstUnderflow(op, va) and effect = "underflow"
or
missingGuardAgainstOverflow(op, va) and effect = "overflow"
) and
not guardedByAssignDiv(origin)
select va, sourceNode, sinkNode,
"$@ flows to here and is used in arithmetic, potentially causing an " + effect + ".", origin,
"Uncontrolled value"
select va, "$@ flows to here and is used in arithmetic, potentially causing an " + effect + ".",
origin, "Uncontrolled value"

View File

@@ -2,7 +2,7 @@
* @name Overflow in uncontrolled allocation size
* @description Allocating memory with a size controlled by an external
* user can result in integer overflow.
* @kind path-problem
* @kind problem
* @problem.severity error
* @precision high
* @id cpp/uncontrolled-allocation-size
@@ -13,33 +13,21 @@
import cpp
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
/**
* Holds if `alloc` is an allocation, and `tainted` is a child of it that is a
* taint sink.
*/
predicate allocSink(Expr alloc, Expr tainted) {
isAllocationExpr(alloc) and
tainted = alloc.getAChild() and
tainted.getUnspecifiedType() instanceof IntegralType
}
class TaintedAllocationSizeConfiguration extends TaintTrackingConfiguration {
override predicate isSink(Element tainted) { allocSink(_, tainted) }
}
predicate taintedAllocSize(
Expr source, Expr alloc, PathNode sourceNode, PathNode sinkNode, string taintCause
) {
isUserInput(source, taintCause) and
predicate taintedAllocSize(Expr e, Expr source, string taintCause) {
(
isAllocationExpr(e) or
any(MulExpr me | me.getAChild() instanceof SizeofOperator) = e
) and
exists(Expr tainted |
allocSink(alloc, tainted) and
taintedWithPath(source, tainted, sourceNode, sinkNode)
tainted = e.getAChild() and
tainted.getUnspecifiedType() instanceof IntegralType and
isUserInput(source, taintCause) and
tainted(source, tainted)
)
}
from Expr source, Expr alloc, PathNode sourceNode, PathNode sinkNode, string taintCause
where taintedAllocSize(source, alloc, sourceNode, sinkNode, taintCause)
select alloc, sourceNode, sinkNode, "This allocation size is derived from $@ and might overflow",
source, "user input (" + taintCause + ")"
from Expr e, Expr source, string taintCause
where taintedAllocSize(e, source, taintCause)
select e, "This allocation size is derived from $@ and might overflow", source,
"user input (" + taintCause + ")"

View File

@@ -3,7 +3,7 @@
* @description Authentication by checking that the peer's address
* matches a known IP or web address is unsafe as it is
* vulnerable to spoofing attacks.
* @kind path-problem
* @kind problem
* @problem.severity warning
* @precision medium
* @id cpp/user-controlled-bypass
@@ -12,7 +12,6 @@
*/
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
predicate hardCodedAddressOrIP(StringLiteral txt) {
exists(string s | s = txt.getValueText() |
@@ -103,21 +102,16 @@ predicate useOfHardCodedAddressOrIP(Expr use) {
* untrusted input then it might be vulnerable to a spoofing
* attack.
*/
predicate hardCodedAddressInCondition(Expr subexpression, Expr condition) {
subexpression = condition.getAChild+() and
predicate hardCodedAddressInCondition(Expr source, Expr condition) {
// One of the sub-expressions of the condition is tainted.
exists(Expr taintedExpr | taintedExpr.getParent+() = condition | tainted(source, taintedExpr)) and
// One of the sub-expressions of the condition is a hard-coded
// IP or web-address.
exists(Expr use | use = condition.getAChild+() | useOfHardCodedAddressOrIP(use)) and
exists(Expr use | use.getParent+() = condition | useOfHardCodedAddressOrIP(use)) and
condition = any(IfStmt ifStmt).getCondition()
}
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element sink) { hardCodedAddressInCondition(sink, _) }
}
from Expr subexpression, Expr source, Expr condition, PathNode sourceNode, PathNode sinkNode
where
hardCodedAddressInCondition(subexpression, condition) and
taintedWithPath(source, subexpression, sourceNode, sinkNode)
select condition, sourceNode, sinkNode,
"Untrusted input $@ might be vulnerable to a spoofing attack.", source, source.toString()
from Expr source, Expr condition
where hardCodedAddressInCondition(source, condition)
select condition, "Untrusted input $@ might be vulnerable to a spoofing attack.", source,
source.toString()

View File

@@ -2,7 +2,7 @@
* @name Cleartext storage of sensitive information in buffer
* @description Storing sensitive information in cleartext can expose it
* to an attacker.
* @kind path-problem
* @kind problem
* @problem.severity warning
* @precision medium
* @id cpp/cleartext-storage-buffer
@@ -14,20 +14,12 @@ import cpp
import semmle.code.cpp.security.BufferWrite
import semmle.code.cpp.security.TaintTracking
import semmle.code.cpp.security.SensitiveExprs
import TaintedWithPath
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element tainted) { exists(BufferWrite w | w.getASource() = tainted) }
}
from
BufferWrite w, Expr taintedArg, Expr taintSource, PathNode sourceNode, PathNode sinkNode,
string taintCause, SensitiveExpr dest
from BufferWrite w, Expr taintedArg, Expr taintSource, string taintCause, SensitiveExpr dest
where
taintedWithPath(taintSource, taintedArg, sourceNode, sinkNode) and
tainted(taintSource, taintedArg) and
isUserInput(taintSource, taintCause) and
w.getASource() = taintedArg and
dest = w.getDest()
select w, sourceNode, sinkNode,
"This write into buffer '" + dest.toString() + "' may contain unencrypted data from $@",
select w, "This write into buffer '" + dest.toString() + "' may contain unencrypted data from $@",
taintSource, "user input (" + taintCause + ")"

View File

@@ -2,7 +2,7 @@
* @name Cleartext storage of sensitive information in an SQLite database
* @description Storing sensitive information in a non-encrypted
* database can expose it to an attacker.
* @kind path-problem
* @kind problem
* @problem.severity warning
* @precision medium
* @id cpp/cleartext-storage-database
@@ -13,7 +13,6 @@
import cpp
import semmle.code.cpp.security.SensitiveExprs
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
class UserInputIsSensitiveExpr extends SecurityOptions {
override predicate isUserInput(Expr expr, string cause) {
@@ -33,21 +32,10 @@ predicate sqlite_encryption_used() {
any(FunctionCall fc).getTarget().getName().matches("sqlite%\\_key\\_%")
}
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element taintedArg) {
exists(SqliteFunctionCall sqliteCall |
taintedArg = sqliteCall.getASource() and
not sqlite_encryption_used()
)
}
}
from
SensitiveExpr taintSource, Expr taintedArg, SqliteFunctionCall sqliteCall, PathNode sourceNode,
PathNode sinkNode
from SensitiveExpr taintSource, Expr taintedArg, SqliteFunctionCall sqliteCall
where
taintedWithPath(taintSource, taintedArg, sourceNode, sinkNode) and
taintedArg = sqliteCall.getASource()
select sqliteCall, sourceNode, sinkNode,
"This SQLite call may store $@ in a non-encrypted SQLite database", taintSource,
tainted(taintSource, taintedArg) and
taintedArg = sqliteCall.getASource() and
not sqlite_encryption_used()
select sqliteCall, "This SQLite call may store $@ in a non-encrypted SQLite database", taintSource,
"sensitive information"

View File

@@ -28,5 +28,5 @@ where
// is probably a mistake.
addWithSizeof(e, sizeofExpr, _) and not isCharSzPtrExpr(e)
select sizeofExpr,
"Suspicious sizeof offset in a pointer arithmetic expression. The type of the pointer is $@.",
e.getFullyConverted().getType() as t, t.toString()
"Suspicious sizeof offset in a pointer arithmetic expression. " + "The type of the pointer is " +
e.getFullyConverted().getType().toString() + "."

View File

@@ -3,7 +3,7 @@
* @description Using untrusted inputs in a statement that makes a
* security decision makes code vulnerable to
* attack.
* @kind path-problem
* @kind problem
* @problem.severity warning
* @precision medium
* @id cpp/tainted-permissions-check
@@ -12,9 +12,14 @@
*/
import semmle.code.cpp.security.TaintTracking
import TaintedWithPath
predicate sensitiveCondition(Expr condition, Expr raise) {
/**
* Holds if there is an 'if' statement whose condition `condition`
* is influenced by tainted data `source`, and the body contains
* `raise` which escalates privilege.
*/
predicate cwe807violation(Expr source, Expr condition, Expr raise) {
tainted(source, condition) and
raisesPrivilege(raise) and
exists(IfStmt ifstmt |
ifstmt.getCondition() = condition and
@@ -22,19 +27,7 @@ predicate sensitiveCondition(Expr condition, Expr raise) {
)
}
class Configuration extends TaintTrackingConfiguration {
override predicate isSink(Element tainted) { sensitiveCondition(tainted, _) }
}
/*
* Produce an alert if there is an 'if' statement whose condition `condition`
* is influenced by tainted data `source`, and the body contains
* `raise` which escalates privilege.
*/
from Expr source, Expr condition, Expr raise, PathNode sourceNode, PathNode sinkNode
where
taintedWithPath(source, condition, sourceNode, sinkNode) and
sensitiveCondition(condition, raise)
select condition, sourceNode, sinkNode, "Reliance on untrusted input $@ to raise privilege at $@",
source, source.toString(), raise, raise.toString()
from Expr source, Expr condition, Expr raise
where cwe807violation(source, condition, raise)
select condition, "Reliance on untrusted input $@ to raise privilege at $@", source,
source.toString(), raise, raise.toString()

View File

@@ -1,4 +0,0 @@
- description: Standard Code Scanning queries for C and C++
- qlpack: codeql-cpp
- apply: code-scanning-selectors.yml
from: codeql-suite-helpers

View File

@@ -1 +0,0 @@
This directory contains [experimental](../../../../docs/experimental.md) CodeQL queries and libraries.

View File

@@ -1,138 +0,0 @@
#include <stdlib.h>
#include <sys/param.h>
#include <unistd.h>
#include <pwd.h>
void callSetuidAndCheck(int uid) {
if (setuid(uid) != 0) {
exit(1);
}
}
void callSetgidAndCheck(int gid) {
if (setgid(gid) != 0) {
exit(1);
}
}
/// Correct ways to drop priv.
void correctDropPrivInline() {
if (setgroups(0, NULL)) {
exit(1);
}
if (setgid(-2) != 0) {
exit(1);
}
if (setuid(-2) != 0) {
exit(1);
}
}
void correctDropPrivInScope() {
{
if (setgroups(0, NULL)) {
exit(1);
}
}
{
if (setgid(-2) != 0) {
exit(1);
}
}
{
if (setuid(-2) != 0) {
exit(1);
}
}
}
void correctOrderForInitgroups() {
struct passwd *pw = getpwuid(0);
if (pw) {
if (initgroups(pw->pw_name, -2)) {
exit(1);
}
} else {
// Unhandled.
}
int rc = setuid(-2);
if (rc) {
exit(1);
}
}
void correctDropPrivInScopeParent() {
{
callSetgidAndCheck(-2);
}
correctOrderForInitgroups();
}
void incorrectNoReturnCodeCheck() {
int user = -2;
if (user) {
if (user) {
int rc = setgid(user);
(void)rc;
initgroups("nobody", user);
}
if (user) {
setuid(user);
}
}
}
void correctDropPrivInFunctionCall() {
if (setgroups(0, NULL)) {
exit(1);
}
callSetgidAndCheck(-2);
callSetuidAndCheck(-2);
}
/// Incorrect, out of order gid and uid.
/// Calling uid before gid will fail.
void incorrectDropPrivOutOfOrderInline() {
if (setuid(-2) != 0) {
exit(1);
}
if (setgid(-2) != 0) {
exit(1);
}
}
void incorrectDropPrivOutOfOrderInScope() {
{
if (setuid(-2) != 0) {
exit(1);
}
}
setgid(-2);
}
void incorrectDropPrivOutOfOrderWithFunction() {
callSetuidAndCheck(-2);
if (setgid(-2) != 0) {
exit(1);
}
}
void incorrectDropPrivOutOfOrderWithFunction2() {
callSetuidAndCheck(-2);
callSetgidAndCheck(-2);
}
void incorrectDropPrivNoCheck() {
setgid(-2);
setuid(-2);
}

View File

@@ -1,35 +0,0 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<overview>
<p>The code attempts to drop privilege in an incorrect order by
erroneous dropping user privilege before groups. This has security
impact if the return codes are not checked.</p>
<p>False positives include code performing negative checks, making
sure that setgid or setgroups does not work, meaning permissions are
dropped. Additionally, other forms of sandboxing may be present removing
any residual risk, for example a dedicated user namespace.</p>
</overview>
<recommendation>
<p>Set the new group ID, then set the target user's intended groups by
dropping previous supplemental source groups and initializing target
groups, and finally set the target user.</p>
</recommendation>
<example>
<p>The following example demonstrates out of order calls.</p>
<sample src="PrivilegeDroppingOutoforder.c" />
</example>
<references>
<li>CERT C Coding Standard:
<a href="https://wiki.sei.cmu.edu/confluence/display/c/POS37-C.+Ensure+that+privilege+relinquishment+is+successful">POS37-C. Ensure that privilege relinquishment is successful</a>.
</li>
</references>
</qhelp>

View File

@@ -1,101 +0,0 @@
/**
* @name LinuxPrivilegeDroppingOutoforder
* @description A syscall commonly associated with privilege dropping is being called out of order.
* Normally a process drops group ID and sets supplimental groups for the target user
* before setting the target user ID. This can have security impact if the return code
* from these methods is not checked.
* @kind problem
* @problem.severity recommendation
* @id cpp/drop-linux-privileges-outoforder
* @tags security
* external/cwe/cwe-273
* @precision medium
*/
import cpp
predicate argumentMayBeRoot(Expr e) {
e.getValue() = "0" or
e.(VariableAccess).getTarget().getName().toLowerCase().matches("%root%")
}
class SetuidLikeFunctionCall extends FunctionCall {
SetuidLikeFunctionCall() {
(getTarget().hasGlobalName("setuid") or getTarget().hasGlobalName("setresuid")) and
// setuid/setresuid with the root user are false positives.
not argumentMayBeRoot(getArgument(0))
}
}
class SetuidLikeWrapperCall extends FunctionCall {
SetuidLikeFunctionCall baseCall;
SetuidLikeWrapperCall() {
this = baseCall
or
exists(SetuidLikeWrapperCall fc |
this.getTarget() = fc.getEnclosingFunction() and
baseCall = fc.getBaseCall()
)
}
SetuidLikeFunctionCall getBaseCall() { result = baseCall }
}
class CallBeforeSetuidFunctionCall extends FunctionCall {
CallBeforeSetuidFunctionCall() {
(
getTarget().hasGlobalName("setgid") or
getTarget().hasGlobalName("setresgid") or
// Compatibility may require skipping initgroups and setgroups return checks.
// A stricter best practice is to check the result and errnor for EPERM.
getTarget().hasGlobalName("initgroups") or
getTarget().hasGlobalName("setgroups")
) and
// setgid/setresgid/etc with the root group are false positives.
not argumentMayBeRoot(getArgument(0))
}
}
class CallBeforeSetuidWrapperCall extends FunctionCall {
CallBeforeSetuidFunctionCall baseCall;
CallBeforeSetuidWrapperCall() {
this = baseCall
or
exists(CallBeforeSetuidWrapperCall fc |
this.getTarget() = fc.getEnclosingFunction() and
baseCall = fc.getBaseCall()
)
}
CallBeforeSetuidFunctionCall getBaseCall() { result = baseCall }
}
predicate setuidBeforeSetgid(
SetuidLikeWrapperCall setuidWrapper, CallBeforeSetuidWrapperCall setgidWrapper
) {
setgidWrapper.getAPredecessor+() = setuidWrapper
}
predicate isAccessed(FunctionCall fc) {
exists(Variable v | v.getAnAssignedValue() = fc)
or
exists(Operation c | fc = c.getAChild() | c.isCondition())
or
// ignore pattern where result is intentionally ignored by a cast to void.
fc.hasExplicitConversion()
}
from Function func, CallBeforeSetuidFunctionCall fc, SetuidLikeFunctionCall setuid
where
setuidBeforeSetgid(setuid, fc) and
// Require the call return code to be used in a condition or assigned.
// This introduces false negatives where the return is checked but then
// errno == EPERM allows execution to continue.
not isAccessed(fc) and
func = fc.getEnclosingFunction()
select fc,
"This function is called within " + func + ", and potentially after " +
"$@, and may not succeed. Be sure to check the return code and errno, otherwise permissions " +
"may not be dropped.", setuid, setuid.getTarget().getName()

View File

@@ -30,13 +30,7 @@ predicate functionsMissingReturnStmt(Function f, ControlFlowNode blame) {
) and
exists(ReturnStmt s |
f.getAPredecessor() = s and
(
blame = s.getAPredecessor() and
count(blame.getASuccessor()) = 1
or
blame = s and
exists(ControlFlowNode pred | pred = s.getAPredecessor() | count(pred.getASuccessor()) != 1)
)
blame = s.getAPredecessor()
)
}

View File

@@ -2,4 +2,3 @@ name: codeql-cpp
version: 0.0.0
dbscheme: semmlecode.cpp.dbscheme
suites: codeql-suites
extractor: cpp

View File

@@ -21,9 +21,9 @@ private predicate idOf(@compilation x, int y) = equivalenceRelation(id/2)(x, y)
* Three things happen to each file during a compilation:
*
* 1. The file is compiled by a real compiler, such as gcc or VC.
* 2. The file is parsed by the CodeQL C++ front-end.
* 2. The file is parsed by Semmle's C++ front-end.
* 3. The parsed representation is converted to database tables by
* the CodeQL extractor.
* Semmle's extractor.
*
* This class provides CPU and elapsed time information for steps 2 and 3,
* but not for step 1.

View File

@@ -25,7 +25,7 @@ private import semmle.code.cpp.internal.QualifiedName as Q
* `DeclarationEntry`, because they always have a unique source location.
* `EnumConstant` and `FriendDecl` are both examples of this.
*/
class Declaration extends Locatable, @declaration {
abstract class Declaration extends Locatable, @declaration {
/**
* Gets the innermost namespace which contains this declaration.
*

View File

@@ -19,8 +19,6 @@ import semmle.code.cpp.exprs.Access
class Field extends MemberVariable {
Field() { fieldoffsets(underlyingElement(this), _, _) }
override string getCanonicalQLClass() { result = "Field" }
/**
* Gets the offset of this field in bytes from the start of its declaring
* type (on the machine where facts were extracted).
@@ -86,8 +84,6 @@ class Field extends MemberVariable {
class BitField extends Field {
BitField() { bitfield(underlyingElement(this), _, _) }
override string getCanonicalQLClass() { result = "BitField" }
/**
* Gets the size of this bitfield in bits (on the machine where facts
* were extracted).

View File

@@ -133,25 +133,12 @@ class Function extends Declaration, ControlFlowNode, AccessHolder, @function {
*/
Type getUnspecifiedType() { result = getType().getUnspecifiedType() }
/**
* Gets the nth parameter of this function. There is no result for the
* implicit `this` parameter, and there is no `...` varargs pseudo-parameter.
*/
/** Gets the nth parameter of this function. */
Parameter getParameter(int n) { params(unresolveElement(result), underlyingElement(this), n, _) }
/**
* Gets a parameter of this function. There is no result for the implicit
* `this` parameter, and there is no `...` varargs pseudo-parameter.
*/
/** Gets a parameter of this function. */
Parameter getAParameter() { params(unresolveElement(result), underlyingElement(this), _, _) }
/**
* Gets an access of this function.
*
* To get calls to this function, use `getACallToThisFunction` instead.
*/
FunctionAccess getAnAccess() { result.getTarget() = this }
/**
* Gets the number of parameters of this function, _not_ including any
* implicit `this` parameter or any `...` varargs pseudo-parameter.
@@ -187,7 +174,6 @@ class Function extends Declaration, ControlFlowNode, AccessHolder, @function {
result = getParameter(index).getTypedName() + ", " + getParameterStringFrom(index + 1)
}
/** Gets a call to this function. */
FunctionCall getACallToThisFunction() { result.getTarget() = this }
/**

View File

@@ -28,8 +28,6 @@ private import semmle.code.cpp.internal.ResolveClass
* can have multiple declarations.
*/
class Variable extends Declaration, @variable {
override string getCanonicalQLClass() { result = "Variable" }
/** Gets the initializer of this variable, if any. */
Initializer getInitializer() { result.getDeclaration() = this }
@@ -353,8 +351,6 @@ class StackVariable extends LocalScopeVariable {
* A local variable can be declared by a `DeclStmt` or a `ConditionDeclExpr`.
*/
class LocalVariable extends LocalScopeVariable, @localvariable {
override string getCanonicalQLClass() { result = "LocalVariable" }
override string getName() { localvariables(underlyingElement(this), _, result) }
override Type getType() { localvariables(underlyingElement(this), unresolveElement(result), _) }
@@ -366,59 +362,6 @@ class LocalVariable extends LocalScopeVariable, @localvariable {
}
}
/**
* A variable whose contents always have static storage duration. This can be a
* global variable, a namespace variable, a static local variable, or a static
* member variable.
*/
class StaticStorageDurationVariable extends Variable {
StaticStorageDurationVariable() {
this instanceof GlobalOrNamespaceVariable
or
this.(LocalVariable).isStatic()
or
this.(MemberVariable).isStatic()
}
/**
* Holds if the initializer for this variable is evaluated at runtime.
*/
predicate hasDynamicInitialization() {
runtimeExprInStaticInitializer(this.getInitializer().getExpr())
}
}
/**
* Holds if `e` is an expression in a static initializer that must be evaluated
* at run time. This predicate computes "is non-const" instead of "is const"
* since computing "is const" for an aggregate literal with many children would
* either involve recursion through `forall` on those children or an iteration
* through the rank numbers of the children, both of which can be slow.
*/
private predicate runtimeExprInStaticInitializer(Expr e) {
inStaticInitializer(e) and
if e instanceof AggregateLiteral // in sync with the cast in `inStaticInitializer`
then runtimeExprInStaticInitializer(e.getAChild())
else not e.getFullyConverted().isConstant()
}
/**
* Holds if `e` is the initializer of a `StaticStorageDurationVariable`, either
* directly or below some top-level `AggregateLiteral`s.
*/
private predicate inStaticInitializer(Expr e) {
exists(StaticStorageDurationVariable var | e = var.getInitializer().getExpr())
or
// The cast to `AggregateLiteral` ensures we only compute what'll later be
// needed by `runtimeExprInStaticInitializer`.
inStaticInitializer(e.getParent().(AggregateLiteral))
}
/**
* A C++ local variable declared as `static`.
*/
class StaticLocalVariable extends LocalVariable, StaticStorageDurationVariable { }
/**
* A C/C++ variable which has global scope or namespace scope. For example the
* variables `a` and `b` in the following code:
@@ -453,8 +396,6 @@ class NamespaceVariable extends GlobalOrNamespaceVariable {
NamespaceVariable() {
exists(Namespace n | namespacembrs(unresolveElement(n), underlyingElement(this)))
}
override string getCanonicalQLClass() { result = "NamespaceVariable" }
}
/**
@@ -474,8 +415,6 @@ class NamespaceVariable extends GlobalOrNamespaceVariable {
*/
class GlobalVariable extends GlobalOrNamespaceVariable {
GlobalVariable() { not this instanceof NamespaceVariable }
override string getCanonicalQLClass() { result = "GlobalVariable" }
}
/**
@@ -495,8 +434,6 @@ class GlobalVariable extends GlobalOrNamespaceVariable {
class MemberVariable extends Variable, @membervariable {
MemberVariable() { this.isMember() }
override string getCanonicalQLClass() { result = "MemberVariable" }
/** Holds if this member is private. */
predicate isPrivate() { this.hasSpecifier("private") }

View File

@@ -23,8 +23,6 @@ predicate freeFunction(Function f, int argNum) { argNum = f.(DeallocationFunctio
/**
* A call to a library routine that frees memory.
*
* DEPRECATED: Use `DeallocationExpr` instead (this also includes `delete` expressions).
*/
predicate freeCall(FunctionCall fc, Expr arg) { arg = fc.(DeallocationExpr).getFreedExpr() }

View File

@@ -441,9 +441,10 @@ private Node getControlOrderChildSparse(Node n, int i) {
* thus should not have control flow computed.
*/
private predicate skipInitializer(Initializer init) {
exists(StaticLocalVariable local |
exists(LocalVariable local |
init = local.getInitializer() and
not local.hasDynamicInitialization()
local.isStatic() and
not runtimeExprInStaticInitializer(init.getExpr())
)
}

View File

@@ -1,6 +1,6 @@
/**
* DEPRECATED: Recursion through `DataFlow::Configuration` is impossible in
* any supported tooling. There is no need for this module because it's
* Semmle Core 1.17 and above. There is no need for this module because it's
* impossible to accidentally depend on recursion through
* `DataFlow::Configuration` in current releases.
*

View File

@@ -1,5 +1,7 @@
private import cpp
Function viableImpl(Call call) { result = viableCallable(call) }
/**
* Gets a function that might be called by `call`.
*/

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,175 +0,0 @@
/**
* Provides consistency queries for checking invariants in the language-specific
* data-flow classes and predicates.
*/
private import DataFlowImplSpecific::Private
private import DataFlowImplSpecific::Public
private import tainttracking1.TaintTrackingParameter::Private
private import tainttracking1.TaintTrackingParameter::Public
module Consistency {
private class RelevantNode extends Node {
RelevantNode() {
this instanceof ArgumentNode or
this instanceof ParameterNode or
this instanceof ReturnNode or
this = getAnOutNode(_, _) or
simpleLocalFlowStep(this, _) or
simpleLocalFlowStep(_, this) or
jumpStep(this, _) or
jumpStep(_, this) or
storeStep(this, _, _) or
storeStep(_, _, this) or
readStep(this, _, _) or
readStep(_, _, this) or
defaultAdditionalTaintStep(this, _) or
defaultAdditionalTaintStep(_, this)
}
}
query predicate uniqueEnclosingCallable(Node n, string msg) {
exists(int c |
n instanceof RelevantNode and
c = count(n.getEnclosingCallable()) and
c != 1 and
msg = "Node should have one enclosing callable but has " + c + "."
)
}
query predicate uniqueTypeBound(Node n, string msg) {
exists(int c |
n instanceof RelevantNode and
c = count(n.getTypeBound()) and
c != 1 and
msg = "Node should have one type bound but has " + c + "."
)
}
query predicate uniqueTypeRepr(Node n, string msg) {
exists(int c |
n instanceof RelevantNode and
c = count(getErasedRepr(n.getTypeBound())) and
c != 1 and
msg = "Node should have one type representation but has " + c + "."
)
}
query predicate uniqueNodeLocation(Node n, string msg) {
exists(int c |
c =
count(string filepath, int startline, int startcolumn, int endline, int endcolumn |
n.hasLocationInfo(filepath, startline, startcolumn, endline, endcolumn)
) and
c != 1 and
msg = "Node should have one location but has " + c + "."
)
}
query predicate missingLocation(string msg) {
exists(int c |
c =
strictcount(Node n |
not exists(string filepath, int startline, int startcolumn, int endline, int endcolumn |
n.hasLocationInfo(filepath, startline, startcolumn, endline, endcolumn)
)
) and
msg = "Nodes without location: " + c
)
}
query predicate uniqueNodeToString(Node n, string msg) {
exists(int c |
c = count(n.toString()) and
c != 1 and
msg = "Node should have one toString but has " + c + "."
)
}
query predicate missingToString(string msg) {
exists(int c |
c = strictcount(Node n | not exists(n.toString())) and
msg = "Nodes without toString: " + c
)
}
query predicate parameterCallable(ParameterNode p, string msg) {
exists(DataFlowCallable c | p.isParameterOf(c, _) and c != p.getEnclosingCallable()) and
msg = "Callable mismatch for parameter."
}
query predicate localFlowIsLocal(Node n1, Node n2, string msg) {
simpleLocalFlowStep(n1, n2) and
n1.getEnclosingCallable() != n2.getEnclosingCallable() and
msg = "Local flow step does not preserve enclosing callable."
}
private DataFlowType typeRepr() { result = getErasedRepr(any(Node n).getTypeBound()) }
query predicate compatibleTypesReflexive(DataFlowType t, string msg) {
t = typeRepr() and
not compatibleTypes(t, t) and
msg = "Type compatibility predicate is not reflexive."
}
query predicate unreachableNodeCCtx(Node n, DataFlowCall call, string msg) {
isUnreachableInCall(n, call) and
exists(DataFlowCallable c |
c = n.getEnclosingCallable() and
not viableCallable(call) = c
) and
msg = "Call context for isUnreachableInCall is inconsistent with call graph."
}
query predicate localCallNodes(DataFlowCall call, Node n, string msg) {
(
n = getAnOutNode(call, _) and
msg = "OutNode and call does not share enclosing callable."
or
n.(ArgumentNode).argumentOf(call, _) and
msg = "ArgumentNode and call does not share enclosing callable."
) and
n.getEnclosingCallable() != call.getEnclosingCallable()
}
query predicate postIsNotPre(PostUpdateNode n, string msg) {
n.getPreUpdateNode() = n and msg = "PostUpdateNode should not equal its pre-update node."
}
query predicate postHasUniquePre(PostUpdateNode n, string msg) {
exists(int c |
c = count(n.getPreUpdateNode()) and
c != 1 and
msg = "PostUpdateNode should have one pre-update node but has " + c + "."
)
}
query predicate uniquePostUpdate(Node n, string msg) {
1 < strictcount(PostUpdateNode post | post.getPreUpdateNode() = n) and
msg = "Node has multiple PostUpdateNodes."
}
query predicate postIsInSameCallable(PostUpdateNode n, string msg) {
n.getEnclosingCallable() != n.getPreUpdateNode().getEnclosingCallable() and
msg = "PostUpdateNode does not share callable with its pre-update node."
}
private predicate hasPost(Node n) { exists(PostUpdateNode post | post.getPreUpdateNode() = n) }
query predicate reverseRead(Node n, string msg) {
exists(Node n2 | readStep(n, _, n2) and hasPost(n2) and not hasPost(n)) and
msg = "Origin of readStep is missing a PostUpdateNode."
}
query predicate storeIsPostUpdate(Node n, string msg) {
storeStep(_, _, n) and
not n instanceof PostUpdateNode and
msg = "Store targets should be PostUpdateNodes."
}
query predicate argHasPostUpdate(ArgumentNode n, string msg) {
not hasPost(n) and
not isImmutableOrUnobservable(n) and
msg = "ArgumentNode is missing PostUpdateNode."
}
}

View File

@@ -132,6 +132,16 @@ OutNode getAnOutNode(DataFlowCall call, ReturnKind kind) {
*/
predicate jumpStep(Node n1, Node n2) { none() }
/**
* Holds if `call` passes an implicit or explicit qualifier, i.e., a
* `this` parameter.
*/
predicate callHasQualifier(Call call) {
call.hasQualifier()
or
call.getTarget() instanceof Destructor
}
private newtype TContent =
TFieldContent(Field f) or
TCollectionContent() or
@@ -291,29 +301,3 @@ class DataFlowCall extends Expr {
}
predicate isUnreachableInCall(Node n, DataFlowCall call) { none() } // stub implementation
int accessPathLimit() { result = 5 }
/**
* Holds if `n` does not require a `PostUpdateNode` as it either cannot be
* modified or its modification cannot be observed, for example if it is a
* freshly created object that is not saved in a variable.
*
* This predicate is only used for consistency checks.
*/
predicate isImmutableOrUnobservable(Node n) {
// Is the null pointer (or something that's not really a pointer)
exists(n.asExpr().getValue())
or
// Isn't a pointer or is a pointer to const
forall(DerivedType dt | dt = n.asExpr().getActualType() |
dt.getBaseType().isConst()
or
dt.getBaseType() instanceof RoutineType
)
or
// Isn't something we can track
n.asExpr() instanceof Call
// The above list of cases isn't exhaustive, but it narrows down the
// consistency alerts enough that most of them are interesting.
}

View File

@@ -6,6 +6,7 @@ private import cpp
private import semmle.code.cpp.dataflow.internal.FlowVar
private import semmle.code.cpp.models.interfaces.DataFlow
private import semmle.code.cpp.controlflow.Guards
private import semmle.code.cpp.valuenumbering.GlobalValueNumbering
cached
private newtype TNode =
@@ -688,9 +689,9 @@ class BarrierGuard extends GuardCondition {
/** Gets a node guarded by this guard. */
final ExprNode getAGuardedNode() {
exists(SsaDefinition def, Variable v, boolean branch |
result.getExpr() = def.getAUse(v) and
this.checks(def.getAUse(v), branch) and
exists(GVN value, boolean branch |
result.getExpr() = value.getAnExpr() and
this.checks(value.getAnExpr(), branch) and
this.controls(result.getExpr().getBasicBlock(), branch)
)
}

View File

@@ -1,13 +1,3 @@
/**
* Provides an implementation of global (interprocedural) taint tracking.
* This file re-exports the local (intraprocedural) taint-tracking analysis
* from `TaintTrackingParameter::Public` and adds a global analysis, mainly
* exposed through the `Configuration` class. For some languages, this file
* exists in several identical copies, allowing queries to use multiple
* `Configuration` classes that depend on each other without introducing
* mutual recursion among those configurations.
*/
import TaintTrackingParameter::Public
private import TaintTrackingParameter::Private

View File

@@ -1,13 +1,3 @@
/**
* Provides an implementation of global (interprocedural) taint tracking.
* This file re-exports the local (intraprocedural) taint-tracking analysis
* from `TaintTrackingParameter::Public` and adds a global analysis, mainly
* exposed through the `Configuration` class. For some languages, this file
* exists in several identical copies, allowing queries to use multiple
* `Configuration` classes that depend on each other without introducing
* mutual recursion among those configurations.
*/
import TaintTrackingParameter::Public
private import TaintTrackingParameter::Private

View File

@@ -16,7 +16,7 @@ class UnaryMinusExpr extends UnaryArithmeticOperation, @arithnegexpr {
override string getCanonicalQLClass() { result = "UnaryMinusExpr" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
}
/**
@@ -30,7 +30,7 @@ class UnaryPlusExpr extends UnaryArithmeticOperation, @unaryplusexpr {
override string getCanonicalQLClass() { result = "UnaryPlusExpr" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
}
/**
@@ -109,7 +109,7 @@ class PrefixIncrExpr extends IncrementOperation, PrefixCrementOperation, @preinc
override string getCanonicalQLClass() { result = "PrefixIncrExpr" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
}
/**
@@ -125,7 +125,7 @@ class PrefixDecrExpr extends DecrementOperation, PrefixCrementOperation, @predec
override string getCanonicalQLClass() { result = "PrefixDecrExpr" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
}
/**
@@ -141,7 +141,7 @@ class PostfixIncrExpr extends IncrementOperation, PostfixCrementOperation, @post
override string getCanonicalQLClass() { result = "PostfixIncrExpr" }
override int getPrecedence() { result = 17 }
override int getPrecedence() { result = 16 }
override string toString() { result = "... " + getOperator() }
}
@@ -159,7 +159,7 @@ class PostfixDecrExpr extends DecrementOperation, PostfixCrementOperation, @post
override string getCanonicalQLClass() { result = "PostfixDecrExpr" }
override int getPrecedence() { result = 17 }
override int getPrecedence() { result = 16 }
override string toString() { result = "... " + getOperator() }
}
@@ -210,7 +210,7 @@ class AddExpr extends BinaryArithmeticOperation, @addexpr {
override string getCanonicalQLClass() { result = "AddExpr" }
override int getPrecedence() { result = 13 }
override int getPrecedence() { result = 12 }
}
/**
@@ -224,7 +224,7 @@ class SubExpr extends BinaryArithmeticOperation, @subexpr {
override string getCanonicalQLClass() { result = "SubExpr" }
override int getPrecedence() { result = 13 }
override int getPrecedence() { result = 12 }
}
/**
@@ -238,7 +238,7 @@ class MulExpr extends BinaryArithmeticOperation, @mulexpr {
override string getCanonicalQLClass() { result = "MulExpr" }
override int getPrecedence() { result = 14 }
override int getPrecedence() { result = 13 }
}
/**
@@ -252,7 +252,7 @@ class DivExpr extends BinaryArithmeticOperation, @divexpr {
override string getCanonicalQLClass() { result = "DivExpr" }
override int getPrecedence() { result = 14 }
override int getPrecedence() { result = 13 }
}
/**
@@ -266,7 +266,7 @@ class RemExpr extends BinaryArithmeticOperation, @remexpr {
override string getCanonicalQLClass() { result = "RemExpr" }
override int getPrecedence() { result = 14 }
override int getPrecedence() { result = 13 }
}
/**
@@ -283,7 +283,7 @@ class ImaginaryMulExpr extends BinaryArithmeticOperation, @jmulexpr {
override string getCanonicalQLClass() { result = "ImaginaryMulExpr" }
override int getPrecedence() { result = 14 }
override int getPrecedence() { result = 13 }
}
/**
@@ -300,7 +300,7 @@ class ImaginaryDivExpr extends BinaryArithmeticOperation, @jdivexpr {
override string getCanonicalQLClass() { result = "ImaginaryDivExpr" }
override int getPrecedence() { result = 14 }
override int getPrecedence() { result = 13 }
}
/**
@@ -318,7 +318,7 @@ class RealImaginaryAddExpr extends BinaryArithmeticOperation, @fjaddexpr {
override string getCanonicalQLClass() { result = "RealImaginaryAddExpr" }
override int getPrecedence() { result = 13 }
override int getPrecedence() { result = 12 }
}
/**
@@ -336,7 +336,7 @@ class ImaginaryRealAddExpr extends BinaryArithmeticOperation, @jfaddexpr {
override string getCanonicalQLClass() { result = "ImaginaryRealAddExpr" }
override int getPrecedence() { result = 13 }
override int getPrecedence() { result = 12 }
}
/**
@@ -354,7 +354,7 @@ class RealImaginarySubExpr extends BinaryArithmeticOperation, @fjsubexpr {
override string getCanonicalQLClass() { result = "RealImaginarySubExpr" }
override int getPrecedence() { result = 13 }
override int getPrecedence() { result = 12 }
}
/**
@@ -372,7 +372,7 @@ class ImaginaryRealSubExpr extends BinaryArithmeticOperation, @jfsubexpr {
override string getCanonicalQLClass() { result = "ImaginaryRealSubExpr" }
override int getPrecedence() { result = 13 }
override int getPrecedence() { result = 12 }
}
/**
@@ -416,7 +416,7 @@ class PointerAddExpr extends PointerArithmeticOperation, @paddexpr {
override string getCanonicalQLClass() { result = "PointerAddExpr" }
override int getPrecedence() { result = 13 }
override int getPrecedence() { result = 12 }
}
/**
@@ -431,7 +431,7 @@ class PointerSubExpr extends PointerArithmeticOperation, @psubexpr {
override string getCanonicalQLClass() { result = "PointerSubExpr" }
override int getPrecedence() { result = 13 }
override int getPrecedence() { result = 12 }
}
/**
@@ -446,5 +446,5 @@ class PointerDiffExpr extends PointerArithmeticOperation, @pdiffexpr {
override string getCanonicalQLClass() { result = "PointerDiffExpr" }
override int getPrecedence() { result = 13 }
override int getPrecedence() { result = 12 }
}

View File

@@ -14,7 +14,7 @@ class UnaryBitwiseOperation extends UnaryOperation, @un_bitwise_op_expr { }
class ComplementExpr extends UnaryBitwiseOperation, @complementexpr {
override string getOperator() { result = "~" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
override string getCanonicalQLClass() { result = "ComplementExpr" }
}
@@ -33,7 +33,7 @@ class BinaryBitwiseOperation extends BinaryOperation, @bin_bitwise_op_expr { }
class LShiftExpr extends BinaryBitwiseOperation, @lshiftexpr {
override string getOperator() { result = "<<" }
override int getPrecedence() { result = 12 }
override int getPrecedence() { result = 11 }
override string getCanonicalQLClass() { result = "LShiftExpr" }
}
@@ -47,7 +47,7 @@ class LShiftExpr extends BinaryBitwiseOperation, @lshiftexpr {
class RShiftExpr extends BinaryBitwiseOperation, @rshiftexpr {
override string getOperator() { result = ">>" }
override int getPrecedence() { result = 12 }
override int getPrecedence() { result = 11 }
override string getCanonicalQLClass() { result = "RShiftExpr" }
}

View File

@@ -8,22 +8,6 @@ abstract class BuiltInOperation extends Expr {
override string getCanonicalQLClass() { result = "BuiltInOperation" }
}
/**
* A C/C++ built-in operation that is used to support functions with variable numbers of arguments.
* This includes `va_start`, `va_end`, `va_copy`, and `va_arg`.
*/
class VarArgsExpr extends BuiltInOperation {
VarArgsExpr() {
this instanceof BuiltInVarArgsStart
or
this instanceof BuiltInVarArgsEnd
or
this instanceof BuiltInVarArg
or
this instanceof BuiltInVarArgCopy
}
}
/**
* A C/C++ `__builtin_va_start` built-in operation (used by some
* implementations of `va_start`).
@@ -36,16 +20,6 @@ class BuiltInVarArgsStart extends BuiltInOperation, @vastartexpr {
override string toString() { result = "__builtin_va_start" }
override string getCanonicalQLClass() { result = "BuiltInVarArgsStart" }
/**
* Gets the `va_list` argument.
*/
final Expr getVAList() { result = getChild(0) }
/**
* Gets the argument that specifies the last named parameter before the ellipsis.
*/
final VariableAccess getLastNamedParameter() { result = getChild(1) }
}
/**
@@ -61,11 +35,6 @@ class BuiltInVarArgsEnd extends BuiltInOperation, @vaendexpr {
override string toString() { result = "__builtin_va_end" }
override string getCanonicalQLClass() { result = "BuiltInVarArgsEnd" }
/**
* Gets the `va_list` argument.
*/
final Expr getVAList() { result = getChild(0) }
}
/**
@@ -79,11 +48,6 @@ class BuiltInVarArg extends BuiltInOperation, @vaargexpr {
override string toString() { result = "__builtin_va_arg" }
override string getCanonicalQLClass() { result = "BuiltInVarArg" }
/**
* Gets the `va_list` argument.
*/
final Expr getVAList() { result = getChild(0) }
}
/**
@@ -99,16 +63,6 @@ class BuiltInVarArgCopy extends BuiltInOperation, @vacopyexpr {
override string toString() { result = "__builtin_va_copy" }
override string getCanonicalQLClass() { result = "BuiltInVarArgCopy" }
/**
* Gets the destination `va_list` argument.
*/
final Expr getDestinationVAList() { result = getChild(0) }
/**
* Gets the the source `va_list` argument.
*/
final Expr getSourceVAList() { result = getChild(1) }
}
/**

View File

@@ -74,7 +74,7 @@ abstract class Call extends Expr, NameQualifiableElement {
*/
abstract Function getTarget();
override int getPrecedence() { result = 17 }
override int getPrecedence() { result = 16 }
override string toString() { none() }
@@ -196,7 +196,7 @@ class FunctionCall extends Call, @funbindexpr {
* constructor calls, this predicate instead gets the `Class` of the constructor
* being called.
*/
Type getTargetType() { result = Call.super.getType().stripType() }
private Type getTargetType() { result = Call.super.getType().stripType() }
/**
* Gets the expected return type of the function called by this call.

View File

@@ -84,7 +84,7 @@ class CStyleCast extends Cast, @c_style_cast {
override string getCanonicalQLClass() { result = "CStyleCast" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
}
/**
@@ -103,7 +103,7 @@ class StaticCast extends Cast, @static_cast {
override string getCanonicalQLClass() { result = "StaticCast" }
override int getPrecedence() { result = 17 }
override int getPrecedence() { result = 16 }
}
/**
@@ -121,7 +121,7 @@ class ConstCast extends Cast, @const_cast {
override string getCanonicalQLClass() { result = "ConstCast" }
override int getPrecedence() { result = 17 }
override int getPrecedence() { result = 16 }
}
/**
@@ -139,7 +139,7 @@ class ReinterpretCast extends Cast, @reinterpret_cast {
override string getCanonicalQLClass() { result = "ReinterpretCast" }
override int getPrecedence() { result = 17 }
override int getPrecedence() { result = 16 }
}
private predicate isArithmeticOrEnum(Type type) {
@@ -608,7 +608,7 @@ class PrvalueAdjustmentConversion extends Cast {
class DynamicCast extends Cast, @dynamic_cast {
override string toString() { result = "dynamic_cast<" + this.getType().getName() + ">..." }
override int getPrecedence() { result = 17 }
override int getPrecedence() { result = 16 }
override string getCanonicalQLClass() { result = "DynamicCast" }
@@ -631,7 +631,7 @@ class UuidofOperator extends Expr, @uuidof {
else result = "__uuidof(0)"
}
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
/** Gets the contained type. */
Type getTypeOperand() { uuidof_bind(underlyingElement(this), unresolveElement(result)) }
@@ -669,7 +669,7 @@ class TypeidOperator extends Expr, @type_id {
override string toString() { result = "typeid ..." }
override int getPrecedence() { result = 17 }
override int getPrecedence() { result = 16 }
override predicate mayBeImpure() { this.getExpr().mayBeImpure() }
@@ -700,7 +700,7 @@ class SizeofPackOperator extends Expr, @sizeof_pack {
* A C/C++ sizeof expression.
*/
abstract class SizeofOperator extends Expr, @runtime_sizeof {
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
}
/**
@@ -763,7 +763,7 @@ class SizeofTypeOperator extends SizeofOperator {
* A C++11 `alignof` expression.
*/
abstract class AlignofOperator extends Expr, @runtime_alignof {
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
}
/**

View File

@@ -2,7 +2,6 @@ import semmle.code.cpp.Element
private import semmle.code.cpp.Enclosing
private import semmle.code.cpp.internal.ResolveClass
private import semmle.code.cpp.internal.AddressConstantExpression
private import semmle.code.cpp.models.implementations.Allocation
/**
* A C/C++ expression.
@@ -443,20 +442,6 @@ class Expr extends StmtParent, @expr {
else result = this
}
/**
* Gets the unique non-`Conversion` expression `e` for which
* `this = e.getConversion*()`.
*
* For example, if called on the expression `(int)(char)x`, this predicate
* gets the expression `x`.
*/
Expr getUnconverted() {
not this instanceof Conversion and
result = this
or
result = this.(Conversion).getExpr().getUnconverted()
}
/**
* Gets the type of this expression, after any implicit conversions and explicit casts, and after resolving typedefs.
*
@@ -643,7 +628,7 @@ class AddressOfExpr extends UnaryOperation, @address_of {
override string getOperator() { result = "&" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
override predicate mayBeImpure() { this.getOperand().mayBeImpure() }
@@ -665,7 +650,7 @@ class ReferenceToExpr extends Conversion, @reference_to {
override string getCanonicalQLClass() { result = "ReferenceToExpr" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
}
/**
@@ -688,7 +673,7 @@ class PointerDereferenceExpr extends UnaryOperation, @indirect {
override string getOperator() { result = "*" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
override predicate mayBeImpure() {
this.getChild(0).mayBeImpure() or
@@ -722,7 +707,7 @@ class ReferenceDereferenceExpr extends Conversion, @ref_indirect {
* A C++ `new` or `new[]` expression.
*/
class NewOrNewArrayExpr extends Expr, @any_new_expr {
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
/**
* Gets the `operator new` or `operator new[]` that allocates storage.
@@ -805,10 +790,8 @@ class NewOrNewArrayExpr extends Expr, @any_new_expr {
* call the constructor of `T` but will not allocate memory.
*/
Expr getPlacementPointer() {
result =
this
.getAllocatorCall()
.getArgument(this.getAllocator().(OperatorNewAllocationFunction).getPlacementArgument())
isStandardPlacementNewAllocator(this.getAllocator()) and
result = this.getAllocatorCall().getArgument(1)
}
}
@@ -901,7 +884,7 @@ class DeleteExpr extends Expr, @delete_expr {
override string getCanonicalQLClass() { result = "DeleteExpr" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
/**
* Gets the compile-time type of the object being deleted.
@@ -975,7 +958,7 @@ class DeleteArrayExpr extends Expr, @delete_array_expr {
override string getCanonicalQLClass() { result = "DeleteArrayExpr" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
/**
* Gets the element type of the array being deleted.
@@ -1197,6 +1180,12 @@ private predicate convparents(Expr child, int idx, Element parent) {
)
}
private predicate isStandardPlacementNewAllocator(Function operatorNew) {
operatorNew.getName().matches("operator new%") and
operatorNew.getNumberOfParameters() = 2 and
operatorNew.getParameter(1).getType() instanceof VoidPointerType
}
// Pulled out for performance. See QL-796.
private predicate hasNoConversions(Expr e) { not e.hasConversion() }
@@ -1213,18 +1202,3 @@ private predicate constantTemplateLiteral(Expr e) {
or
constantTemplateLiteral(e.(Cast).getExpr())
}
/**
* A C++ three-way comparison operation, also known as the _spaceship
* operation_. This is specific to C++20 and later.
* ```
* auto c = (a <=> b);
* ```
*/
class SpaceshipExpr extends BinaryOperation, @spaceshipexpr {
override string getCanonicalQLClass() { result = "SpaceshipExpr" }
override int getPrecedence() { result = 11 }
override string getOperator() { result = "<=>" }
}

View File

@@ -16,7 +16,7 @@ class NotExpr extends UnaryLogicalOperation, @notexpr {
override string getCanonicalQLClass() { result = "NotExpr" }
override int getPrecedence() { result = 16 }
override int getPrecedence() { result = 15 }
}
/**

View File

@@ -2,7 +2,6 @@ import cpp
import semmle.code.cpp.security.Security
private import semmle.code.cpp.ir.dataflow.DataFlow
private import semmle.code.cpp.ir.dataflow.DataFlow2
private import semmle.code.cpp.ir.dataflow.DataFlow3
private import semmle.code.cpp.ir.IR
private import semmle.code.cpp.ir.dataflow.internal.DataFlowDispatch as Dispatch
private import semmle.code.cpp.models.interfaces.Taint
@@ -22,53 +21,12 @@ private predicate predictableInstruction(Instruction instr) {
predictableInstruction(instr.(UnaryInstruction).getUnary())
}
/**
* Functions that we should only allow taint to flow through (to the return
* value) if all but the source argument are 'predictable'. This is done to
* emulate the old security library's implementation rather than due to any
* strong belief that this is the right approach.
*
* Note that the list itself is not very principled; it consists of all the
* functions listed in the old security library's [default] `isPureFunction`
* that have more than one argument, but are not in the old taint tracking
* library's `returnArgument` predicate. In addition, `strlen` is included
* because it's also a special case in flow to return values.
*/
predicate predictableOnlyFlow(string name) {
name = "strcasestr" or
name = "strchnul" or
name = "strchr" or
name = "strchrnul" or
name = "strcmp" or
name = "strcspn" or
name = "strlen" or // special case
name = "strncmp" or
name = "strndup" or
name = "strnlen" or
name = "strrchr" or
name = "strspn" or
name = "strstr" or
name = "strtod" or
name = "strtof" or
name = "strtol" or
name = "strtoll" or
name = "strtoq" or
name = "strtoul"
}
private DataFlow::Node getNodeForSource(Expr source) {
isUserInput(source, _) and
(
result = DataFlow::exprNode(source)
or
// Some of the sources in `isUserInput` are intended to match the value of
// an expression, while others (those modeled below) are intended to match
// the taint that propagates out of an argument, like the `char *` argument
// to `gets`. It's impossible here to tell which is which, but the "access
// to argv" source is definitely not intended to match an output argument,
// and it causes false positives if we let it.
result = DataFlow::definitionByReferenceNode(source) and
not argv(source.(VariableAccess).getTarget())
result = DataFlow::definitionByReferenceNode(source)
)
}
@@ -77,15 +35,13 @@ private class DefaultTaintTrackingCfg extends DataFlow::Configuration {
override predicate isSource(DataFlow::Node source) { source = getNodeForSource(_) }
override predicate isSink(DataFlow::Node sink) { exists(adjustedSink(sink)) }
override predicate isSink(DataFlow::Node sink) { any() }
override predicate isAdditionalFlowStep(DataFlow::Node n1, DataFlow::Node n2) {
instructionTaintStep(n1.asInstruction(), n2.asInstruction())
}
override predicate isBarrier(DataFlow::Node node) { nodeIsBarrier(node) }
override predicate isBarrierIn(DataFlow::Node node) { nodeIsBarrierIn(node) }
}
private class ToGlobalVarTaintTrackingCfg extends DataFlow::Configuration {
@@ -94,45 +50,43 @@ private class ToGlobalVarTaintTrackingCfg extends DataFlow::Configuration {
override predicate isSource(DataFlow::Node source) { source = getNodeForSource(_) }
override predicate isSink(DataFlow::Node sink) {
sink.asVariable() instanceof GlobalOrNamespaceVariable
exists(GlobalOrNamespaceVariable gv | writesVariable(sink.asInstruction(), gv))
}
override predicate isAdditionalFlowStep(DataFlow::Node n1, DataFlow::Node n2) {
instructionTaintStep(n1.asInstruction(), n2.asInstruction())
or
writesVariable(n1.asInstruction(), n2.asVariable().(GlobalOrNamespaceVariable))
or
readsVariable(n2.asInstruction(), n1.asVariable().(GlobalOrNamespaceVariable))
exists(StoreInstruction i1, LoadInstruction i2, GlobalOrNamespaceVariable gv |
writesVariable(i1, gv) and
readsVariable(i2, gv) and
i1 = n1.asInstruction() and
i2 = n2.asInstruction()
)
}
override predicate isBarrier(DataFlow::Node node) { nodeIsBarrier(node) }
override predicate isBarrierIn(DataFlow::Node node) { nodeIsBarrierIn(node) }
}
private class FromGlobalVarTaintTrackingCfg extends DataFlow2::Configuration {
FromGlobalVarTaintTrackingCfg() { this = "FromGlobalVarTaintTrackingCfg" }
override predicate isSource(DataFlow::Node source) {
// This set of sources should be reasonably small, which is good for
// performance since the set of sinks is very large.
exists(ToGlobalVarTaintTrackingCfg otherCfg | otherCfg.hasFlowTo(source))
exists(
ToGlobalVarTaintTrackingCfg other, DataFlow::Node prevSink, GlobalOrNamespaceVariable gv
|
other.hasFlowTo(prevSink) and
writesVariable(prevSink.asInstruction(), gv) and
readsVariable(source.asInstruction(), gv)
)
}
override predicate isSink(DataFlow::Node sink) { exists(adjustedSink(sink)) }
override predicate isSink(DataFlow::Node sink) { any() }
override predicate isAdditionalFlowStep(DataFlow::Node n1, DataFlow::Node n2) {
instructionTaintStep(n1.asInstruction(), n2.asInstruction())
or
// Additional step for flow out of variables. There is no flow _into_
// variables in this configuration, so this step only serves to take flow
// out of a variable that's a source.
readsVariable(n2.asInstruction(), n1.asVariable())
}
override predicate isBarrier(DataFlow::Node node) { nodeIsBarrier(node) }
override predicate isBarrierIn(DataFlow::Node node) { nodeIsBarrierIn(node) }
}
private predicate readsVariable(LoadInstruction load, Variable var) {
@@ -144,17 +98,7 @@ private predicate writesVariable(StoreInstruction store, Variable var) {
}
/**
* A variable that has any kind of upper-bound check anywhere in the program. This is
* biased towards being inclusive because there are a lot of valid ways of doing an
* upper bounds checks if we don't consider where it occurs, for example:
* ```
* if (x < 10) { sink(x); }
*
* if (10 > y) { sink(y); }
*
* if (z > 10) { z = 10; }
* sink(z);
* ```
* A variable that has any kind of upper-bound check anywhere in the program
*/
// TODO: This coarse overapproximation, ported from the old taint tracking
// library, could be replaced with an actual semantic check that a particular
@@ -163,10 +107,10 @@ private predicate writesVariable(StoreInstruction store, Variable var) {
// previously suppressed by this predicate by coincidence.
private predicate hasUpperBoundsCheck(Variable var) {
exists(RelationalOperation oper, VariableAccess access |
oper.getAnOperand() = access and
oper.getLeftOperand() = access and
access.getTarget() = var and
// Comparing to 0 is not an upper bound check
not oper.getAnOperand().getValue() = "0"
not oper.getRightOperand().getValue() = "0"
)
}
@@ -177,65 +121,25 @@ private predicate nodeIsBarrier(DataFlow::Node node) {
)
}
private predicate nodeIsBarrierIn(DataFlow::Node node) {
// don't use dataflow into taint sources, as this leads to duplicate results.
node = getNodeForSource(any(Expr e))
}
cached
private predicate instructionTaintStep(Instruction i1, Instruction i2) {
// Expressions computed from tainted data are also tainted
exists(CallInstruction call, int argIndex | call = i2 |
isPureFunction(call.getStaticCallTarget().getName()) and
i1 = getACallArgumentOrIndirection(call, argIndex) and
forall(Instruction arg | arg = call.getAnArgument() |
arg = getACallArgumentOrIndirection(call, argIndex) or predictableInstruction(arg)
) and
// flow through `strlen` tends to cause dubious results, if the length is
// bounded.
not call.getStaticCallTarget().getName() = "strlen"
)
i2 =
any(CallInstruction call |
isPureFunction(call.getStaticCallTarget().getName()) and
call.getAnArgument() = i1 and
forall(Instruction arg | arg = call.getAnArgument() | arg = i1 or predictableInstruction(arg)) and
// flow through `strlen` tends to cause dubious results, if the length is
// bounded.
not call.getStaticCallTarget().getName() = "strlen"
)
or
// Flow through pointer dereference
i2.(LoadInstruction).getSourceAddress() = i1
or
// Flow through partial reads of arrays and unions
i2.(LoadInstruction).getSourceValueOperand().getAnyDef() = i1 and
not i1.isResultConflated() and
(
i1.getResultType() instanceof ArrayType or
i1.getResultType() instanceof Union
)
i2.(UnaryInstruction).getUnary() = i1
or
// Unary instructions tend to preserve enough information in practice that we
// want taint to flow through.
// The exception is `FieldAddressInstruction`. Together with the rule for
// `LoadInstruction` above and for `ChiInstruction` below, flow through
// `FieldAddressInstruction` could cause flow into one field to come out an
// unrelated field. This would happen across function boundaries, where the IR
// would not be able to match loads to stores.
i2.(UnaryInstruction).getUnary() = i1 and
(
not i2 instanceof FieldAddressInstruction
or
i2.(FieldAddressInstruction).getField().getDeclaringType() instanceof Union
)
or
// Flow out of definition-by-reference
i2.(ChiInstruction).getPartial() = i1.(WriteSideEffectInstruction) and
not i2.isResultConflated()
or
// Flow from an element to an array or union that contains it.
i2.(ChiInstruction).getPartial() = i1 and
not i2.isResultConflated() and
exists(Type t | i2.getResultLanguageType().hasType(t, false) |
t instanceof Union
or
t instanceof ArrayType
or
// Buffers of unknown size
t instanceof UnknownType
)
not isChiForAllAliasedMemory(i2)
or
exists(BinaryInstruction bin |
bin = i2 and
@@ -247,29 +151,12 @@ private predicate instructionTaintStep(Instruction i1, Instruction i2) {
// from `a`.
i2.(PointerAddInstruction).getLeft() = i1
or
// Until we have from through indirections across calls, we'll take flow out
// of the parameter and into its indirection.
exists(IRFunction f, Parameter parameter |
i1 = getInitializeParameter(f, parameter) and
i2 = getInitializeIndirection(f, parameter)
)
or
// Until we have flow through indirections across calls, we'll take flow out
// of the indirection and into the argument.
// When we get proper flow through indirections across calls, this code can be
// moved to `adjusedSink` or possibly into the `DataFlow::ExprNode` class.
exists(ReadSideEffectInstruction read |
read.getAnOperand().(SideEffectOperand).getAnyDef() = i1 and
read.getArgumentDef() = i2
)
or
// Flow from argument to return value
i2 =
any(CallInstruction call |
exists(int indexIn |
modelTaintToReturnValue(call.getStaticCallTarget(), indexIn) and
i1 = getACallArgumentOrIndirection(call, indexIn) and
not predictableOnlyFlow(call.getStaticCallTarget().getName())
i1 = getACallArgumentOrIndirection(call, indexIn)
)
)
or
@@ -289,18 +176,6 @@ private predicate instructionTaintStep(Instruction i1, Instruction i2) {
)
}
pragma[noinline]
private InitializeIndirectionInstruction getInitializeIndirection(IRFunction f, Parameter p) {
result.getParameter() = p and
result.getEnclosingIRFunction() = f
}
pragma[noinline]
private InitializeParameterInstruction getInitializeParameter(IRFunction f, Parameter p) {
result.getParameter() = p and
result.getEnclosingIRFunction() = f
}
/**
* Get an instruction that goes into argument `argumentIndex` of `call`. This
* can be either directly or through one pointer indirection.
@@ -328,6 +203,19 @@ private predicate modelTaintToParameter(Function f, int parameterIn, int paramet
)
}
/**
* Holds if `chi` is on the chain of chi-instructions for all aliased memory.
* Taint shoud not pass through these instructions since they tend to mix up
* unrelated objects.
*/
private predicate isChiForAllAliasedMemory(Instruction instr) {
instr.(ChiInstruction).getTotal() instanceof AliasedDefinitionInstruction
or
isChiForAllAliasedMemory(instr.(ChiInstruction).getTotal())
or
isChiForAllAliasedMemory(instr.(PhiInstruction).getAnInput())
}
private predicate modelTaintToReturnValue(Function f, int parameterIn) {
// Taint flow from parameter to return value
exists(FunctionInput modelIn, FunctionOutput modelOut |
@@ -386,24 +274,24 @@ private Element adjustedSink(DataFlow::Node sink) {
// short-circuiting condition and thus might get skipped.
result.(NotExpr).getOperand() = sink.asExpr()
or
// Taint postfix and prefix crement operations when their operand is tainted.
result.(CrementOperation).getAnOperand() = sink.asExpr()
// For compatibility, send flow from argument read side effects to their
// corresponding argument expression
exists(IndirectReadSideEffectInstruction read |
read.getAnOperand().(SideEffectOperand).getAnyDef() = sink.asInstruction() and
read.getArgumentDef().getUnconvertedResultExpression() = result
)
or
// Taint `e1 += e2`, `e &= e2` and friends when `e1` or `e2` is tainted.
result.(AssignOperation).getAnOperand() = sink.asExpr()
exists(BufferReadSideEffectInstruction read |
read.getAnOperand().(SideEffectOperand).getAnyDef() = sink.asInstruction() and
read.getArgumentDef().getUnconvertedResultExpression() = result
)
or
exists(SizedBufferReadSideEffectInstruction read |
read.getAnOperand().(SideEffectOperand).getAnyDef() = sink.asInstruction() and
read.getArgumentDef().getUnconvertedResultExpression() = result
)
}
/**
* Holds if `tainted` may contain taint from `source`.
*
* A tainted expression is either directly user input, or is
* computed from user input in a way that users can probably
* control the exact output of the computation.
*
* This doesn't include data flow through global variables.
* If you need that you must call `taintedIncludingGlobalVars`.
*/
cached
predicate tainted(Expr source, Element tainted) {
exists(DefaultTaintTrackingCfg cfg, DataFlow::Node sink |
cfg.hasFlow(getNodeForSource(source), sink) and
@@ -411,277 +299,38 @@ predicate tainted(Expr source, Element tainted) {
)
}
/**
* Holds if `tainted` may contain taint from `source`, where the taint passed
* through a global variable named `globalVar`.
*
* A tainted expression is either directly user input, or is
* computed from user input in a way that users can probably
* control the exact output of the computation.
*
* This version gives the same results as tainted but also includes
* data flow through global variables.
*
* The parameter `globalVar` is the qualified name of the last global variable
* used to move the value from source to tainted. If the taint did not pass
* through a global variable, then `globalVar = ""`.
*/
cached
predicate taintedIncludingGlobalVars(Expr source, Element tainted, string globalVar) {
tainted(source, tainted) and
globalVar = ""
or
exists(
ToGlobalVarTaintTrackingCfg toCfg, FromGlobalVarTaintTrackingCfg fromCfg,
DataFlow::VariableNode variableNode, GlobalOrNamespaceVariable global, DataFlow::Node sink
ToGlobalVarTaintTrackingCfg toCfg, FromGlobalVarTaintTrackingCfg fromCfg, DataFlow::Node store,
GlobalOrNamespaceVariable global, DataFlow::Node load, DataFlow::Node sink
|
global = variableNode.getVariable() and
toCfg.hasFlow(getNodeForSource(source), variableNode) and
fromCfg.hasFlow(variableNode, sink) and
toCfg.hasFlow(getNodeForSource(source), store) and
store
.asInstruction()
.(StoreInstruction)
.getDestinationAddress()
.(VariableAddressInstruction)
.getASTVariable() = global and
load
.asInstruction()
.(LoadInstruction)
.getSourceAddress()
.(VariableAddressInstruction)
.getASTVariable() = global and
fromCfg.hasFlow(load, sink) and
tainted = adjustedSink(sink) and
global = globalVarFromId(globalVar)
)
}
/**
* Gets the global variable whose qualified name is `id`. Use this predicate
* together with `taintedIncludingGlobalVars`. Example:
*
* ```
* exists(string varName |
* taintedIncludingGlobalVars(source, tainted, varName) and
* var = globalVarFromId(varName)
* )
* ```
*/
GlobalOrNamespaceVariable globalVarFromId(string id) { id = result.getQualifiedName() }
/**
* Resolve potential target function(s) for `call`.
*
* If `call` is a call through a function pointer (`ExprCall`) or
* targets a virtual method, simple data flow analysis is performed
* in order to identify target(s).
*/
Function resolveCall(Call call) {
exists(CallInstruction callInstruction |
callInstruction.getAST() = call and
result = Dispatch::viableCallable(callInstruction)
)
}
/**
* Provides definitions for augmenting source/sink pairs with data-flow paths
* between them. From a `@kind path-problem` query, import this module in the
* global scope, extend `TaintTrackingConfiguration`, and use `taintedWithPath`
* in place of `tainted`.
*
* Importing this module will also import the query predicates that contain the
* taint paths.
*/
module TaintedWithPath {
private newtype TSingleton = MkSingleton()
/**
* A taint-tracking configuration that matches sources and sinks in the same
* way as the `tainted` predicate.
*
* Override `isSink` and `taintThroughGlobals` as needed, but do not provide
* a characteristic predicate.
*/
class TaintTrackingConfiguration extends TSingleton {
/** Override this to specify which elements are sinks in this configuration. */
abstract predicate isSink(Element e);
/**
* Override this predicate to `any()` to allow taint to flow through global
* variables.
*/
predicate taintThroughGlobals() { none() }
/** Gets a textual representation of this element. */
string toString() { result = "TaintTrackingConfiguration" }
}
private class AdjustedConfiguration extends DataFlow3::Configuration {
AdjustedConfiguration() { this = "AdjustedConfiguration" }
override predicate isSource(DataFlow::Node source) { source = getNodeForSource(_) }
override predicate isSink(DataFlow::Node sink) {
exists(TaintTrackingConfiguration cfg | cfg.isSink(adjustedSink(sink)))
}
override predicate isAdditionalFlowStep(DataFlow::Node n1, DataFlow::Node n2) {
instructionTaintStep(n1.asInstruction(), n2.asInstruction())
or
exists(TaintTrackingConfiguration cfg | cfg.taintThroughGlobals() |
writesVariable(n1.asInstruction(), n2.asVariable().(GlobalOrNamespaceVariable))
or
readsVariable(n2.asInstruction(), n1.asVariable().(GlobalOrNamespaceVariable))
)
}
override predicate isBarrier(DataFlow::Node node) { nodeIsBarrier(node) }
override predicate isBarrierIn(DataFlow::Node node) { nodeIsBarrierIn(node) }
}
/*
* A sink `Element` may map to multiple `DataFlowX::PathNode`s via (the
* inverse of) `adjustedSink`. For example, an `Expr` maps to all its
* conversions, and a `Variable` maps to all loads and stores from it. Because
* the path node is part of the tuple that constitutes the alert, this leads
* to duplicate alerts.
*
* To avoid showing duplicates, we edit the graph to replace the final node
* coming from the data-flow library with a node that matches exactly the
* `Element` sink that's requested.
*
* The same is done for sources.
*/
private newtype TPathNode =
TWrapPathNode(DataFlow3::PathNode n) or
// There's a single newtype constructor for both sources and sinks since
// that makes it easiest to deal with the case where source = sink.
TEndpointPathNode(Element e) {
exists(AdjustedConfiguration cfg, DataFlow3::Node sourceNode, DataFlow3::Node sinkNode |
cfg.hasFlow(sourceNode, sinkNode)
|
sourceNode = getNodeForSource(e)
or
e = adjustedSink(sinkNode) and
exists(TaintTrackingConfiguration ttCfg | ttCfg.isSink(e))
)
}
/** An opaque type used for the nodes of a data-flow path. */
class PathNode extends TPathNode {
/** Gets a textual representation of this element. */
string toString() { none() }
/**
* Holds if this element is at the specified location.
* The location spans column `startcolumn` of line `startline` to
* column `endcolumn` of line `endline` in file `filepath`.
* For more information, see
* [Locations](https://help.semmle.com/QL/learn-ql/ql/locations.html).
*/
predicate hasLocationInfo(
string filepath, int startline, int startcolumn, int endline, int endcolumn
) {
none()
}
}
private class WrapPathNode extends PathNode, TWrapPathNode {
DataFlow3::PathNode inner() { this = TWrapPathNode(result) }
override string toString() { result = this.inner().toString() }
override predicate hasLocationInfo(
string filepath, int startline, int startcolumn, int endline, int endcolumn
) {
this.inner().hasLocationInfo(filepath, startline, startcolumn, endline, endcolumn)
}
}
private class EndpointPathNode extends PathNode, TEndpointPathNode {
Expr inner() { this = TEndpointPathNode(result) }
override string toString() { result = this.inner().toString() }
override predicate hasLocationInfo(
string filepath, int startline, int startcolumn, int endline, int endcolumn
) {
this
.inner()
.getLocation()
.hasLocationInfo(filepath, startline, startcolumn, endline, endcolumn)
}
}
/** A PathNode whose `Element` is a source. It may also be a sink. */
private class InitialPathNode extends EndpointPathNode {
InitialPathNode() { exists(getNodeForSource(this.inner())) }
}
/** A PathNode whose `Element` is a sink. It may also be a source. */
private class FinalPathNode extends EndpointPathNode {
FinalPathNode() { exists(TaintTrackingConfiguration cfg | cfg.isSink(this.inner())) }
}
/** Holds if `(a,b)` is an edge in the graph of data flow path explanations. */
query predicate edges(PathNode a, PathNode b) {
DataFlow3::PathGraph::edges(a.(WrapPathNode).inner(), b.(WrapPathNode).inner())
or
// To avoid showing trivial-looking steps, we _replace_ the last node instead
// of adding an edge out of it.
exists(WrapPathNode sinkNode |
DataFlow3::PathGraph::edges(a.(WrapPathNode).inner(), sinkNode.inner()) and
b.(FinalPathNode).inner() = adjustedSink(sinkNode.inner().getNode())
)
or
// Same for the first node
exists(WrapPathNode sourceNode |
DataFlow3::PathGraph::edges(sourceNode.inner(), b.(WrapPathNode).inner()) and
sourceNode.inner().getNode() = getNodeForSource(a.(InitialPathNode).inner())
)
or
// Finally, handle the case where the path goes directly from a source to a
// sink, meaning that they both need to be translated.
exists(WrapPathNode sinkNode, WrapPathNode sourceNode |
DataFlow3::PathGraph::edges(sourceNode.inner(), sinkNode.inner()) and
sourceNode.inner().getNode() = getNodeForSource(a.(InitialPathNode).inner()) and
b.(FinalPathNode).inner() = adjustedSink(sinkNode.inner().getNode())
)
}
/** Holds if `n` is a node in the graph of data flow path explanations. */
query predicate nodes(PathNode n, string key, string val) {
key = "semmle.label" and val = n.toString()
}
/**
* Holds if `tainted` may contain taint from `source`, where `sourceNode` and
* `sinkNode` are the corresponding `PathNode`s that can be used in a query
* to provide path explanations. Extend `TaintTrackingConfiguration` to use
* this predicate.
*
* A tainted expression is either directly user input, or is computed from
* user input in a way that users can probably control the exact output of
* the computation.
*/
predicate taintedWithPath(Expr source, Element tainted, PathNode sourceNode, PathNode sinkNode) {
exists(AdjustedConfiguration cfg, DataFlow3::Node flowSource, DataFlow3::Node flowSink |
source = sourceNode.(InitialPathNode).inner() and
flowSource = getNodeForSource(source) and
cfg.hasFlow(flowSource, flowSink) and
tainted = adjustedSink(flowSink) and
tainted = sinkNode.(FinalPathNode).inner()
)
}
private predicate isGlobalVariablePathNode(WrapPathNode n) {
n.inner().getNode().asVariable() instanceof GlobalOrNamespaceVariable
}
private predicate edgesWithoutGlobals(PathNode a, PathNode b) {
edges(a, b) and
not isGlobalVariablePathNode(a) and
not isGlobalVariablePathNode(b)
}
/**
* Holds if `tainted` can be reached from a taint source without passing
* through a global variable.
*/
predicate taintedWithoutGlobals(Element tainted) {
exists(PathNode sourceNode, FinalPathNode sinkNode |
sourceNode.(WrapPathNode).inner().getNode() = getNodeForSource(_) and
edgesWithoutGlobals+(sourceNode, sinkNode) and
tainted = sinkNode.inner()
)
}
}

View File

@@ -3,6 +3,8 @@ private import semmle.code.cpp.ir.IR
private import semmle.code.cpp.ir.dataflow.DataFlow
private import semmle.code.cpp.ir.dataflow.internal.DataFlowPrivate
Function viableImpl(CallInstruction call) { result = viableCallable(call) }
/**
* Gets a function that might be called by `call`.
*/
@@ -70,69 +72,59 @@ private module VirtualDispatch {
// Call return
exists(DataFlowCall call, ReturnKind returnKind |
other = getAnOutNode(call, returnKind) and
returnNodeWithKindAndEnclosingCallable(src, returnKind, call.getStaticCallTarget())
src.(ReturnNode).getKind() = returnKind and
call.getStaticCallTarget() = src.getEnclosingCallable()
) and
allowFromArg = false
or
// Local flow
DataFlow::localFlowStep(src, other) and
allowFromArg = allowOtherFromArg
or
// Flow from global variable to load.
exists(LoadInstruction load, GlobalOrNamespaceVariable var |
var = src.asVariable() and
other.asInstruction() = load and
// The `allowFromArg` concept doesn't play a role when `src` is a
// global variable, so we just set it to a single arbitrary value for
// performance.
allowFromArg = true
|
// Load directly from the global variable
load.getSourceAddress().(VariableAddressInstruction).getASTVariable() = var
or
// Load from a field on a global union
exists(FieldAddressInstruction fa |
fa = load.getSourceAddress() and
fa.getObjectAddress().(VariableAddressInstruction).getASTVariable() = var and
fa.getField().getDeclaringType() instanceof Union
)
or
// Flow through global variable
exists(StoreInstruction store |
store = src.asInstruction() and
(
exists(Variable var |
var = store.getDestinationAddress().(VariableAddressInstruction).getASTVariable() and
this.flowsFromGlobal(var)
)
)
or
// Flow from store to global variable. These cases are similar to the
// above but have `StoreInstruction` instead of `LoadInstruction` and
// have the roles swapped between `other` and `src`.
exists(StoreInstruction store, GlobalOrNamespaceVariable var |
var = other.asVariable() and
store = src.asInstruction() and
// Setting `allowFromArg` to `true` like in the base case means we
// treat a store to a global variable like the dispatch itself: flow
// may come from anywhere.
allowFromArg = true
|
// Store directly to the global variable
store.getDestinationAddress().(VariableAddressInstruction).getASTVariable() = var
or
// Store to a field on a global union
exists(FieldAddressInstruction fa |
fa = store.getDestinationAddress() and
fa.getObjectAddress().(VariableAddressInstruction).getASTVariable() = var and
fa.getField().getDeclaringType() instanceof Union
exists(Variable var, FieldAccess a |
var =
store
.getDestinationAddress()
.(FieldAddressInstruction)
.getObjectAddress()
.(VariableAddressInstruction)
.getASTVariable() and
this.flowsFromGlobalUnionField(var, a)
)
)
) and
allowFromArg = true
)
}
}
/**
* A ReturnNode with its ReturnKind and its enclosing callable.
*
* Used to fix a join ordering issue in flowsFrom.
*/
private predicate returnNodeWithKindAndEnclosingCallable(
ReturnNode node, ReturnKind kind, DataFlowCallable callable
) {
node.getKind() = kind and
node.getEnclosingCallable() = callable
private predicate flowsFromGlobal(GlobalOrNamespaceVariable var) {
exists(LoadInstruction load |
this.flowsFrom(DataFlow::instructionNode(load), _) and
load.getSourceAddress().(VariableAddressInstruction).getASTVariable() = var
)
}
private predicate flowsFromGlobalUnionField(Variable var, FieldAccess a) {
a.getTarget().getDeclaringType() instanceof Union and
exists(LoadInstruction load |
this.flowsFrom(DataFlow::instructionNode(load), _) and
load
.getSourceAddress()
.(FieldAddressInstruction)
.getObjectAddress()
.(VariableAddressInstruction)
.getASTVariable() = var
)
}
}
/** Call through a function pointer. */
@@ -145,12 +137,6 @@ private module VirtualDispatch {
exists(FunctionInstruction fi |
this.flowsFrom(DataFlow::instructionNode(fi), _) and
result = fi.getFunctionSymbol()
) and
(
this.getNumberOfArguments() <= result.getEffectiveNumberOfParameters() and
this.getNumberOfArguments() >= result.getEffectiveNumberOfParameters()
or
result.isVarargs()
)
}
}

View File

@@ -1,175 +0,0 @@
/**
* Provides consistency queries for checking invariants in the language-specific
* data-flow classes and predicates.
*/
private import DataFlowImplSpecific::Private
private import DataFlowImplSpecific::Public
private import tainttracking1.TaintTrackingParameter::Private
private import tainttracking1.TaintTrackingParameter::Public
module Consistency {
private class RelevantNode extends Node {
RelevantNode() {
this instanceof ArgumentNode or
this instanceof ParameterNode or
this instanceof ReturnNode or
this = getAnOutNode(_, _) or
simpleLocalFlowStep(this, _) or
simpleLocalFlowStep(_, this) or
jumpStep(this, _) or
jumpStep(_, this) or
storeStep(this, _, _) or
storeStep(_, _, this) or
readStep(this, _, _) or
readStep(_, _, this) or
defaultAdditionalTaintStep(this, _) or
defaultAdditionalTaintStep(_, this)
}
}
query predicate uniqueEnclosingCallable(Node n, string msg) {
exists(int c |
n instanceof RelevantNode and
c = count(n.getEnclosingCallable()) and
c != 1 and
msg = "Node should have one enclosing callable but has " + c + "."
)
}
query predicate uniqueTypeBound(Node n, string msg) {
exists(int c |
n instanceof RelevantNode and
c = count(n.getTypeBound()) and
c != 1 and
msg = "Node should have one type bound but has " + c + "."
)
}
query predicate uniqueTypeRepr(Node n, string msg) {
exists(int c |
n instanceof RelevantNode and
c = count(getErasedRepr(n.getTypeBound())) and
c != 1 and
msg = "Node should have one type representation but has " + c + "."
)
}
query predicate uniqueNodeLocation(Node n, string msg) {
exists(int c |
c =
count(string filepath, int startline, int startcolumn, int endline, int endcolumn |
n.hasLocationInfo(filepath, startline, startcolumn, endline, endcolumn)
) and
c != 1 and
msg = "Node should have one location but has " + c + "."
)
}
query predicate missingLocation(string msg) {
exists(int c |
c =
strictcount(Node n |
not exists(string filepath, int startline, int startcolumn, int endline, int endcolumn |
n.hasLocationInfo(filepath, startline, startcolumn, endline, endcolumn)
)
) and
msg = "Nodes without location: " + c
)
}
query predicate uniqueNodeToString(Node n, string msg) {
exists(int c |
c = count(n.toString()) and
c != 1 and
msg = "Node should have one toString but has " + c + "."
)
}
query predicate missingToString(string msg) {
exists(int c |
c = strictcount(Node n | not exists(n.toString())) and
msg = "Nodes without toString: " + c
)
}
query predicate parameterCallable(ParameterNode p, string msg) {
exists(DataFlowCallable c | p.isParameterOf(c, _) and c != p.getEnclosingCallable()) and
msg = "Callable mismatch for parameter."
}
query predicate localFlowIsLocal(Node n1, Node n2, string msg) {
simpleLocalFlowStep(n1, n2) and
n1.getEnclosingCallable() != n2.getEnclosingCallable() and
msg = "Local flow step does not preserve enclosing callable."
}
private DataFlowType typeRepr() { result = getErasedRepr(any(Node n).getTypeBound()) }
query predicate compatibleTypesReflexive(DataFlowType t, string msg) {
t = typeRepr() and
not compatibleTypes(t, t) and
msg = "Type compatibility predicate is not reflexive."
}
query predicate unreachableNodeCCtx(Node n, DataFlowCall call, string msg) {
isUnreachableInCall(n, call) and
exists(DataFlowCallable c |
c = n.getEnclosingCallable() and
not viableCallable(call) = c
) and
msg = "Call context for isUnreachableInCall is inconsistent with call graph."
}
query predicate localCallNodes(DataFlowCall call, Node n, string msg) {
(
n = getAnOutNode(call, _) and
msg = "OutNode and call does not share enclosing callable."
or
n.(ArgumentNode).argumentOf(call, _) and
msg = "ArgumentNode and call does not share enclosing callable."
) and
n.getEnclosingCallable() != call.getEnclosingCallable()
}
query predicate postIsNotPre(PostUpdateNode n, string msg) {
n.getPreUpdateNode() = n and msg = "PostUpdateNode should not equal its pre-update node."
}
query predicate postHasUniquePre(PostUpdateNode n, string msg) {
exists(int c |
c = count(n.getPreUpdateNode()) and
c != 1 and
msg = "PostUpdateNode should have one pre-update node but has " + c + "."
)
}
query predicate uniquePostUpdate(Node n, string msg) {
1 < strictcount(PostUpdateNode post | post.getPreUpdateNode() = n) and
msg = "Node has multiple PostUpdateNodes."
}
query predicate postIsInSameCallable(PostUpdateNode n, string msg) {
n.getEnclosingCallable() != n.getPreUpdateNode().getEnclosingCallable() and
msg = "PostUpdateNode does not share callable with its pre-update node."
}
private predicate hasPost(Node n) { exists(PostUpdateNode post | post.getPreUpdateNode() = n) }
query predicate reverseRead(Node n, string msg) {
exists(Node n2 | readStep(n, _, n2) and hasPost(n2) and not hasPost(n)) and
msg = "Origin of readStep is missing a PostUpdateNode."
}
query predicate storeIsPostUpdate(Node n, string msg) {
storeStep(_, _, n) and
not n instanceof PostUpdateNode and
msg = "Store targets should be PostUpdateNodes."
}
query predicate argHasPostUpdate(ArgumentNode n, string msg) {
not hasPost(n) and
not isImmutableOrUnobservable(n) and
msg = "ArgumentNode is missing PostUpdateNode."
}
}

View File

@@ -7,26 +7,24 @@ private import DataFlowDispatch
* A data flow node that occurs as the argument of a call and is passed as-is
* to the callable. Instance arguments (`this` pointer) are also included.
*/
class ArgumentNode extends InstructionNode {
ArgumentNode() { exists(CallInstruction call | this.getInstruction() = call.getAnArgument()) }
class ArgumentNode extends Node {
ArgumentNode() { exists(CallInstruction call | this.asInstruction() = call.getAnArgument()) }
/**
* Holds if this argument occurs at the given position in the given call.
* The instance argument is considered to have index `-1`.
*/
predicate argumentOf(DataFlowCall call, int pos) {
this.getInstruction() = call.getPositionalArgument(pos)
this.asInstruction() = call.getPositionalArgument(pos)
or
this.getInstruction() = call.getThisArgument() and pos = -1
this.asInstruction() = call.getThisArgument() and pos = -1
}
/** Gets the call in which this node is an argument. */
DataFlowCall getCall() { this.argumentOf(result, _) }
}
private newtype TReturnKind =
TNormalReturnKind() or
TIndirectReturnKind(ParameterIndex index)
private newtype TReturnKind = TNormalReturnKind()
/**
* A return kind. A return kind describes how a value can be returned
@@ -34,76 +32,23 @@ private newtype TReturnKind =
*/
class ReturnKind extends TReturnKind {
/** Gets a textual representation of this return kind. */
abstract string toString();
}
private class NormalReturnKind extends ReturnKind, TNormalReturnKind {
override string toString() { result = "return" }
}
private class IndirectReturnKind extends ReturnKind, TIndirectReturnKind {
ParameterIndex index;
IndirectReturnKind() { this = TIndirectReturnKind(index) }
override string toString() { result = "outparam[" + index.toString() + "]" }
string toString() { result = "return" }
}
/** A data flow node that occurs as the result of a `ReturnStmt`. */
class ReturnNode extends InstructionNode {
Instruction primary;
ReturnNode() {
exists(ReturnValueInstruction ret | instr = ret.getReturnValue() and primary = ret)
or
exists(ReturnIndirectionInstruction rii |
instr = rii.getSideEffectOperand().getAnyDef() and primary = rii
)
}
class ReturnNode extends Node {
ReturnNode() { exists(ReturnValueInstruction ret | this.asInstruction() = ret.getReturnValue()) }
/** Gets the kind of this returned value. */
abstract ReturnKind getKind();
}
class ReturnValueNode extends ReturnNode {
override ReturnValueInstruction primary;
override ReturnKind getKind() { result = TNormalReturnKind() }
}
class ReturnIndirectionNode extends ReturnNode {
override ReturnIndirectionInstruction primary;
override ReturnKind getKind() { result = TIndirectReturnKind(primary.getParameter().getIndex()) }
ReturnKind getKind() { result = TNormalReturnKind() }
}
/** A data flow node that represents the output of a call. */
class OutNode extends InstructionNode {
OutNode() {
instr instanceof CallInstruction or
instr instanceof WriteSideEffectInstruction
}
/** Gets the underlying call. */
abstract DataFlowCall getCall();
abstract ReturnKind getReturnKind();
}
private class CallOutNode extends OutNode {
class OutNode extends Node {
override CallInstruction instr;
override DataFlowCall getCall() { result = instr }
override ReturnKind getReturnKind() { result instanceof NormalReturnKind }
}
private class SideEffectOutNode extends OutNode {
override WriteSideEffectInstruction instr;
override DataFlowCall getCall() { result = instr.getPrimaryInstruction() }
override ReturnKind getReturnKind() { result = TIndirectReturnKind(instr.getIndex()) }
/** Gets the underlying call. */
DataFlowCall getCall() { result = instr }
}
/**
@@ -112,7 +57,7 @@ private class SideEffectOutNode extends OutNode {
*/
OutNode getAnOutNode(DataFlowCall call, ReturnKind kind) {
result.getCall() = call and
result.getReturnKind() = kind
kind = TNormalReturnKind()
}
/**
@@ -122,6 +67,16 @@ OutNode getAnOutNode(DataFlowCall call, ReturnKind kind) {
*/
predicate jumpStep(Node n1, Node n2) { none() }
/**
* Holds if `call` passes an implicit or explicit qualifier, i.e., a
* `this` parameter.
*/
predicate callHasQualifier(Call call) {
call.hasQualifier()
or
call.getTarget() instanceof Destructor
}
private newtype TContent =
TFieldContent(Field f) or
TCollectionContent() or
@@ -209,7 +164,7 @@ Type getErasedRepr(Type t) {
}
/** Gets a string representation of a type returned by `getErasedRepr`. */
string ppReprType(Type t) { none() } // stub implementation
string ppReprType(Type t) { result = t.toString() }
/**
* Holds if `t1` and `t2` are compatible, that is, whether data can flow from
@@ -226,17 +181,11 @@ private predicate suppressUnusedType(Type t) { any() }
// Java QL library compatibility wrappers
//////////////////////////////////////////////////////////////////////////////
/** A node that performs a type cast. */
class CastNode extends InstructionNode {
class CastNode extends Node {
CastNode() { none() } // stub implementation
}
/**
* A function that may contain code or a variable that may contain itself. When
* flow crosses from one _enclosing callable_ to another, the interprocedural
* data-flow library discards call contexts and inserts a node in the big-step
* relation used for human-readable path explanations.
*/
class DataFlowCallable = Declaration;
class DataFlowCallable = Function;
class DataFlowExpr = Expr;
@@ -255,18 +204,3 @@ class DataFlowCall extends CallInstruction {
}
predicate isUnreachableInCall(Node n, DataFlowCall call) { none() } // stub implementation
int accessPathLimit() { result = 5 }
/**
* Holds if `n` does not require a `PostUpdateNode` as it either cannot be
* modified or its modification cannot be observed, for example if it is a
* freshly created object that is not saved in a variable.
*
* This predicate is only used for consistency checks.
*/
predicate isImmutableOrUnobservable(Node n) {
// The rules for whether an IR argument gets a post-update node are too
// complex to model here.
any()
}

View File

@@ -3,17 +3,16 @@
*/
private import cpp
// The `ValueNumbering` library has to be imported right after `cpp` to ensure
// that the cached IR gets the same checksum here as it does in queries that use
// `ValueNumbering` without `DataFlow`.
private import semmle.code.cpp.ir.ValueNumbering
private import semmle.code.cpp.ir.IR
private import semmle.code.cpp.controlflow.IRGuards
private import semmle.code.cpp.models.interfaces.DataFlow
private import semmle.code.cpp.ir.ValueNumbering
private newtype TIRDataFlowNode =
TInstructionNode(Instruction i) or
TVariableNode(Variable var)
/**
* A newtype wrapper to prevent accidental casts between `Node` and
* `Instruction`. This ensures we can add `Node`s that are not `Instruction`s
* in the future.
*/
private newtype TIRDataFlowNode = MkIRDataFlowNode(Instruction i)
/**
* A node in a data flow graph.
@@ -23,19 +22,21 @@ private newtype TIRDataFlowNode =
* `DataFlow::parameterNode`, and `DataFlow::uninitializedNode` respectively.
*/
class Node extends TIRDataFlowNode {
/**
* INTERNAL: Do not use.
*/
Declaration getEnclosingCallable() { none() } // overridden in subclasses
Instruction instr;
/** Gets the function to which this node belongs, if any. */
Function getFunction() { none() } // overridden in subclasses
Node() { this = MkIRDataFlowNode(instr) }
/**
* INTERNAL: Do not use. Alternative name for `getFunction`.
*/
Function getEnclosingCallable() { result = this.getFunction() }
Function getFunction() { result = instr.getEnclosingFunction() }
/** Gets the type of this node. */
Type getType() { none() } // overridden in subclasses
Type getType() { result = instr.getResultType() }
/** Gets the instruction corresponding to this node, if any. */
Instruction asInstruction() { result = this.(InstructionNode).getInstruction() }
Instruction asInstruction() { this = MkIRDataFlowNode(result) }
/**
* Gets the non-conversion expression corresponding to this node, if any. If
@@ -43,25 +44,22 @@ class Node extends TIRDataFlowNode {
* `Conversion`, then the result is that `Conversion`'s non-`Conversion` base
* expression.
*/
Expr asExpr() { result = this.(ExprNode).getExpr() }
Expr asExpr() {
result.getConversion*() = instr.getConvertedResultExpression() and
not result instanceof Conversion
}
/**
* Gets the expression corresponding to this node, if any. The returned
* expression may be a `Conversion`.
*/
Expr asConvertedExpr() { result = this.(ExprNode).getConvertedExpr() }
Expr asConvertedExpr() { result = instr.getConvertedResultExpression() }
/** Gets the argument that defines this `DefinitionByReferenceNode`, if any. */
Expr asDefiningArgument() { result = this.(DefinitionByReferenceNode).getArgument() }
/** Gets the parameter corresponding to this node, if any. */
Parameter asParameter() { result = this.(ParameterNode).getParameter() }
/**
* Gets the variable corresponding to this node, if any. This can be used for
* modelling flow in and out of global variables.
*/
Variable asVariable() { result = this.(VariableNode).getVariable() }
Parameter asParameter() { result = instr.(InitializeParameterInstruction).getParameter() }
/**
* DEPRECATED: See UninitializedNode.
@@ -77,7 +75,7 @@ class Node extends TIRDataFlowNode {
Type getTypeBound() { result = getType() }
/** Gets the location of this element. */
Location getLocation() { none() } // overridden by subclasses
Location getLocation() { result = instr.getLocation() }
/**
* Holds if this element is at the specified location.
@@ -92,38 +90,18 @@ class Node extends TIRDataFlowNode {
this.getLocation().hasLocationInfo(filepath, startline, startcolumn, endline, endcolumn)
}
/** Gets a textual representation of this element. */
string toString() { none() } // overridden by subclasses
}
class InstructionNode extends Node, TInstructionNode {
Instruction instr;
InstructionNode() { this = TInstructionNode(instr) }
/** Gets the instruction corresponding to this node. */
Instruction getInstruction() { result = instr }
override Declaration getEnclosingCallable() { result = this.getFunction() }
override Function getFunction() { result = instr.getEnclosingFunction() }
override Type getType() { result = instr.getResultType() }
override Location getLocation() { result = instr.getLocation() }
override string toString() {
string toString() {
// This predicate is overridden in subclasses. This default implementation
// does not use `Instruction.toString` because that's expensive to compute.
result = this.getInstruction().getOpcode().toString()
result = this.asInstruction().getOpcode().toString()
}
}
/**
* An expression, viewed as a node in a data flow graph.
*/
class ExprNode extends InstructionNode {
ExprNode() { exists(instr.getConvertedResultExpression()) }
class ExprNode extends Node {
ExprNode() { exists(this.asExpr()) }
/**
* Gets the non-conversion expression corresponding to this node, if any. If
@@ -131,13 +109,13 @@ class ExprNode extends InstructionNode {
* `Conversion`, then the result is that `Conversion`'s non-`Conversion` base
* expression.
*/
Expr getExpr() { result = instr.getUnconvertedResultExpression() }
Expr getExpr() { result = this.asExpr() }
/**
* Gets the expression corresponding to this node, if any. The returned
* expression may be a `Conversion`.
*/
Expr getConvertedExpr() { result = instr.getConvertedResultExpression() }
Expr getConvertedExpr() { result = this.asConvertedExpr() }
override string toString() { result = this.asConvertedExpr().toString() }
}
@@ -146,7 +124,7 @@ class ExprNode extends InstructionNode {
* The value of a parameter at function entry, viewed as a node in a data
* flow graph.
*/
class ParameterNode extends InstructionNode {
class ParameterNode extends Node {
override InitializeParameterInstruction instr;
/**
@@ -160,7 +138,7 @@ class ParameterNode extends InstructionNode {
override string toString() { result = instr.getParameter().toString() }
}
private class ThisParameterNode extends InstructionNode {
private class ThisParameterNode extends Node {
override InitializeThisInstruction instr;
override string toString() { result = "this" }
@@ -197,7 +175,7 @@ deprecated class UninitializedNode extends Node {
* This class exists to match the interface used by Java. There are currently no non-abstract
* classes that extend it. When we implement field flow, we can revisit this.
*/
abstract class PostUpdateNode extends InstructionNode {
abstract class PostUpdateNode extends Node {
/**
* Gets the node before the state update.
*/
@@ -214,7 +192,7 @@ abstract class PostUpdateNode extends InstructionNode {
* returned. This node will have its `getArgument()` equal to `&x` and its
* `getVariableAccess()` equal to `x`.
*/
class DefinitionByReferenceNode extends InstructionNode {
class DefinitionByReferenceNode extends Node {
override WriteSideEffectInstruction instr;
/** Gets the argument corresponding to this node. */
@@ -239,54 +217,12 @@ class DefinitionByReferenceNode extends InstructionNode {
Parameter getParameter() {
exists(CallInstruction ci | result = ci.getStaticCallTarget().getParameter(instr.getIndex()))
}
override string toString() {
// This string should be unique enough to be helpful but common enough to
// avoid storing too many different strings.
result =
instr.getPrimaryInstruction().(CallInstruction).getStaticCallTarget().getName() +
" output argument"
or
not exists(instr.getPrimaryInstruction().(CallInstruction).getStaticCallTarget()) and
result = "output argument"
}
}
/**
* A `Node` corresponding to a variable in the program, as opposed to the
* value of that variable at some particular point. This can be used for
* modelling flow in and out of global variables.
*/
class VariableNode extends Node, TVariableNode {
Variable v;
VariableNode() { this = TVariableNode(v) }
/** Gets the variable corresponding to this node. */
Variable getVariable() { result = v }
override Function getFunction() { none() }
override Declaration getEnclosingCallable() {
// When flow crosses from one _enclosing callable_ to another, the
// interprocedural data-flow library discards call contexts and inserts a
// node in the big-step relation used for human-readable path explanations.
// Therefore we want a distinct enclosing callable for each `VariableNode`,
// and that can be the `Variable` itself.
result = v
}
override Type getType() { result = v.getType() }
override Location getLocation() { result = v.getLocation() }
override string toString() { result = v.toString() }
}
/**
* Gets the node corresponding to `instr`.
*/
InstructionNode instructionNode(Instruction instr) { result.getInstruction() = instr }
Node instructionNode(Instruction instr) { result.asInstruction() = instr }
DefinitionByReferenceNode definitionByReferenceNode(Expr e) { result.getArgument() = e }
@@ -300,23 +236,18 @@ ExprNode exprNode(Expr e) { result.getExpr() = e }
* Gets the `Node` corresponding to `e`, if any. Here, `e` may be a
* `Conversion`.
*/
ExprNode convertedExprNode(Expr e) { result.getConvertedExpr() = e }
ExprNode convertedExprNode(Expr e) { result.getExpr() = e }
/**
* Gets the `Node` corresponding to the value of `p` at function entry.
*/
ParameterNode parameterNode(Parameter p) { result.getParameter() = p }
/** Gets the `VariableNode` corresponding to the variable `v`. */
VariableNode variableNode(Variable v) { result.getVariable() = v }
/**
* DEPRECATED: See UninitializedNode.
*
* Gets the `Node` corresponding to the value of an uninitialized local
* variable `v`.
*/
Node uninitializedNode(LocalVariable v) { none() }
UninitializedNode uninitializedNode(LocalVariable v) { result.getLocalVariable() = v }
/**
* Holds if data flows from `nodeFrom` to `nodeTo` in exactly one local
@@ -334,7 +265,6 @@ predicate simpleLocalFlowStep(Node nodeFrom, Node nodeTo) {
simpleInstructionLocalFlowStep(nodeFrom.asInstruction(), nodeTo.asInstruction())
}
cached
private predicate simpleInstructionLocalFlowStep(Instruction iFrom, Instruction iTo) {
iTo.(CopyInstruction).getSourceValue() = iFrom
or
@@ -359,62 +289,6 @@ private predicate simpleInstructionLocalFlowStep(Instruction iFrom, Instruction
// Flow through the partial operand belongs in the taint-tracking libraries
// for now.
iTo.getAnOperand().(ChiTotalOperand).getDef() = iFrom
or
// Flow through modeled functions
modelFlow(iFrom, iTo)
}
private predicate modelFlow(Instruction iFrom, Instruction iTo) {
exists(
CallInstruction call, DataFlowFunction func, FunctionInput modelIn, FunctionOutput modelOut
|
call.getStaticCallTarget() = func and
func.hasDataFlow(modelIn, modelOut)
|
(
modelOut.isReturnValue() and
iTo = call
or
// TODO: Add write side effects for return values
modelOut.isReturnValueDeref() and
iTo = call
or
exists(int index, WriteSideEffectInstruction outNode |
modelOut.isParameterDeref(index) and
iTo = outNode and
outNode = getSideEffectFor(call, index)
)
// TODO: add write side effects for qualifiers
) and
(
exists(int index |
modelIn.isParameter(index) and
iFrom = call.getPositionalArgument(index)
)
or
exists(int index, ReadSideEffectInstruction read |
modelIn.isParameterDeref(index) and
read = getSideEffectFor(call, index) and
iFrom = read.getSideEffectOperand().getAnyDef()
)
or
modelIn.isQualifierAddress() and
iFrom = call.getThisArgument()
// TODO: add read side effects for qualifiers
)
)
}
/**
* Holds if the result is a side effect for instruction `call` on argument
* index `argument`. This helper predicate makes it easy to join on both of
* these columns at once, avoiding pathological join orders in case the
* argument index should get joined first.
*/
pragma[noinline]
SideEffectInstruction getSideEffectFor(CallInstruction call, int argument) {
call = result.getPrimaryInstruction() and
argument = result.(IndexedInstruction).getIndex()
}
/**

View File

@@ -1,13 +1,3 @@
/**
* Provides an implementation of global (interprocedural) taint tracking.
* This file re-exports the local (intraprocedural) taint-tracking analysis
* from `TaintTrackingParameter::Public` and adds a global analysis, mainly
* exposed through the `Configuration` class. For some languages, this file
* exists in several identical copies, allowing queries to use multiple
* `Configuration` classes that depend on each other without introducing
* mutual recursion among those configurations.
*/
import TaintTrackingParameter::Public
private import TaintTrackingParameter::Private

View File

@@ -1,13 +1,3 @@
/**
* Provides an implementation of global (interprocedural) taint tracking.
* This file re-exports the local (intraprocedural) taint-tracking analysis
* from `TaintTrackingParameter::Public` and adds a global analysis, mainly
* exposed through the `Configuration` class. For some languages, this file
* exists in several identical copies, allowing queries to use multiple
* `Configuration` classes that depend on each other without introducing
* mutual recursion among those configurations.
*/
import TaintTrackingParameter::Public
private import TaintTrackingParameter::Private

View File

@@ -16,8 +16,6 @@ class IRConfiguration extends TIRConfiguration {
* Holds if IR should be created for function `func`. By default, holds for all functions.
*/
predicate shouldCreateIRForFunction(Language::Function func) { any() }
predicate shouldEvaluateDebugStringsForFunction(Language::Function func) { any() }
}
private newtype TIREscapeAnalysisConfiguration = MkIREscapeAnalysisConfiguration()

View File

@@ -63,14 +63,13 @@ private newtype TOpcode =
TUnmodeledDefinition() or
TUnmodeledUse() or
TAliasedDefinition() or
TInitializeNonLocal() or
TAliasedUse() or
TPhi() or
TBuiltIn() or
TVarArgsStart() or
TVarArgsEnd() or
TVarArg() or
TNextVarArg() or
TVarArgCopy() or
TCallSideEffect() or
TCallReadSideEffect() or
TIndirectReadSideEffect() or
@@ -82,7 +81,6 @@ private newtype TOpcode =
TSizedBufferReadSideEffect() or
TSizedBufferMustWriteSideEffect() or
TSizedBufferMayWriteSideEffect() or
TInitializeDynamicAllocation() or
TChi() or
TInlineAsm() or
TUnreached() or
@@ -214,28 +212,23 @@ abstract class IndirectReadOpcode extends IndirectMemoryAccessOpcode {
}
/**
* An opcode that accesses a memory buffer.
* An opcode that accesses a memory buffer of unknown size.
*/
abstract class BufferAccessOpcode extends Opcode {
final override predicate hasAddressOperand() { any() }
}
/**
* An opcode that accesses a memory buffer of unknown size.
*/
abstract class UnsizedBufferAccessOpcode extends BufferAccessOpcode { }
/**
* An opcode that writes to a memory buffer of unknown size.
*/
abstract class UnsizedBufferWriteOpcode extends UnsizedBufferAccessOpcode {
abstract class BufferWriteOpcode extends BufferAccessOpcode {
final override MemoryAccessKind getWriteMemoryAccess() { result instanceof BufferMemoryAccess }
}
/**
* An opcode that reads from a memory buffer of unknown size.
*/
abstract class UnsizedBufferReadOpcode extends UnsizedBufferAccessOpcode {
abstract class BufferReadOpcode extends BufferAccessOpcode {
final override MemoryAccessKind getReadMemoryAccess() { result instanceof BufferMemoryAccess }
}
@@ -267,7 +260,9 @@ abstract class EntireAllocationReadOpcode extends EntireAllocationAccessOpcode {
/**
* An opcode that accesses a memory buffer whose size is determined by a `BufferSizeOperand`.
*/
abstract class SizedBufferAccessOpcode extends BufferAccessOpcode {
abstract class SizedBufferAccessOpcode extends Opcode {
final override predicate hasAddressOperand() { any() }
final override predicate hasBufferSizeOperand() { any() }
}
@@ -601,14 +596,6 @@ module Opcode {
final override MemoryAccessKind getWriteMemoryAccess() { result instanceof EscapedMemoryAccess }
}
class InitializeNonLocal extends Opcode, TInitializeNonLocal {
final override string toString() { result = "InitializeNonLocal" }
final override MemoryAccessKind getWriteMemoryAccess() {
result instanceof NonLocalMemoryAccess
}
}
class AliasedUse extends Opcode, TAliasedUse {
final override string toString() { result = "AliasedUse" }
@@ -629,20 +616,20 @@ module Opcode {
final override string toString() { result = "BuiltIn" }
}
class VarArgsStart extends UnaryOpcode, TVarArgsStart {
class VarArgsStart extends BuiltInOperationOpcode, TVarArgsStart {
final override string toString() { result = "VarArgsStart" }
}
class VarArgsEnd extends UnaryOpcode, TVarArgsEnd {
class VarArgsEnd extends BuiltInOperationOpcode, TVarArgsEnd {
final override string toString() { result = "VarArgsEnd" }
}
class VarArg extends UnaryOpcode, TVarArg {
class VarArg extends BuiltInOperationOpcode, TVarArg {
final override string toString() { result = "VarArg" }
}
class NextVarArg extends UnaryOpcode, TNextVarArg {
final override string toString() { result = "NextVarArg" }
class VarArgCopy extends BuiltInOperationOpcode, TVarArgCopy {
final override string toString() { result = "VarArgCopy" }
}
class CallSideEffect extends WriteSideEffectOpcode, EscapedWriteOpcode, MayWriteOpcode,
@@ -670,18 +657,17 @@ module Opcode {
final override string toString() { result = "IndirectMayWriteSideEffect" }
}
class BufferReadSideEffect extends ReadSideEffectOpcode, UnsizedBufferReadOpcode,
TBufferReadSideEffect {
class BufferReadSideEffect extends ReadSideEffectOpcode, BufferReadOpcode, TBufferReadSideEffect {
final override string toString() { result = "BufferReadSideEffect" }
}
class BufferMustWriteSideEffect extends WriteSideEffectOpcode, UnsizedBufferWriteOpcode,
class BufferMustWriteSideEffect extends WriteSideEffectOpcode, BufferWriteOpcode,
TBufferMustWriteSideEffect {
final override string toString() { result = "BufferMustWriteSideEffect" }
}
class BufferMayWriteSideEffect extends WriteSideEffectOpcode, UnsizedBufferWriteOpcode,
MayWriteOpcode, TBufferMayWriteSideEffect {
class BufferMayWriteSideEffect extends WriteSideEffectOpcode, BufferWriteOpcode, MayWriteOpcode,
TBufferMayWriteSideEffect {
final override string toString() { result = "BufferMayWriteSideEffect" }
}
@@ -700,11 +686,6 @@ module Opcode {
final override string toString() { result = "SizedBufferMayWriteSideEffect" }
}
class InitializeDynamicAllocation extends SideEffectOpcode, EntireAllocationWriteOpcode,
TInitializeDynamicAllocation {
final override string toString() { result = "InitializeDynamicAllocation" }
}
class Chi extends Opcode, TChi {
final override string toString() { result = "Chi" }

View File

@@ -27,9 +27,6 @@ class IRBlockBase extends TIRBlock {
* by debugging and printing code only.
*/
int getDisplayIndex() {
exists(IRConfiguration::IRConfiguration config |
config.shouldEvaluateDebugStringsForFunction(this.getEnclosingFunction())
) and
this =
rank[result + 1](IRBlock funcBlock |
funcBlock.getEnclosingFunction() = getEnclosingFunction()

View File

@@ -1,320 +1,3 @@
private import IR
import InstructionSanity // module is below
import IRTypeSanity // module is in IRType.qll
module InstructionSanity {
private import internal.InstructionImports as Imports
private import Imports::OperandTag
private import Imports::Overlap
private import internal.IRInternal
/**
* Holds if instruction `instr` is missing an expected operand with tag `tag`.
*/
query predicate missingOperand(Instruction instr, string message, IRFunction func, string funcText) {
exists(OperandTag tag |
instr.getOpcode().hasOperand(tag) and
not exists(NonPhiOperand operand |
operand = instr.getAnOperand() and
operand.getOperandTag() = tag
) and
message =
"Instruction '" + instr.getOpcode().toString() +
"' is missing an expected operand with tag '" + tag.toString() + "' in function '$@'." and
func = instr.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
)
}
/**
* Holds if instruction `instr` has an unexpected operand with tag `tag`.
*/
query predicate unexpectedOperand(Instruction instr, OperandTag tag) {
exists(NonPhiOperand operand |
operand = instr.getAnOperand() and
operand.getOperandTag() = tag
) and
not instr.getOpcode().hasOperand(tag) and
not (instr instanceof CallInstruction and tag instanceof ArgumentOperandTag) and
not (
instr instanceof BuiltInOperationInstruction and tag instanceof PositionalArgumentOperandTag
) and
not (instr instanceof InlineAsmInstruction and tag instanceof AsmOperandTag)
}
/**
* Holds if instruction `instr` has multiple operands with tag `tag`.
*/
query predicate duplicateOperand(
Instruction instr, string message, IRFunction func, string funcText
) {
exists(OperandTag tag, int operandCount |
operandCount =
strictcount(NonPhiOperand operand |
operand = instr.getAnOperand() and
operand.getOperandTag() = tag
) and
operandCount > 1 and
not tag instanceof UnmodeledUseOperandTag and
message =
"Instruction has " + operandCount + " operands with tag '" + tag.toString() + "'" +
" in function '$@'." and
func = instr.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
)
}
/**
* Holds if `Phi` instruction `instr` is missing an operand corresponding to
* the predecessor block `pred`.
*/
query predicate missingPhiOperand(PhiInstruction instr, IRBlock pred) {
pred = instr.getBlock().getAPredecessor() and
not exists(PhiInputOperand operand |
operand = instr.getAnOperand() and
operand.getPredecessorBlock() = pred
)
}
query predicate missingOperandType(Operand operand, string message) {
exists(Language::Function func, Instruction use |
not exists(operand.getType()) and
use = operand.getUse() and
func = use.getEnclosingFunction() and
message =
"Operand '" + operand.toString() + "' of instruction '" + use.getOpcode().toString() +
"' missing type in function '" + Language::getIdentityString(func) + "'."
)
}
query predicate duplicateChiOperand(
ChiInstruction chi, string message, IRFunction func, string funcText
) {
chi.getTotal() = chi.getPartial() and
message =
"Chi instruction for " + chi.getPartial().toString() +
" has duplicate operands in function $@" and
func = chi.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
}
query predicate sideEffectWithoutPrimary(
SideEffectInstruction instr, string message, IRFunction func, string funcText
) {
not exists(instr.getPrimaryInstruction()) and
message = "Side effect instruction missing primary instruction in function $@" and
func = instr.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
}
/**
* Holds if an instruction, other than `ExitFunction`, has no successors.
*/
query predicate instructionWithoutSuccessor(Instruction instr) {
not exists(instr.getASuccessor()) and
not instr instanceof ExitFunctionInstruction and
// Phi instructions aren't linked into the instruction-level flow graph.
not instr instanceof PhiInstruction and
not instr instanceof UnreachedInstruction
}
/**
* Holds if there are multiple (`n`) edges of kind `kind` from `source`,
* where `target` is among the targets of those edges.
*/
query predicate ambiguousSuccessors(Instruction source, EdgeKind kind, int n, Instruction target) {
n = strictcount(Instruction t | source.getSuccessor(kind) = t) and
n > 1 and
source.getSuccessor(kind) = target
}
/**
* Holds if `instr` in `f` is part of a loop even though the AST of `f`
* contains no element that can cause loops.
*/
query predicate unexplainedLoop(Language::Function f, Instruction instr) {
exists(IRBlock block |
instr.getBlock() = block and
block.getEnclosingFunction() = f and
block.getASuccessor+() = block
) and
not Language::hasPotentialLoop(f)
}
/**
* Holds if a `Phi` instruction is present in a block with fewer than two
* predecessors.
*/
query predicate unnecessaryPhiInstruction(PhiInstruction instr) {
count(instr.getBlock().getAPredecessor()) < 2
}
/**
* Holds if operand `operand` consumes a value that was defined in
* a different function.
*/
query predicate operandAcrossFunctions(Operand operand, Instruction instr, Instruction defInstr) {
operand.getUse() = instr and
operand.getAnyDef() = defInstr and
instr.getEnclosingIRFunction() != defInstr.getEnclosingIRFunction()
}
/**
* Holds if instruction `instr` is not in exactly one block.
*/
query predicate instructionWithoutUniqueBlock(Instruction instr, int blockCount) {
blockCount = count(instr.getBlock()) and
blockCount != 1
}
private predicate forwardEdge(IRBlock b1, IRBlock b2) {
b1.getASuccessor() = b2 and
not b1.getBackEdgeSuccessor(_) = b2
}
/**
* Holds if `f` contains a loop in which no edge is a back edge.
*
* This check ensures we don't have too _few_ back edges.
*/
query predicate containsLoopOfForwardEdges(IRFunction f) {
exists(IRBlock block |
forwardEdge+(block, block) and
block.getEnclosingIRFunction() = f
)
}
/**
* Holds if `block` is reachable from its function entry point but would not
* be reachable by traversing only forward edges. This check is skipped for
* functions containing `goto` statements as the property does not generally
* hold there.
*
* This check ensures we don't have too _many_ back edges.
*/
query predicate lostReachability(IRBlock block) {
exists(IRFunction f, IRBlock entry |
entry = f.getEntryBlock() and
entry.getASuccessor+() = block and
not forwardEdge+(entry, block) and
not Language::hasGoto(f.getFunction())
)
}
/**
* Holds if the number of back edges differs between the `Instruction` graph
* and the `IRBlock` graph.
*/
query predicate backEdgeCountMismatch(Language::Function f, int fromInstr, int fromBlock) {
fromInstr =
count(Instruction i1, Instruction i2 |
i1.getEnclosingFunction() = f and i1.getBackEdgeSuccessor(_) = i2
) and
fromBlock =
count(IRBlock b1, IRBlock b2 |
b1.getEnclosingFunction() = f and b1.getBackEdgeSuccessor(_) = b2
) and
fromInstr != fromBlock
}
/**
* Gets the point in the function at which the specified operand is evaluated. For most operands,
* this is at the instruction that consumes the use. For a `PhiInputOperand`, the effective point
* of evaluation is at the end of the corresponding predecessor block.
*/
private predicate pointOfEvaluation(Operand operand, IRBlock block, int index) {
block = operand.(PhiInputOperand).getPredecessorBlock() and
index = block.getInstructionCount()
or
exists(Instruction use |
use = operand.(NonPhiOperand).getUse() and
block.getInstruction(index) = use
)
}
/**
* Holds if `useOperand` has a definition that does not dominate the use.
*/
query predicate useNotDominatedByDefinition(
Operand useOperand, string message, IRFunction func, string funcText
) {
exists(IRBlock useBlock, int useIndex, Instruction defInstr, IRBlock defBlock, int defIndex |
not useOperand.getUse() instanceof UnmodeledUseInstruction and
not defInstr instanceof UnmodeledDefinitionInstruction and
pointOfEvaluation(useOperand, useBlock, useIndex) and
defInstr = useOperand.getAnyDef() and
(
defInstr instanceof PhiInstruction and
defBlock = defInstr.getBlock() and
defIndex = -1
or
defBlock.getInstruction(defIndex) = defInstr
) and
not (
defBlock.strictlyDominates(useBlock)
or
defBlock = useBlock and
defIndex < useIndex
) and
message =
"Operand '" + useOperand.toString() +
"' is not dominated by its definition in function '$@'." and
func = useOperand.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
)
}
query predicate switchInstructionWithoutDefaultEdge(
SwitchInstruction switchInstr, string message, IRFunction func, string funcText
) {
not exists(switchInstr.getDefaultSuccessor()) and
message =
"SwitchInstruction " + switchInstr.toString() + " without a DefaultEdge in function '$@'." and
func = switchInstr.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
}
/**
* Holds if `instr` is on the chain of chi/phi instructions for all aliased
* memory.
*/
private predicate isOnAliasedDefinitionChain(Instruction instr) {
instr instanceof AliasedDefinitionInstruction
or
isOnAliasedDefinitionChain(instr.(ChiInstruction).getTotal())
or
isOnAliasedDefinitionChain(instr.(PhiInstruction).getAnInputOperand().getAnyDef())
}
private predicate shouldBeConflated(Instruction instr) {
isOnAliasedDefinitionChain(instr)
or
instr instanceof UnmodeledDefinitionInstruction
or
instr.getOpcode() instanceof Opcode::InitializeNonLocal
}
query predicate notMarkedAsConflated(Instruction instr) {
shouldBeConflated(instr) and
not instr.isResultConflated()
}
query predicate wronglyMarkedAsConflated(Instruction instr) {
instr.isResultConflated() and
not shouldBeConflated(instr)
}
query predicate invalidOverlap(
MemoryOperand useOperand, string message, IRFunction func, string funcText
) {
exists(Overlap overlap |
overlap = useOperand.getDefinitionOverlap() and
overlap instanceof MayPartiallyOverlap and
message =
"MemoryOperand '" + useOperand.toString() + "' has a `getDefinitionOverlap()` of '" +
overlap.toString() + "'." and
func = useOperand.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
)
}
}
import InstructionSanity
import IRTypeSanity

View File

@@ -23,8 +23,7 @@ class IRVariable extends TIRVariable {
IRVariable() {
this = TIRUserVariable(_, _, func) or
this = TIRTempVariable(func, _, _, _) or
this = TIRStringLiteral(func, _, _, _) or
this = TIRDynamicInitializationFlag(func, _, _)
this = TIRStringLiteral(func, _, _, _)
}
string toString() { none() }
@@ -150,8 +149,7 @@ class IRGeneratedVariable extends IRVariable {
IRGeneratedVariable() {
this = TIRTempVariable(func, ast, _, type) or
this = TIRStringLiteral(func, ast, type, _) or
this = TIRDynamicInitializationFlag(func, ast, type)
this = TIRStringLiteral(func, ast, type, _)
}
final override Language::LanguageType getLanguageType() { result = type }
@@ -178,7 +176,7 @@ IRTempVariable getIRTempVariable(Language::AST ast, TempVariableTag tag) {
/**
* A temporary variable introduced by IR construction. The most common examples are the variable
* generated to hold the return value of a function, or the variable generated to hold the result of
* generated to hold the return value of afunction, or the variable generated to hold the result of
* a condition operator (`a ? b : c`).
*/
class IRTempVariable extends IRGeneratedVariable, IRAutomaticVariable, TIRTempVariable {
@@ -210,17 +208,7 @@ class IRReturnVariable extends IRTempVariable {
class IRThrowVariable extends IRTempVariable {
IRThrowVariable() { tag = ThrowTempVar() }
final override string getBaseString() { result = "#throw" }
}
/**
* A temporary variable generated to hold the contents of all arguments passed to the `...` of a
* function that accepts a variable number of arguments.
*/
class IREllipsisVariable extends IRTempVariable {
IREllipsisVariable() { tag = EllipsisTempVar() }
final override string toString() { result = "#ellipsis" }
override string getBaseString() { result = "#throw" }
}
/**
@@ -238,30 +226,7 @@ class IRStringLiteral extends IRGeneratedVariable, TIRStringLiteral {
result = "String: " + getLocationString() + "=" + Language::getStringLiteralText(literal)
}
final override string getBaseString() { result = "#string" }
override string getBaseString() { result = "#string" }
final Language::StringLiteral getLiteral() { result = literal }
}
/**
* A variable generated to track whether a specific non-stack variable has been initialized. This is
* used to model the runtime initialization of static local variables in C++, as well as static
* fields in C#.
*/
class IRDynamicInitializationFlag extends IRGeneratedVariable, TIRDynamicInitializationFlag {
Language::Variable var;
IRDynamicInitializationFlag() {
this = TIRDynamicInitializationFlag(func, var, type) and ast = var
}
final override string toString() { result = var.toString() + "#init" }
final Language::Variable getVariable() { result = var }
final override string getUniqueId() {
result = "Init: " + getVariable().toString() + " " + getVariable().getLocation().toString()
}
final override string getBaseString() { result = "#init:" + var.toString() + ":" }
}

View File

@@ -10,14 +10,269 @@ import Imports::MemoryAccessKind
import Imports::Opcode
private import Imports::OperandTag
module InstructionSanity {
/**
* Holds if instruction `instr` is missing an expected operand with tag `tag`.
*/
query predicate missingOperand(Instruction instr, string message, IRFunction func, string funcText) {
exists(OperandTag tag |
instr.getOpcode().hasOperand(tag) and
not exists(NonPhiOperand operand |
operand = instr.getAnOperand() and
operand.getOperandTag() = tag
) and
message =
"Instruction '" + instr.getOpcode().toString() +
"' is missing an expected operand with tag '" + tag.toString() + "' in function '$@'." and
func = instr.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
)
}
/**
* Holds if instruction `instr` has an unexpected operand with tag `tag`.
*/
query predicate unexpectedOperand(Instruction instr, OperandTag tag) {
exists(NonPhiOperand operand |
operand = instr.getAnOperand() and
operand.getOperandTag() = tag
) and
not instr.getOpcode().hasOperand(tag) and
not (instr instanceof CallInstruction and tag instanceof ArgumentOperandTag) and
not (
instr instanceof BuiltInOperationInstruction and tag instanceof PositionalArgumentOperandTag
) and
not (instr instanceof InlineAsmInstruction and tag instanceof AsmOperandTag)
}
/**
* Holds if instruction `instr` has multiple operands with tag `tag`.
*/
query predicate duplicateOperand(
Instruction instr, string message, IRFunction func, string funcText
) {
exists(OperandTag tag, int operandCount |
operandCount =
strictcount(NonPhiOperand operand |
operand = instr.getAnOperand() and
operand.getOperandTag() = tag
) and
operandCount > 1 and
not tag instanceof UnmodeledUseOperandTag and
message =
"Instruction has " + operandCount + " operands with tag '" + tag.toString() + "'" +
" in function '$@'." and
func = instr.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
)
}
/**
* Holds if `Phi` instruction `instr` is missing an operand corresponding to
* the predecessor block `pred`.
*/
query predicate missingPhiOperand(PhiInstruction instr, IRBlock pred) {
pred = instr.getBlock().getAPredecessor() and
not exists(PhiInputOperand operand |
operand = instr.getAnOperand() and
operand.getPredecessorBlock() = pred
)
}
query predicate missingOperandType(Operand operand, string message) {
exists(Language::Function func, Instruction use |
not exists(operand.getType()) and
use = operand.getUse() and
func = use.getEnclosingFunction() and
message =
"Operand '" + operand.toString() + "' of instruction '" + use.getOpcode().toString() +
"' missing type in function '" + Language::getIdentityString(func) + "'."
)
}
query predicate duplicateChiOperand(
ChiInstruction chi, string message, IRFunction func, string funcText
) {
chi.getTotal() = chi.getPartial() and
message =
"Chi instruction for " + chi.getPartial().toString() +
" has duplicate operands in function $@" and
func = chi.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
}
query predicate sideEffectWithoutPrimary(
SideEffectInstruction instr, string message, IRFunction func, string funcText
) {
not exists(instr.getPrimaryInstruction()) and
message = "Side effect instruction missing primary instruction in function $@" and
func = instr.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
}
/**
* Holds if an instruction, other than `ExitFunction`, has no successors.
*/
query predicate instructionWithoutSuccessor(Instruction instr) {
not exists(instr.getASuccessor()) and
not instr instanceof ExitFunctionInstruction and
// Phi instructions aren't linked into the instruction-level flow graph.
not instr instanceof PhiInstruction and
not instr instanceof UnreachedInstruction
}
/**
* Holds if there are multiple (`n`) edges of kind `kind` from `source`,
* where `target` is among the targets of those edges.
*/
query predicate ambiguousSuccessors(Instruction source, EdgeKind kind, int n, Instruction target) {
n = strictcount(Instruction t | source.getSuccessor(kind) = t) and
n > 1 and
source.getSuccessor(kind) = target
}
/**
* Holds if `instr` in `f` is part of a loop even though the AST of `f`
* contains no element that can cause loops.
*/
query predicate unexplainedLoop(Language::Function f, Instruction instr) {
exists(IRBlock block |
instr.getBlock() = block and
block.getEnclosingFunction() = f and
block.getASuccessor+() = block
) and
not Language::hasPotentialLoop(f)
}
/**
* Holds if a `Phi` instruction is present in a block with fewer than two
* predecessors.
*/
query predicate unnecessaryPhiInstruction(PhiInstruction instr) {
count(instr.getBlock().getAPredecessor()) < 2
}
/**
* Holds if operand `operand` consumes a value that was defined in
* a different function.
*/
query predicate operandAcrossFunctions(Operand operand, Instruction instr, Instruction defInstr) {
operand.getUse() = instr and
operand.getAnyDef() = defInstr and
instr.getEnclosingIRFunction() != defInstr.getEnclosingIRFunction()
}
/**
* Holds if instruction `instr` is not in exactly one block.
*/
query predicate instructionWithoutUniqueBlock(Instruction instr, int blockCount) {
blockCount = count(instr.getBlock()) and
blockCount != 1
}
private predicate forwardEdge(IRBlock b1, IRBlock b2) {
b1.getASuccessor() = b2 and
not b1.getBackEdgeSuccessor(_) = b2
}
/**
* Holds if `f` contains a loop in which no edge is a back edge.
*
* This check ensures we don't have too _few_ back edges.
*/
query predicate containsLoopOfForwardEdges(IRFunction f) {
exists(IRBlock block |
forwardEdge+(block, block) and
block.getEnclosingIRFunction() = f
)
}
/**
* Holds if `block` is reachable from its function entry point but would not
* be reachable by traversing only forward edges. This check is skipped for
* functions containing `goto` statements as the property does not generally
* hold there.
*
* This check ensures we don't have too _many_ back edges.
*/
query predicate lostReachability(IRBlock block) {
exists(IRFunction f, IRBlock entry |
entry = f.getEntryBlock() and
entry.getASuccessor+() = block and
not forwardEdge+(entry, block) and
not Language::hasGoto(f.getFunction())
)
}
/**
* Holds if the number of back edges differs between the `Instruction` graph
* and the `IRBlock` graph.
*/
query predicate backEdgeCountMismatch(Language::Function f, int fromInstr, int fromBlock) {
fromInstr =
count(Instruction i1, Instruction i2 |
i1.getEnclosingFunction() = f and i1.getBackEdgeSuccessor(_) = i2
) and
fromBlock =
count(IRBlock b1, IRBlock b2 |
b1.getEnclosingFunction() = f and b1.getBackEdgeSuccessor(_) = b2
) and
fromInstr != fromBlock
}
/**
* Gets the point in the function at which the specified operand is evaluated. For most operands,
* this is at the instruction that consumes the use. For a `PhiInputOperand`, the effective point
* of evaluation is at the end of the corresponding predecessor block.
*/
private predicate pointOfEvaluation(Operand operand, IRBlock block, int index) {
block = operand.(PhiInputOperand).getPredecessorBlock() and
index = block.getInstructionCount()
or
exists(Instruction use |
use = operand.(NonPhiOperand).getUse() and
block.getInstruction(index) = use
)
}
/**
* Holds if `useOperand` has a definition that does not dominate the use.
*/
query predicate useNotDominatedByDefinition(
Operand useOperand, string message, IRFunction func, string funcText
) {
exists(IRBlock useBlock, int useIndex, Instruction defInstr, IRBlock defBlock, int defIndex |
not useOperand.getUse() instanceof UnmodeledUseInstruction and
not defInstr instanceof UnmodeledDefinitionInstruction and
pointOfEvaluation(useOperand, useBlock, useIndex) and
defInstr = useOperand.getAnyDef() and
(
defInstr instanceof PhiInstruction and
defBlock = defInstr.getBlock() and
defIndex = -1
or
defBlock.getInstruction(defIndex) = defInstr
) and
not (
defBlock.strictlyDominates(useBlock)
or
defBlock = useBlock and
defIndex < useIndex
) and
message =
"Operand '" + useOperand.toString() +
"' is not dominated by its definition in function '$@'." and
func = useOperand.getEnclosingIRFunction() and
funcText = Language::getIdentityString(func.getFunction())
)
}
}
/**
* Gets an `Instruction` that is contained in `IRFunction`, and has a location with the specified
* `File` and line number. Used for assigning register names when printing IR.
*/
private Instruction getAnInstructionAtLine(IRFunction irFunc, Language::File file, int line) {
exists(IRConfiguration::IRConfiguration config |
config.shouldEvaluateDebugStringsForFunction(irFunc.getFunction())
) and
exists(Language::Location location |
irFunc = result.getEnclosingIRFunction() and
location = result.getLocation() and
@@ -42,12 +297,6 @@ class Instruction extends Construction::TInstruction {
result = getResultString() + " = " + getOperationString() + " " + getOperandsString()
}
private predicate shouldGenerateDumpStrings() {
exists(IRConfiguration::IRConfiguration config |
config.shouldEvaluateDebugStringsForFunction(this.getEnclosingFunction())
)
}
/**
* Gets a string describing the operation of this instruction. This includes
* the opcode and the immediate value, if any. For example:
@@ -55,7 +304,6 @@ class Instruction extends Construction::TInstruction {
* VariableAddress[x]
*/
final string getOperationString() {
shouldGenerateDumpStrings() and
if exists(getImmediateString())
then result = getOperationPrefix() + getOpcode().toString() + "[" + getImmediateString() + "]"
else result = getOperationPrefix() + getOpcode().toString()
@@ -67,12 +315,10 @@ class Instruction extends Construction::TInstruction {
string getImmediateString() { none() }
private string getOperationPrefix() {
shouldGenerateDumpStrings() and
if this instanceof SideEffectInstruction then result = "^" else result = ""
}
private string getResultPrefix() {
shouldGenerateDumpStrings() and
if getResultIRType() instanceof IRVoidType
then result = "v"
else
@@ -86,7 +332,6 @@ class Instruction extends Construction::TInstruction {
* used by debugging and printing code only.
*/
int getDisplayIndexInBlock() {
shouldGenerateDumpStrings() and
exists(IRBlock block |
this = block.getInstruction(result)
or
@@ -100,7 +345,6 @@ class Instruction extends Construction::TInstruction {
}
private int getLineRank() {
shouldGenerateDumpStrings() and
this =
rank[result](Instruction instr |
instr =
@@ -119,7 +363,6 @@ class Instruction extends Construction::TInstruction {
* Example: `r1_1`
*/
string getResultId() {
shouldGenerateDumpStrings() and
result = getResultPrefix() + getAST().getLocation().getStartLine() + "_" + getLineRank()
}
@@ -131,7 +374,6 @@ class Instruction extends Construction::TInstruction {
* Example: `r1_1(int*)`
*/
final string getResultString() {
shouldGenerateDumpStrings() and
result = getResultId() + "(" + getResultLanguageType().getDumpString() + ")"
}
@@ -142,7 +384,6 @@ class Instruction extends Construction::TInstruction {
* Example: `func:r3_4, this:r3_5`
*/
string getOperandsString() {
shouldGenerateDumpStrings() and
result =
concat(Operand operand |
operand = getAnOperand()
@@ -338,17 +579,6 @@ class Instruction extends Construction::TInstruction {
Construction::hasModeledMemoryResult(this)
}
/**
* Holds if this is an instruction with a memory result that represents a
* conflation of more than one memory allocation.
*
* This happens in practice when dereferencing a pointer that cannot be
* tracked back to a single local allocation. Such memory is instead modeled
* as originating on the `AliasedDefinitionInstruction` at the entry of the
* function.
*/
final predicate isResultConflated() { Construction::hasConflatedMemoryResult(this) }
/**
* Gets the successor of this instruction along the control flow edge
* specified by `kind`.
@@ -525,7 +755,7 @@ class ReturnValueInstruction extends ReturnInstruction {
final Instruction getReturnValue() { result = getReturnValueOperand().getDef() }
}
class ReturnIndirectionInstruction extends VariableInstruction {
class ReturnIndirectionInstruction extends Instruction {
ReturnIndirectionInstruction() { getOpcode() instanceof Opcode::ReturnIndirection }
final SideEffectOperand getSideEffectOperand() { result = getAnOperand() }
@@ -535,12 +765,6 @@ class ReturnIndirectionInstruction extends VariableInstruction {
final AddressOperand getSourceAddressOperand() { result = getAnOperand() }
final Instruction getSourceAddress() { result = getSourceAddressOperand().getDef() }
/**
* Gets the parameter for which this instruction reads the final pointed-to value within the
* function.
*/
final Language::Parameter getParameter() { result = var.(IRUserVariable).getVariable() }
}
class CopyInstruction extends Instruction {
@@ -968,11 +1192,6 @@ class CallInstruction extends Instruction {
final Instruction getPositionalArgument(int index) {
result = getPositionalArgumentOperand(index).getDef()
}
/**
* Gets the number of arguments of the call, including the `this` pointer, if any.
*/
final int getNumberOfArguments() { result = count(this.getAnArgumentOperand()) }
}
/**
@@ -1121,26 +1340,6 @@ class SizedBufferMayWriteSideEffectInstruction extends WriteSideEffectInstructio
Instruction getSizeDef() { result = getAnOperand().(BufferSizeOperand).getDef() }
}
/**
* An instruction representing the initial value of newly allocated memory, e.g. the result of a
* call to `malloc`.
*/
class InitializeDynamicAllocationInstruction extends SideEffectInstruction {
InitializeDynamicAllocationInstruction() {
getOpcode() instanceof Opcode::InitializeDynamicAllocation
}
/**
* Gets the address of the allocation this instruction is initializing.
*/
final AddressOperand getAllocationAddressOperand() { result = getAnOperand() }
/**
* Gets the operand for the allocation this instruction is initializing.
*/
final Instruction getAllocationAddress() { result = getAllocationAddressOperand().getDef() }
}
/**
* An instruction representing a GNU or MSVC inline assembly statement.
*/

View File

@@ -11,19 +11,13 @@ cached
private newtype TOperand =
TRegisterOperand(Instruction useInstr, RegisterOperandTag tag, Instruction defInstr) {
defInstr = Construction::getRegisterOperandDefinition(useInstr, tag) and
not Construction::isInCycle(useInstr) and
strictcount(Construction::getRegisterOperandDefinition(useInstr, tag)) = 1
not Construction::isInCycle(useInstr)
} or
TNonPhiMemoryOperand(
Instruction useInstr, MemoryOperandTag tag, Instruction defInstr, Overlap overlap
) {
defInstr = Construction::getMemoryOperandDefinition(useInstr, tag, overlap) and
not Construction::isInCycle(useInstr) and
(
strictcount(Construction::getMemoryOperandDefinition(useInstr, tag, _)) = 1
or
tag instanceof UnmodeledUseOperandTag
)
not Construction::isInCycle(useInstr)
} or
TPhiOperand(
PhiInstruction useInstr, Instruction defInstr, IRBlock predecessorBlock, Overlap overlap
@@ -384,8 +378,6 @@ class PositionalArgumentOperand extends ArgumentOperand {
class SideEffectOperand extends TypedOperand {
override SideEffectOperandTag tag;
override string toString() { result = "SideEffect" }
}
/**

View File

@@ -18,19 +18,19 @@ class PrintIRConfiguration extends TPrintIRConfiguration {
predicate shouldPrintFunction(Language::Function func) { any() }
}
/**
* Override of `IRConfiguration` to only evaluate debug strings for the functions that are to be dumped.
*/
private class FilteredIRConfiguration extends IRConfiguration {
override predicate shouldEvaluateDebugStringsForFunction(Language::Function func) {
shouldPrintFunction(func)
}
}
private predicate shouldPrintFunction(Language::Function func) {
exists(PrintIRConfiguration config | config.shouldPrintFunction(func))
}
/**
* Override of `IRConfiguration` to only create IR for the functions that are to be dumped.
*/
private class FilteredIRConfiguration extends IRConfiguration {
override predicate shouldCreateIRForFunction(Language::Function func) {
shouldPrintFunction(func)
}
}
private string getAdditionalInstructionProperty(Instruction instr, string key) {
exists(IRPropertyProvider provider | result = provider.getInstructionProperty(instr, key))
}

View File

@@ -1,17 +0,0 @@
private import internal.ValueNumberingImports
private import ValueNumbering
/**
* Provides additional information about value numbering in IR dumps.
*/
class ValueNumberPropertyProvider extends IRPropertyProvider {
override string getInstructionProperty(Instruction instr, string key) {
exists(ValueNumber vn |
vn = valueNumber(instr) and
key = "valnum" and
if strictcount(vn.getAnInstruction()) > 1
then result = vn.getDebugString()
else result = "unique"
)
}
}

View File

@@ -1,31 +1,66 @@
private import internal.ValueNumberingInternal
private import internal.ValueNumberingImports
private import IR
/**
* Provides additional information about value numbering in IR dumps.
*/
class ValueNumberPropertyProvider extends IRPropertyProvider {
override string getInstructionProperty(Instruction instr, string key) {
exists(ValueNumber vn |
vn = valueNumber(instr) and
key = "valnum" and
if strictcount(vn.getAnInstruction()) > 1 then result = vn.toString() else result = "unique"
)
}
}
newtype TValueNumber =
TVariableAddressValueNumber(IRFunction irFunc, IRVariable var) {
variableAddressValueNumber(_, irFunc, var)
} or
TInitializeParameterValueNumber(IRFunction irFunc, IRVariable var) {
initializeParameterValueNumber(_, irFunc, var)
} or
TInitializeThisValueNumber(IRFunction irFunc) { initializeThisValueNumber(_, irFunc) } or
TConstantValueNumber(IRFunction irFunc, IRType type, string value) {
constantValueNumber(_, irFunc, type, value)
} or
TStringConstantValueNumber(IRFunction irFunc, IRType type, string value) {
stringConstantValueNumber(_, irFunc, type, value)
} or
TFieldAddressValueNumber(IRFunction irFunc, Language::Field field, ValueNumber objectAddress) {
fieldAddressValueNumber(_, irFunc, field, objectAddress)
} or
TBinaryValueNumber(
IRFunction irFunc, Opcode opcode, IRType type, ValueNumber leftOperand, ValueNumber rightOperand
) {
binaryValueNumber(_, irFunc, opcode, type, leftOperand, rightOperand)
} or
TPointerArithmeticValueNumber(
IRFunction irFunc, Opcode opcode, IRType type, int elementSize, ValueNumber leftOperand,
ValueNumber rightOperand
) {
pointerArithmeticValueNumber(_, irFunc, opcode, type, elementSize, leftOperand, rightOperand)
} or
TUnaryValueNumber(IRFunction irFunc, Opcode opcode, IRType type, ValueNumber operand) {
unaryValueNumber(_, irFunc, opcode, type, operand)
} or
TInheritanceConversionValueNumber(
IRFunction irFunc, Opcode opcode, Language::Class baseClass, Language::Class derivedClass,
ValueNumber operand
) {
inheritanceConversionValueNumber(_, irFunc, opcode, baseClass, derivedClass, operand)
} or
TUniqueValueNumber(IRFunction irFunc, Instruction instr) { uniqueValueNumber(instr, irFunc) }
/**
* The value number assigned to a particular set of instructions that produce equivalent results.
*/
class ValueNumber extends TValueNumber {
final string toString() { result = "GVN" }
final string toString() { result = getExampleInstruction().getResultId() }
final string getDebugString() { result = strictconcat(getAnInstruction().getResultId(), ", ") }
final Language::Location getLocation() {
if
exists(Instruction i |
i = getAnInstruction() and not i.getLocation() instanceof Language::UnknownLocation
)
then
result =
min(Language::Location l |
l = getAnInstruction().getLocation() and not l instanceof Language::UnknownLocation
|
l
order by
l.getFile().getAbsolutePath(), l.getStartLine(), l.getStartColumn(), l.getEndLine(),
l.getEndColumn()
)
else result instanceof Language::UnknownDefaultLocation
}
final Language::Location getLocation() { result = getExampleInstruction().getLocation() }
/**
* Gets the instructions that have been assigned this value number. This will always produce at
@@ -50,39 +85,236 @@ class ValueNumber extends TValueNumber {
* Gets an `Operand` whose definition is exact and has this value number.
*/
final Operand getAUse() { this = valueNumber(result.getDef()) }
}
final string getKind() {
this instanceof TVariableAddressValueNumber and result = "VariableAddress"
or
this instanceof TInitializeParameterValueNumber and result = "InitializeParameter"
or
this instanceof TInitializeThisValueNumber and result = "InitializeThis"
or
this instanceof TStringConstantValueNumber and result = "StringConstant"
or
this instanceof TFieldAddressValueNumber and result = "FieldAddress"
or
this instanceof TBinaryValueNumber and result = "Binary"
or
this instanceof TPointerArithmeticValueNumber and result = "PointerArithmetic"
or
this instanceof TUnaryValueNumber and result = "Unary"
or
this instanceof TInheritanceConversionValueNumber and result = "InheritanceConversion"
or
this instanceof TLoadTotalOverlapValueNumber and result = "LoadTotalOverlap"
or
this instanceof TUniqueValueNumber and result = "Unique"
/**
* A `CopyInstruction` whose source operand's value is congruent to the definition of that source
* operand.
* For example:
* ```
* Point p = { 1, 2 };
* Point q = p;
* int a = p.x;
* ```
* The use of `p` on line 2 is linked to the definition of `p` on line 1, and is congruent to that
* definition because it accesses the exact same memory.
* The use of `p.x` on line 3 is linked to the definition of `p` on line 1 as well, but is not
* congruent to that definition because `p.x` accesses only a subset of the memory defined by `p`.
*/
private class CongruentCopyInstruction extends CopyInstruction {
CongruentCopyInstruction() {
this.getSourceValueOperand().getDefinitionOverlap() instanceof MustExactlyOverlap
}
}
/**
* Holds if this library knows how to assign a value number to the specified instruction, other than
* a `unique` value number that is never shared by multiple instructions.
*/
private predicate numberableInstruction(Instruction instr) {
instr instanceof VariableAddressInstruction
or
instr instanceof InitializeParameterInstruction
or
instr instanceof InitializeThisInstruction
or
instr instanceof ConstantInstruction
or
instr instanceof StringConstantInstruction
or
instr instanceof FieldAddressInstruction
or
instr instanceof BinaryInstruction
or
instr instanceof UnaryInstruction and not instr instanceof CopyInstruction
or
instr instanceof PointerArithmeticInstruction
or
instr instanceof CongruentCopyInstruction
}
private predicate variableAddressValueNumber(
VariableAddressInstruction instr, IRFunction irFunc, IRVariable var
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getIRVariable() = var
}
private predicate initializeParameterValueNumber(
InitializeParameterInstruction instr, IRFunction irFunc, IRVariable var
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getIRVariable() = var
}
private predicate initializeThisValueNumber(InitializeThisInstruction instr, IRFunction irFunc) {
instr.getEnclosingIRFunction() = irFunc
}
private predicate constantValueNumber(
ConstantInstruction instr, IRFunction irFunc, IRType type, string value
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getResultIRType() = type and
instr.getValue() = value
}
private predicate stringConstantValueNumber(
StringConstantInstruction instr, IRFunction irFunc, IRType type, string value
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getResultIRType() = type and
instr.getValue().getValue() = value
}
private predicate fieldAddressValueNumber(
FieldAddressInstruction instr, IRFunction irFunc, Language::Field field, ValueNumber objectAddress
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getField() = field and
valueNumber(instr.getObjectAddress()) = objectAddress
}
private predicate binaryValueNumber(
BinaryInstruction instr, IRFunction irFunc, Opcode opcode, IRType type, ValueNumber leftOperand,
ValueNumber rightOperand
) {
instr.getEnclosingIRFunction() = irFunc and
not instr instanceof PointerArithmeticInstruction and
instr.getOpcode() = opcode and
instr.getResultIRType() = type and
valueNumber(instr.getLeft()) = leftOperand and
valueNumber(instr.getRight()) = rightOperand
}
private predicate pointerArithmeticValueNumber(
PointerArithmeticInstruction instr, IRFunction irFunc, Opcode opcode, IRType type,
int elementSize, ValueNumber leftOperand, ValueNumber rightOperand
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getOpcode() = opcode and
instr.getResultIRType() = type and
instr.getElementSize() = elementSize and
valueNumber(instr.getLeft()) = leftOperand and
valueNumber(instr.getRight()) = rightOperand
}
private predicate unaryValueNumber(
UnaryInstruction instr, IRFunction irFunc, Opcode opcode, IRType type, ValueNumber operand
) {
instr.getEnclosingIRFunction() = irFunc and
not instr instanceof InheritanceConversionInstruction and
not instr instanceof CopyInstruction and
instr.getOpcode() = opcode and
instr.getResultIRType() = type and
valueNumber(instr.getUnary()) = operand
}
private predicate inheritanceConversionValueNumber(
InheritanceConversionInstruction instr, IRFunction irFunc, Opcode opcode,
Language::Class baseClass, Language::Class derivedClass, ValueNumber operand
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getOpcode() = opcode and
instr.getBaseClass() = baseClass and
instr.getDerivedClass() = derivedClass and
valueNumber(instr.getUnary()) = operand
}
/**
* Holds if `instr` should be assigned a unique value number because this library does not know how
* to determine if two instances of that instruction are equivalent.
*/
private predicate uniqueValueNumber(Instruction instr, IRFunction irFunc) {
instr.getEnclosingIRFunction() = irFunc and
not instr.getResultIRType() instanceof IRVoidType and
not numberableInstruction(instr)
}
/**
* Gets the value number assigned to `instr`, if any. Returns at most one result.
*/
ValueNumber valueNumber(Instruction instr) { result = tvalueNumber(instr) }
cached
ValueNumber valueNumber(Instruction instr) {
result = nonUniqueValueNumber(instr)
or
exists(IRFunction irFunc |
uniqueValueNumber(instr, irFunc) and
result = TUniqueValueNumber(irFunc, instr)
)
}
/**
* Gets the value number assigned to the exact definition of `op`, if any.
* Returns at most one result.
*/
ValueNumber valueNumberOfOperand(Operand op) { result = tvalueNumberOfOperand(op) }
ValueNumber valueNumberOfOperand(Operand op) { result = valueNumber(op.getDef()) }
/**
* Gets the value number assigned to `instr`, if any, unless that instruction is assigned a unique
* value number.
*/
private ValueNumber nonUniqueValueNumber(Instruction instr) {
exists(IRFunction irFunc |
irFunc = instr.getEnclosingIRFunction() and
(
exists(IRVariable var |
variableAddressValueNumber(instr, irFunc, var) and
result = TVariableAddressValueNumber(irFunc, var)
)
or
exists(IRVariable var |
initializeParameterValueNumber(instr, irFunc, var) and
result = TInitializeParameterValueNumber(irFunc, var)
)
or
initializeThisValueNumber(instr, irFunc) and
result = TInitializeThisValueNumber(irFunc)
or
exists(IRType type, string value |
constantValueNumber(instr, irFunc, type, value) and
result = TConstantValueNumber(irFunc, type, value)
)
or
exists(IRType type, string value |
stringConstantValueNumber(instr, irFunc, type, value) and
result = TStringConstantValueNumber(irFunc, type, value)
)
or
exists(Language::Field field, ValueNumber objectAddress |
fieldAddressValueNumber(instr, irFunc, field, objectAddress) and
result = TFieldAddressValueNumber(irFunc, field, objectAddress)
)
or
exists(Opcode opcode, IRType type, ValueNumber leftOperand, ValueNumber rightOperand |
binaryValueNumber(instr, irFunc, opcode, type, leftOperand, rightOperand) and
result = TBinaryValueNumber(irFunc, opcode, type, leftOperand, rightOperand)
)
or
exists(Opcode opcode, IRType type, ValueNumber operand |
unaryValueNumber(instr, irFunc, opcode, type, operand) and
result = TUnaryValueNumber(irFunc, opcode, type, operand)
)
or
exists(
Opcode opcode, Language::Class baseClass, Language::Class derivedClass, ValueNumber operand
|
inheritanceConversionValueNumber(instr, irFunc, opcode, baseClass, derivedClass, operand) and
result = TInheritanceConversionValueNumber(irFunc, opcode, baseClass, derivedClass, operand)
)
or
exists(
Opcode opcode, IRType type, int elementSize, ValueNumber leftOperand,
ValueNumber rightOperand
|
pointerArithmeticValueNumber(instr, irFunc, opcode, type, elementSize, leftOperand,
rightOperand) and
result =
TPointerArithmeticValueNumber(irFunc, opcode, type, elementSize, leftOperand, rightOperand)
)
or
// The value number of a copy is just the value number of its source value.
result = valueNumber(instr.(CongruentCopyInstruction).getSourceValue())
)
)
}

View File

@@ -1,3 +1,2 @@
import semmle.code.cpp.ir.implementation.aliased_ssa.IR
import semmle.code.cpp.ir.internal.Overlap
import semmle.code.cpp.ir.internal.IRCppLanguage as Language

View File

@@ -1,321 +1 @@
private import ValueNumberingImports
newtype TValueNumber =
TVariableAddressValueNumber(IRFunction irFunc, Language::AST ast) {
variableAddressValueNumber(_, irFunc, ast)
} or
TInitializeParameterValueNumber(IRFunction irFunc, Language::AST var) {
initializeParameterValueNumber(_, irFunc, var)
} or
TInitializeThisValueNumber(IRFunction irFunc) { initializeThisValueNumber(_, irFunc) } or
TConstantValueNumber(IRFunction irFunc, IRType type, string value) {
constantValueNumber(_, irFunc, type, value)
} or
TStringConstantValueNumber(IRFunction irFunc, IRType type, string value) {
stringConstantValueNumber(_, irFunc, type, value)
} or
TFieldAddressValueNumber(IRFunction irFunc, Language::Field field, TValueNumber objectAddress) {
fieldAddressValueNumber(_, irFunc, field, objectAddress)
} or
TBinaryValueNumber(
IRFunction irFunc, Opcode opcode, TValueNumber leftOperand, TValueNumber rightOperand
) {
binaryValueNumber(_, irFunc, opcode, leftOperand, rightOperand)
} or
TPointerArithmeticValueNumber(
IRFunction irFunc, Opcode opcode, int elementSize, TValueNumber leftOperand,
TValueNumber rightOperand
) {
pointerArithmeticValueNumber(_, irFunc, opcode, elementSize, leftOperand, rightOperand)
} or
TUnaryValueNumber(IRFunction irFunc, Opcode opcode, TValueNumber operand) {
unaryValueNumber(_, irFunc, opcode, operand)
} or
TInheritanceConversionValueNumber(
IRFunction irFunc, Opcode opcode, Language::Class baseClass, Language::Class derivedClass,
TValueNumber operand
) {
inheritanceConversionValueNumber(_, irFunc, opcode, baseClass, derivedClass, operand)
} or
TLoadTotalOverlapValueNumber(
IRFunction irFunc, IRType type, TValueNumber memOperand, TValueNumber operand
) {
loadTotalOverlapValueNumber(_, irFunc, type, memOperand, operand)
} or
TUniqueValueNumber(IRFunction irFunc, Instruction instr) { uniqueValueNumber(instr, irFunc) }
/**
* A `CopyInstruction` whose source operand's value is congruent to the definition of that source
* operand.
* For example:
* ```
* Point p = { 1, 2 };
* Point q = p;
* int a = p.x;
* ```
* The use of `p` on line 2 is linked to the definition of `p` on line 1, and is congruent to that
* definition because it accesses the exact same memory.
* The use of `p.x` on line 3 is linked to the definition of `p` on line 1 as well, but is not
* congruent to that definition because `p.x` accesses only a subset of the memory defined by `p`.
*/
class CongruentCopyInstruction extends CopyInstruction {
CongruentCopyInstruction() {
this.getSourceValueOperand().getDefinitionOverlap() instanceof MustExactlyOverlap
}
}
class LoadTotalOverlapInstruction extends LoadInstruction {
LoadTotalOverlapInstruction() {
this.getSourceValueOperand().getDefinitionOverlap() instanceof MustTotallyOverlap
}
}
/**
* Holds if this library knows how to assign a value number to the specified instruction, other than
* a `unique` value number that is never shared by multiple instructions.
*/
private predicate numberableInstruction(Instruction instr) {
instr instanceof VariableAddressInstruction
or
instr instanceof InitializeParameterInstruction
or
instr instanceof InitializeThisInstruction
or
instr instanceof ConstantInstruction
or
instr instanceof StringConstantInstruction
or
instr instanceof FieldAddressInstruction
or
instr instanceof BinaryInstruction
or
instr instanceof UnaryInstruction and not instr instanceof CopyInstruction
or
instr instanceof PointerArithmeticInstruction
or
instr instanceof CongruentCopyInstruction
or
instr instanceof LoadTotalOverlapInstruction
}
private predicate filteredNumberableInstruction(Instruction instr) {
// count rather than strictcount to handle missing AST elements
// separate instanceof and inline casts to avoid failed casts with a count of 0
instr instanceof VariableAddressInstruction and
count(instr.(VariableAddressInstruction).getIRVariable().getAST()) != 1
or
instr instanceof ConstantInstruction and
count(instr.getResultIRType()) != 1
or
instr instanceof FieldAddressInstruction and
count(instr.(FieldAddressInstruction).getField()) != 1
}
private predicate variableAddressValueNumber(
VariableAddressInstruction instr, IRFunction irFunc, Language::AST ast
) {
instr.getEnclosingIRFunction() = irFunc and
// The underlying AST element is used as value-numbering key instead of the
// `IRVariable` to work around a problem where a variable or expression with
// multiple types gives rise to multiple `IRVariable`s.
instr.getIRVariable().getAST() = ast and
strictcount(instr.getIRVariable().getAST()) = 1
}
private predicate initializeParameterValueNumber(
InitializeParameterInstruction instr, IRFunction irFunc, Language::AST var
) {
instr.getEnclosingIRFunction() = irFunc and
// The underlying AST element is used as value-numbering key instead of the
// `IRVariable` to work around a problem where a variable or expression with
// multiple types gives rise to multiple `IRVariable`s.
instr.getIRVariable().getAST() = var
}
private predicate initializeThisValueNumber(InitializeThisInstruction instr, IRFunction irFunc) {
instr.getEnclosingIRFunction() = irFunc
}
private predicate constantValueNumber(
ConstantInstruction instr, IRFunction irFunc, IRType type, string value
) {
instr.getEnclosingIRFunction() = irFunc and
strictcount(instr.getResultIRType()) = 1 and
instr.getResultIRType() = type and
instr.getValue() = value
}
private predicate stringConstantValueNumber(
StringConstantInstruction instr, IRFunction irFunc, IRType type, string value
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getResultIRType() = type and
instr.getValue().getValue() = value
}
private predicate fieldAddressValueNumber(
FieldAddressInstruction instr, IRFunction irFunc, Language::Field field,
TValueNumber objectAddress
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getField() = field and
strictcount(instr.getField()) = 1 and
tvalueNumber(instr.getObjectAddress()) = objectAddress
}
private predicate binaryValueNumber(
BinaryInstruction instr, IRFunction irFunc, Opcode opcode, TValueNumber leftOperand,
TValueNumber rightOperand
) {
instr.getEnclosingIRFunction() = irFunc and
not instr instanceof PointerArithmeticInstruction and
instr.getOpcode() = opcode and
tvalueNumber(instr.getLeft()) = leftOperand and
tvalueNumber(instr.getRight()) = rightOperand
}
private predicate pointerArithmeticValueNumber(
PointerArithmeticInstruction instr, IRFunction irFunc, Opcode opcode, int elementSize,
TValueNumber leftOperand, TValueNumber rightOperand
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getOpcode() = opcode and
instr.getElementSize() = elementSize and
tvalueNumber(instr.getLeft()) = leftOperand and
tvalueNumber(instr.getRight()) = rightOperand
}
private predicate unaryValueNumber(
UnaryInstruction instr, IRFunction irFunc, Opcode opcode, TValueNumber operand
) {
instr.getEnclosingIRFunction() = irFunc and
not instr instanceof InheritanceConversionInstruction and
not instr instanceof CopyInstruction and
not instr instanceof FieldAddressInstruction and
instr.getOpcode() = opcode and
tvalueNumber(instr.getUnary()) = operand
}
private predicate inheritanceConversionValueNumber(
InheritanceConversionInstruction instr, IRFunction irFunc, Opcode opcode,
Language::Class baseClass, Language::Class derivedClass, TValueNumber operand
) {
instr.getEnclosingIRFunction() = irFunc and
instr.getOpcode() = opcode and
instr.getBaseClass() = baseClass and
instr.getDerivedClass() = derivedClass and
tvalueNumber(instr.getUnary()) = operand
}
private predicate loadTotalOverlapValueNumber(
LoadTotalOverlapInstruction instr, IRFunction irFunc, IRType type, TValueNumber memOperand,
TValueNumber operand
) {
instr.getEnclosingIRFunction() = irFunc and
tvalueNumber(instr.getAnOperand().(MemoryOperand).getAnyDef()) = memOperand and
tvalueNumberOfOperand(instr.getAnOperand().(AddressOperand)) = operand and
instr.getResultIRType() = type
}
/**
* Holds if `instr` should be assigned a unique value number because this library does not know how
* to determine if two instances of that instruction are equivalent.
*/
private predicate uniqueValueNumber(Instruction instr, IRFunction irFunc) {
instr.getEnclosingIRFunction() = irFunc and
not instr.getResultIRType() instanceof IRVoidType and
(
not numberableInstruction(instr)
or
filteredNumberableInstruction(instr)
)
}
/**
* Gets the value number assigned to `instr`, if any. Returns at most one result.
*/
cached
TValueNumber tvalueNumber(Instruction instr) {
result = nonUniqueValueNumber(instr)
or
exists(IRFunction irFunc |
uniqueValueNumber(instr, irFunc) and
result = TUniqueValueNumber(irFunc, instr)
)
}
/**
* Gets the value number assigned to the exact definition of `op`, if any.
* Returns at most one result.
*/
TValueNumber tvalueNumberOfOperand(Operand op) { result = tvalueNumber(op.getDef()) }
/**
* Gets the value number assigned to `instr`, if any, unless that instruction is assigned a unique
* value number.
*/
private TValueNumber nonUniqueValueNumber(Instruction instr) {
exists(IRFunction irFunc |
irFunc = instr.getEnclosingIRFunction() and
(
exists(Language::AST ast |
variableAddressValueNumber(instr, irFunc, ast) and
result = TVariableAddressValueNumber(irFunc, ast)
)
or
exists(Language::AST var |
initializeParameterValueNumber(instr, irFunc, var) and
result = TInitializeParameterValueNumber(irFunc, var)
)
or
initializeThisValueNumber(instr, irFunc) and
result = TInitializeThisValueNumber(irFunc)
or
exists(string value, IRType type |
constantValueNumber(instr, irFunc, type, value) and
result = TConstantValueNumber(irFunc, type, value)
)
or
exists(IRType type, string value |
stringConstantValueNumber(instr, irFunc, type, value) and
result = TStringConstantValueNumber(irFunc, type, value)
)
or
exists(Language::Field field, TValueNumber objectAddress |
fieldAddressValueNumber(instr, irFunc, field, objectAddress) and
result = TFieldAddressValueNumber(irFunc, field, objectAddress)
)
or
exists(Opcode opcode, TValueNumber leftOperand, TValueNumber rightOperand |
binaryValueNumber(instr, irFunc, opcode, leftOperand, rightOperand) and
result = TBinaryValueNumber(irFunc, opcode, leftOperand, rightOperand)
)
or
exists(Opcode opcode, TValueNumber operand |
unaryValueNumber(instr, irFunc, opcode, operand) and
result = TUnaryValueNumber(irFunc, opcode, operand)
)
or
exists(
Opcode opcode, Language::Class baseClass, Language::Class derivedClass, TValueNumber operand
|
inheritanceConversionValueNumber(instr, irFunc, opcode, baseClass, derivedClass, operand) and
result = TInheritanceConversionValueNumber(irFunc, opcode, baseClass, derivedClass, operand)
)
or
exists(Opcode opcode, int elementSize, TValueNumber leftOperand, TValueNumber rightOperand |
pointerArithmeticValueNumber(instr, irFunc, opcode, elementSize, leftOperand, rightOperand) and
result =
TPointerArithmeticValueNumber(irFunc, opcode, elementSize, leftOperand, rightOperand)
)
or
exists(IRType type, TValueNumber memOperand, TValueNumber operand |
loadTotalOverlapValueNumber(instr, irFunc, type, memOperand, operand) and
result = TLoadTotalOverlapValueNumber(irFunc, type, memOperand, operand)
)
or
// The value number of a copy is just the value number of its source value.
result = tvalueNumber(instr.(CongruentCopyInstruction).getSourceValue())
)
)
}
import semmle.code.cpp.ir.implementation.aliased_ssa.IR as IR

View File

@@ -7,9 +7,6 @@ private newtype TAllocation =
TVariableAllocation(IRVariable var) or
TIndirectParameterAllocation(IRAutomaticUserVariable var) {
exists(InitializeIndirectionInstruction instr | instr.getIRVariable() = var)
} or
TDynamicAllocation(CallInstruction call) {
exists(InitializeDynamicAllocationInstruction instr | instr.getPrimaryInstruction() = call)
}
/**
@@ -98,29 +95,3 @@ class IndirectParameterAllocation extends Allocation, TIndirectParameterAllocati
final override predicate alwaysEscapes() { none() }
}
class DynamicAllocation extends Allocation, TDynamicAllocation {
CallInstruction call;
DynamicAllocation() { this = TDynamicAllocation(call) }
final override string toString() {
result = call.toString() + " at " + call.getLocation() // This isn't performant, but it's only used in test/dump code right now.
}
final override CallInstruction getABaseInstruction() { result = call }
final override IRFunction getEnclosingIRFunction() { result = call.getEnclosingIRFunction() }
final override Language::Location getLocation() { result = call.getLocation() }
final override string getUniqueId() { result = call.getUniqueId() }
final override IRType getIRType() { result instanceof IRUnknownType }
final override predicate isReadOnly() { none() }
final override predicate isAlwaysAllocatedOnStack() { none() }
final override predicate alwaysEscapes() { none() }
}

View File

@@ -68,12 +68,8 @@ private newtype TMemoryLocation =
) and
languageType = type.getCanonicalLanguageType()
} or
TEntireAllocationMemoryLocation(Allocation var, boolean isMayAccess) {
(
var instanceof IndirectParameterAllocation or
var instanceof DynamicAllocation
) and
(isMayAccess = false or isMayAccess = true)
TEntireAllocationMemoryLocation(IndirectParameterAllocation var, boolean isMayAccess) {
isMayAccess = false or isMayAccess = true
} or
TUnknownMemoryLocation(IRFunction irFunc, boolean isMayAccess) {
isMayAccess = false or isMayAccess = true
@@ -224,12 +220,9 @@ class VariableMemoryLocation extends TVariableMemoryLocation, AllocationMemoryLo
/**
* Holds if this memory location covers the entire variable.
*/
final predicate coversEntireVariable() { varIRTypeHasBitRange(startBitOffset, endBitOffset) }
pragma[noinline]
private predicate varIRTypeHasBitRange(int start, int end) {
start = 0 and
end = var.getIRType().getByteSize() * 8
final predicate coversEntireVariable() {
startBitOffset = 0 and
endBitOffset = var.getIRType().getByteSize() * 8
}
}
@@ -305,7 +298,7 @@ class AllNonLocalMemory extends TAllNonLocalMemory, MemoryLocation {
final override string toStringInternal() { result = "{AllNonLocal}" }
final override AliasedVirtualVariable getVirtualVariable() { result.getIRFunction() = irFunc }
final override VirtualVariable getVirtualVariable() { result = TAllAliasedMemory(irFunc, false) }
final override Language::LanguageType getType() {
result = any(IRUnknownType type).getCanonicalLanguageType()
@@ -318,14 +311,6 @@ class AllNonLocalMemory extends TAllNonLocalMemory, MemoryLocation {
final override string getUniqueId() { result = "{AllNonLocal}" }
final override predicate isMayAccess() { isMayAccess = true }
override predicate canDefineReadOnly() {
// A "must" access that defines all non-local memory appears only on the `InitializeNonLocal`
// instruction, which provides the initial definition for all memory outside of the current
// function's stack frame. This memory includes string literals and other read-only globals, so
// we allow such an access to be the definition for a use of a read-only location.
not isMayAccess()
}
}
/**
@@ -354,9 +339,18 @@ class AllAliasedMemory extends TAllAliasedMemory, MemoryLocation {
final override predicate isMayAccess() { isMayAccess = true }
}
/** A virtual variable that groups all escaped memory within a function. */
class AliasedVirtualVariable extends AllAliasedMemory, VirtualVariable {
AliasedVirtualVariable() { not isMayAccess() }
override predicate canDefineReadOnly() {
// A must-def of all aliased memory is only used in two places:
// 1. In the prologue of the function, to provide a definition for all memory defined before the
// function was called. In this case, it needs to provide a definition even for read-only
// non-local variables.
// 2. As the result of a `Chi` instruction. These don't participate in overlap analysis, so it's
// OK if we let this predicate hold in that case.
any()
}
}
/**
@@ -411,16 +405,10 @@ private Overlap getExtentOverlap(MemoryLocation def, MemoryLocation use) {
use instanceof AllNonLocalMemory and
result instanceof MustExactlyOverlap
or
not use instanceof AllNonLocalMemory and
not use.isAlwaysAllocatedOnStack() and
if use instanceof VariableMemoryLocation
then
// AllNonLocalMemory totally overlaps any non-local variable.
result instanceof MustTotallyOverlap
else
// AllNonLocalMemory may partially overlap any other location within the same virtual
// variable, except a stack variable.
result instanceof MayPartiallyOverlap
// AllNonLocalMemory may partially overlap any location within the same virtual variable,
// except a local variable.
result instanceof MayPartiallyOverlap and
not use.isAlwaysAllocatedOnStack()
)
or
def.getVirtualVariable() = use.getVirtualVariable() and
@@ -430,18 +418,10 @@ private Overlap getExtentOverlap(MemoryLocation def, MemoryLocation use) {
use instanceof EntireAllocationMemoryLocation and
result instanceof MustExactlyOverlap
or
// EntireAllocationMemoryLocation totally overlaps any location within the same virtual
// variable.
not use instanceof EntireAllocationMemoryLocation and
if def.getAllocation() = use.getAllocation()
then
// EntireAllocationMemoryLocation totally overlaps any location within
// the same allocation.
result instanceof MustTotallyOverlap
else (
// There is no overlap with a location that's known to belong to a
// different allocation, but all other locations may partially overlap.
not exists(use.getAllocation()) and
result instanceof MayPartiallyOverlap
)
result instanceof MustTotallyOverlap
)
or
exists(VariableMemoryLocation defVariableLocation |

View File

@@ -1,3 +1,2 @@
import semmle.code.cpp.ir.internal.IRCppLanguage as Language
import SSAConstruction as Construction
import semmle.code.cpp.ir.implementation.IRConfiguration as IRConfiguration

View File

@@ -3,4 +3,3 @@ import semmle.code.cpp.ir.implementation.IRType as IRType
import semmle.code.cpp.ir.implementation.MemoryAccessKind as MemoryAccessKind
import semmle.code.cpp.ir.implementation.Opcode as Opcode
import semmle.code.cpp.ir.implementation.internal.OperandTag as OperandTag
import semmle.code.cpp.ir.internal.Overlap as Overlap

View File

@@ -65,29 +65,6 @@ private module Cached {
instruction instanceof ChiInstruction // Chis always have modeled results
}
cached
predicate hasConflatedMemoryResult(Instruction instruction) {
instruction instanceof UnmodeledDefinitionInstruction
or
instruction instanceof AliasedDefinitionInstruction
or
instruction.getOpcode() instanceof Opcode::InitializeNonLocal
or
// Chi instructions track virtual variables, and therefore a chi instruction is
// conflated if it's associated with the aliased virtual variable.
exists(OldInstruction oldInstruction | instruction = Chi(oldInstruction) |
Alias::getResultMemoryLocation(oldInstruction).getVirtualVariable() instanceof
Alias::AliasedVirtualVariable
)
or
// Phi instructions track locations, and therefore a phi instruction is
// conflated if it's associated with a conflated location.
exists(Alias::MemoryLocation location |
instruction = Phi(_, location) and
not exists(location.getAllocation())
)
}
cached
Instruction getRegisterOperandDefinition(Instruction instruction, RegisterOperandTag tag) {
exists(OldInstruction oldInstruction, OldIR::RegisterOperand oldOperand |
@@ -107,20 +84,19 @@ private module Cached {
oldOperand instanceof OldIR::NonPhiMemoryOperand and
exists(
OldBlock useBlock, int useRank, Alias::MemoryLocation useLocation,
Alias::MemoryLocation defLocation, OldBlock defBlock, int defRank, int defOffset,
Alias::MemoryLocation actualDefLocation
Alias::MemoryLocation defLocation, OldBlock defBlock, int defRank, int defOffset
|
useLocation = Alias::getOperandMemoryLocation(oldOperand) and
hasUseAtRank(useLocation, useBlock, useRank, oldInstruction) and
definitionReachesUse(useLocation, defBlock, defRank, useBlock, useRank) and
hasDefinitionAtRank(useLocation, defLocation, defBlock, defRank, defOffset) and
instr = getDefinitionOrChiInstruction(defBlock, defOffset, defLocation, actualDefLocation) and
overlap = Alias::getOverlap(actualDefLocation, useLocation)
instr = getDefinitionOrChiInstruction(defBlock, defOffset, defLocation, _) and
overlap = Alias::getOverlap(defLocation, useLocation)
)
}
cached
private Instruction getMemoryOperandDefinition0(
Instruction getMemoryOperandDefinition(
Instruction instruction, MemoryOperandTag tag, Overlap overlap
) {
exists(OldInstruction oldInstruction, OldIR::NonPhiMemoryOperand oldOperand |
@@ -166,19 +142,6 @@ private module Cached {
overlap instanceof MustExactlyOverlap
}
cached
Instruction getMemoryOperandDefinition(
Instruction instruction, MemoryOperandTag tag, Overlap overlap
) {
// getMemoryOperandDefinition0 currently has a bug where it can match with multiple overlaps.
// This predicate ensures that the chosen overlap is the most conservative if there's any doubt.
result = getMemoryOperandDefinition0(instruction, tag, overlap) and
not (
overlap instanceof MustExactlyOverlap and
exists(MustTotallyOverlap o | exists(getMemoryOperandDefinition0(instruction, tag, o)))
)
}
/**
* Holds if `instr` is part of a cycle in the operand graph that doesn't go
* through a phi instruction and therefore should be impossible.
@@ -689,18 +652,17 @@ module DefUse {
private predicate definitionReachesRank(
Alias::MemoryLocation useLocation, OldBlock block, int defRank, int reachesRank
) {
// The def always reaches the next use, even if there is also a def on the
// use instruction.
hasDefinitionAtRank(useLocation, _, block, defRank, _) and
reachesRank = defRank + 1
or
// If the def reached the previous rank, it also reaches the current rank,
// unless there was another def at the previous rank.
exists(int prevRank |
reachesRank = prevRank + 1 and
definitionReachesRank(useLocation, block, defRank, prevRank) and
not prevRank = exitRank(useLocation, block) and
not hasDefinitionAtRank(useLocation, _, block, prevRank, _)
reachesRank <= exitRank(useLocation, block) and // Without this, the predicate would be infinite.
(
// The def always reaches the next use, even if there is also a def on the
// use instruction.
reachesRank = defRank + 1
or
// If the def reached the previous rank, it also reaches the current rank,
// unless there was another def at the previous rank.
definitionReachesRank(useLocation, block, defRank, reachesRank - 1) and
not hasDefinitionAtRank(useLocation, _, block, reachesRank - 1, _)
)
}
@@ -797,21 +759,7 @@ module DefUse {
then defLocation = useLocation
else (
definitionHasPhiNode(defLocation, block) and
defLocation = useLocation.getVirtualVariable() and
// Handle the unusual case where a virtual variable does not overlap one of its member
// locations. For example, a definition of the virtual variable representing all aliased
// memory does not overlap a use of a string literal, because the contents of a string
// literal can never be redefined. The string literal's location could still be a member of
// the `AliasedVirtualVariable` due to something like:
// ```
// char s[10];
// strcpy(s, p);
// const char* p = b ? "SomeLiteral" : s;
// return p[3];
// ```
// In the above example, `p[3]` may access either the string literal or the local variable
// `s`, so both of those locations must be members of the `AliasedVirtualVariable`.
exists(Alias::getOverlap(defLocation, useLocation))
defLocation = useLocation.getVirtualVariable()
)
)
or

View File

@@ -10,11 +10,6 @@ newtype TIRVariable =
) {
Construction::hasTempVariable(func, ast, tag, type)
} or
TIRDynamicInitializationFlag(
Language::Function func, Language::Variable var, Language::LanguageType type
) {
Construction::hasDynamicInitializationFlag(func, var, type)
} or
TIRStringLiteral(
Language::Function func, Language::AST ast, Language::LanguageType type,
Language::StringLiteral literal

View File

@@ -27,9 +27,6 @@ class IRBlockBase extends TIRBlock {
* by debugging and printing code only.
*/
int getDisplayIndex() {
exists(IRConfiguration::IRConfiguration config |
config.shouldEvaluateDebugStringsForFunction(this.getEnclosingFunction())
) and
this =
rank[result + 1](IRBlock funcBlock |
funcBlock.getEnclosingFunction() = getEnclosingFunction()

Some files were not shown because too many files have changed in this diff Show More