Compare commits

..

6 Commits

Author SHA1 Message Date
Paolo Tranquilli
a4c505907f Bazel/rust: use patched tree-sitter 2024-05-08 12:03:51 +02:00
Paolo Tranquilli
f07d15de0c Merge branch 'main' into treesitter/bump 2024-05-08 05:40:20 +02:00
Tom Hvitved
796297eb19 Ruby: Update expected test output 2024-05-07 15:41:38 +02:00
Tom Hvitved
114869d52d Python: Also use tree-sitter 0.22.6 2024-05-07 10:40:01 +02:00
Tom Hvitved
67dbcb4197 Ruby: Bump codeql-extractor SHA 2024-05-07 10:32:26 +02:00
Tom Hvitved
68d1cbb2d5 Tree-sitter: Bump to 0.22.6 2024-05-07 10:26:08 +02:00
2066 changed files with 67217 additions and 88953 deletions

View File

@@ -10,19 +10,15 @@ common --override_module=semmle_code=%workspace%/misc/bazel/semmle_code_stub
build --repo_env=CC=clang --repo_env=CXX=clang++
# we use transitions that break builds of `...`, so for `test` to work with that we need the following
test --build_tests_only
build:linux --cxxopt=-std=c++20
build:macos --cxxopt=-std=c++20 --cpu=darwin_x86_64
build:windows --cxxopt=/std:c++20 --cxxopt=/Zc:preprocessor
# this requires developer mode, but is required to have pack installer functioning
startup --windows_enable_symlinks
common --enable_runfiles
# with the above, we can avoid building python zips which is the default on windows as that's expensive
build --nobuild_python_zip
common --registry=file:///%workspace%/misc/bazel/registry
common --registry=https://bcr.bazel.build
common --@rules_dotnet//dotnet/settings:strict_deps=false
try-import %workspace%/local.bazelrc

View File

@@ -2,9 +2,3 @@
common --registry=file:///%workspace%/ql/misc/bazel/registry
common --registry=https://bcr.bazel.build
# See bazelbuild/rules_dotnet#413: strict_deps in C# also appliy to 3rd-party deps, and when we pull
# in (for example) the xunit package, there's no code in this at all, it just depends transitively on
# its implementation packages without providing any code itself.
# We either can depend on internal implementation details, or turn of strict deps.
common --@rules_dotnet//dotnet/settings:strict_deps=false

View File

@@ -1 +1 @@
7.1.2
7.1.0

34
.gitattributes vendored
View File

@@ -50,40 +50,26 @@
*.dll -text
*.pdb -text
/java/ql/test/stubs/**/*.java linguist-generated=true
/java/ql/test/experimental/stubs/**/*.java linguist-generated=true
/java/kotlin-extractor/deps/*.jar filter=lfs diff=lfs merge=lfs -text
java/ql/test/stubs/**/*.java linguist-generated=true
java/ql/test/experimental/stubs/**/*.java linguist-generated=true
# Force git not to modify line endings for go or html files under the go/ql directory
/go/ql/**/*.go -text
/go/ql/**/*.html -text
go/ql/**/*.go -text
go/ql/**/*.html -text
# Force git not to modify line endings for go dbschemes
/go/*.dbscheme -text
go/*.dbscheme -text
# Preserve unusual line ending from codeql-go merge
/go/extractor/opencsv/CSVReader.java -text
go/extractor/opencsv/CSVReader.java -text
# For some languages, upgrade script testing references really old dbscheme
# files from legacy upgrades that have CRLF line endings. Since upgrade
# resolution relies on object hashes, we must suppress line ending conversion
# for those testing dbscheme files.
/*/ql/lib/upgrades/initial/*.dbscheme -text
*/ql/lib/upgrades/initial/*.dbscheme -text
# Auto-generated modeling for Python
/python/ql/lib/semmle/python/frameworks/data/internal/subclass-capture/*.yml linguist-generated=true
python/ql/lib/semmle/python/frameworks/data/internal/subclass-capture/*.yml linguist-generated=true
# auto-generated bazel lock file
/ruby/extractor/cargo-bazel-lock.json linguist-generated=true
/ruby/extractor/cargo-bazel-lock.json -merge
# auto-generated files for the C# build
/csharp/paket.lock linguist-generated=true
# needs eol=crlf, as `paket` touches this file and saves it as crlf
/csharp/.paket/Paket.Restore.targets linguist-generated=true eol=crlf
/csharp/paket.main.bzl linguist-generated=true
/csharp/paket.main_extension.bzl linguist-generated=true
# ripunzip tool
/misc/ripunzip/ripunzip-* filter=lfs diff=lfs merge=lfs -text
# swift prebuilt resources
/swift/third_party/resource-dir/*.zip filter=lfs diff=lfs merge=lfs -text
ruby/extractor/cargo-bazel-lock.json linguist-generated=true
ruby/extractor/cargo-bazel-lock.json -merge

View File

@@ -1,74 +0,0 @@
name: Build runzip
on:
workflow_dispatch:
inputs:
ripunzip-version:
description: "what reference to checktout from google/runzip"
required: false
default: v1.2.1
openssl-version:
description: "what reference to checkout from openssl/openssl for Linux"
required: false
default: openssl-3.3.0
jobs:
build:
strategy:
fail-fast: false
matrix:
os: [ubuntu-20.04, macos-12, windows-2019]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
with:
repository: google/ripunzip
ref: ${{ inputs.ripunzip-version }}
# we need to avoid ripunzip dynamically linking into libssl
# see https://github.com/sfackler/rust-openssl/issues/183
- if: runner.os == 'Linux'
name: checkout openssl
uses: actions/checkout@v4
with:
repository: openssl/openssl
path: openssl
ref: ${{ inputs.openssl-version }}
- if: runner.os == 'Linux'
name: build and install openssl with fPIC
shell: bash
working-directory: openssl
run: |
./config -fPIC --prefix=$HOME/.local --openssldir=$HOME/.local/ssl
make -j $(nproc)
make install_sw -j $(nproc)
- if: runner.os == 'Linux'
name: build (linux)
shell: bash
run: |
env OPENSSL_LIB_DIR=$HOME/.local/lib64 OPENSSL_INCLUDE_DIR=$HOME/.local/include OPENSSL_STATIC=yes cargo build --release
mv target/release/ripunzip ripunzip-linux
- if: runner.os == 'Windows'
name: build (windows)
shell: bash
run: |
cargo build --release
mv target/release/ripunzip ripunzip-windows
- name: build (macOS)
if: runner.os == 'macOS'
shell: bash
run: |
rustup target install x86_64-apple-darwin
rustup target install aarch64-apple-darwin
cargo build --target x86_64-apple-darwin --release
cargo build --target aarch64-apple-darwin --release
lipo -create -output ripunzip-macos \
-arch x86_64 target/x86_64-apple-darwin/release/ripunzip \
-arch arm64 target/aarch64-apple-darwin/release/ripunzip
- uses: actions/upload-artifact@v4
with:
name: ripunzip-${{ runner.os }}
path: ripunzip-*
- name: Check built binary
shell: bash
run: |
./ripunzip-* --version

View File

@@ -56,9 +56,7 @@ jobs:
# uses a compiled language
- run: |
cd csharp
dotnet tool restore
dotnet build .
dotnet build csharp
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@main

View File

@@ -65,7 +65,7 @@ jobs:
key: csharp-qltest-${{ matrix.slice }}
- name: Run QL tests
run: |
codeql test run --threads=0 --ram 50000 --slice ${{ matrix.slice }} --search-path "${{ github.workspace }}" --check-databases --check-undefined-labels --check-repeated-labels --check-redefined-labels --consistency-queries ql/consistency-queries ql/test --compilation-cache "${{ steps.query-cache.outputs.cache-dir }}"
codeql test run --threads=0 --ram 50000 --slice ${{ matrix.slice }} --search-path extractor-pack --check-databases --check-undefined-labels --check-repeated-labels --check-redefined-labels --consistency-queries ql/consistency-queries ql/test --compilation-cache "${{ steps.query-cache.outputs.cache-dir }}"
env:
GITHUB_TOKEN: ${{ github.token }}
unit-tests:
@@ -81,11 +81,10 @@ jobs:
dotnet-version: 8.0.101
- name: Extractor unit tests
run: |
dotnet tool restore
dotnet test -p:RuntimeFrameworkVersion=8.0.1 extractor/Semmle.Util.Tests
dotnet test -p:RuntimeFrameworkVersion=8.0.1 extractor/Semmle.Extraction.Tests
dotnet test -p:RuntimeFrameworkVersion=8.0.1 autobuilder/Semmle.Autobuild.CSharp.Tests
dotnet test -p:RuntimeFrameworkVersion=8.0.1 autobuilder/Semmle.Autobuild.Cpp.Tests
dotnet test -p:RuntimeFrameworkVersion=8.0.1 "${{ github.workspace }}/cpp/autobuilder/Semmle.Autobuild.Cpp.Tests"
shell: bash
stubgentest:
runs-on: ubuntu-latest
@@ -101,6 +100,6 @@ jobs:
# Update existing stubs in the repo with the freshly generated ones
mv "$STUBS_PATH/output/stubs/_frameworks" ql/test/resources/stubs/
git status
codeql test run --threads=0 --search-path "${{ github.workspace }}" --check-databases --check-undefined-labels --check-repeated-labels --check-redefined-labels --consistency-queries ql/consistency-queries -- ql/test/library-tests/dataflow/flowsources/aspremote
codeql test run --threads=0 --search-path extractor-pack --check-databases --check-undefined-labels --check-repeated-labels --check-redefined-labels --consistency-queries ql/consistency-queries -- ql/test/library-tests/dataflow/flowsources/aspremote
env:
GITHUB_TOKEN: ${{ github.token }}

View File

@@ -7,9 +7,8 @@ on:
- .github/workflows/go-tests-other-os.yml
- .github/actions/**
- codeql-workspace.yml
- MODULE.bazel
- .bazelrc
- misc/bazel/**
env:
GO_VERSION: '~1.22.0'
permissions:
contents: read
@@ -19,17 +18,72 @@ jobs:
name: Test MacOS
runs-on: macos-latest
steps:
- name: Set up Go ${{ env.GO_VERSION }}
uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: false
id: go
- name: Check out code
uses: actions/checkout@v4
- name: Run tests
uses: ./go/actions/test
- name: Set up CodeQL CLI
uses: ./.github/actions/fetch-codeql
- name: Enable problem matchers in repository
shell: bash
run: 'find .github/problem-matchers -name \*.json -exec echo "::add-matcher::{}" \;'
- name: Build
run: |
cd go
make
- name: Cache compilation cache
id: query-cache
uses: ./.github/actions/cache-query-compilation
with:
key: go-qltest
- name: Test
run: |
cd go
make test cache="${{ steps.query-cache.outputs.cache-dir }}"
test-win:
if: github.repository_owner == 'github'
name: Test Windows
runs-on: windows-latest-xl
steps:
- name: Set up Go ${{ env.GO_VERSION }}
uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: false
id: go
- name: Check out code
uses: actions/checkout@v4
- name: Run tests
uses: ./go/actions/test
- name: Set up CodeQL CLI
uses: ./.github/actions/fetch-codeql
- name: Enable problem matchers in repository
shell: bash
run: 'find .github/problem-matchers -name \*.json -exec echo "::add-matcher::{}" \;'
- name: Build
run: |
cd go
make
- name: Cache compilation cache
id: query-cache
uses: ./.github/actions/cache-query-compilation
with:
key: go-qltest
- name: Test
run: |
cd go
make test cache="${{ steps.query-cache.outputs.cache-dir }}"

View File

@@ -15,9 +15,9 @@ on:
- .github/workflows/go-tests.yml
- .github/actions/**
- codeql-workspace.yml
- MODULE.bazel
- .bazelrc
- misc/bazel/**
env:
GO_VERSION: '~1.22.0'
permissions:
contents: read
@@ -28,9 +28,51 @@ jobs:
name: Test Linux (Ubuntu)
runs-on: ubuntu-latest-xl
steps:
- name: Set up Go ${{ env.GO_VERSION }}
uses: actions/setup-go@v5
with:
go-version: ${{ env.GO_VERSION }}
cache: false
id: go
- name: Check out code
uses: actions/checkout@v4
- name: Run tests
uses: ./go/actions/test
- name: Set up CodeQL CLI
uses: ./.github/actions/fetch-codeql
- name: Enable problem matchers in repository
shell: bash
run: 'find .github/problem-matchers -name \*.json -exec echo "::add-matcher::{}" \;'
- name: Build
run: |
cd go
make
- name: Check that all Go code is autoformatted
run: |
cd go
make check-formatting
- name: Compile qhelp files to markdown
run: |
cd go
env QHELP_OUT_DIR=qhelp-out make qhelp-to-markdown
- name: Upload qhelp markdown
uses: actions/upload-artifact@v3
with:
run-code-checks: true
name: qhelp-markdown
path: go/qhelp-out/**/*.md
- name: Cache compilation cache
id: query-cache
uses: ./.github/actions/cache-query-compilation
with:
key: go-qltest
- name: Test
run: |
cd go
make test cache="${{ steps.query-cache.outputs.cache-dir }}"

View File

@@ -1,28 +0,0 @@
name: "Kotlin Build"
on:
pull_request:
paths:
- "java/kotlin-extractor/**"
- "misc/bazel/**"
- "misc/codegen/**"
- "*.bazel*"
- .github/workflows/kotlin-build.yml
branches:
- main
- rc/*
- codeql-cli-*
permissions:
contents: read
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
bazel query //java/kotlin-extractor/...
# only build the default version as a quick check that we can build from `codeql`
# the full official build will be checked by QLucie
bazel build //java/kotlin-extractor

View File

@@ -49,20 +49,20 @@ jobs:
key: ${{ runner.os }}-${{ steps.os_version.outputs.version }}-rust-cargo-${{ hashFiles('ql/**/Cargo.lock') }}
- name: Release build
if: steps.cache-extractor.outputs.cache-hit != 'true'
run: cd ql; ./scripts/create-extractor-pack.sh
run: cd ql; ./scripts/create-extractor-pack.sh
env:
GH_TOKEN: ${{ github.token }}
GH_TOKEN: ${{ github.token }}
- name: Cache compilation cache
id: query-cache
uses: ./.github/actions/cache-query-compilation
with:
with:
key: run-ql-for-ql
- name: Make database and analyze
run: |
./ql/target/release/buramu | tee deprecated.blame # Add a blame file for the extractor to parse.
${CODEQL} database create -l=ql ${DB} --search-path "${{ github.workspace }}"
${CODEQL} database create -l=ql --search-path ql/extractor-pack ${DB}
${CODEQL} database analyze -j0 --format=sarif-latest --output=ql-for-ql.sarif ${DB} ql/ql/src/codeql-suites/ql-code-scanning.qls --compilation-cache "${{ steps.query-cache.outputs.cache-dir }}"
env:
env:
CODEQL: ${{ steps.find-codeql.outputs.codeql-path }}
DB: ${{ runner.temp }}/DB
LGTM_INDEX_FILTERS: |

View File

@@ -53,8 +53,8 @@ jobs:
- name: Create database
run: |
"${CODEQL}" database create \
--search-path "${{ github.workspace }}"
--threads 4 \
--search-path "ql/extractor-pack" \
--threads 4 \
--language ql --source-root "${{ github.workspace }}/repo" \
"${{ runner.temp }}/database"
env:

View File

@@ -49,15 +49,15 @@ jobs:
- name: Cache compilation cache
id: query-cache
uses: ./.github/actions/cache-query-compilation
with:
with:
key: ql-for-ql-tests
- name: Run QL tests
run: |
"${CODEQL}" test run --check-databases --check-unused-labels --check-repeated-labels --check-redefined-labels --check-use-before-definition --search-path "${{ github.workspace }}" --consistency-queries ql/ql/consistency-queries --compilation-cache "${{ steps.query-cache.outputs.cache-dir }}" ql/ql/test
"${CODEQL}" test run --check-databases --check-unused-labels --check-repeated-labels --check-redefined-labels --check-use-before-definition --search-path "${{ github.workspace }}/ql/extractor-pack" --consistency-queries ql/ql/consistency-queries --compilation-cache "${{ steps.query-cache.outputs.cache-dir }}" ql/ql/test
env:
CODEQL: ${{ steps.find-codeql.outputs.codeql-path }}
other-os:
other-os:
strategy:
matrix:
os: [macos-latest, windows-latest]
@@ -65,7 +65,7 @@ jobs:
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- name: Install GNU tar
- name: Install GNU tar
if: runner.os == 'macOS'
run: |
brew install gnu-tar
@@ -100,7 +100,7 @@ jobs:
- name: Run a single QL tests - Unix
if: runner.os != 'Windows'
run: |
"${CODEQL}" test run --check-databases --search-path "${{ github.workspace }}" ql/ql/test/queries/style/DeadCode/DeadCode.qlref
"${CODEQL}" test run --check-databases --search-path "${{ github.workspace }}/ql/extractor-pack" ql/ql/test/queries/style/DeadCode/DeadCode.qlref
env:
CODEQL: ${{ steps.find-codeql.outputs.codeql-path }}
- name: Run a single QL tests - Windows
@@ -108,4 +108,5 @@ jobs:
shell: pwsh
run: |
$Env:PATH += ";$(dirname ${{ steps.find-codeql.outputs.codeql-path }})"
codeql test run --check-databases --search-path "${{ github.workspace }}" ql/ql/test/queries/style/DeadCode/DeadCode.qlref
codeql test run --check-databases --search-path "${{ github.workspace }}/ql/extractor-pack" ql/ql/test/queries/style/DeadCode/DeadCode.qlref

View File

@@ -44,7 +44,7 @@ jobs:
- name: Create database
run: |
codeql database create \
--search-path "${{ github.workspace }}" \
--search-path "${{ github.workspace }}/ruby/extractor-pack" \
--threads 4 \
--language ruby --source-root "${{ github.workspace }}/repo" \
"${{ runner.temp }}/database"

View File

@@ -64,10 +64,10 @@ jobs:
- name: Cache compilation cache
id: query-cache
uses: ./.github/actions/cache-query-compilation
with:
with:
key: ruby-qltest
- name: Run QL tests
run: |
codeql test run --threads=0 --ram 50000 --search-path "${{ github.workspace }}" --check-databases --check-undefined-labels --check-unused-labels --check-repeated-labels --check-redefined-labels --check-use-before-definition --consistency-queries ql/consistency-queries ql/test --compilation-cache "${{ steps.query-cache.outputs.cache-dir }}"
codeql test run --threads=0 --ram 50000 --search-path "${{ github.workspace }}/ruby/extractor-pack" --check-databases --check-undefined-labels --check-unused-labels --check-repeated-labels --check-redefined-labels --check-use-before-definition --consistency-queries ql/consistency-queries ql/test --compilation-cache "${{ steps.query-cache.outputs.cache-dir }}"
env:
GITHUB_TOKEN: ${{ github.token }}

View File

@@ -68,6 +68,21 @@ jobs:
steps:
- uses: actions/checkout@v4
- uses: ./swift/actions/run-ql-tests
integration-tests-linux:
if: github.repository_owner == 'github'
needs: build-and-test-linux
runs-on: ubuntu-latest-xl
steps:
- uses: actions/checkout@v4
- uses: ./swift/actions/run-integration-tests
integration-tests-macos:
if: ${{ github.repository_owner == 'github' && github.event_name == 'pull_request' }}
needs: build-and-test-macos
runs-on: macos-12-xl
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- uses: ./swift/actions/run-integration-tests
clang-format:
if : ${{ github.event_name == 'pull_request' }}
runs-on: ubuntu-latest

View File

@@ -1,23 +0,0 @@
name: "Test zipmerge code"
on:
pull_request:
paths:
- "misc/bazel/internal/zipmerge/**"
- "MODULE.bazel"
- ".bazelrc*"
branches:
- main
- "rc/*"
permissions:
contents: read
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: |
bazel test //misc/bazel/internal/zipmerge:test --test_output=all

3
.gitignore vendored
View File

@@ -62,6 +62,3 @@ node_modules/
# Temporary folders for working with generated models
.model-temp
# bazel-built in-tree extractor packs
/*/extractor-pack

View File

@@ -2,6 +2,4 @@
# codeql is publicly forked by many users, and we don't want any LFS file polluting their working
# copies. We therefore exclude everything by default.
# For files required by bazel builds, use rules in `misc/bazel/lfs.bzl` to download them on demand.
# we go for `fetchinclude` to something not exsiting rather than `fetchexclude = *` because the
# former is easier to override (with `git -c` or a local git config) to fetch something specific
fetchinclude = /nothing

View File

@@ -29,13 +29,12 @@ repos:
entry: bazel run //misc/bazel:buildifier
pass_filenames: false
# DISABLED: can be enabled by copying this config and installing `pre-commit` with `--config` on the copy
# - id: go-gen
# name: Check checked in generated files in go
# files: ^go/.*
# language: system
# entry: bazel run //go:gen
# pass_filenames: false
- id: go-gen
name: Check checked in generated files in go
files: ^go/.*
language: system
entry: bazel run //go:gen
pass_filenames: false
- id: codeql-format
name: Fix QL file formatting

View File

@@ -1 +0,0 @@
exports_files(["LICENSE"])

View File

@@ -1,7 +1,6 @@
/cpp/ @github/codeql-c-analysis
/cpp/autobuilder/ @github/codeql-c-extractor
/csharp/ @github/codeql-csharp
/csharp/autobuilder/Semmle.Autobuild.Cpp @github/codeql-c-extractor
/csharp/autobuilder/Semmle.Autobuild.Cpp.Tests @github/codeql-c-extractor
/go/ @github/codeql-go
/java/ @github/codeql-java
/javascript/ @github/codeql-javascript

View File

@@ -4,8 +4,6 @@ We welcome contributions to our CodeQL libraries and queries. Got an idea for a
There is lots of useful documentation to help you write queries, ranging from information about query file structure to tutorials for specific target languages. For more information on the documentation available, see [CodeQL queries](https://codeql.github.com/docs/writing-codeql-queries/codeql-queries) on [codeql.github.com](https://codeql.github.com).
Note that the CodeQL for Visual Studio Code documentation has been migrated to https://docs.github.com/en/code-security/codeql-for-vs-code/, but you can still contribute to it via a different repository. For more information, see [Contributing to GitHub Docs documentation](https://docs.github.com/en/contributing)."
## Change notes
Any nontrivial user-visible change to a query pack or library pack should have a change note. For details on how to add a change note for your change, see [this guide](docs/change-notes.md).
@@ -45,7 +43,7 @@ If you have an idea for a query that you would like to share with other CodeQL u
3. **Formatting**
- The queries and libraries must be autoformatted, for example using the "Format Document" command in [CodeQL for Visual Studio Code](https://docs.github.com/en/code-security/codeql-for-vs-code/).
- The queries and libraries must be autoformatted, for example using the "Format Document" command in [CodeQL for Visual Studio Code](https://codeql.github.com/docs/codeql-for-visual-studio-code/about-codeql-for-visual-studio-code).
If you prefer, you can either:
1. install the [pre-commit framework](https://pre-commit.com/) and install the configured hooks on this repo via `pre-commit install`, or

View File

@@ -22,22 +22,10 @@ bazel_dep(name = "bazel_skylib", version = "1.5.0")
bazel_dep(name = "abseil-cpp", version = "20240116.0", repo_name = "absl")
bazel_dep(name = "nlohmann_json", version = "3.11.3", repo_name = "json")
bazel_dep(name = "fmt", version = "10.0.0")
bazel_dep(name = "rules_kotlin", version = "1.9.4-codeql.1")
bazel_dep(name = "gazelle", version = "0.36.0")
bazel_dep(name = "rules_dotnet", version = "0.15.1")
bazel_dep(name = "googletest", version = "1.14.0.bcr.1")
bazel_dep(name = "buildifier_prebuilt", version = "6.4.0", dev_dependency = True)
dotnet = use_extension("@rules_dotnet//dotnet:extensions.bzl", "dotnet")
dotnet.toolchain(dotnet_version = "8.0.101")
use_repo(dotnet, "dotnet_toolchains")
register_toolchains("@dotnet_toolchains//:all")
csharp_main_extension = use_extension("//csharp:paket.main_extension.bzl", "main_extension")
use_repo(csharp_main_extension, "paket.main")
pip = use_extension("@rules_python//python/extensions:pip.bzl", "pip")
pip.parse(
hub_name = "codegen_deps",
@@ -66,84 +54,9 @@ node.toolchain(
)
use_repo(node, "nodejs", "nodejs_toolchains")
kotlin_extractor_deps = use_extension("//java/kotlin-extractor:deps.bzl", "kotlin_extractor_deps")
# following list can be kept in sync by running `bazel mod tidy` in `codeql`
use_repo(
kotlin_extractor_deps,
"codeql_kotlin_defaults",
"codeql_kotlin_embeddable",
"kotlin-compiler-1.5.0",
"kotlin-compiler-1.5.10",
"kotlin-compiler-1.5.20",
"kotlin-compiler-1.5.30",
"kotlin-compiler-1.6.0",
"kotlin-compiler-1.6.20",
"kotlin-compiler-1.7.0",
"kotlin-compiler-1.7.20",
"kotlin-compiler-1.8.0",
"kotlin-compiler-1.9.0-Beta",
"kotlin-compiler-1.9.20-Beta",
"kotlin-compiler-2.0.0-RC1",
"kotlin-compiler-embeddable-1.5.0",
"kotlin-compiler-embeddable-1.5.10",
"kotlin-compiler-embeddable-1.5.20",
"kotlin-compiler-embeddable-1.5.30",
"kotlin-compiler-embeddable-1.6.0",
"kotlin-compiler-embeddable-1.6.20",
"kotlin-compiler-embeddable-1.7.0",
"kotlin-compiler-embeddable-1.7.20",
"kotlin-compiler-embeddable-1.8.0",
"kotlin-compiler-embeddable-1.9.0-Beta",
"kotlin-compiler-embeddable-1.9.20-Beta",
"kotlin-compiler-embeddable-2.0.0-RC1",
"kotlin-stdlib-1.5.0",
"kotlin-stdlib-1.5.10",
"kotlin-stdlib-1.5.20",
"kotlin-stdlib-1.5.30",
"kotlin-stdlib-1.6.0",
"kotlin-stdlib-1.6.20",
"kotlin-stdlib-1.7.0",
"kotlin-stdlib-1.7.20",
"kotlin-stdlib-1.8.0",
"kotlin-stdlib-1.9.0-Beta",
"kotlin-stdlib-1.9.20-Beta",
"kotlin-stdlib-2.0.0-RC1",
)
go_sdk = use_extension("@rules_go//go:extensions.bzl", "go_sdk")
go_sdk.download(version = "1.22.2")
lfs_files = use_repo_rule("//misc/bazel:lfs.bzl", "lfs_files")
lfs_files(
name = "ripunzip-linux",
srcs = ["//misc/ripunzip:ripunzip-linux"],
executable = True,
)
lfs_files(
name = "ripunzip-windows",
srcs = ["//misc/ripunzip:ripunzip-windows.exe"],
executable = True,
)
lfs_files(
name = "ripunzip-macos",
srcs = ["//misc/ripunzip:ripunzip-macos"],
executable = True,
)
lfs_files(
name = "swift-resource-dir-linux",
srcs = ["//swift/third_party/resource-dir:resource-dir-linux.zip"],
)
lfs_files(
name = "swift-resource-dir-macos",
srcs = ["//swift/third_party/resource-dir:resource-dir-macos.zip"],
)
register_toolchains(
"@nodejs_toolchains//:all",
)

View File

@@ -4,7 +4,7 @@ This open source repository contains the standard CodeQL libraries and queries t
## How do I learn CodeQL and run queries?
There is extensive documentation about the [CodeQL language](https://codeql.github.com/docs/), writing CodeQL using the [CodeQL extension for Visual Studio Code](https://docs.github.com/en/code-security/codeql-for-vs-code/) and using the [CodeQL CLI](https://docs.github.com/en/code-security/codeql-cli).
There is [extensive documentation](https://codeql.github.com/docs/) on getting started with writing CodeQL using the [CodeQL extension for Visual Studio Code](https://codeql.github.com/docs/codeql-for-visual-studio-code/) and the [CodeQL CLI](https://codeql.github.com/docs/codeql-cli/).
## Contributing

View File

@@ -6,16 +6,19 @@ provide:
- "*/ql/consistency-queries/qlpack.yml"
- "*/ql/automodel/src/qlpack.yml"
- "*/ql/automodel/test/qlpack.yml"
- "*/extractor-pack/codeql-extractor.yml"
- "python/extractor/qlpack.yml"
- "shared/**/qlpack.yml"
- "cpp/ql/test/query-tests/Security/CWE/CWE-190/semmle/tainted/qlpack.yml"
- "go/ql/config/legacy-support/qlpack.yml"
- "go/build/codeql-extractor-go/codeql-extractor.yml"
- "csharp/ql/campaigns/Solorigate/lib/qlpack.yml"
- "csharp/ql/campaigns/Solorigate/src/qlpack.yml"
- "csharp/ql/campaigns/Solorigate/test/qlpack.yml"
- "misc/legacy-support/*/qlpack.yml"
- "misc/suite-helpers/qlpack.yml"
- "ruby/extractor-pack/codeql-extractor.yml"
- "swift/extractor-pack/codeql-extractor.yml"
- "ql/extractor-pack/codeql-extractor.yml"
- ".github/codeql/extensions/**/codeql-pack.yml"
versionPolicies:

View File

@@ -28,7 +28,6 @@
"/*- Yaml dbscheme -*/",
"/*- Blame dbscheme -*/",
"/*- JSON dbscheme -*/",
"/*- Python dbscheme -*/",
"/*- Empty location -*/"
"/*- Python dbscheme -*/"
]
}

View File

@@ -364,9 +364,5 @@
"Python model summaries test extension": [
"python/ql/test/library-tests/dataflow/model-summaries/InlineTaintTest.ext.yml",
"python/ql/test/library-tests/dataflow/model-summaries/NormalDataflowTest.ext.yml"
],
"shared tree-sitter extractor cargo.toml": [
"shared/tree-sitter-extractor/Cargo.toml",
"ruby/extractor/codeql-extractor-fake-crate/Cargo.toml"
]
}

13
cpp/autobuilder/.gitignore vendored Normal file
View File

@@ -0,0 +1,13 @@
obj/
TestResults/
*.manifest
*.pdb
*.suo
*.mdb
*.vsmdi
csharp.log
**/bin/Debug
**/bin/Release
*.tlog
.vs
*.user

View File

@@ -1 +0,0 @@
The Windows autobuilder that used to live in this directory moved to `csharp/autobuilder/Semmle.Autobuild.Cpp`.

View File

@@ -200,9 +200,9 @@ namespace Semmle.Autobuild.Cpp.Tests
internal class TestDiagnosticWriter : IDiagnosticsWriter
{
public IList<Semmle.Util.DiagnosticMessage> Diagnostics { get; } = new List<Semmle.Util.DiagnosticMessage>();
public IList<DiagnosticMessage> Diagnostics { get; } = new List<DiagnosticMessage>();
public void AddEntry(Semmle.Util.DiagnosticMessage message) => this.Diagnostics.Add(message);
public void AddEntry(DiagnosticMessage message) => this.Diagnostics.Add(message);
public void Dispose() { }
}

View File

@@ -0,0 +1,26 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net8.0</TargetFramework>
<GenerateAssemblyInfo>false</GenerateAssemblyInfo>
<RuntimeIdentifiers>win-x64;linux-x64;osx-x64</RuntimeIdentifiers>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="System.IO.FileSystem" Version="4.3.0" />
<PackageReference Include="System.IO.FileSystem.Primitives" Version="4.3.0" />
<PackageReference Include="xunit" Version="2.6.2" />
<PackageReference Include="xunit.runner.visualstudio" Version="2.5.4">
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers</IncludeAssets>
</PackageReference>
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.8.0" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\Semmle.Autobuild.Cpp\Semmle.Autobuild.Cpp.csproj" />
<ProjectReference Include="..\..\..\csharp\autobuilder\Semmle.Autobuild.Shared\Semmle.Autobuild.Shared.csproj" />
</ItemGroup>
</Project>

View File

@@ -0,0 +1,32 @@
using System.Reflection;
using System.Runtime.InteropServices;
// General Information about an assembly is controlled through the following
// set of attributes. Change these attribute values to modify the information
// associated with an assembly.
[assembly: AssemblyTitle("Semmle.Autobuild.Cpp")]
[assembly: AssemblyDescription("")]
[assembly: AssemblyConfiguration("")]
[assembly: AssemblyCompany("GitHub")]
[assembly: AssemblyProduct("CodeQL autobuilder for C++")]
[assembly: AssemblyCopyright("Copyright © GitHub 2020")]
[assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")]
// Setting ComVisible to false makes the types in this assembly not visible
// to COM components. If you need to access a type in this assembly from
// COM, set the ComVisible attribute to true on that type.
[assembly: ComVisible(false)]
// Version information for an assembly consists of the following four values:
//
// Major Version
// Minor Version
// Build Number
// Revision
//
// You can specify all the values or you can default the Build and Revision Numbers
// by using the '*' as shown below:
// [assembly: AssemblyVersion("1.0.*")]
[assembly: AssemblyVersion("1.0.0.0")]
[assembly: AssemblyFileVersion("1.0.0.0")]

View File

@@ -0,0 +1,28 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net8.0</TargetFramework>
<AssemblyName>Semmle.Autobuild.Cpp</AssemblyName>
<RootNamespace>Semmle.Autobuild.Cpp</RootNamespace>
<ApplicationIcon />
<OutputType>Exe</OutputType>
<StartupObject />
<GenerateAssemblyInfo>false</GenerateAssemblyInfo>
<RuntimeIdentifiers>win-x64;linux-x64;osx-x64</RuntimeIdentifiers>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<Folder Include="Properties\" />
</ItemGroup>
<ItemGroup>
<PackageReference Include="Microsoft.Build" Version="17.8.3" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\..\..\csharp\extractor\Semmle.Util\Semmle.Util.csproj" />
<ProjectReference Include="..\..\..\csharp\autobuilder\Semmle.Autobuild.Shared\Semmle.Autobuild.Shared.csproj" />
</ItemGroup>
</Project>

View File

@@ -1,4 +1,4 @@
description: Revert support for repeated initializers, which are allowed in C with designated initializers.
compatibility: full
aggregate_field_init.rel: reorder aggregate_field_init.rel (@aggregateliteral aggregate, @expr initializer, @membervariable field, int position) aggregate initializer field
aggregate_array_init.rel: reorder aggregate_array_init.rel (@aggregateliteral aggregate, @expr initializer, int element_index, int position) aggregate initializer element_index
aggregate_field_init.rel: reorder aggregate_field_init.rel (int aggregate, int initializer, int field, int position) aggregate initializer field
aggregate_array_init.rel: reorder aggregate_array_init.rel (int aggregate, int initializer, int element_index, int position) aggregate initializer element_index

View File

@@ -6,7 +6,7 @@ pkg_files(
["**"],
exclude = ["BUILD.bazel"],
),
prefix = "downgrades",
prefix = "cpp/downgrades",
strip_prefix = strip_prefix.from_pkg(),
visibility = ["//cpp:__pkg__"],
)

View File

@@ -5,9 +5,11 @@ package(default_visibility = ["//cpp:__pkg__"])
pkg_files(
name = "dbscheme",
srcs = ["semmlecode.cpp.dbscheme"],
prefix = "cpp",
)
pkg_files(
name = "dbscheme-stats",
srcs = ["semmlecode.cpp.dbscheme.stats"],
prefix = "cpp",
)

View File

@@ -1,27 +1,3 @@
## 1.1.1
No user-facing changes.
## 1.1.0
### New Features
* Data models can now be added with data extensions. In this way source, sink and summary models can be added in extension `.model.yml` files, rather than by writing classes in QL code. New models should be added in the `lib/ext` folder.
### Minor Analysis Improvements
* A partial model for the `Boost.Asio` network library has been added. This includes sources, sinks and summaries for certain functions in `Boost.Asio`, such as `read_until` and `write`.
## 1.0.0
### Breaking Changes
* CodeQL package management is now generally available, and all GitHub-produced CodeQL packages have had their version numbers increased to 1.0.0.
## 0.13.1
No user-facing changes.
## 0.13.0
### Breaking Changes

View File

@@ -1,3 +0,0 @@
## 0.13.1
No user-facing changes.

View File

@@ -1,5 +0,0 @@
## 1.0.0
### Breaking Changes
* CodeQL package management is now generally available, and all GitHub-produced CodeQL packages have had their version numbers increased to 1.0.0.

View File

@@ -1,9 +0,0 @@
## 1.1.0
### New Features
* Data models can now be added with data extensions. In this way source, sink and summary models can be added in extension `.model.yml` files, rather than by writing classes in QL code. New models should be added in the `lib/ext` folder.
### Minor Analysis Improvements
* A partial model for the `Boost.Asio` network library has been added. This includes sources, sinks and summaries for certain functions in `Boost.Asio`, such as `read_until` and `write`.

View File

@@ -1,3 +0,0 @@
## 1.1.1
No user-facing changes.

View File

@@ -1,2 +1,2 @@
---
lastReleaseVersion: 1.1.1
lastReleaseVersion: 0.13.0

View File

@@ -1,26 +0,0 @@
extensions:
# partial model of the Boost::Asio network library
extensions:
- addsTo:
pack: codeql/cpp-all
extensible: sourceModel
data: # namespace, type, subtypes, name, signature, ext, output, kind, provenance
- ["boost::asio", "", False, "read", "", "", "Argument[*1]", "remote", "manual"]
- ["boost::asio", "", False, "read_at", "", "", "Argument[*2]", "remote", "manual"]
- ["boost::asio", "", False, "read_until", "", "", "Argument[*1]", "remote", "manual"]
- ["boost::asio", "", False, "async_read", "", "", "Argument[*1]", "remote", "manual"]
- ["boost::asio", "", False, "async_read_at", "", "", "Argument[*2]", "remote", "manual"]
- ["boost::asio", "", False, "async_read_until", "", "", "Argument[*1]", "remote", "manual"]
- addsTo:
pack: codeql/cpp-all
extensible: sinkModel
data: # namespace, type, subtypes, name, signature, ext, input, kind, provenance
- ["boost::asio", "", False, "write", "", "", "Argument[*1]", "remote-sink", "manual"]
- ["boost::asio", "", False, "write_at", "", "", "Argument[*2]", "remote-sink", "manual"]
- ["boost::asio", "", False, "async_write", "", "", "Argument[*1]", "remote-sink", "manual"]
- ["boost::asio", "", False, "async_write_at", "", "", "Argument[*2]", "remote-sink", "manual"]
- addsTo:
pack: codeql/cpp-all
extensible: summaryModel
data: # namespace, type, subtypes, name, signature, ext, input, output, kind, provenance
- ["boost::asio", "", False, "buffer", "", "", "Argument[*0]", "ReturnValue", "taint", "manual"]

View File

@@ -1,15 +0,0 @@
extensions:
# Make sure that the extensible model predicates have at least one definition
# to avoid errors about undefined extensionals.
- addsTo:
pack: codeql/cpp-all
extensible: sourceModel
data: []
- addsTo:
pack: codeql/cpp-all
extensible: sinkModel
data: []
- addsTo:
pack: codeql/cpp-all
extensible: summaryModel
data: []

View File

@@ -1,5 +1,5 @@
name: codeql/cpp-all
version: 1.1.1
version: 0.13.1-dev
groups: cpp
dbscheme: semmlecode.cpp.dbscheme
extractor: cpp
@@ -14,6 +14,4 @@ dependencies:
codeql/tutorial: ${workspace}
codeql/util: ${workspace}
codeql/xml: ${workspace}
dataExtensions:
- ext/*.model.yml
warnOnImplicitThis: true

View File

@@ -410,10 +410,6 @@ class LocalVariable extends LocalScopeVariable, @localvariable {
or
orphaned_variables(underlyingElement(this), unresolveElement(result))
}
override predicate isStatic() {
super.isStatic() or orphaned_variables(underlyingElement(this), _)
}
}
/**

View File

@@ -375,33 +375,6 @@ cached
class IRGuardCondition extends Instruction {
Instruction branch;
/*
* An `IRGuardCondition` supports reasoning about four different kinds of
* relations:
* 1. A unary equality relation of the form `e == k`
* 2. A binary equality relation of the form `e1 == e2 + k`
* 3. A unary inequality relation of the form `e < k`
* 4. A binary inequality relation of the form `e1 < e2 + k`
*
* where `k` is a constant.
*
* Furthermore, the unary relations (i.e., case 1 and case 3) are also
* inferred from `switch` statement guards: equality relations are inferred
* from the unique `case` statement, if any, and inequality relations are
* inferred from the [case range](https://gcc.gnu.org/onlinedocs/gcc/Case-Ranges.html)
* gcc extension.
*
* The implementation of all four follows the same structure: Each relation
* has a cached user-facing predicate that. For example,
* `GuardCondition::comparesEq` calls `compares_eq`. This predicate has
* several cases that recursively decompose the relation to bring it to a
* canonical form (i.e., a relation of the form `e1 == e2 + k`). The base
* case for this relation (i.e., `simple_comparison_eq`) handles
* `CompareEQInstruction`s and `CompareNEInstruction`, and recursive
* predicates (e.g., `complex_eq`) rewrites larger expressions such as
* `e1 + k1 == e2 + k2` into canonical the form `e1 == e2 + (k2 - k1)`.
*/
cached
IRGuardCondition() { branch = getBranchForCondition(this) }
@@ -592,7 +565,7 @@ class IRGuardCondition extends Instruction {
/** Holds if (determined by this guard) `op == k` evaluates to `areEqual` if this expression evaluates to `value`. */
cached
predicate comparesEq(Operand op, int k, boolean areEqual, AbstractValue value) {
unary_compares_eq(this, op, k, areEqual, false, value)
compares_eq(this, op, k, areEqual, value)
}
/**
@@ -613,7 +586,7 @@ class IRGuardCondition extends Instruction {
cached
predicate ensuresEq(Operand op, int k, IRBlock block, boolean areEqual) {
exists(AbstractValue value |
unary_compares_eq(this, op, k, areEqual, false, value) and this.valueControls(block, value)
compares_eq(this, op, k, areEqual, value) and this.valueControls(block, value)
)
}
@@ -638,7 +611,7 @@ class IRGuardCondition extends Instruction {
cached
predicate ensuresEqEdge(Operand op, int k, IRBlock pred, IRBlock succ, boolean areEqual) {
exists(AbstractValue value |
unary_compares_eq(this, op, k, areEqual, false, value) and
compares_eq(this, op, k, areEqual, value) and
this.valueControlsEdge(pred, succ, value)
)
}
@@ -764,68 +737,26 @@ private predicate compares_eq(
)
}
/**
* Holds if `op == k` is `areEqual` given that `test` is equal to `value`.
*
* Many internal predicates in this file have a `inNonZeroCase` column.
* Ideally, the `k` column would be a type such as `Option<int>::Option`, to
* represent whether we have a concrete value `k` such that `op == k`, or whether
* we only know that `op != 0`.
* However, cannot instantiate `Option` with an infinite type. Thus the boolean
* `inNonZeroCase` is used to distinquish the `Some` (where we have a concrete
* value `k`) and `None` cases (where we only know that `op != 0`).
*
* Thus, if `inNonZeroCase = true` then `op != 0` and the value of `k` is
* meaningless.
*
* To see why `inNonZeroCase` is needed consider the following C program:
* ```c
* char* p = ...;
* if(p) {
* use(p);
* }
* ```
* in C++ there would be an int-to-bool conversion on `p`. However, since C
* does not have booleans there is no conversion. We want to be able to
* conclude that `p` is non-zero in the true branch, so we need to give `k`
* some value. However, simply setting `k = 1` would make the rest of the
* analysis think that `k == 1` holds inside the branch. So we distinquish
* between the above case and
* ```c
* if(p == 1) {
* use(p)
* }
* ```
* by setting `inNonZeroCase` to `true` in the former case, but not in the
* latter.
*/
private predicate unary_compares_eq(
Instruction test, Operand op, int k, boolean areEqual, boolean inNonZeroCase, AbstractValue value
/** Holds if `op == k` is `areEqual` given that `test` is equal to `value`. */
private predicate compares_eq(
Instruction test, Operand op, int k, boolean areEqual, AbstractValue value
) {
/* The simple case where the test *is* the comparison so areEqual = testIsTrue xor eq. */
exists(AbstractValue v |
unary_simple_comparison_eq(test, k, inNonZeroCase, v) and op.getDef() = test
|
exists(AbstractValue v | simple_comparison_eq(test, op, k, v) |
areEqual = true and value = v
or
areEqual = false and value = v.getDualValue()
)
or
unary_complex_eq(test, op, k, areEqual, inNonZeroCase, value)
complex_eq(test, op, k, areEqual, value)
or
/* (x is true => (op == k)) => (!x is false => (op == k)) */
exists(AbstractValue dual, boolean inNonZeroCase0 |
value = dual.getDualValue() and
unary_compares_eq(test.(LogicalNotInstruction).getUnary(), op, k, inNonZeroCase0, areEqual, dual)
|
k = 0 and inNonZeroCase = inNonZeroCase0
or
k != 0 and inNonZeroCase = true
exists(AbstractValue dual | value = dual.getDualValue() |
compares_eq(test.(LogicalNotInstruction).getUnary(), op, k, areEqual, dual)
)
or
// ((test is `areEqual` => op == const + k2) and const == `k1`) =>
// test is `areEqual` => op == k1 + k2
inNonZeroCase = false and
exists(int k1, int k2, ConstantInstruction const |
compares_eq(test, op, const.getAUse(), k2, areEqual, value) and
int_value(const) = k1 and
@@ -850,63 +781,35 @@ private predicate simple_comparison_eq(
value.(BooleanValue).getValue() = false
}
/**
* Rearrange various simple comparisons into `op == k` form.
*/
private predicate unary_simple_comparison_eq(
Instruction test, int k, boolean inNonZeroCase, AbstractValue value
) {
/** Rearrange various simple comparisons into `op == k` form. */
private predicate simple_comparison_eq(Instruction test, Operand op, int k, AbstractValue value) {
exists(SwitchInstruction switch, CaseEdge case |
test = switch.getExpression() and
op.getDef() = test and
case = value.(MatchValue).getCase() and
exists(switch.getSuccessor(case)) and
case.getValue().toInt() = k and
inNonZeroCase = false
case.getValue().toInt() = k
)
or
// Any instruction with an integral type could potentially be part of a
// check for nullness when used in a guard. So we include all integral
// typed instructions here. However, since some of these instructions are
// already included as guards in other cases, we exclude those here.
// These are instructions that compute a binary equality or inequality
// relation. For example, the following:
// ```cpp
// if(a == b + 42) { ... }
// ```
// generates the following IR:
// ```
// r1(glval<int>) = VariableAddress[a] :
// r2(int) = Load[a] : &:r1, m1
// r3(glval<int>) = VariableAddress[b] :
// r4(int) = Load[b] : &:r3, m2
// r5(int) = Constant[42] :
// r6(int) = Add : r4, r5
// r7(bool) = CompareEQ : r2, r6
// v1(void) = ConditionalBranch : r7
// ```
// and since `r7` is an integral typed instruction this predicate could
// include a case for when `r7` evaluates to true (in which case we would
// infer that `r6` was non-zero, and a case for when `r7` evaluates to false
// (in which case we would infer that `r6` was zero).
// However, since `a == b + 42` is already supported when reasoning about
// binary equalities we exclude those cases here.
not test.isGLValue() and
not simple_comparison_eq(test, _, _, _, _) and
not simple_comparison_lt(test, _, _, _) and
not test = any(SwitchInstruction switch).getExpression() and
(
test.getResultIRType() instanceof IRAddressType or
test.getResultIRType() instanceof IRIntegerType or
test.getResultIRType() instanceof IRBooleanType
) and
(
k = 1 and
value.(BooleanValue).getValue() = true and
inNonZeroCase = true
or
// There's no implicit CompareInstruction in files compiled as C since C
// doesn't have implicit boolean conversions. So instead we check whether
// there's a branch on a value of pointer or integer type.
exists(ConditionalBranchInstruction branch, IRType type |
not test instanceof CompareInstruction and
type = test.getResultIRType() and
(type instanceof IRAddressType or type instanceof IRIntegerType) and
test = branch.getCondition() and
op.getDef() = test
|
// We'd like to also include a case such as:
// ```
// k = 1 and
// value.(BooleanValue).getValue() = true
// ```
// but all we know is that the value is non-zero in the true branch.
// So we can only conclude something in the false branch.
k = 0 and
value.(BooleanValue).getValue() = false and
inNonZeroCase = false
value.(BooleanValue).getValue() = false
)
}
@@ -918,12 +821,12 @@ private predicate complex_eq(
add_eq(cmp, left, right, k, areEqual, value)
}
private predicate unary_complex_eq(
Instruction test, Operand op, int k, boolean areEqual, boolean inNonZeroCase, AbstractValue value
private predicate complex_eq(
Instruction test, Operand op, int k, boolean areEqual, AbstractValue value
) {
unary_sub_eq(test, op, k, areEqual, inNonZeroCase, value)
sub_eq(test, op, k, areEqual, value)
or
unary_add_eq(test, op, k, areEqual, inNonZeroCase, value)
add_eq(test, op, k, areEqual, value)
}
/*
@@ -952,8 +855,7 @@ private predicate compares_lt(
/** Holds if `op < k` evaluates to `isLt` given that `test` evaluates to `value`. */
private predicate compares_lt(Instruction test, Operand op, int k, boolean isLt, AbstractValue value) {
unary_simple_comparison_lt(test, k, isLt, value) and
op.getDef() = test
simple_comparison_lt(test, op, k, isLt, value)
or
complex_lt(test, op, k, isLt, value)
or
@@ -1000,11 +902,12 @@ private predicate simple_comparison_lt(CompareInstruction cmp, Operand left, Ope
}
/** Rearrange various simple comparisons into `op < k` form. */
private predicate unary_simple_comparison_lt(
Instruction test, int k, boolean isLt, AbstractValue value
private predicate simple_comparison_lt(
Instruction test, Operand op, int k, boolean isLt, AbstractValue value
) {
exists(SwitchInstruction switch, CaseEdge case |
test = switch.getExpression() and
op.getDef() = test and
case = value.(MatchValue).getCase() and
exists(switch.getSuccessor(case)) and
case.getMaxValue() > case.getMinValue()
@@ -1187,20 +1090,16 @@ private predicate sub_eq(
}
// op - x == c => op == (c+x)
private predicate unary_sub_eq(
Instruction test, Operand op, int k, boolean areEqual, boolean inNonZeroCase, AbstractValue value
) {
inNonZeroCase = false and
private predicate sub_eq(Instruction test, Operand op, int k, boolean areEqual, AbstractValue value) {
exists(SubInstruction sub, int c, int x |
unary_compares_eq(test, sub.getAUse(), c, areEqual, inNonZeroCase, value) and
compares_eq(test, sub.getAUse(), c, areEqual, value) and
op = sub.getLeftOperand() and
x = int_value(sub.getRight()) and
k = c + x
)
or
inNonZeroCase = false and
exists(PointerSubInstruction sub, int c, int x |
unary_compares_eq(test, sub.getAUse(), c, areEqual, inNonZeroCase, value) and
compares_eq(test, sub.getAUse(), c, areEqual, value) and
op = sub.getLeftOperand() and
x = int_value(sub.getRight()) and
k = c + x
@@ -1254,13 +1153,11 @@ private predicate add_eq(
}
// left + x == right + c => left == right + (c-x)
private predicate unary_add_eq(
Instruction test, Operand left, int k, boolean areEqual, boolean inNonZeroCase,
AbstractValue value
private predicate add_eq(
Instruction test, Operand left, int k, boolean areEqual, AbstractValue value
) {
inNonZeroCase = false and
exists(AddInstruction lhs, int c, int x |
unary_compares_eq(test, lhs.getAUse(), c, areEqual, inNonZeroCase, value) and
compares_eq(test, lhs.getAUse(), c, areEqual, value) and
(
left = lhs.getLeftOperand() and x = int_value(lhs.getRight())
or
@@ -1269,9 +1166,8 @@ private predicate unary_add_eq(
k = c - x
)
or
inNonZeroCase = false and
exists(PointerAddInstruction lhs, int c, int x |
unary_compares_eq(test, lhs.getAUse(), c, areEqual, inNonZeroCase, value) and
compares_eq(test, lhs.getAUse(), c, areEqual, value) and
(
left = lhs.getLeftOperand() and x = int_value(lhs.getRight())
or

View File

@@ -78,7 +78,6 @@ private import internal.FlowSummaryImpl
private import internal.FlowSummaryImpl::Public
private import internal.FlowSummaryImpl::Private
private import internal.FlowSummaryImpl::Private::External
private import internal.ExternalFlowExtensions as Extensions
private import codeql.mad.ModelValidation as SharedModelVal
private import codeql.util.Unit
@@ -139,9 +138,6 @@ predicate sourceModel(
row.splitAt(";", 7) = kind
) and
provenance = "manual"
or
Extensions::sourceModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance,
_)
}
/** Holds if a sink model exists for the given parameters. */
@@ -162,8 +158,6 @@ predicate sinkModel(
row.splitAt(";", 7) = kind
) and
provenance = "manual"
or
Extensions::sinkModel(namespace, type, subtypes, name, signature, ext, input, kind, provenance, _)
}
/** Holds if a summary model exists for the given parameters. */
@@ -185,9 +179,6 @@ predicate summaryModel(
row.splitAt(";", 8) = kind
) and
provenance = "manual"
or
Extensions::summaryModel(namespace, type, subtypes, name, signature, ext, input, output, kind,
provenance, _)
}
private predicate relevantNamespace(string namespace) {
@@ -212,10 +203,8 @@ private predicate canonicalNamespaceLink(string namespace, string subns) {
}
/**
* Holds if MaD framework coverage of `namespace` is `n` api endpoints of the
* kind `(kind, part)`, and `namespaces` is the number of subnamespaces of
* `namespace` which have MaD framework coverage (including `namespace`
* itself).
* Holds if CSV framework coverage of `namespace` is `n` api endpoints of the
* kind `(kind, part)`.
*/
predicate modelCoverage(string namespace, int namespaces, string kind, string part, int n) {
namespaces = strictcount(string subns | canonicalNamespaceLink(namespace, subns)) and
@@ -332,10 +321,10 @@ module CsvValidation {
or
summaryModel(namespace, type, _, name, signature, ext, _, _, _, _) and pred = "summary"
|
not namespace.regexpMatch("[a-zA-Z0-9_\\.:]*") and
not namespace.regexpMatch("[a-zA-Z0-9_\\.]+") and
result = "Dubious namespace \"" + namespace + "\" in " + pred + " model."
or
not type.regexpMatch("[a-zA-Z0-9_<>,\\+]*") and
not type.regexpMatch("[a-zA-Z0-9_<>,\\+]+") and
result = "Dubious type \"" + type + "\" in " + pred + " model."
or
not name.regexpMatch("[a-zA-Z0-9_<>,]*") and

View File

@@ -9,7 +9,7 @@ private import DataFlowUtil
/**
* Gets a function that might be called by `call`.
*/
DataFlowCallable viableCallable(DataFlowCall call) {
Function viableCallable(DataFlowCall call) {
result = call.(Call).getTarget()
or
// If the target of the call does not have a body in the snapshot, it might

View File

@@ -242,17 +242,7 @@ class CastNode extends Node {
CastNode() { none() } // stub implementation
}
class DataFlowCallable extends Function {
/** Gets a best-effort total ordering. */
int totalorder() {
this =
rank[result](DataFlowCallable c, string file, int startline, int startcolumn |
c.getLocation().hasLocationInfo(file, startline, startcolumn, _, _)
|
c order by file, startline, startcolumn
)
}
}
class DataFlowCallable = Function;
class DataFlowExpr = Expr;
@@ -271,28 +261,10 @@ class DataFlowCall extends Expr instanceof Call {
ExprNode getNode() { result.getExpr() = this }
/** Gets the enclosing callable of this call. */
DataFlowCallable getEnclosingCallable() { result = this.getEnclosingFunction() }
/** Gets a best-effort total ordering. */
int totalorder() {
this =
rank[result](DataFlowCall c, int startline, int startcolumn |
c.getLocation().hasLocationInfo(_, startline, startcolumn, _, _)
|
c order by startline, startcolumn
)
}
Function getEnclosingCallable() { result = this.getEnclosingFunction() }
}
class NodeRegion instanceof Unit {
string toString() { result = "NodeRegion" }
predicate contains(Node n) { none() }
int totalOrder() { result = 1 }
}
predicate isUnreachableInCall(NodeRegion nr, DataFlowCall call) { none() } // stub implementation
predicate isUnreachableInCall(Node n, DataFlowCall call) { none() } // stub implementation
/**
* Holds if access paths with `c` at their head always should be tracked at high

View File

@@ -1,27 +0,0 @@
/**
* This module provides extensible predicates for defining MaD models.
*/
/**
* Holds if an external source model exists for the given parameters.
*/
extensible predicate sourceModel(
string namespace, string type, boolean subtypes, string name, string signature, string ext,
string output, string kind, string provenance, QlBuiltins::ExtensionId madId
);
/**
* Holds if an external sink model exists for the given parameters.
*/
extensible predicate sinkModel(
string namespace, string type, boolean subtypes, string name, string signature, string ext,
string input, string kind, string provenance, QlBuiltins::ExtensionId madId
);
/**
* Holds if an external summary model exists for the given parameters.
*/
extensible predicate summaryModel(
string namespace, string type, boolean subtypes, string name, string signature, string ext,
string input, string output, string kind, string provenance, QlBuiltins::ExtensionId madId
);

View File

@@ -1062,16 +1062,6 @@ class DataFlowCallable extends TDataFlowCallable {
result = this.asSummarizedCallable() or // SummarizedCallable = Function (in CPP)
result = this.asSourceCallable()
}
/** Gets a best-effort total ordering. */
int totalorder() {
this =
rank[result](DataFlowCallable c, string file, int startline, int startcolumn |
c.getLocation().hasLocationInfo(file, startline, startcolumn, _, _)
|
c order by file, startline, startcolumn
)
}
}
/**
@@ -1169,16 +1159,6 @@ class DataFlowCall extends TDataFlowCall {
* Gets the location of this call.
*/
Location getLocation() { none() }
/** Gets a best-effort total ordering. */
int totalorder() {
this =
rank[result](DataFlowCall c, int startline, int startcolumn |
c.getLocation().hasLocationInfo(_, startline, startcolumn, _, _)
|
c order by startline, startcolumn
)
}
}
/**
@@ -1267,53 +1247,43 @@ module IsUnreachableInCall {
any(G::IRGuardCondition guard).ensuresLt(left, right, k, block, areEqual)
}
class NodeRegion instanceof IRBlock {
string toString() { result = "NodeRegion" }
predicate contains(Node n) { this = n.getBasicBlock() }
int totalOrder() {
this =
rank[result](IRBlock b, int startline, int startcolumn |
b.getLocation().hasLocationInfo(_, startline, startcolumn, _, _)
|
b order by startline, startcolumn
)
}
}
predicate isUnreachableInCall(NodeRegion block, DataFlowCall call) {
predicate isUnreachableInCall(Node n, DataFlowCall call) {
exists(
InstructionDirectParameterNode paramNode, ConstantIntegralTypeArgumentNode arg,
IntegerConstantInstruction constant, int k, Operand left, Operand right, int argval
IntegerConstantInstruction constant, int k, Operand left, Operand right, IRBlock block
|
// arg flows into `paramNode`
DataFlowImplCommon::viableParamArg(call, pragma[only_bind_into](paramNode),
pragma[only_bind_into](arg)) and
DataFlowImplCommon::viableParamArg(call, paramNode, arg) and
left = constant.getAUse() and
right = valueNumber(paramNode.getInstruction()).getAUse() and
argval = arg.getValue()
block = n.getBasicBlock()
|
// and there's a guard condition which ensures that the result of `left == right + k` is `areEqual`
exists(boolean areEqual | ensuresEq(left, right, k, block, areEqual) |
exists(boolean areEqual |
ensuresEq(pragma[only_bind_into](left), pragma[only_bind_into](right),
pragma[only_bind_into](k), pragma[only_bind_into](block), areEqual)
|
// this block ensures that left = right + k, but it holds that `left != right + k`
areEqual = true and
constant.getValue().toInt() != argval + k
constant.getValue().toInt() != arg.getValue() + k
or
// this block ensures that or `left != right + k`, but it holds that `left = right + k`
areEqual = false and
constant.getValue().toInt() = argval + k
constant.getValue().toInt() = arg.getValue() + k
)
or
// or there's a guard condition which ensures that the result of `left < right + k` is `isLessThan`
exists(boolean isLessThan | ensuresLt(left, right, k, block, isLessThan) |
exists(boolean isLessThan |
ensuresLt(pragma[only_bind_into](left), pragma[only_bind_into](right),
pragma[only_bind_into](k), pragma[only_bind_into](block), isLessThan)
|
isLessThan = true and
// this block ensures that `left < right + k`, but it holds that `left >= right + k`
constant.getValue().toInt() >= argval + k
constant.getValue().toInt() >= arg.getValue() + k
or
// this block ensures that `left >= right + k`, but it holds that `left < right + k`
isLessThan = false and
constant.getValue().toInt() < argval + k
constant.getValue().toInt() < arg.getValue() + k
)
)
}
@@ -1336,8 +1306,6 @@ predicate nodeIsHidden(Node n) {
n instanceof FinalGlobalValue
or
n instanceof InitialGlobalValue
or
n instanceof SsaPhiInputNode
}
predicate neverSkipInPathGraph(Node n) {
@@ -1636,8 +1604,6 @@ private Instruction getAnInstruction(Node n) {
or
result = n.(SsaPhiNode).getPhiNode().getBasicBlock().getFirstInstruction()
or
result = n.(SsaPhiInputNode).getBasicBlock().getFirstInstruction()
or
n.(IndirectInstruction).hasInstructionAndIndirectionIndex(result, _)
or
not n instanceof IndirectInstruction and
@@ -1767,7 +1733,7 @@ module IteratorFlow {
crementCall = def.getValue().asInstruction().(StoreInstruction).getSourceValue() and
sv = def.getSourceVariable() and
bb.getInstruction(i) = crementCall and
Ssa::ssaDefReachesReadExt(sv, result.asDef(), bb, i)
Ssa::ssaDefReachesRead(sv, result.asDef(), bb, i)
)
}
@@ -1801,7 +1767,7 @@ module IteratorFlow {
isIteratorWrite(writeToDeref, address) and
operandForFullyConvertedCall(address, starCall) and
bbStar.getInstruction(iStar) = starCall and
Ssa::ssaDefReachesReadExt(_, def.asDef(), bbStar, iStar) and
Ssa::ssaDefReachesRead(_, def.asDef(), bbStar, iStar) and
ultimate = getAnUltimateDefinition*(def) and
beginStore = ultimate.getValue().asInstruction() and
operandForFullyConvertedCall(beginStore.getSourceValueOperand(), beginCall)

View File

@@ -17,7 +17,6 @@ private import SsaInternals as Ssa
private import DataFlowImplCommon as DataFlowImplCommon
private import codeql.util.Unit
private import Node0ToString
import ExprNodes
/**
* The IR dataflow graph consists of the following nodes:
@@ -46,7 +45,6 @@ private newtype TIRDataFlowNode =
or
Ssa::isModifiableByCall(operand, indirectionIndex)
} or
TSsaPhiInputNode(Ssa::PhiNode phi, IRBlock input) { phi.hasInputFromBlock(_, _, _, _, input) } or
TSsaPhiNode(Ssa::PhiNode phi) or
TSsaIteratorNode(IteratorFlow::IteratorFlowNode n) or
TRawIndirectOperand0(Node0Impl node, int indirectionIndex) {
@@ -116,13 +114,6 @@ predicate conversionFlow(
instrTo.(CheckedConvertOrNullInstruction).getUnaryOperand() = opFrom
or
instrTo.(InheritanceConversionInstruction).getUnaryOperand() = opFrom
or
exists(BuiltInInstruction builtIn |
builtIn = instrTo and
// __builtin_bit_cast
builtIn.getBuiltInOperation() instanceof BuiltInBitCast and
opFrom = builtIn.getAnOperand()
)
)
or
additional = true and
@@ -167,12 +158,6 @@ class Node extends TIRDataFlowNode {
/** Gets the operands corresponding to this node, if any. */
Operand asOperand() { result = this.(OperandNode).getOperand() }
/**
* Gets the operand that is indirectly tracked by this node behind `index`
* number of indirections.
*/
Operand asIndirectOperand(int index) { hasOperandAndIndex(this, result, index) }
/**
* Holds if this node is at index `i` in basic block `block`.
*
@@ -185,9 +170,6 @@ class Node extends TIRDataFlowNode {
or
this.(SsaPhiNode).getPhiNode().getBasicBlock() = block and i = -1
or
this.(SsaPhiInputNode).getBlock() = block and
i = block.getInstructionCount()
or
this.(RawIndirectOperand).getOperand().getUse() = block.getInstruction(i)
or
this.(RawIndirectInstruction).getInstruction() = block.getInstruction(i)
@@ -640,7 +622,7 @@ class SsaPhiNode extends Node, TSsaPhiNode {
final override Location getLocationImpl() { result = phi.getBasicBlock().getLocation() }
override string toStringImpl() { result = phi.toString() }
override string toStringImpl() { result = "Phi" }
/**
* Gets a node that is used as input to this phi node.
@@ -649,7 +631,7 @@ class SsaPhiNode extends Node, TSsaPhiNode {
*/
cached
final Node getAnInput(boolean fromBackEdge) {
result.(SsaPhiInputNode).getPhiNode() = phi and
localFlowStep(result, this) and
exists(IRBlock bPhi, IRBlock bResult |
bPhi = phi.getBasicBlock() and bResult = result.getBasicBlock()
|
@@ -672,58 +654,6 @@ class SsaPhiNode extends Node, TSsaPhiNode {
predicate isPhiRead() { phi.isPhiRead() }
}
/**
* INTERNAL: Do not use.
*
* A node that is used as an input to a phi node.
*
* This class exists to allow more powerful barrier guards. Consider this
* example:
*
* ```cpp
* int x = source();
* if(!safe(x)) {
* x = clear();
* }
* // phi node for x here
* sink(x);
* ```
*
* At the phi node for `x` it is neither the case that `x` is dominated by
* `safe(x)`, or is the case that the phi is dominated by a clearing of `x`.
*
* By inserting a "phi input" node as the last entry in the basic block that
* defines the inputs to the phi we can conclude that each of those inputs are
* safe to pass to `sink`.
*/
class SsaPhiInputNode extends Node, TSsaPhiInputNode {
Ssa::PhiNode phi;
IRBlock block;
SsaPhiInputNode() { this = TSsaPhiInputNode(phi, block) }
/** Gets the phi node associated with this node. */
Ssa::PhiNode getPhiNode() { result = phi }
/** Gets the basic block in which this input originates. */
IRBlock getBlock() { result = block }
override Declaration getEnclosingCallable() { result = this.getFunction() }
override Declaration getFunction() { result = phi.getBasicBlock().getEnclosingFunction() }
override DataFlowType getType() { result = this.getSourceVariable().getType() }
override predicate isGLValue() { phi.getSourceVariable().isGLValue() }
final override Location getLocationImpl() { result = block.getLastInstruction().getLocation() }
override string toStringImpl() { result = "Phi input" }
/** Gets the source variable underlying this phi node. */
Ssa::SourceVariable getSourceVariable() { result = phi.getSourceVariable() }
}
/**
* INTERNAL: do not use.
*
@@ -1297,6 +1227,466 @@ class UninitializedNode extends Node {
LocalVariable getLocalVariable() { result = v }
}
private module GetConvertedResultExpression {
private import semmle.code.cpp.ir.implementation.raw.internal.TranslatedExpr
private import semmle.code.cpp.ir.implementation.raw.internal.InstructionTag
private Operand getAnInitializeDynamicAllocationInstructionAddress() {
result = any(InitializeDynamicAllocationInstruction init).getAllocationAddressOperand()
}
/**
* Gets the expression that should be returned as the result expression from `instr`.
*
* Note that this predicate may return multiple results in cases where a conversion belongs to a
* different AST element than its operand.
*/
Expr getConvertedResultExpression(Instruction instr, int n) {
// Only fully converted instructions have a result for `asConvertedExpr`
not conversionFlow(unique(Operand op |
// The address operand of a `InitializeDynamicAllocationInstruction` is
// special: we need to handle it during dataflow (since it's
// effectively a store to an indirection), but it doesn't appear in
// source syntax, so dataflow node <-> expression conversion shouldn't
// care about it.
op = getAUse(instr) and not op = getAnInitializeDynamicAllocationInstructionAddress()
|
op
), _, false, false) and
result = getConvertedResultExpressionImpl(instr) and
n = 0
or
// If the conversion also has a result then we return multiple results
exists(Operand operand | conversionFlow(operand, instr, false, false) |
n = 1 and
result = getConvertedResultExpressionImpl(operand.getDef())
or
result = getConvertedResultExpression(operand.getDef(), n - 1)
)
}
private Expr getConvertedResultExpressionImpl0(Instruction instr) {
// IR construction inserts an additional cast to a `size_t` on the extent
// of a `new[]` expression. The resulting `ConvertInstruction` doesn't have
// a result for `getConvertedResultExpression`. We remap this here so that
// this `ConvertInstruction` maps to the result of the expression that
// represents the extent.
exists(TranslatedNonConstantAllocationSize tas |
result = tas.getExtent().getExpr() and
instr = tas.getInstruction(AllocationExtentConvertTag())
)
or
// There's no instruction that returns `ParenthesisExpr`, but some queries
// expect this
exists(TranslatedTransparentConversion ttc |
result = ttc.getExpr().(ParenthesisExpr) and
instr = ttc.getResult()
)
or
// Certain expressions generate `CopyValueInstruction`s only when they
// are needed. Examples of this include crement operations and compound
// assignment operations. For example:
// ```cpp
// int x = ...
// int y = x++;
// ```
// this generate IR like:
// ```
// r1(glval<int>) = VariableAddress[x] :
// r2(int) = Constant[0] :
// m3(int) = Store[x] : &:r1, r2
// r4(glval<int>) = VariableAddress[y] :
// r5(glval<int>) = VariableAddress[x] :
// r6(int) = Load[x] : &:r5, m3
// r7(int) = Constant[1] :
// r8(int) = Add : r6, r7
// m9(int) = Store[x] : &:r5, r8
// r11(int) = CopyValue : r6
// m12(int) = Store[y] : &:r4, r11
// ```
// When the `CopyValueInstruction` is not generated there is no instruction
// whose `getConvertedResultExpression` maps back to the expression. When
// such an instruction doesn't exist it means that the old value is not
// needed, and in that case the only value that will propagate forward in
// the program is the value that's been updated. So in those cases we just
// use the result of `node.asDefinition()` as the result of `node.asExpr()`.
exists(TranslatedCoreExpr tco |
tco.getInstruction(_) = instr and
tco.producesExprResult() and
result = asDefinitionImpl0(instr)
)
}
private Expr getConvertedResultExpressionImpl(Instruction instr) {
result = getConvertedResultExpressionImpl0(instr)
or
not exists(getConvertedResultExpressionImpl0(instr)) and
result = instr.getConvertedResultExpression()
}
/**
* Gets the result for `node.asDefinition()` (when `node` is the instruction
* node that wraps `store`) in the cases where `store.getAst()` should not be
* used to define the result of `node.asDefinition()`.
*/
private Expr asDefinitionImpl0(StoreInstruction store) {
// For an expression such as `i += 2` we pretend that the generated
// `StoreInstruction` contains the result of the expression even though
// this isn't totally aligned with the C/C++ standard.
exists(TranslatedAssignOperation tao |
store = tao.getInstruction(AssignmentStoreTag()) and
result = tao.getExpr()
)
or
// Similarly for `i++` and `++i` we pretend that the generated
// `StoreInstruction` is contains the result of the expression even though
// this isn't totally aligned with the C/C++ standard.
exists(TranslatedCrementOperation tco |
store = tco.getInstruction(CrementStoreTag()) and
result = tco.getExpr()
)
}
/**
* Holds if the expression returned by `store.getAst()` should not be
* returned as the result of `node.asDefinition()` when `node` is the
* instruction node that wraps `store`.
*/
private predicate excludeAsDefinitionResult(StoreInstruction store) {
// Exclude the store to the temporary generated by a ternary expression.
exists(TranslatedConditionalExpr tce |
store = tce.getInstruction(ConditionValueFalseStoreTag())
or
store = tce.getInstruction(ConditionValueTrueStoreTag())
)
}
/**
* Gets the expression that represents the result of `StoreInstruction` for
* dataflow purposes.
*
* For example, consider the following example
* ```cpp
* int x = 42; // 1
* x = 34; // 2
* ++x; // 3
* x++; // 4
* x += 1; // 5
* int y = x += 2; // 6
* ```
* For (1) the result is `42`.
* For (2) the result is `x = 34`.
* For (3) the result is `++x`.
* For (4) the result is `x++`.
* For (5) the result is `x += 1`.
* For (6) there are two results:
* - For the `StoreInstruction` generated by `x += 2` the result
* is `x += 2`
* - For the `StoreInstruction` generated by `int y = ...` the result
* is also `x += 2`
*/
Expr asDefinitionImpl(StoreInstruction store) {
not exists(asDefinitionImpl0(store)) and
not excludeAsDefinitionResult(store) and
result = store.getAst().(Expr).getUnconverted()
or
result = asDefinitionImpl0(store)
}
}
private import GetConvertedResultExpression
/** Holds if `node` is an `OperandNode` that should map `node.asExpr()` to `e`. */
predicate exprNodeShouldBeOperand(OperandNode node, Expr e, int n) {
not exprNodeShouldBeIndirectOperand(_, e, n) and
exists(Instruction def |
unique( | | getAUse(def)) = node.getOperand() and
e = getConvertedResultExpression(def, n)
)
}
/** Holds if `node` should be an `IndirectOperand` that maps `node.asIndirectExpr()` to `e`. */
private predicate indirectExprNodeShouldBeIndirectOperand(
IndirectOperand node, Expr e, int n, int indirectionIndex
) {
exists(Instruction def |
node.hasOperandAndIndirectionIndex(unique( | | getAUse(def)), indirectionIndex) and
e = getConvertedResultExpression(def, n)
)
}
/** Holds if `node` should be an `IndirectOperand` that maps `node.asExpr()` to `e`. */
private predicate exprNodeShouldBeIndirectOperand(IndirectOperand node, Expr e, int n) {
exists(ArgumentOperand operand |
// When an argument (qualifier or positional) is a prvalue and the
// parameter (qualifier or positional) is a (const) reference, IR
// construction introduces a temporary `IRVariable`. The `VariableAddress`
// instruction has the argument as its `getConvertedResultExpression`
// result. However, the instruction actually represents the _address_ of
// the argument. So to fix this mismatch, we have the indirection of the
// `VariableAddressInstruction` map to the expression.
node.hasOperandAndIndirectionIndex(operand, 1) and
e = getConvertedResultExpression(operand.getDef(), n) and
operand.getDef().(VariableAddressInstruction).getIRVariable() instanceof IRTempVariable
)
}
private predicate exprNodeShouldBeIndirectOutNode(IndirectArgumentOutNode node, Expr e, int n) {
exists(CallInstruction call |
call.getStaticCallTarget() instanceof Constructor and
e = getConvertedResultExpression(call, n) and
call.getThisArgumentOperand() = node.getAddressOperand()
)
}
/** Holds if `node` should be an instruction node that maps `node.asExpr()` to `e`. */
predicate exprNodeShouldBeInstruction(Node node, Expr e, int n) {
not exprNodeShouldBeOperand(_, e, n) and
not exprNodeShouldBeIndirectOutNode(_, e, n) and
not exprNodeShouldBeIndirectOperand(_, e, n) and
e = getConvertedResultExpression(node.asInstruction(), n)
}
/** Holds if `node` should be an `IndirectInstruction` that maps `node.asIndirectExpr()` to `e`. */
predicate indirectExprNodeShouldBeIndirectInstruction(
IndirectInstruction node, Expr e, int n, int indirectionIndex
) {
not indirectExprNodeShouldBeIndirectOperand(_, e, n, indirectionIndex) and
exists(Instruction instr |
node.hasInstructionAndIndirectionIndex(instr, indirectionIndex) and
e = getConvertedResultExpression(instr, n)
)
}
abstract private class ExprNodeBase extends Node {
/**
* Gets the expression corresponding to this node, if any. The returned
* expression may be a `Conversion`.
*/
abstract Expr getConvertedExpr(int n);
/** Gets the non-conversion expression corresponding to this node, if any. */
final Expr getExpr(int n) { result = this.getConvertedExpr(n).getUnconverted() }
}
/**
* Holds if there exists a dataflow node whose `asExpr(n)` should evaluate
* to `e`.
*/
private predicate exprNodeShouldBe(Expr e, int n) {
exprNodeShouldBeInstruction(_, e, n) or
exprNodeShouldBeOperand(_, e, n) or
exprNodeShouldBeIndirectOutNode(_, e, n) or
exprNodeShouldBeIndirectOperand(_, e, n)
}
private class InstructionExprNode extends ExprNodeBase, InstructionNode {
InstructionExprNode() {
exists(Expr e, int n |
exprNodeShouldBeInstruction(this, e, n) and
not exists(Expr conv |
exprNodeShouldBe(conv, n + 1) and
conv.getUnconverted() = e.getUnconverted()
)
)
}
final override Expr getConvertedExpr(int n) { exprNodeShouldBeInstruction(this, result, n) }
}
private class OperandExprNode extends ExprNodeBase, OperandNode {
OperandExprNode() {
exists(Expr e, int n |
exprNodeShouldBeOperand(this, e, n) and
not exists(Expr conv |
exprNodeShouldBe(conv, n + 1) and
conv.getUnconverted() = e.getUnconverted()
)
)
}
final override Expr getConvertedExpr(int n) { exprNodeShouldBeOperand(this, result, n) }
}
abstract private class IndirectExprNodeBase extends Node {
/**
* Gets the expression corresponding to this node, if any. The returned
* expression may be a `Conversion`.
*/
abstract Expr getConvertedExpr(int n, int indirectionIndex);
/** Gets the non-conversion expression corresponding to this node, if any. */
final Expr getExpr(int n, int indirectionIndex) {
result = this.getConvertedExpr(n, indirectionIndex).getUnconverted()
}
}
/** A signature for converting an indirect node to an expression. */
private signature module IndirectNodeToIndirectExprSig {
/** The indirect node class to be converted to an expression */
class IndirectNode;
/**
* Holds if the indirect expression at indirection index `indirectionIndex`
* of `node` is `e`. The integer `n` specifies how many conversions has been
* applied to `node`.
*/
predicate indirectNodeHasIndirectExpr(IndirectNode node, Expr e, int n, int indirectionIndex);
}
/**
* A module that implements the logic for deciding whether an indirect node
* should be an `IndirectExprNode`.
*/
private module IndirectNodeToIndirectExpr<IndirectNodeToIndirectExprSig Sig> {
import Sig
/**
* This predicate shifts the indirection index by one when `conv` is a
* `ReferenceDereferenceExpr`.
*
* This is necessary because `ReferenceDereferenceExpr` is a conversion
* in the AST, but appears as a `LoadInstruction` in the IR.
*/
bindingset[e, indirectionIndex]
private predicate adjustForReference(
Expr e, int indirectionIndex, Expr conv, int adjustedIndirectionIndex
) {
conv.(ReferenceDereferenceExpr).getExpr() = e and
adjustedIndirectionIndex = indirectionIndex - 1
or
not conv instanceof ReferenceDereferenceExpr and
conv = e and
adjustedIndirectionIndex = indirectionIndex
}
/** Holds if `node` should be an `IndirectExprNode`. */
predicate charpred(IndirectNode node) {
exists(Expr e, int n, int indirectionIndex |
indirectNodeHasIndirectExpr(node, e, n, indirectionIndex) and
not exists(Expr conv, int adjustedIndirectionIndex |
adjustForReference(e, indirectionIndex, conv, adjustedIndirectionIndex) and
indirectExprNodeShouldBe(conv, n + 1, adjustedIndirectionIndex)
)
)
}
}
private predicate indirectExprNodeShouldBe(Expr e, int n, int indirectionIndex) {
indirectExprNodeShouldBeIndirectOperand(_, e, n, indirectionIndex) or
indirectExprNodeShouldBeIndirectInstruction(_, e, n, indirectionIndex)
}
private module IndirectOperandIndirectExprNodeImpl implements IndirectNodeToIndirectExprSig {
class IndirectNode = IndirectOperand;
predicate indirectNodeHasIndirectExpr = indirectExprNodeShouldBeIndirectOperand/4;
}
module IndirectOperandToIndirectExpr =
IndirectNodeToIndirectExpr<IndirectOperandIndirectExprNodeImpl>;
private class IndirectOperandIndirectExprNode extends IndirectExprNodeBase instanceof IndirectOperand
{
IndirectOperandIndirectExprNode() { IndirectOperandToIndirectExpr::charpred(this) }
final override Expr getConvertedExpr(int n, int index) {
IndirectOperandToIndirectExpr::indirectNodeHasIndirectExpr(this, result, n, index)
}
}
private module IndirectInstructionIndirectExprNodeImpl implements IndirectNodeToIndirectExprSig {
class IndirectNode = IndirectInstruction;
predicate indirectNodeHasIndirectExpr = indirectExprNodeShouldBeIndirectInstruction/4;
}
module IndirectInstructionToIndirectExpr =
IndirectNodeToIndirectExpr<IndirectInstructionIndirectExprNodeImpl>;
private class IndirectInstructionIndirectExprNode extends IndirectExprNodeBase instanceof IndirectInstruction
{
IndirectInstructionIndirectExprNode() { IndirectInstructionToIndirectExpr::charpred(this) }
final override Expr getConvertedExpr(int n, int index) {
IndirectInstructionToIndirectExpr::indirectNodeHasIndirectExpr(this, result, n, index)
}
}
private class IndirectArgumentOutExprNode extends ExprNodeBase, IndirectArgumentOutNode {
IndirectArgumentOutExprNode() { exprNodeShouldBeIndirectOutNode(this, _, _) }
final override Expr getConvertedExpr(int n) { exprNodeShouldBeIndirectOutNode(this, result, n) }
}
private class IndirectOperandExprNode extends ExprNodeBase instanceof IndirectOperand {
IndirectOperandExprNode() { exprNodeShouldBeIndirectOperand(this, _, _) }
final override Expr getConvertedExpr(int n) { exprNodeShouldBeIndirectOperand(this, result, n) }
}
/**
* An expression, viewed as a node in a data flow graph.
*/
class ExprNode extends Node instanceof ExprNodeBase {
/**
* INTERNAL: Do not use.
*/
Expr getExpr(int n) { result = super.getExpr(n) }
/**
* Gets the non-conversion expression corresponding to this node, if any. If
* this node strictly (in the sense of `getConvertedExpr`) corresponds to a
* `Conversion`, then the result is that `Conversion`'s non-`Conversion` base
* expression.
*/
final Expr getExpr() { result = this.getExpr(_) }
/**
* INTERNAL: Do not use.
*/
Expr getConvertedExpr(int n) { result = super.getConvertedExpr(n) }
/**
* Gets the expression corresponding to this node, if any. The returned
* expression may be a `Conversion`.
*/
final Expr getConvertedExpr() { result = this.getConvertedExpr(_) }
}
/**
* An indirect expression, viewed as a node in a data flow graph.
*/
class IndirectExprNode extends Node instanceof IndirectExprNodeBase {
/**
* Gets the non-conversion expression corresponding to this node, if any. If
* this node strictly (in the sense of `getConvertedExpr`) corresponds to a
* `Conversion`, then the result is that `Conversion`'s non-`Conversion` base
* expression.
*/
final Expr getExpr(int indirectionIndex) { result = this.getExpr(_, indirectionIndex) }
/**
* INTERNAL: Do not use.
*/
Expr getExpr(int n, int indirectionIndex) { result = super.getExpr(n, indirectionIndex) }
/**
* INTERNAL: Do not use.
*/
Expr getConvertedExpr(int n, int indirectionIndex) {
result = super.getConvertedExpr(n, indirectionIndex)
}
/**
* Gets the expression corresponding to this node, if any. The returned
* expression may be a `Conversion`.
*/
Expr getConvertedExpr(int indirectionIndex) {
result = this.getConvertedExpr(_, indirectionIndex)
}
}
abstract private class AbstractParameterNode extends Node {
/**
* Holds if this node is the parameter of `f` at the specified position. The
@@ -1786,9 +2176,6 @@ private module Cached {
// Def-use/Use-use flow
Ssa::ssaFlow(nodeFrom, nodeTo)
or
// Phi input -> Phi
nodeFrom.(SsaPhiInputNode).getPhiNode() = nodeTo.(SsaPhiNode).getPhiNode()
or
IteratorFlow::localFlowStep(nodeFrom, nodeTo)
or
// Operand -> Instruction flow
@@ -2227,22 +2614,6 @@ class ContentSet instanceof Content {
}
}
pragma[nomagic]
private predicate guardControlsPhiInput(
IRGuardCondition g, boolean branch, Ssa::Definition def, IRBlock input, Ssa::PhiNode phi
) {
phi.hasInputFromBlock(def, _, _, _, input) and
(
g.controls(input, branch)
or
exists(EdgeKind kind |
g.getBlock() = input and
kind = getConditionalEdge(branch) and
input.getSuccessor(kind) = phi.getBasicBlock()
)
)
}
/**
* Holds if the guard `g` validates the expression `e` upon evaluating to `branch`.
*
@@ -2291,21 +2662,13 @@ module BarrierGuard<guardChecksSig/3 guardChecks> {
*
* NOTE: If an indirect expression is tracked, use `getAnIndirectBarrierNode` instead.
*/
Node getABarrierNode() {
ExprNode getABarrierNode() {
exists(IRGuardCondition g, Expr e, ValueNumber value, boolean edge |
e = value.getAnInstruction().getConvertedResultExpression() and
result.asConvertedExpr() = e and
result.getConvertedExpr() = e and
guardChecks(g, value.getAnInstruction().getConvertedResultExpression(), edge) and
g.controls(result.getBasicBlock(), edge)
)
or
exists(
IRGuardCondition g, boolean branch, Ssa::DefinitionExt def, IRBlock input, Ssa::PhiNode phi
|
guardChecks(g, def.getARead().asOperand().getDef().getConvertedResultExpression(), branch) and
guardControlsPhiInput(g, branch, def, input, phi) and
result = TSsaPhiInputNode(phi, input)
)
}
/**
@@ -2341,7 +2704,7 @@ module BarrierGuard<guardChecksSig/3 guardChecks> {
*
* NOTE: If a non-indirect expression is tracked, use `getABarrierNode` instead.
*/
Node getAnIndirectBarrierNode() { result = getAnIndirectBarrierNode(_) }
IndirectExprNode getAnIndirectBarrierNode() { result = getAnIndirectBarrierNode(_) }
/**
* Gets an indirect expression node with indirection index `indirectionIndex` that is
@@ -2377,23 +2740,13 @@ module BarrierGuard<guardChecksSig/3 guardChecks> {
*
* NOTE: If a non-indirect expression is tracked, use `getABarrierNode` instead.
*/
Node getAnIndirectBarrierNode(int indirectionIndex) {
IndirectExprNode getAnIndirectBarrierNode(int indirectionIndex) {
exists(IRGuardCondition g, Expr e, ValueNumber value, boolean edge |
e = value.getAnInstruction().getConvertedResultExpression() and
result.asIndirectConvertedExpr(indirectionIndex) = e and
result.getConvertedExpr(indirectionIndex) = e and
guardChecks(g, value.getAnInstruction().getConvertedResultExpression(), edge) and
g.controls(result.getBasicBlock(), edge)
)
or
exists(
IRGuardCondition g, boolean branch, Ssa::DefinitionExt def, IRBlock input, Ssa::PhiNode phi
|
guardChecks(g,
def.getARead().asIndirectOperand(indirectionIndex).getDef().getConvertedResultExpression(),
branch) and
guardControlsPhiInput(g, branch, def, input, phi) and
result = TSsaPhiInputNode(phi, input)
)
}
}
@@ -2402,14 +2755,6 @@ module BarrierGuard<guardChecksSig/3 guardChecks> {
*/
signature predicate instructionGuardChecksSig(IRGuardCondition g, Instruction instr, boolean branch);
private EdgeKind getConditionalEdge(boolean branch) {
branch = true and
result instanceof TrueEdge
or
branch = false and
result instanceof FalseEdge
}
/**
* Provides a set of barrier nodes for a guard that validates an instruction.
*
@@ -2418,20 +2763,12 @@ private EdgeKind getConditionalEdge(boolean branch) {
*/
module InstructionBarrierGuard<instructionGuardChecksSig/3 instructionGuardChecks> {
/** Gets a node that is safely guarded by the given guard check. */
Node getABarrierNode() {
ExprNode getABarrierNode() {
exists(IRGuardCondition g, ValueNumber value, boolean edge, Operand use |
instructionGuardChecks(g, value.getAnInstruction(), edge) and
use = value.getAnInstruction().getAUse() and
result.asOperand() = use and
g.controls(result.getBasicBlock(), edge)
)
or
exists(
IRGuardCondition g, boolean branch, Ssa::DefinitionExt def, IRBlock input, Ssa::PhiNode phi
|
instructionGuardChecks(g, def.getARead().asOperand().getDef(), branch) and
guardControlsPhiInput(g, branch, def, input, phi) and
result = TSsaPhiInputNode(phi, input)
g.controls(use.getDef().getBlock(), edge)
)
}
}

View File

@@ -1,479 +0,0 @@
/**
* Provides the classes `ExprNode` and `IndirectExprNode` for converting between `Expr` and `Node`.
*/
private import cpp
private import semmle.code.cpp.ir.IR
private import DataFlowUtil
private import DataFlowPrivate
private import semmle.code.cpp.ir.implementation.raw.internal.TranslatedExpr
private import semmle.code.cpp.ir.implementation.raw.internal.InstructionTag
cached
private module Cached {
private Operand getAnInitializeDynamicAllocationInstructionAddress() {
result = any(InitializeDynamicAllocationInstruction init).getAllocationAddressOperand()
}
/**
* Gets the expression that should be returned as the result expression from `instr`.
*
* Note that this predicate may return multiple results in cases where a conversion belongs to a
* different AST element than its operand.
*/
private Expr getConvertedResultExpression(Instruction instr, int n) {
// Only fully converted instructions have a result for `asConvertedExpr`
not conversionFlow(unique(Operand op |
// The address operand of a `InitializeDynamicAllocationInstruction` is
// special: we need to handle it during dataflow (since it's
// effectively a store to an indirection), but it doesn't appear in
// source syntax, so dataflow node <-> expression conversion shouldn't
// care about it.
op = getAUse(instr) and not op = getAnInitializeDynamicAllocationInstructionAddress()
|
op
), _, false, false) and
result = getConvertedResultExpressionImpl(instr) and
n = 0
or
// If the conversion also has a result then we return multiple results
exists(Operand operand | conversionFlow(operand, instr, false, false) |
n = 1 and
result = getConvertedResultExpressionImpl(operand.getDef())
or
result = getConvertedResultExpression(operand.getDef(), n - 1)
)
}
private Expr getConvertedResultExpressionImpl0(Instruction instr) {
// IR construction inserts an additional cast to a `size_t` on the extent
// of a `new[]` expression. The resulting `ConvertInstruction` doesn't have
// a result for `getConvertedResultExpression`. We remap this here so that
// this `ConvertInstruction` maps to the result of the expression that
// represents the extent.
exists(TranslatedNonConstantAllocationSize tas |
result = tas.getExtent().getExpr() and
instr = tas.getInstruction(AllocationExtentConvertTag())
)
or
// There's no instruction that returns `ParenthesisExpr`, but some queries
// expect this
exists(TranslatedTransparentConversion ttc |
result = ttc.getExpr().(ParenthesisExpr) and
instr = ttc.getResult()
)
or
// Certain expressions generate `CopyValueInstruction`s only when they
// are needed. Examples of this include crement operations and compound
// assignment operations. For example:
// ```cpp
// int x = ...
// int y = x++;
// ```
// this generate IR like:
// ```
// r1(glval<int>) = VariableAddress[x] :
// r2(int) = Constant[0] :
// m3(int) = Store[x] : &:r1, r2
// r4(glval<int>) = VariableAddress[y] :
// r5(glval<int>) = VariableAddress[x] :
// r6(int) = Load[x] : &:r5, m3
// r7(int) = Constant[1] :
// r8(int) = Add : r6, r7
// m9(int) = Store[x] : &:r5, r8
// r11(int) = CopyValue : r6
// m12(int) = Store[y] : &:r4, r11
// ```
// When the `CopyValueInstruction` is not generated there is no instruction
// whose `getConvertedResultExpression` maps back to the expression. When
// such an instruction doesn't exist it means that the old value is not
// needed, and in that case the only value that will propagate forward in
// the program is the value that's been updated. So in those cases we just
// use the result of `node.asDefinition()` as the result of `node.asExpr()`.
exists(TranslatedCoreExpr tco |
tco.getInstruction(_) = instr and
tco.producesExprResult() and
result = asDefinitionImpl0(instr)
)
}
private Expr getConvertedResultExpressionImpl(Instruction instr) {
result = getConvertedResultExpressionImpl0(instr)
or
not exists(getConvertedResultExpressionImpl0(instr)) and
result = instr.getConvertedResultExpression()
}
/**
* Gets the result for `node.asDefinition()` (when `node` is the instruction
* node that wraps `store`) in the cases where `store.getAst()` should not be
* used to define the result of `node.asDefinition()`.
*/
private Expr asDefinitionImpl0(StoreInstruction store) {
// For an expression such as `i += 2` we pretend that the generated
// `StoreInstruction` contains the result of the expression even though
// this isn't totally aligned with the C/C++ standard.
exists(TranslatedAssignOperation tao |
store = tao.getInstruction(AssignmentStoreTag()) and
result = tao.getExpr()
)
or
// Similarly for `i++` and `++i` we pretend that the generated
// `StoreInstruction` is contains the result of the expression even though
// this isn't totally aligned with the C/C++ standard.
exists(TranslatedCrementOperation tco |
store = tco.getInstruction(CrementStoreTag()) and
result = tco.getExpr()
)
}
/**
* Holds if the expression returned by `store.getAst()` should not be
* returned as the result of `node.asDefinition()` when `node` is the
* instruction node that wraps `store`.
*/
private predicate excludeAsDefinitionResult(StoreInstruction store) {
// Exclude the store to the temporary generated by a ternary expression.
exists(TranslatedConditionalExpr tce |
store = tce.getInstruction(ConditionValueFalseStoreTag())
or
store = tce.getInstruction(ConditionValueTrueStoreTag())
)
}
/**
* Gets the expression that represents the result of `StoreInstruction` for
* dataflow purposes.
*
* For example, consider the following example
* ```cpp
* int x = 42; // 1
* x = 34; // 2
* ++x; // 3
* x++; // 4
* x += 1; // 5
* int y = x += 2; // 6
* ```
* For (1) the result is `42`.
* For (2) the result is `x = 34`.
* For (3) the result is `++x`.
* For (4) the result is `x++`.
* For (5) the result is `x += 1`.
* For (6) there are two results:
* - For the `StoreInstruction` generated by `x += 2` the result
* is `x += 2`
* - For the `StoreInstruction` generated by `int y = ...` the result
* is also `x += 2`
*/
cached
Expr asDefinitionImpl(StoreInstruction store) {
not exists(asDefinitionImpl0(store)) and
not excludeAsDefinitionResult(store) and
result = store.getAst().(Expr).getUnconverted()
or
result = asDefinitionImpl0(store)
}
/** Holds if `node` is an `OperandNode` that should map `node.asExpr()` to `e`. */
private predicate exprNodeShouldBeOperand(OperandNode node, Expr e, int n) {
not exprNodeShouldBeIndirectOperand(_, e, n) and
exists(Instruction def |
unique( | | getAUse(def)) = node.getOperand() and
e = getConvertedResultExpression(def, n)
)
}
/** Holds if `node` should be an `IndirectOperand` that maps `node.asIndirectExpr()` to `e`. */
private predicate indirectExprNodeShouldBeIndirectOperand(
IndirectOperand node, Expr e, int n, int indirectionIndex
) {
exists(Instruction def |
node.hasOperandAndIndirectionIndex(unique( | | getAUse(def)), indirectionIndex) and
e = getConvertedResultExpression(def, n)
)
}
/** Holds if `node` should be an `IndirectOperand` that maps `node.asExpr()` to `e`. */
private predicate exprNodeShouldBeIndirectOperand(IndirectOperand node, Expr e, int n) {
exists(ArgumentOperand operand |
// When an argument (qualifier or positional) is a prvalue and the
// parameter (qualifier or positional) is a (const) reference, IR
// construction introduces a temporary `IRVariable`. The `VariableAddress`
// instruction has the argument as its `getConvertedResultExpression`
// result. However, the instruction actually represents the _address_ of
// the argument. So to fix this mismatch, we have the indirection of the
// `VariableAddressInstruction` map to the expression.
node.hasOperandAndIndirectionIndex(operand, 1) and
e = getConvertedResultExpression(operand.getDef(), n) and
operand.getDef().(VariableAddressInstruction).getIRVariable() instanceof IRTempVariable
)
}
private predicate exprNodeShouldBeIndirectOutNode(IndirectArgumentOutNode node, Expr e, int n) {
exists(CallInstruction call |
call.getStaticCallTarget() instanceof Constructor and
e = getConvertedResultExpression(call, n) and
call.getThisArgumentOperand() = node.getAddressOperand()
)
}
/** Holds if `node` should be an instruction node that maps `node.asExpr()` to `e`. */
private predicate exprNodeShouldBeInstruction(Node node, Expr e, int n) {
not exprNodeShouldBeOperand(_, e, n) and
not exprNodeShouldBeIndirectOutNode(_, e, n) and
not exprNodeShouldBeIndirectOperand(_, e, n) and
e = getConvertedResultExpression(node.asInstruction(), n)
}
/** Holds if `node` should be an `IndirectInstruction` that maps `node.asIndirectExpr()` to `e`. */
private predicate indirectExprNodeShouldBeIndirectInstruction(
IndirectInstruction node, Expr e, int n, int indirectionIndex
) {
not indirectExprNodeShouldBeIndirectOperand(_, e, n, indirectionIndex) and
exists(Instruction instr |
node.hasInstructionAndIndirectionIndex(instr, indirectionIndex) and
e = getConvertedResultExpression(instr, n)
)
}
abstract private class ExprNodeBase extends Node {
/**
* Gets the expression corresponding to this node, if any. The returned
* expression may be a `Conversion`.
*/
abstract Expr getConvertedExpr(int n);
/** Gets the non-conversion expression corresponding to this node, if any. */
final Expr getExpr(int n) { result = this.getConvertedExpr(n).getUnconverted() }
}
/**
* Holds if there exists a dataflow node whose `asExpr(n)` should evaluate
* to `e`.
*/
private predicate exprNodeShouldBe(Expr e, int n) {
exprNodeShouldBeInstruction(_, e, n) or
exprNodeShouldBeOperand(_, e, n) or
exprNodeShouldBeIndirectOutNode(_, e, n) or
exprNodeShouldBeIndirectOperand(_, e, n)
}
private class InstructionExprNode extends ExprNodeBase, InstructionNode {
InstructionExprNode() {
exists(Expr e, int n |
exprNodeShouldBeInstruction(this, e, n) and
not exists(Expr conv |
exprNodeShouldBe(conv, n + 1) and
conv.getUnconverted() = e.getUnconverted()
)
)
}
final override Expr getConvertedExpr(int n) { exprNodeShouldBeInstruction(this, result, n) }
}
private class OperandExprNode extends ExprNodeBase, OperandNode {
OperandExprNode() {
exists(Expr e, int n |
exprNodeShouldBeOperand(this, e, n) and
not exists(Expr conv |
exprNodeShouldBe(conv, n + 1) and
conv.getUnconverted() = e.getUnconverted()
)
)
}
final override Expr getConvertedExpr(int n) { exprNodeShouldBeOperand(this, result, n) }
}
abstract private class IndirectExprNodeBase extends Node {
/**
* Gets the expression corresponding to this node, if any. The returned
* expression may be a `Conversion`.
*/
abstract Expr getConvertedExpr(int n, int indirectionIndex);
/** Gets the non-conversion expression corresponding to this node, if any. */
final Expr getExpr(int n, int indirectionIndex) {
result = this.getConvertedExpr(n, indirectionIndex).getUnconverted()
}
}
/** A signature for converting an indirect node to an expression. */
private signature module IndirectNodeToIndirectExprSig {
/** The indirect node class to be converted to an expression */
class IndirectNode;
/**
* Holds if the indirect expression at indirection index `indirectionIndex`
* of `node` is `e`. The integer `n` specifies how many conversions has been
* applied to `node`.
*/
predicate indirectNodeHasIndirectExpr(IndirectNode node, Expr e, int n, int indirectionIndex);
}
/**
* A module that implements the logic for deciding whether an indirect node
* should be an `IndirectExprNode`.
*/
private module IndirectNodeToIndirectExpr<IndirectNodeToIndirectExprSig Sig> {
import Sig
/**
* This predicate shifts the indirection index by one when `conv` is a
* `ReferenceDereferenceExpr`.
*
* This is necessary because `ReferenceDereferenceExpr` is a conversion
* in the AST, but appears as a `LoadInstruction` in the IR.
*/
bindingset[e, indirectionIndex]
private predicate adjustForReference(
Expr e, int indirectionIndex, Expr conv, int adjustedIndirectionIndex
) {
conv.(ReferenceDereferenceExpr).getExpr() = e and
adjustedIndirectionIndex = indirectionIndex - 1
or
not conv instanceof ReferenceDereferenceExpr and
conv = e and
adjustedIndirectionIndex = indirectionIndex
}
/** Holds if `node` should be an `IndirectExprNode`. */
predicate charpred(IndirectNode node) {
exists(Expr e, int n, int indirectionIndex |
indirectNodeHasIndirectExpr(node, e, n, indirectionIndex) and
not exists(Expr conv, int adjustedIndirectionIndex |
adjustForReference(e, indirectionIndex, conv, adjustedIndirectionIndex) and
indirectExprNodeShouldBe(conv, n + 1, adjustedIndirectionIndex)
)
)
}
}
private predicate indirectExprNodeShouldBe(Expr e, int n, int indirectionIndex) {
indirectExprNodeShouldBeIndirectOperand(_, e, n, indirectionIndex) or
indirectExprNodeShouldBeIndirectInstruction(_, e, n, indirectionIndex)
}
private module IndirectOperandIndirectExprNodeImpl implements IndirectNodeToIndirectExprSig {
class IndirectNode = IndirectOperand;
predicate indirectNodeHasIndirectExpr = indirectExprNodeShouldBeIndirectOperand/4;
}
module IndirectOperandToIndirectExpr =
IndirectNodeToIndirectExpr<IndirectOperandIndirectExprNodeImpl>;
private class IndirectOperandIndirectExprNode extends IndirectExprNodeBase instanceof IndirectOperand
{
IndirectOperandIndirectExprNode() { IndirectOperandToIndirectExpr::charpred(this) }
final override Expr getConvertedExpr(int n, int index) {
IndirectOperandToIndirectExpr::indirectNodeHasIndirectExpr(this, result, n, index)
}
}
private module IndirectInstructionIndirectExprNodeImpl implements IndirectNodeToIndirectExprSig {
class IndirectNode = IndirectInstruction;
predicate indirectNodeHasIndirectExpr = indirectExprNodeShouldBeIndirectInstruction/4;
}
module IndirectInstructionToIndirectExpr =
IndirectNodeToIndirectExpr<IndirectInstructionIndirectExprNodeImpl>;
private class IndirectInstructionIndirectExprNode extends IndirectExprNodeBase instanceof IndirectInstruction
{
IndirectInstructionIndirectExprNode() { IndirectInstructionToIndirectExpr::charpred(this) }
final override Expr getConvertedExpr(int n, int index) {
IndirectInstructionToIndirectExpr::indirectNodeHasIndirectExpr(this, result, n, index)
}
}
private class IndirectArgumentOutExprNode extends ExprNodeBase, IndirectArgumentOutNode {
IndirectArgumentOutExprNode() { exprNodeShouldBeIndirectOutNode(this, _, _) }
final override Expr getConvertedExpr(int n) { exprNodeShouldBeIndirectOutNode(this, result, n) }
}
private class IndirectOperandExprNode extends ExprNodeBase instanceof IndirectOperand {
IndirectOperandExprNode() { exprNodeShouldBeIndirectOperand(this, _, _) }
final override Expr getConvertedExpr(int n) { exprNodeShouldBeIndirectOperand(this, result, n) }
}
/**
* An expression, viewed as a node in a data flow graph.
*/
cached
class ExprNode extends Node instanceof ExprNodeBase {
/**
* INTERNAL: Do not use.
*/
cached
Expr getExpr(int n) { result = super.getExpr(n) }
/**
* Gets the non-conversion expression corresponding to this node, if any. If
* this node strictly (in the sense of `getConvertedExpr`) corresponds to a
* `Conversion`, then the result is that `Conversion`'s non-`Conversion` base
* expression.
*/
cached
final Expr getExpr() { result = this.getExpr(_) }
/**
* INTERNAL: Do not use.
*/
cached
Expr getConvertedExpr(int n) { result = super.getConvertedExpr(n) }
/**
* Gets the expression corresponding to this node, if any. The returned
* expression may be a `Conversion`.
*/
cached
final Expr getConvertedExpr() { result = this.getConvertedExpr(_) }
}
/**
* An indirect expression, viewed as a node in a data flow graph.
*/
cached
class IndirectExprNode extends Node instanceof IndirectExprNodeBase {
/**
* Gets the non-conversion expression corresponding to this node, if any. If
* this node strictly (in the sense of `getConvertedExpr`) corresponds to a
* `Conversion`, then the result is that `Conversion`'s non-`Conversion` base
* expression.
*/
cached
final Expr getExpr(int indirectionIndex) { result = this.getExpr(_, indirectionIndex) }
/**
* INTERNAL: Do not use.
*/
cached
Expr getExpr(int n, int indirectionIndex) { result = super.getExpr(n, indirectionIndex) }
/**
* INTERNAL: Do not use.
*/
cached
Expr getConvertedExpr(int n, int indirectionIndex) {
result = super.getConvertedExpr(n, indirectionIndex)
}
/**
* Gets the expression corresponding to this node, if any. The returned
* expression may be a `Conversion`.
*/
cached
Expr getConvertedExpr(int indirectionIndex) {
result = this.getConvertedExpr(_, indirectionIndex)
}
}
}
import Cached

View File

@@ -546,7 +546,7 @@ module ProductFlow {
Flow1::PathGraph::edges(pred1, succ1, _, _) and
exists(ReturnKindExt returnKind |
succ1.getNode() = returnKind.getAnOutNode(call) and
paramReturnNode(_, pred1.asParameterReturnNode(), _, returnKind)
pred1.getNode().(ReturnNodeExt).getKind() = returnKind
)
}
@@ -574,7 +574,7 @@ module ProductFlow {
Flow2::PathGraph::edges(pred2, succ2, _, _) and
exists(ReturnKindExt returnKind |
succ2.getNode() = returnKind.getAnOutNode(call) and
paramReturnNode(_, pred2.asParameterReturnNode(), _, returnKind)
pred2.getNode().(ReturnNodeExt).getKind() = returnKind
)
}

View File

@@ -9,7 +9,6 @@ private import semmle.code.cpp.models.interfaces.PartialFlow as PartialFlow
private import semmle.code.cpp.models.interfaces.FunctionInputsAndOutputs as FIO
private import semmle.code.cpp.ir.internal.IRCppLanguage
private import semmle.code.cpp.ir.dataflow.internal.ModelUtil
private import semmle.code.cpp.ir.implementation.raw.internal.TranslatedInitialization
private import DataFlowPrivate
import SsaInternalsCommon
@@ -105,8 +104,8 @@ predicate hasRawIndirectInstruction(Instruction instr, int indirectionIndex) {
cached
private newtype TDefImpl =
TDefAddressImpl(BaseIRVariable v) or
TDirectDefImpl(Operand address, int indirectionIndex) {
isDef(_, _, address, _, _, indirectionIndex)
TDirectDefImpl(BaseSourceVariableInstruction base, Operand address, int indirectionIndex) {
isDef(_, _, address, base, _, indirectionIndex)
} or
TGlobalDefImpl(GlobalLikeVariable v, IRFunction f, int indirectionIndex) {
// Represents the initial "definition" of a global variable when entering
@@ -116,8 +115,8 @@ private newtype TDefImpl =
cached
private newtype TUseImpl =
TDirectUseImpl(Operand operand, int indirectionIndex) {
isUse(_, operand, _, _, indirectionIndex) and
TDirectUseImpl(BaseSourceVariableInstruction base, Operand operand, int indirectionIndex) {
isUse(_, operand, base, _, indirectionIndex) and
not isDef(true, _, operand, _, _, _)
} or
TGlobalUse(GlobalLikeVariable v, IRFunction f, int indirectionIndex) {
@@ -211,11 +210,19 @@ abstract class DefImpl extends TDefImpl {
*/
abstract int getIndirection();
/**
* Gets the instruction that computes the base of this definition or use.
* This is always a `VariableAddressInstruction` or an `CallInstruction`.
*/
abstract BaseSourceVariableInstruction getBase();
/**
* Gets the base source variable (i.e., the variable without
* any indirection) of this definition or use.
*/
abstract BaseSourceVariable getBaseSourceVariable();
final BaseSourceVariable getBaseSourceVariable() {
this.getBase().getBaseSourceVariable() = result
}
/** Gets the variable that is defined or used. */
SourceVariable getSourceVariable() {
@@ -275,11 +282,19 @@ abstract class UseImpl extends TUseImpl {
/** Gets the indirection index of this use. */
final int getIndirectionIndex() { result = indirectionIndex }
/**
* Gets the instruction that computes the base of this definition or use.
* This is always a `VariableAddressInstruction` or an `CallInstruction`.
*/
abstract BaseSourceVariableInstruction getBase();
/**
* Gets the base source variable (i.e., the variable without
* any indirection) of this definition or use.
*/
abstract BaseSourceVariable getBaseSourceVariable();
final BaseSourceVariable getBaseSourceVariable() {
this.getBase().getBaseSourceVariable() = result
}
/** Gets the variable that is defined or used. */
SourceVariable getSourceVariable() {
@@ -314,17 +329,6 @@ private predicate sourceVariableHasBaseAndIndex(SourceVariable v, BaseSourceVari
v.getIndirection() = ind
}
/**
* Gets the instruction that computes the address that's used to
* initialize `v`.
*/
private Instruction getInitializationTargetAddress(IRVariable v) {
exists(TranslatedVariableInitialization init |
init.getIRVariable() = v and
result = init.getTargetAddress()
)
}
/** An initial definition of an `IRVariable`'s address. */
private class DefAddressImpl extends DefImpl, TDefAddressImpl {
BaseIRVariable v;
@@ -343,15 +347,8 @@ private class DefAddressImpl extends DefImpl, TDefAddressImpl {
final override Node0Impl getValue() { none() }
final override predicate hasIndexInBlock(IRBlock block, int index) {
exists(IRVariable var | var = v.getIRVariable() |
block.getInstruction(index) = getInitializationTargetAddress(var)
or
// If there is no translatated element that does initialization of the
// variable we place the SSA definition at the entry block of the function.
not exists(getInitializationTargetAddress(var)) and
block = var.getEnclosingIRFunction().getEntryBlock() and
index = 0
)
block = v.getIRVariable().getEnclosingIRFunction().getEntryBlock() and
index = 0
}
override Cpp::Location getLocation() { result = v.getIRVariable().getLocation() }
@@ -361,13 +358,14 @@ private class DefAddressImpl extends DefImpl, TDefAddressImpl {
result.getIndirection() = 0
}
final override BaseSourceVariable getBaseSourceVariable() { result = v }
final override BaseSourceVariableInstruction getBase() { none() }
}
private class DirectDef extends DefImpl, TDirectDefImpl {
Operand address;
BaseSourceVariableInstruction base;
DirectDef() { this = TDirectDefImpl(address, indirectionIndex) }
DirectDef() { this = TDirectDefImpl(base, address, indirectionIndex) }
override Cpp::Location getLocation() { result = this.getAddressOperand().getUse().getLocation() }
@@ -379,36 +377,30 @@ private class DirectDef extends DefImpl, TDirectDefImpl {
override Operand getAddressOperand() { result = address }
private BaseSourceVariableInstruction getBase() {
isDef(_, _, address, result, _, indirectionIndex)
}
override BaseSourceVariableInstruction getBase() { result = base }
override BaseSourceVariable getBaseSourceVariable() {
result = this.getBase().getBaseSourceVariable()
}
override int getIndirection() { isDef(_, _, address, base, result, indirectionIndex) }
override int getIndirection() { isDef(_, _, address, _, result, indirectionIndex) }
override Node0Impl getValue() { isDef(_, result, address, base, _, _) }
override Node0Impl getValue() { isDef(_, result, address, _, _, _) }
override predicate isCertain() { isDef(true, _, address, _, _, indirectionIndex) }
override predicate isCertain() { isDef(true, _, address, base, _, indirectionIndex) }
}
private class DirectUseImpl extends UseImpl, TDirectUseImpl {
Operand operand;
BaseSourceVariableInstruction base;
DirectUseImpl() { this = TDirectUseImpl(operand, indirectionIndex) }
DirectUseImpl() { this = TDirectUseImpl(base, operand, indirectionIndex) }
override string toString() { result = "Use of " + this.getSourceVariable() }
final override predicate hasIndexInBlock(IRBlock block, int index) {
// See the comment in `ssa0`'s `OperandBasedUse` for an explanation of this
// predicate's implementation.
if this.getBase().getAst() = any(Cpp::PostfixCrementOperation c).getOperand()
if base.getAst() = any(Cpp::PostfixCrementOperation c).getOperand()
then
exists(Operand op, int indirection, Instruction base |
exists(Operand op, int indirection |
indirection = this.getIndirection() and
base = this.getBase() and
op =
min(Operand cand, int i |
isUse(_, cand, base, indirection, indirectionIndex) and
@@ -421,19 +413,15 @@ private class DirectUseImpl extends UseImpl, TDirectUseImpl {
else operand.getUse() = block.getInstruction(index)
}
private BaseSourceVariableInstruction getBase() { isUse(_, operand, result, _, indirectionIndex) }
override BaseSourceVariable getBaseSourceVariable() {
result = this.getBase().getBaseSourceVariable()
}
final override BaseSourceVariableInstruction getBase() { result = base }
final Operand getOperand() { result = operand }
final override Cpp::Location getLocation() { result = operand.getLocation() }
override int getIndirection() { isUse(_, operand, _, result, indirectionIndex) }
override int getIndirection() { isUse(_, operand, base, result, indirectionIndex) }
override predicate isCertain() { isUse(true, operand, _, _, indirectionIndex) }
override predicate isCertain() { isUse(true, operand, base, _, indirectionIndex) }
override Node getNode() { nodeHasOperand(result, operand, indirectionIndex) }
}
@@ -492,7 +480,13 @@ class FinalParameterUse extends UseImpl, TFinalParameterUse {
result instanceof UnknownDefaultLocation
}
override BaseIRVariable getBaseSourceVariable() { result.getIRVariable().getAst() = p }
override BaseSourceVariableInstruction getBase() {
exists(InitializeParameterInstruction init |
init.getParameter() = p and
// This is always a `VariableAddressInstruction`
result = init.getAnOperand().getDef()
)
}
}
/**
@@ -578,8 +572,8 @@ class GlobalUse extends UseImpl, TGlobalUse {
)
}
override BaseSourceVariable getBaseSourceVariable() {
baseSourceVariableIsGlobal(result, global, f)
override SourceVariable getSourceVariable() {
sourceVariableIsGlobal(result, global, f, this.getIndirection())
}
final override Cpp::Location getLocation() { result = f.getLocation() }
@@ -596,6 +590,8 @@ class GlobalUse extends UseImpl, TGlobalUse {
Type getUnderlyingType() { result = global.getUnderlyingType() }
override predicate isCertain() { any() }
override BaseSourceVariableInstruction getBase() { none() }
}
/**
@@ -625,8 +621,8 @@ class GlobalDefImpl extends DefImpl, TGlobalDefImpl {
}
/** Gets the global variable associated with this definition. */
override BaseSourceVariable getBaseSourceVariable() {
baseSourceVariableIsGlobal(result, global, f)
override SourceVariable getSourceVariable() {
sourceVariableIsGlobal(result, global, f, this.getIndirection())
}
override int getIndirection() { result = indirectionIndex }
@@ -649,6 +645,8 @@ class GlobalDefImpl extends DefImpl, TGlobalDefImpl {
override string toString() { result = "Def of " + this.getSourceVariable() }
override Location getLocation() { result = f.getLocation() }
override BaseSourceVariableInstruction getBase() { none() }
}
/**
@@ -657,9 +655,19 @@ class GlobalDefImpl extends DefImpl, TGlobalDefImpl {
*/
predicate adjacentDefRead(IRBlock bb1, int i1, SourceVariable sv, IRBlock bb2, int i2) {
adjacentDefReadExt(_, sv, bb1, i1, bb2, i2)
or
exists(PhiNode phi |
lastRefRedefExt(_, sv, bb1, i1, phi) and
phi.definesAt(sv, bb2, i2, _)
)
}
predicate useToNode(IRBlock bb, int i, SourceVariable sv, Node nodeTo) {
exists(Phi phi |
phi.asPhi().definesAt(sv, bb, i, _) and
nodeTo = phi.getNode()
)
or
exists(UseImpl use |
use.hasIndexInBlock(bb, i, sv) and
nodeTo = use.getNode()
@@ -713,26 +721,46 @@ predicate nodeToDefOrUse(Node node, SourceVariable sv, IRBlock bb, int i, boolea
*/
private predicate indirectConversionFlowStep(Node nFrom, Node nTo) {
not exists(SourceVariable sv, IRBlock bb2, int i2 |
useToNode(bb2, i2, sv, nTo) and
nodeToDefOrUse(nTo, sv, bb2, i2, _) and
adjacentDefRead(bb2, i2, sv, _, _)
) and
exists(Operand op1, Operand op2, int indirectionIndex, Instruction instr |
hasOperandAndIndex(nFrom, op1, pragma[only_bind_into](indirectionIndex)) and
hasOperandAndIndex(nTo, op2, pragma[only_bind_into](indirectionIndex)) and
instr = op2.getDef() and
conversionFlow(op1, instr, _, _)
(
exists(Operand op1, Operand op2, int indirectionIndex, Instruction instr |
hasOperandAndIndex(nFrom, op1, pragma[only_bind_into](indirectionIndex)) and
hasOperandAndIndex(nTo, op2, pragma[only_bind_into](indirectionIndex)) and
instr = op2.getDef() and
conversionFlow(op1, instr, _, _)
)
or
exists(Operand op1, Operand op2, int indirectionIndex, Instruction instr |
hasOperandAndIndex(nFrom, op1, pragma[only_bind_into](indirectionIndex)) and
hasOperandAndIndex(nTo, op2, indirectionIndex - 1) and
instr = op2.getDef() and
isDereference(instr, op1, _)
)
)
}
/**
* Holds if `node` is a phi input node that should receive flow from the
* definition to (or use of) `sv` at `(bb1, i1)`.
* The reason for this predicate is a bit annoying:
* We cannot mark a `PointerArithmeticInstruction` that computes an offset based on some SSA
* variable `x` as a use of `x` since this creates taint-flow in the following example:
* ```c
* int x = array[source]
* sink(*array)
* ```
* This is because `source` would flow from the operand of `PointerArithmeticInstruction` to the
* result of the instruction, and into the `IndirectOperand` that represents the value of `*array`.
* Then, via use-use flow, flow will arrive at `*array` in `sink(*array)`.
*
* So this predicate recurses back along conversions and `PointerArithmeticInstruction`s to find the
* first use that has provides use-use flow, and uses that target as the target of the `nodeFrom`.
*/
private predicate phiToNode(SsaPhiInputNode node, SourceVariable sv, IRBlock bb1, int i1) {
exists(PhiNode phi, IRBlock input |
phi.hasInputFromBlock(_, sv, bb1, i1, input) and
node.getPhiNode() = phi and
node.getBlock() = input
private predicate adjustForPointerArith(PostUpdateNode pun, SourceVariable sv, IRBlock bb2, int i2) {
exists(IRBlock bb1, int i1, Node adjusted |
indirectConversionFlowStep*(adjusted, pun.getPreUpdateNode()) and
nodeToDefOrUse(adjusted, sv, bb1, i1, _) and
adjacentDefRead(bb1, i1, sv, bb2, i2)
)
}
@@ -747,14 +775,10 @@ private predicate phiToNode(SsaPhiInputNode node, SourceVariable sv, IRBlock bb1
private predicate ssaFlowImpl(
IRBlock bb1, int i1, SourceVariable sv, Node nodeFrom, Node nodeTo, boolean uncertain
) {
nodeToDefOrUse(nodeFrom, sv, bb1, i1, uncertain) and
(
exists(IRBlock bb2, int i2 |
adjacentDefRead(bb1, i1, sv, bb2, i2) and
useToNode(bb2, i2, sv, nodeTo)
)
or
phiToNode(nodeTo, sv, bb1, i1)
exists(IRBlock bb2, int i2 |
nodeToDefOrUse(nodeFrom, sv, bb1, i1, uncertain) and
adjacentDefRead(bb1, i1, sv, bb2, i2) and
useToNode(bb2, i2, sv, nodeTo)
) and
nodeFrom != nodeTo
}
@@ -763,7 +787,7 @@ private predicate ssaFlowImpl(
private Node getAPriorDefinition(DefinitionExt next) {
exists(IRBlock bb, int i, SourceVariable sv |
lastRefRedefExt(_, pragma[only_bind_into](sv), pragma[only_bind_into](bb),
pragma[only_bind_into](i), _, next) and
pragma[only_bind_into](i), next) and
nodeToDefOrUse(result, sv, bb, i, _)
)
}
@@ -870,31 +894,9 @@ private predicate isArgumentOfCallable(DataFlowCall call, Node n) {
* Holds if there is use-use flow from `pun`'s pre-update node to `n`.
*/
private predicate postUpdateNodeToFirstUse(PostUpdateNode pun, Node n) {
// We cannot mark a `PointerArithmeticInstruction` that computes an offset
// based on some SSA
// variable `x` as a use of `x` since this creates taint-flow in the
// following example:
// ```c
// int x = array[source]
// sink(*array)
// ```
// This is because `source` would flow from the operand of `PointerArithmetic`
// instruction to the result of the instruction, and into the `IndirectOperand`
// that represents the value of `*array`. Then, via use-use flow, flow will
// arrive at `*array` in `sink(*array)`.
// So this predicate recurses back along conversions and `PointerArithmetic`
// instructions to find the first use that has provides use-use flow, and
// uses that target as the target of the `nodeFrom`.
exists(Node adjusted, IRBlock bb1, int i1, SourceVariable sv |
indirectConversionFlowStep*(adjusted, pun.getPreUpdateNode()) and
useToNode(bb1, i1, sv, adjusted)
|
exists(IRBlock bb2, int i2 |
adjacentDefRead(bb1, i1, sv, bb2, i2) and
useToNode(bb2, i2, sv, n)
)
or
phiToNode(n, sv, bb1, i1)
exists(SourceVariable sv, IRBlock bb2, int i2 |
adjustForPointerArith(pun, sv, bb2, i2) and
useToNode(bb2, i2, sv, n)
)
}
@@ -949,23 +951,19 @@ predicate postUpdateFlow(PostUpdateNode pun, Node nodeTo) {
/** Holds if `nodeTo` receives flow from the phi node `nodeFrom`. */
predicate fromPhiNode(SsaPhiNode nodeFrom, Node nodeTo) {
exists(PhiNode phi, SourceVariable sv, IRBlock bb1, int i1 |
exists(PhiNode phi, SourceVariable sv, IRBlock bb1, int i1, IRBlock bb2, int i2 |
phi = nodeFrom.getPhiNode() and
phi.definesAt(sv, bb1, i1, _)
|
exists(IRBlock bb2, int i2 |
adjacentDefRead(bb1, i1, sv, bb2, i2) and
useToNode(bb2, i2, sv, nodeTo)
)
or
phiToNode(nodeTo, sv, bb1, i1)
phi.definesAt(sv, bb1, i1, _) and
adjacentDefRead(bb1, i1, sv, bb2, i2) and
useToNode(bb2, i2, sv, nodeTo)
)
}
private predicate baseSourceVariableIsGlobal(
BaseIRVariable base, GlobalLikeVariable global, IRFunction func
private predicate sourceVariableIsGlobal(
SourceVariable sv, GlobalLikeVariable global, IRFunction func, int indirectionIndex
) {
exists(IRVariable irVar |
exists(IRVariable irVar, BaseIRVariable base |
sourceVariableHasBaseAndIndex(sv, base, indirectionIndex) and
irVar = base.getIRVariable() and
irVar.getEnclosingIRFunction() = func and
global = irVar.getAst() and
@@ -1032,26 +1030,22 @@ module SsaCached {
* Holds if the node at index `i` in `bb` is a last reference to SSA definition
* `def`. The reference is last because it can reach another write `next`,
* without passing through another read or write.
*
* The path from node `i` in `bb` to `next` goes via basic block `input`,
* which is either a predecessor of the basic block of `next`, or `input` =
* `bb` in case `next` occurs in basic block `bb`.
*/
cached
predicate lastRefRedefExt(
DefinitionExt def, SourceVariable sv, IRBlock bb, int i, IRBlock input, DefinitionExt next
DefinitionExt def, SourceVariable sv, IRBlock bb, int i, DefinitionExt next
) {
SsaImpl::lastRefRedefExt(def, sv, bb, i, input, next)
SsaImpl::lastRefRedefExt(def, sv, bb, i, next)
}
cached
Definition phiHasInputFromBlockExt(PhiNode phi, IRBlock bb) {
SsaImpl::phiHasInputFromBlockExt(phi, result, bb)
Definition phiHasInputFromBlock(PhiNode phi, IRBlock bb) {
SsaImpl::phiHasInputFromBlock(phi, result, bb)
}
cached
predicate ssaDefReachesReadExt(SourceVariable v, DefinitionExt def, IRBlock bb, int i) {
SsaImpl::ssaDefReachesReadExt(v, def, bb, i)
predicate ssaDefReachesRead(SourceVariable v, Definition def, IRBlock bb, int i) {
SsaImpl::ssaDefReachesRead(v, def, bb, i)
}
predicate variableRead = SsaInput::variableRead/4;
@@ -1203,11 +1197,11 @@ class Phi extends TPhi, SsaDef {
final override Location getLocation() { result = phi.getBasicBlock().getLocation() }
override string toString() { result = phi.toString() }
override string toString() { result = "Phi" }
SsaPhiInputNode getNode(IRBlock block) { result.getPhiNode() = phi and result.getBlock() = block }
SsaPhiNode getNode() { result.getPhiNode() = phi }
predicate hasInputFromBlock(Definition inp, IRBlock bb) { inp = phiHasInputFromBlockExt(phi, bb) }
predicate hasInputFromBlock(Definition inp, IRBlock bb) { inp = phiHasInputFromBlock(phi, bb) }
final Definition getAnInput() { this.hasInputFromBlock(result, _) }
}
@@ -1233,21 +1227,13 @@ class PhiNode extends SsaImpl::DefinitionExt {
*/
predicate isPhiRead() { this instanceof SsaImpl::PhiReadNode }
/**
* Holds if the node at index `i` in `bb` is a last reference to SSA
* definition `def` of `sv`. The reference is last because it can reach
* this phi node, without passing through another read or write.
*
* The path from node `i` in `bb` to this phi node goes via basic block
* `input`, which is either a predecessor of the basic block of this phi
* node, or `input` = `bb` in case this phi node occurs in basic block `bb`.
*/
predicate hasInputFromBlock(DefinitionExt def, SourceVariable sv, IRBlock bb, int i, IRBlock input) {
SsaCached::lastRefRedefExt(def, sv, bb, i, input, this)
/** Holds if `inp` is an input to this phi node along the edge originating in `bb`. */
predicate hasInputFromBlock(Definition inp, IRBlock bb) {
inp = SsaCached::phiHasInputFromBlock(this, bb)
}
/** Gets a definition that is an input to this phi node. */
final Definition getAnInput() { this.hasInputFromBlock(result, _, _, _, _) }
final Definition getAnInput() { this.hasInputFromBlock(result, _) }
}
/** An static single assignment (SSA) definition. */
@@ -1262,15 +1248,6 @@ class DefinitionExt extends SsaImpl::DefinitionExt {
result = this.getAPhiInputOrPriorDefinition*() and
not result instanceof PhiNode
}
/** Gets a node that represents a read of this SSA definition. */
Node getARead() {
exists(SourceVariable sv, IRBlock bb, int i | SsaCached::ssaDefReachesReadExt(sv, this, bb, i) |
useToNode(bb, i, sv, result)
or
phiToNode(result, sv, bb, i)
)
}
}
class Definition = SsaImpl::Definition;

View File

@@ -830,12 +830,6 @@ newtype TTranslatedElement =
not ignoreExpr(dc)
)
} or
// The set of destructors to invoke after a handler for a `try` statement. These
// need to be special cased because the destructors need to run following an
// `ExceptionEdge`, but not following a `GotoEdge` edge.
TTranslatedDestructorsAfterHandler(Handler handler) {
exists(handler.getAnImplicitDestructorCall())
} or
// A precise side effect of an argument to a `Call`
TTranslatedArgumentExprSideEffect(Call call, Expr expr, int n, SideEffectOpcode opcode) {
not ignoreExpr(expr) and

View File

@@ -1844,6 +1844,9 @@ class TranslatedAssignExpr extends TranslatedNonConstantExpr {
child = this.getRightOperand() and
result = this.getLeftOperand().getFirstInstruction(kind)
or
child = this.getRightOperand() and
result = this.getLeftOperand().getFirstInstruction(kind)
or
kind instanceof GotoEdge and
child = this.getLeftOperand() and
result = this.getInstruction(AssignmentStoreTag())
@@ -3208,20 +3211,9 @@ class TranslatedBuiltInOperation extends TranslatedNonConstantExpr {
final override Instruction getResult() { result = this.getInstruction(OnlyInstructionTag()) }
/**
* Gets the rnk'th (0-indexed) child for which a `TranslatedElement` exists.
*
* We use this predicate to filter out `TypeName` expressions that sometimes
* occur in builtin operations since the IR doesn't have an instruction to
* represent a reference to a type.
*/
private TranslatedElement getRankedChild(int rnk) {
result = rank[rnk + 1](int id, TranslatedElement te | te = this.getChild(id) | te order by id)
}
final override Instruction getFirstInstruction(EdgeKind kind) {
if exists(this.getRankedChild(0))
then result = this.getRankedChild(0).getFirstInstruction(kind)
if exists(this.getChild(0))
then result = this.getChild(0).getFirstInstruction(kind)
else (
kind instanceof GotoEdge and result = this.getInstruction(OnlyInstructionTag())
)
@@ -3241,11 +3233,11 @@ class TranslatedBuiltInOperation extends TranslatedNonConstantExpr {
}
final override Instruction getChildSuccessorInternal(TranslatedElement child, EdgeKind kind) {
exists(int id | child = this.getRankedChild(id) |
result = this.getRankedChild(id + 1).getFirstInstruction(kind)
exists(int id | child = this.getChild(id) |
result = this.getChild(id + 1).getFirstInstruction(kind)
or
kind instanceof GotoEdge and
not exists(this.getRankedChild(id + 1)) and
not exists(this.getChild(id + 1)) and
result = this.getInstruction(OnlyInstructionTag())
)
}
@@ -3260,7 +3252,7 @@ class TranslatedBuiltInOperation extends TranslatedNonConstantExpr {
tag = OnlyInstructionTag() and
exists(int index |
operandTag = positionalArgumentOperand(index) and
result = this.getRankedChild(index).(TranslatedExpr).getResult()
result = this.getChild(index).(TranslatedExpr).getResult()
)
}

View File

@@ -777,72 +777,6 @@ abstract class TranslatedHandler extends TranslatedStmt {
TranslatedStmt getBlock() { result = getTranslatedStmt(stmt.getBlock()) }
}
/**
* The IR translation of the destructor calls of the parent `TranslatedCatchByTypeHandler`.
*
* This object does not itself generate the destructor calls. Instead, its
* children provide the actual calls.
*/
class TranslatedDestructorsAfterHandler extends TranslatedElement,
TTranslatedDestructorsAfterHandler
{
Handler handler;
TranslatedDestructorsAfterHandler() { this = TTranslatedDestructorsAfterHandler(handler) }
override string toString() { result = "Destructor calls after handler: " + handler }
private TranslatedCall getTranslatedImplicitDestructorCall(int id) {
result.getExpr() = handler.getImplicitDestructorCall(id)
}
override Instruction getFirstInstruction(EdgeKind kind) {
result = this.getChild(0).getFirstInstruction(kind)
}
override Handler getAst() { result = handler }
override Instruction getInstructionSuccessorInternal(InstructionTag tag, EdgeKind kind) { none() }
override TranslatedElement getChild(int id) {
result = this.getTranslatedImplicitDestructorCall(id)
}
override predicate handlesDestructorsExplicitly() { any() }
override Declaration getFunction() { result = handler.getEnclosingFunction() }
override Instruction getChildSuccessorInternal(TranslatedElement child, EdgeKind kind) {
exists(int id | child = this.getChild(id) |
// Transition to the next child, if any.
result = this.getChild(id + 1).getFirstInstruction(kind)
or
// And otherwise go to the next handler, if any.
not exists(this.getChild(id + 1)) and
result =
getTranslatedStmt(handler)
.getParent()
.(TranslatedTryStmt)
.getNextHandler(getTranslatedStmt(handler), kind)
)
}
override TranslatedElement getLastChild() {
result =
this.getTranslatedImplicitDestructorCall(max(int id |
exists(handler.getImplicitDestructorCall(id))
))
}
override Instruction getALastInstructionInternal() {
result = this.getLastChild().getALastInstruction()
}
override predicate hasInstruction(Opcode opcode, InstructionTag tag, CppType resultType) {
none()
}
}
/**
* The IR translation of a C++ `catch` block that catches an exception with a
* specific type (e.g. `catch (const std::exception&)`).
@@ -856,14 +790,10 @@ class TranslatedCatchByTypeHandler extends TranslatedHandler {
resultType = getVoidType()
}
override predicate handlesDestructorsExplicitly() { any() }
override TranslatedElement getChildInternal(int id) {
result = super.getChildInternal(id)
or
id = 0 and result = this.getParameter()
or
id = 1 and result = this.getDestructors()
}
override Instruction getChildSuccessorInternal(TranslatedElement child, EdgeKind kind) {
@@ -880,9 +810,7 @@ class TranslatedCatchByTypeHandler extends TranslatedHandler {
result = this.getParameter().getFirstInstruction(kind)
or
kind instanceof ExceptionEdge and
if exists(this.getDestructors())
then result = this.getDestructors().getFirstInstruction(any(GotoEdge edge))
else result = this.getParent().(TranslatedTryStmt).getNextHandler(this, any(GotoEdge edge))
result = this.getParent().(TranslatedTryStmt).getNextHandler(this, any(GotoEdge edge))
)
}
@@ -894,8 +822,6 @@ class TranslatedCatchByTypeHandler extends TranslatedHandler {
private TranslatedParameter getParameter() {
result = getTranslatedParameter(stmt.getParameter())
}
private TranslatedDestructorsAfterHandler getDestructors() { result.getAst() = stmt }
}
/**
@@ -916,7 +842,9 @@ class TranslatedCatchAnyHandler extends TranslatedHandler {
}
}
abstract class TranslatedIfLikeStmt extends TranslatedStmt, ConditionContext {
class TranslatedIfStmt extends TranslatedStmt, ConditionContext {
override IfStmt stmt;
override Instruction getFirstInstruction(EdgeKind kind) {
if this.hasInitialization()
then result = this.getInitialization().getFirstInstruction(kind)
@@ -929,8 +857,6 @@ abstract class TranslatedIfLikeStmt extends TranslatedStmt, ConditionContext {
override TranslatedElement getLastChild() { result = this.getElse() or result = this.getThen() }
override predicate handlesDestructorsExplicitly() { any() }
override TranslatedElement getChildInternal(int id) {
id = 0 and result = this.getInitialization()
or
@@ -941,21 +867,25 @@ abstract class TranslatedIfLikeStmt extends TranslatedStmt, ConditionContext {
id = 3 and result = this.getElse()
}
abstract predicate hasInitialization();
private predicate hasInitialization() { exists(stmt.getInitialization()) }
abstract TranslatedStmt getInitialization();
private TranslatedStmt getInitialization() {
result = getTranslatedStmt(stmt.getInitialization())
}
abstract TranslatedCondition getCondition();
private TranslatedCondition getCondition() {
result = getTranslatedCondition(stmt.getCondition().getFullyConverted())
}
private Instruction getFirstConditionInstruction(EdgeKind kind) {
result = this.getCondition().getFirstInstruction(kind)
}
abstract TranslatedStmt getThen();
private TranslatedStmt getThen() { result = getTranslatedStmt(stmt.getThen()) }
abstract TranslatedStmt getElse();
private TranslatedStmt getElse() { result = getTranslatedStmt(stmt.getElse()) }
abstract predicate hasElse();
private predicate hasElse() { exists(stmt.getElse()) }
override Instruction getInstructionSuccessorInternal(InstructionTag tag, EdgeKind kind) { none() }
@@ -968,11 +898,7 @@ abstract class TranslatedIfLikeStmt extends TranslatedStmt, ConditionContext {
child = this.getCondition() and
if this.hasElse()
then result = this.getElse().getFirstInstruction(kind)
else (
if this.hasAnImplicitDestructorCall()
then result = this.getChild(this.getFirstDestructorCallIndex()).getFirstInstruction(kind)
else result = this.getParent().getChildSuccessor(this, kind)
)
else result = this.getParent().getChildSuccessor(this, kind)
}
override Instruction getChildSuccessorInternal(TranslatedElement child, EdgeKind kind) {
@@ -980,24 +906,7 @@ abstract class TranslatedIfLikeStmt extends TranslatedStmt, ConditionContext {
result = this.getFirstConditionInstruction(kind)
or
(child = this.getThen() or child = this.getElse()) and
(
if this.hasAnImplicitDestructorCall()
then result = this.getChild(this.getFirstDestructorCallIndex()).getFirstInstruction(kind)
else result = this.getParent().getChildSuccessor(this, kind)
)
or
exists(int destructorId |
destructorId >= this.getFirstDestructorCallIndex() and
child = this.getChild(destructorId) and
result = this.getChild(destructorId + 1).getFirstInstruction(kind)
)
or
exists(int lastDestructorIndex |
lastDestructorIndex =
max(int n | exists(this.getChild(n)) and n >= this.getFirstDestructorCallIndex()) and
child = this.getChild(lastDestructorIndex) and
result = this.getParent().getChildSuccessor(this, kind)
)
result = this.getParent().getChildSuccessor(this, kind)
}
override predicate hasInstruction(Opcode opcode, InstructionTag tag, CppType resultType) {
@@ -1005,44 +914,76 @@ abstract class TranslatedIfLikeStmt extends TranslatedStmt, ConditionContext {
}
}
class TranslatedIfStmt extends TranslatedIfLikeStmt {
override IfStmt stmt;
override predicate hasInitialization() { exists(stmt.getInitialization()) }
override TranslatedStmt getInitialization() {
result = getTranslatedStmt(stmt.getInitialization())
}
override TranslatedCondition getCondition() {
result = getTranslatedCondition(stmt.getCondition().getFullyConverted())
}
override TranslatedStmt getThen() { result = getTranslatedStmt(stmt.getThen()) }
override TranslatedStmt getElse() { result = getTranslatedStmt(stmt.getElse()) }
override predicate hasElse() { exists(stmt.getElse()) }
}
class TranslatedConstExprIfStmt extends TranslatedIfLikeStmt {
class TranslatedConstExprIfStmt extends TranslatedStmt, ConditionContext {
override ConstexprIfStmt stmt;
override predicate hasInitialization() { exists(stmt.getInitialization()) }
override Instruction getFirstInstruction(EdgeKind kind) {
if this.hasInitialization()
then result = this.getInitialization().getFirstInstruction(kind)
else result = this.getFirstConditionInstruction(kind)
}
override TranslatedStmt getInitialization() {
override TranslatedElement getChildInternal(int id) {
id = 0 and result = this.getInitialization()
or
id = 1 and result = this.getCondition()
or
id = 2 and result = this.getThen()
or
id = 3 and result = this.getElse()
}
private predicate hasInitialization() { exists(stmt.getInitialization()) }
private TranslatedStmt getInitialization() {
result = getTranslatedStmt(stmt.getInitialization())
}
override TranslatedCondition getCondition() {
private TranslatedCondition getCondition() {
result = getTranslatedCondition(stmt.getCondition().getFullyConverted())
}
override TranslatedStmt getThen() { result = getTranslatedStmt(stmt.getThen()) }
private Instruction getFirstConditionInstruction(EdgeKind kind) {
result = this.getCondition().getFirstInstruction(kind)
}
override TranslatedStmt getElse() { result = getTranslatedStmt(stmt.getElse()) }
private TranslatedStmt getThen() { result = getTranslatedStmt(stmt.getThen()) }
override predicate hasElse() { exists(stmt.getElse()) }
private TranslatedStmt getElse() { result = getTranslatedStmt(stmt.getElse()) }
private predicate hasElse() { exists(stmt.getElse()) }
override Instruction getInstructionSuccessorInternal(InstructionTag tag, EdgeKind kind) { none() }
override Instruction getChildTrueSuccessor(TranslatedCondition child, EdgeKind kind) {
child = this.getCondition() and
result = this.getThen().getFirstInstruction(kind)
}
override Instruction getChildFalseSuccessor(TranslatedCondition child, EdgeKind kind) {
child = this.getCondition() and
if this.hasElse()
then result = this.getElse().getFirstInstruction(kind)
else result = this.getParent().getChildSuccessor(this, kind)
}
override Instruction getChildSuccessorInternal(TranslatedElement child, EdgeKind kind) {
child = this.getInitialization() and
result = this.getFirstConditionInstruction(kind)
or
(child = this.getThen() or child = this.getElse()) and
result = this.getParent().getChildSuccessor(this, kind)
}
override predicate hasInstruction(Opcode opcode, InstructionTag tag, CppType resultType) {
none()
}
override Instruction getALastInstructionInternal() {
result = this.getThen().getALastInstruction()
or
result = this.getElse().getALastInstruction()
}
}
abstract class TranslatedLoop extends TranslatedStmt, ConditionContext {

View File

@@ -1,4 +1,4 @@
description: Removed unused column from the `folders` and `files` relations
compatibility: full
files.rel: reorder files.rel (@file id, string name, string simple, string ext, int fromSource) id name
folders.rel: reorder folders.rel (@folder id, string name, string simple) id name
files.rel: reorder files.rel (int id, string name, string simple, string ext, int fromSource) id name
folders.rel: reorder folders.rel (int id, string name, string simple) id name

View File

@@ -1,30 +1,3 @@
## 1.0.2
No user-facing changes.
## 1.0.1
### Minor Analysis Improvements
* The `cpp/dangerous-function-overflow` no longer produces a false positive alert when the `gets` function does not have exactly one parameter.
## 1.0.0
### Breaking Changes
* CodeQL package management is now generally available, and all GitHub-produced CodeQL packages have had their version numbers increased to 1.0.0.
### Minor Analysis Improvements
* The "Use of unique pointer after lifetime ends" query (`cpp/use-of-unique-pointer-after-lifetime-ends`) no longer reports an alert when the pointer is converted to a boolean
* The "Variable not initialized before use" query (`cpp/not-initialised`) no longer reports an alert on static variables.
## 0.9.12
### New Queries
* Added a new query, `cpp/iterator-to-expired-container`, to detect the creation of iterators owned by a temporary objects that are about to be destroyed.
## 0.9.11
### Minor Analysis Improvements

View File

@@ -14,32 +14,13 @@ the program, or security vulnerabilities, by allowing an attacker to overwrite a
</overview>
<recommendation>
<p>
Ensure that all execution paths deallocate the allocated memory at most once. In complex cases it may
help to reassign a pointer to a null value after deallocating it. This will prevent double-free vulnerabilities
since most deallocation functions will perform a null-pointer check before attempting to deallocate memory.
Ensure that all execution paths deallocate the allocated memory at most once. If possible, reassign
the pointer to a null value after deallocating it. This will prevent double-free vulnerabilities since
most deallocation functions will perform a null-pointer check before attempting to deallocate the memory.
</p>
</recommendation>
<example>
<p>
In the following example, <code>buff</code> is allocated and then freed twice:
</p>
<sample src="DoubleFreeBad.cpp" />
<p>
Reviewing the code above, the issue can be fixed by simply deleting the additional call to
<code>free(buff)</code>.
</p>
<sample src="DoubleFreeGood.cpp" />
<p>
In the next example, <code>task</code> may be deleted twice, if an exception occurs inside the <code>try</code>
block after the first <code>delete</code>:
</p>
<sample src="DoubleFreeBad2.cpp" />
<p>
The problem can be solved by assigning a null value to the pointer after the first <code>delete</code>, as
calling <code>delete</code> a second time on the null pointer is harmless.
</p>
<sample src="DoubleFreeGood2.cpp" />
<example><sample src="DoubleFree.cpp" />
</example>
<references>

View File

@@ -1,16 +0,0 @@
void g() {
MyTask *task = nullptr;
try
{
task = new MyTask;
...
delete task;
...
} catch (...) {
delete task; // BAD: potential double-free
}
}

View File

@@ -1,7 +0,0 @@
int* f() {
int *buff = malloc(SIZE*sizeof(int));
do_stuff(buff);
free(buff); // GOOD: buff is only freed once.
int *new_buffer = malloc(SIZE*sizeof(int));
return new_buffer;
}

View File

@@ -1,17 +0,0 @@
void g() {
MyTask *task = nullptr;
try
{
task = new MyTask;
...
delete task;
task = nullptr;
...
} catch (...) {
delete task; // GOOD: harmless if task is NULL
}
}

View File

@@ -54,7 +54,6 @@ predicate undefinedLocalUse(VariableAccess va) {
// it is hard to tell when a struct or array has been initialized, so we
// ignore them
not isAggregateType(lv.getUnderlyingType()) and
not lv.isStatic() and // static variables are initialized to zero or null by default
not lv.getType().hasName("va_list") and
va = lv.getAnAccess() and
noDefPath(lv, va) and
@@ -71,8 +70,7 @@ predicate uninitialisedGlobal(GlobalVariable gv) {
va = gv.getAnAccess() and
va.isRValue() and
not gv.hasInitializer() and
not gv.hasSpecifier("extern") and
not gv.isStatic() // static variables are initialized to zero or null by default
not gv.hasSpecifier("extern")
)
}

View File

@@ -42,7 +42,7 @@ in the previous example, one solution is to make the log message a trailing argu
<p>An alternative solution is to allow <code>log_with_timestamp</code> to accept format arguments:</p>
<sample src="NonConstantFormat-2-good.c" />
<p>In this formulation, the non-constant format string to <code>printf</code> has been replaced with
a non-constant format string to <code>vprintf</code>. The analysis will no longer consider the body of
a non-constant format string to <code>vprintf</code>. Semmle will no longer consider the body of
<code>log_with_timestamp</code> to be a problem, and will instead check that every call to
<code>log_with_timestamp</code> passes a constant format string.</p>

View File

@@ -22,8 +22,10 @@ function.
</example>
<references>
<li>CERT C Coding Standard: <a href="https://wiki.sei.cmu.edu/confluence/display/c/FIO47-C.+Use+valid+format+strings">FIO47-C. Use valid format strings</a>.</li>
<li>cplusplus.com: <a href="http://www.tutorialspoint.com/cplusplus/cpp_functions.htm">C++ Functions</a>.</li>
<li>Microsoft C Runtime Library Reference: <a href="https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/printf-printf-l-wprintf-wprintf-l">printf, wprintf</a>.</li>
</references>
</qhelp>

View File

@@ -19,8 +19,8 @@ contents.
</overview>
<recommendation>
<p>Review the format and arguments expected by the highlighted function calls. Update either
the format or the arguments so that the expected number of arguments are passed to the
<p>Review the format and arguments expected by the highlighted function calls. Update either
the format or the arguments so that the expected number of arguments are passed to the
function.
</p>
@@ -30,8 +30,11 @@ function.
</example>
<references>
<li>CERT C Coding Standard: <a href="https://wiki.sei.cmu.edu/confluence/display/c/FIO47-C.+Use+valid+format+strings">FIO47-C. Use valid format strings</a>.</li>
<li>CERT C Coding
Standard: <a href="https://www.securecoding.cert.org/confluence/display/c/FIO30-C.+Exclude+user+input+from+format+strings">FIO30-C. Exclude user input from format strings</a>.</li>
<li>cplusplus.com: <a href="http://www.tutorialspoint.com/cplusplus/cpp_functions.htm">C++ Functions</a>.</li>
<li>Microsoft C Runtime Library Reference: <a href="https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/printf-printf-l-wprintf-wprintf-l">printf, wprintf</a>.</li>
</references>
</qhelp>

View File

@@ -0,0 +1,4 @@
int main() {
printf("%s\n", 42); //printf will treat 42 as a char*, will most likely segfault
return 0;
}

View File

@@ -4,33 +4,29 @@
<qhelp>
<overview>
<p>Each call to the <code>printf</code> function or a related function should include
the type and sequence of arguments defined by the format. If the function is passed arguments
the type and sequence of arguments defined by the format. If the function is passed arguments
of a different type or in a different sequence then the arguments are reinterpreted to fit the type and sequence expected, resulting in unpredictable behavior.</p>
</overview>
<recommendation>
<p>Review the format and arguments expected by the highlighted function calls. Update either
the format or the arguments so that the expected type and sequence of arguments are passed to
<p>Review the format and arguments expected by the highlighted function calls. Update either
the format or the arguments so that the expected type and sequence of arguments are passed to
the function.
</p>
</recommendation>
<example>
<p>In the following example, the wrong format specifier is given for an integer format argument:</p>
<sample src="WrongTypeFormatArgumentsBad.cpp" />
<p>The corrected version uses <code>%i</code> as the format specifier for the integer format argument:</p>
<sample src="WrongTypeFormatArgumentsGood.cpp" />
<example><sample src="WrongTypeFormatArguments.cpp" />
</example>
<references>
<li>Microsoft Learn: <a href="https://learn.microsoft.com/en-us/cpp/c-runtime-library/format-specification-syntax-printf-and-wprintf-functions?view=msvc-170">Format specification syntax: printf and wprintf functions</a>.</li>
<li>cplusplus.com:<a href="https://cplusplus.com/reference/cstdio/printf/"></a>printf</li>
<li>CERT C Coding Standard: <a href="https://wiki.sei.cmu.edu/confluence/display/c/FIO47-C.+Use+valid+format+strings">FIO47-C. Use valid format strings</a>.</li>
<li>CERT C Coding
Standard: <a href="https://www.securecoding.cert.org/confluence/display/c/FIO30-C.+Exclude+user+input+from+format+strings">FIO30-C. Exclude user input from format strings</a>.</li>
<li>cplusplus.com: <a href="http://www.tutorialspoint.com/cplusplus/cpp_functions.htm">C++ Functions</a>.</li>
<li>CRT Alphabetical Function Reference: <a href="https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/printf-printf-l-wprintf-wprintf-l">printf, _printf_l, wprintf, _wprintf_l</a>.</li>
</references>
</qhelp>

View File

@@ -1,4 +0,0 @@
int main() {
printf("%s\n", 42); // BAD: printf will treat 42 as a char*, will most likely segfault
return 0;
}

View File

@@ -1,4 +0,0 @@
int main() {
printf("%i\n", 42); // GOOD: printf will treat 42 as an int
return 0;
}

View File

@@ -2,18 +2,19 @@
void f_warning(int i)
{
// BAD: the usage of the logical not operator in this case is unlikely to be correct
// The usage of the logical not operator in this case is unlikely to be correct
// as the output is being used as an operator for a bit-wise and operation
if (i & !FLAGS)
if (i & !FLAGS)
{
// code
}
}
void f_fixed(int i)
{
if (i & ~FLAGS) // GOOD: Changing the logical not operator for the bit-wise not operator would fix this logic
if (i & ~FLAGS) // Changing the logical not operator for the bit-wise not operator would fix this logic
{
// code
}
}
}

View File

@@ -16,13 +16,7 @@
<p>Carefully inspect the flagged expressions. Consider the intent in the code logic, and decide whether it is necessary to change the not operator.</p>
</recommendation>
<example>
<p>Here is an example of this issue and how it can be fixed:</p>
<sample src="IncorrectNotOperatorUsage.cpp" />
<p>In other cases, particularly when the expressions have <code>bool</code> type, the fix may instead be of the form <code>a &amp;&amp; !b</code>.</p>
</example>
<example><sample src="IncorrectNotOperatorUsage.cpp" /></example>
<references>
<li>

View File

@@ -37,19 +37,6 @@ class AllocaCall extends FunctionCall {
}
}
/**
* Gets an expression associated with a dataflow node.
*/
private Expr getExpr(DataFlow::Node node) {
result = node.asInstruction().getAst()
or
result = node.asOperand().getUse().getAst()
or
result = node.(DataFlow::RawIndirectInstruction).getInstruction().getAst()
or
result = node.(DataFlow::RawIndirectOperand).getOperand().getUse().getAst()
}
/**
* A loop that contains an `alloca` call.
*/
@@ -198,6 +185,19 @@ class LoopWithAlloca extends Stmt {
not this.conditionReachesWithoutUpdate(var, this.(Loop).getCondition())
}
/**
* Gets an expression associated with a dataflow node.
*/
private Expr getExpr(DataFlow::Node node) {
result = node.asInstruction().getAst()
or
result = node.asOperand().getUse().getAst()
or
result = node.(DataFlow::RawIndirectInstruction).getInstruction().getAst()
or
result = node.(DataFlow::RawIndirectOperand).getOperand().getUse().getAst()
}
/**
* Gets a definition that may be the most recent definition of the
* controlling variable `var` before this loop.
@@ -209,9 +209,8 @@ class LoopWithAlloca extends Stmt {
DataFlow::localFlow(result, DataFlow::exprNode(va)) and
// Phi nodes will be preceded by nodes that represent actual definitions
not result instanceof DataFlow::SsaPhiNode and
not result instanceof DataFlow::SsaPhiInputNode and
// A source is outside the loop if it's not inside the loop
not exists(Expr e | e = getExpr(result) | this = getAnEnclosingLoopOfExpr(e))
not exists(Expr e | e = this.getExpr(result) | this = getAnEnclosingLoopOfExpr(e))
)
}
@@ -222,9 +221,9 @@ class LoopWithAlloca extends Stmt {
private int getAControllingVarInitialValue(Variable var, DataFlow::Node source) {
source = this.getAPrecedingDef(var) and
(
result = getExpr(source).getValue().toInt()
result = this.getExpr(source).getValue().toInt()
or
result = getExpr(source).(Assignment).getRValue().getValue().toInt()
result = this.getExpr(source).(Assignment).getRValue().getValue().toInt()
)
}

View File

@@ -107,7 +107,7 @@ class SnprintfSizeExpr extends BufferAccess, FunctionCall {
}
class MemcmpSizeExpr extends BufferAccess, FunctionCall {
MemcmpSizeExpr() { this.getTarget().hasName("memcmp") }
MemcmpSizeExpr() { this.getTarget().hasName("Memcmp") }
override Expr getPointer() {
result = this.getArgument(0) or

View File

@@ -0,0 +1,2 @@
strncpy(dest, src, sizeof(src)); //wrong: size of dest should be used
strncpy(dest, src, strlen(src)); //wrong: size of dest should be used

View File

@@ -3,7 +3,7 @@
"qhelp.dtd">
<qhelp>
<overview>
<p>The standard library function <code>strncpy</code> copies a source string to a destination buffer. The third argument defines the maximum number of characters to copy and should be less than
<p>The standard library function <code>strncpy</code> copies a source string to a destination buffer. The third argument defines the maximum number of characters to copy and should be less than
or equal to the size of the destination buffer. Calls of the form <code>strncpy(dest, src, strlen(src))</code> or <code>strncpy(dest, src, sizeof(src))</code> incorrectly set the third argument to the size of the source buffer. Executing a call of this type may cause a buffer overflow. Buffer overflows can lead to anything from a segmentation fault to a security vulnerability.</p>
</overview>
@@ -12,20 +12,14 @@ or equal to the size of the destination buffer. Calls of the form <code>strncpy(
not the source buffer.</p>
</recommendation>
<example><sample src="StrncpyFlippedArgs.cpp" />
<example>
<p>In the following examples, the size of the source buffer is incorrectly used as a parameter to <code>strncpy</code>:</p>
<sample src="StrncpyFlippedArgsBad.cpp" />
<p>The corrected version uses the size of the destination buffer, or a variable containing the size of the destination buffer as the size parameter to <code>strncpy</code>:</p>
<sample src="StrncpyFlippedArgsGood.cpp" />
</example>
<references>
<li>cplusplus.com: <a href="https://cplusplus.com/reference/cstring/strncpy/">strncpy</a>.</li>
<li>cplusplus.com: <a href="http://www.cplusplus.com/reference/clibrary/cstring/strncpy/">strncpy</a>.</li>
<li>
I. Gerg. <em>An Overview and Example of the Buffer-Overflow Exploit</em>. IANewsletter vol 7 no 4. 2005.
</li>

View File

@@ -1,9 +0,0 @@
char src[256];
char dest1[128];
...
strncpy(dest1, src, sizeof(src)); // wrong: size of dest should be used
char *dest2 = (char *)malloc(sz1 + sz2 + sz3);
strncpy(dest2, src, strlen(src)); // wrong: size of dest should be used

View File

@@ -1,10 +0,0 @@
char src[256];
char dest1[128];
...
strncpy(dest1, src, sizeof(dest1)); // correct
size_t destSize = sz1 + sz2 + sz3;
char *dest2 = (char *)malloc(destSize);
strncpy(dest2, src, destSize); // correct

View File

@@ -0,0 +1,22 @@
int main(int argc, char** argv) {
char *userAndFile = argv[2];
{
char fileBuffer[FILENAME_MAX] = "/home/";
char *fileName = fileBuffer;
size_t len = strlen(fileName);
strncat(fileName+len, userAndFile, FILENAME_MAX-len-1);
// BAD: a string from the user is used in a filename
fopen(fileName, "wb+");
}
{
char fileBuffer[FILENAME_MAX] = "/home/";
char *fileName = fileBuffer;
size_t len = strlen(fileName);
// GOOD: use a fixed file
char* fixed = "jim/file.txt";
strncat(fileName+len, fixed, FILENAME_MAX-len-1);
fopen(fileName, "wb+");
}
}

View File

@@ -3,57 +3,36 @@
"qhelp.dtd">
<qhelp>
<overview>
<p>Accessing paths controlled by users can allow an attacker to access unexpected resources. This
<p>Accessing paths controlled by users can allow an attacker to access unexpected resources. This
can result in sensitive information being revealed or deleted, or an attacker being able to influence
behavior by modifying unexpected files.</p>
<p>Paths that are naively constructed from data controlled by a user may be absolute paths, or may contain
unexpected special characters such as "..". Such a path could point anywhere on the file system.</p>
<p>Paths that are naively constructed from data controlled by a user may contain unexpected special characters,
such as "..". Such a path may potentially point to any directory on the filesystem.</p>
</overview>
<recommendation>
<p>Validate user input before using it to construct a file path.</p>
<p>Validate user input before using it to construct a filepath. Ideally, follow these rules:</p>
<p>Common validation methods include checking that the normalized path is relative and does not contain
any ".." components, or checking that the path is contained within a safe folder. The method you should use depends
on how the path is used in the application, and whether the path should be a single path component.
</p>
<p>If the path should be a single path component (such as a file name), you can check for the existence
of any path separators ("/" or "\"), or ".." sequences in the input, and reject the input if any are found.
</p>
<p>
Note that removing "../" sequences is <i>not</i> sufficient, since the input could still contain a path separator
followed by "..". For example, the input ".../...//" would still result in the string "../" if only "../" sequences
are removed.
</p>
<p>Finally, the simplest (but most restrictive) option is to use an allow list of safe patterns and make sure that
the user input matches one of these patterns.</p>
<ul>
<li>Do not allow more than a single "." character.</li>
<li>Do not allow directory separators such as "/" or "\" (depending on the filesystem).</li>
<li>Do not rely on simply replacing problematic sequences such as "../". For example, after applying this filter to
".../...//" the resulting string would still be "../".</li>
<li>Ideally use a whitelist of known good patterns.</li>
</ul>
</recommendation>
<example>
<p>In this example, a file name is read from a user and then used to access a file.
However, a malicious user could enter a file name anywhere on the file system,
such as "/etc/passwd" or "../../../etc/passwd".</p>
<p>In this example, a username and file are read from the arguments to main and then used to access a file in the
user's home directory. However, a malicious user could enter a filename which contains special
characters. For example, the string "../../etc/passwd" will result in the code reading the file located at
"/home/[user]/../../etc/passwd", which is the system's password file. This could potentially allow them to
access all the system's passwords.</p>
<sample src="examples/TaintedPath.c" />
<p>
If the input should only be a file name, you can check that it doesn't contain any path separators or ".." sequences.
</p>
<sample src="examples/TaintedPathNormalize.c" />
<p>
If the input should be within a specific directory, you can check that the resolved path
is still contained within that directory.
</p>
<sample src="examples/TaintedPathFolder.c" />
<sample src="TaintedPath.c" />
</example>
<references>
@@ -62,7 +41,6 @@ is still contained within that directory.
OWASP:
<a href="https://owasp.org/www-community/attacks/Path_Traversal">Path Traversal</a>.
</li>
<li>Linux man pages: <a href="https://man7.org/linux/man-pages/man3/realpath.3.html">realpath(3)</a>.</li>
</references>
</qhelp>

View File

@@ -1,10 +0,0 @@
int main(int argc, char** argv) {
char *userAndFile = argv[2];
{
char fileBuffer[PATH_MAX];
snprintf(fileBuffer, sizeof(fileBuffer), "/home/%s", userAndFile);
// BAD: a string from the user is used in a filename
fopen(fileBuffer, "wb+");
}
}

View File

@@ -1,28 +0,0 @@
#include <stdio.h>
#include <string.h>
int main(int argc, char** argv) {
char *userAndFile = argv[2];
const char *baseDir = "/home/user/public/";
char fullPath[PATH_MAX];
// Attempt to concatenate the base directory and the user-supplied path
snprintf(fullPath, sizeof(fullPath), "%s%s", baseDir, userAndFile);
// Resolve the absolute path, normalizing any ".." or "."
char *resolvedPath = realpath(fullPath, NULL);
if (resolvedPath == NULL) {
perror("Error resolving path");
return 1;
}
// Check if the resolved path starts with the base directory
if (strncmp(baseDir, resolvedPath, strlen(baseDir)) != 0) {
free(resolvedPath);
return 1;
}
// GOOD: Path is within the intended directory
FILE *file = fopen(resolvedPath, "wb+");
free(resolvedPath);
}

View File

@@ -1,16 +0,0 @@
#include <stdio.h>
#include <string.h>
int main(int argc, char** argv) {
char *fileName = argv[2];
// Check for invalid sequences in the user input
if (strstr(fileName , "..") || strchr(fileName , '/') || strchr(fileName , '\\')) {
printf("Invalid filename.\n");
return 1;
}
char fileBuffer[PATH_MAX];
snprintf(fileBuffer, sizeof(fileBuffer), "/home/user/files/%s", fileName);
// GOOD: We know that the filename is safe and stays within the public folder
FILE *file = fopen(fileBuffer, "wb+");
}

View File

@@ -12,8 +12,8 @@ the required buffer size, but do not allocate space for the zero terminator.
</overview>
<recommendation>
<p>
The highlighted code segment creates a buffer without ensuring it's large enough to accommodate the copied data.
This leaves the code susceptible to a buffer overflow attack, which could lead to anything from program crashes to malicious code execution.
The expression highlighted by this rule creates a buffer that is of insufficient size to contain
the data being copied. This makes the code vulnerable to buffer overflow which can result in anything from a segmentation fault to a security vulnerability (particularly if the array is on stack-allocated memory).
</p>
<p>

View File

@@ -30,8 +30,6 @@ where
outlivesFullExpr(c) and
not c.isFromUninstantiatedTemplate(_) and
isUniquePointerDerefFunction(c.getTarget()) and
// Exclude cases where the pointer is implicitly converted to a non-pointer type
not c.getActualType() instanceof IntegralType and
isTemporary(c.getQualifier().getFullyConverted())
select c,
"The underlying unique pointer object is destroyed after the call to '" + c.getTarget() +

View File

@@ -17,6 +17,5 @@ import cpp
from FunctionCall call, Function target
where
call.getTarget() = target and
target.hasGlobalOrStdName("gets") and
target.getNumberOfParameters() = 1
target.hasGlobalOrStdName("gets")
select call, "'gets' does not guard against buffer overflow."

View File

@@ -1,5 +1,4 @@
## 0.9.12
### New Queries
---
category: newQuery
---
* Added a new query, `cpp/iterator-to-expired-container`, to detect the creation of iterators owned by a temporary objects that are about to be destroyed.

View File

@@ -1,10 +0,0 @@
## 1.0.0
### Breaking Changes
* CodeQL package management is now generally available, and all GitHub-produced CodeQL packages have had their version numbers increased to 1.0.0.
### Minor Analysis Improvements
* The "Use of unique pointer after lifetime ends" query (`cpp/use-of-unique-pointer-after-lifetime-ends`) no longer reports an alert when the pointer is converted to a boolean
* The "Variable not initialized before use" query (`cpp/not-initialised`) no longer reports an alert on static variables.

View File

@@ -1,5 +0,0 @@
## 1.0.1
### Minor Analysis Improvements
* The `cpp/dangerous-function-overflow` no longer produces a false positive alert when the `gets` function does not have exactly one parameter.

View File

@@ -1,3 +0,0 @@
## 1.0.2
No user-facing changes.

View File

@@ -1,2 +1,2 @@
---
lastReleaseVersion: 1.0.2
lastReleaseVersion: 0.9.11

View File

@@ -1,23 +0,0 @@
char * create (int arg) {
if (arg > 42) {
// this function may return NULL
return NULL;
}
char * r = malloc(arg);
snprintf(r, arg -1, "Hello");
return r;
}
void process(char *str) {
// str is dereferenced
if (str[0] == 'H') {
printf("Hello H\n");
}
}
void test(int arg) {
// first function returns a pointer that may be NULL
char *str = create(arg);
// str is not checked for nullness before being passed to process function
process(str);
}

View File

@@ -1,26 +0,0 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<overview>
<p>This rule finds a dereference of a function parameter, whose value comes from another function call that may return NULL, without checks in the meantime.</p>
</overview>
<recommendation>
<p>A check should be added between the return of the function which may return NULL, and its use by the function dereferencing ths pointer.</p>
</recommendation>
<example>
<sample src="DerefNullResult.cpp" />
</example>
<references>
<li>
<a href="https://www.owasp.org/index.php/Null_Dereference">
Null Dereference
</a>
</li>
</references>
</qhelp>

Some files were not shown because too many files have changed in this diff Show More