mirror of
https://github.com/hohn/codeql-dataflow-sql-injection.git
synced 2025-12-16 18:23:05 +01:00
topic reordering
This commit is contained in:
committed by
=Michael Hohn
parent
62234f4d55
commit
12a90e9a54
@@ -3,12 +3,16 @@
|
|||||||
-->
|
-->
|
||||||
# CodeQL tutorial for C/C++: data flow and SQL injection
|
# CodeQL tutorial for C/C++: data flow and SQL injection
|
||||||
|
|
||||||
|
xx:
|
||||||
|
md_toc github < codeql-dataflow-sql-injection.md
|
||||||
|
|
||||||
|
- [CodeQL tutorial for C/C++: data flow and SQL injection](#codeql-tutorial-for-cc-data-flow-and-sql-injection)
|
||||||
- [Setup instructions](#setup-instructions)
|
- [Setup instructions](#setup-instructions)
|
||||||
- [Documentation links](#documentation-links)
|
- [Documentation links](#documentation-links)
|
||||||
|
- [The Problem in Action: running the code to see the problem](#the-problem-in-action-running-the-code-to-see-the-problem)
|
||||||
- [Problem statement](#problem-statement)
|
- [Problem statement](#problem-statement)
|
||||||
- [Tutorial, part 1: running the code to see the problem](#tutorial-part-1-running-the-code-to-see-the-problem)
|
|
||||||
- [Data flow overview and illustration](#data-flow-overview-and-illustration)
|
- [Data flow overview and illustration](#data-flow-overview-and-illustration)
|
||||||
- [Tutorial, part 3: recap, sources and sinks](#tutorial-part-3-recap-sources-and-sinks)
|
- [Tutorial: recap, sources and sinks](#tutorial-recap-sources-and-sinks)
|
||||||
- [Codeql recap](#codeql-recap)
|
- [Codeql recap](#codeql-recap)
|
||||||
- [Call to SQL query execution (the data sink)](#call-to-sql-query-execution-the-data-sink)
|
- [Call to SQL query execution (the data sink)](#call-to-sql-query-execution-the-data-sink)
|
||||||
- [Non-constant query strings and untrusted data (the data source)](#non-constant-query-strings-and-untrusted-data-the-data-source)
|
- [Non-constant query strings and untrusted data (the data source)](#non-constant-query-strings-and-untrusted-data-the-data-source)
|
||||||
@@ -16,7 +20,7 @@
|
|||||||
- [Taint flow configuration](#taint-flow-configuration)
|
- [Taint flow configuration](#taint-flow-configuration)
|
||||||
- [Path problem setup](#path-problem-setup)
|
- [Path problem setup](#path-problem-setup)
|
||||||
- [Path problem query format](#path-problem-query-format)
|
- [Path problem query format](#path-problem-query-format)
|
||||||
- [Tutorial, part 2: data flow details](#tutorial-part-2-data-flow-details)
|
- [Tutorial: data flow details](#tutorial-data-flow-details)
|
||||||
- [isSource predicate ](#issource-predicate-)
|
- [isSource predicate ](#issource-predicate-)
|
||||||
- [isSink predicate ](#issink-predicate-)
|
- [isSink predicate ](#issink-predicate-)
|
||||||
- [Additional data flow features: the isAdditionalTaintStep predicate](#additional-data-flow-features-the-isadditionaltaintstep-predicate)
|
- [Additional data flow features: the isAdditionalTaintStep predicate](#additional-data-flow-features-the-isadditionaltaintstep-predicate)
|
||||||
@@ -54,6 +58,78 @@ If you get stuck, try searching our documentation and blog posts for help and id
|
|||||||
- [Learning CodeQL for C/C++](https://help.semmle.com/QL/learn-ql/cpp/ql-for-cpp.html)
|
- [Learning CodeQL for C/C++](https://help.semmle.com/QL/learn-ql/cpp/ql-for-cpp.html)
|
||||||
- [Using the CodeQL extension for VS Code](https://help.semmle.com/codeql/codeql-for-vscode.html)
|
- [Using the CodeQL extension for VS Code](https://help.semmle.com/codeql/codeql-for-vscode.html)
|
||||||
|
|
||||||
|
## The Problem in Action: running the code to see the problem
|
||||||
|
|
||||||
|
This program can be compiled and linked, and a simple sqlite db created via
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# Build
|
||||||
|
./build.sh
|
||||||
|
|
||||||
|
# Prepare db
|
||||||
|
./admin rm-db
|
||||||
|
./admin create-db
|
||||||
|
./admin show-db
|
||||||
|
```
|
||||||
|
|
||||||
|
Users can be added via `stdin` in several ways; the second is a pretend "server"
|
||||||
|
using the `echo` command.
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# Add regular user interactively
|
||||||
|
./add-user 2>> users.log
|
||||||
|
First User
|
||||||
|
|
||||||
|
# Regular user via "external" process
|
||||||
|
echo "User Outside" | ./add-user 2>> users.log
|
||||||
|
```
|
||||||
|
|
||||||
|
Check the db and log:
|
||||||
|
```
|
||||||
|
# Check
|
||||||
|
./admin show-db
|
||||||
|
|
||||||
|
tail -4 users.log
|
||||||
|
```
|
||||||
|
|
||||||
|
Looks ok:
|
||||||
|
```
|
||||||
|
0:$ ./admin show-db
|
||||||
|
87797|First User
|
||||||
|
87808|User Outside
|
||||||
|
|
||||||
|
0:$ tail -4 users.log
|
||||||
|
[Tue Jul 21 14:15:46 2020] query: INSERT INTO users VALUES (87797, 'First User')
|
||||||
|
[Tue Jul 21 14:17:07 2020] query: INSERT INTO users VALUES (87808, 'User Outside')
|
||||||
|
```
|
||||||
|
|
||||||
|
But there may be bad input; this one guesses the table name and drops it:
|
||||||
|
```sh
|
||||||
|
# Add Johnny Droptable
|
||||||
|
./add-user 2>> users.log
|
||||||
|
Johnny'); DROP TABLE users; --
|
||||||
|
```
|
||||||
|
|
||||||
|
And then we have this:
|
||||||
|
```sh
|
||||||
|
# And the problem:
|
||||||
|
./admin show-db
|
||||||
|
0:$ ./admin show-db
|
||||||
|
Error: near line 2: no such table: users
|
||||||
|
```
|
||||||
|
|
||||||
|
What happened? The log shows that data was treated as command:
|
||||||
|
```
|
||||||
|
1:$ tail -4 users.log
|
||||||
|
[Tue Jul 21 14:15:46 2020] query: INSERT INTO users VALUES (87797, 'First User')
|
||||||
|
[Tue Jul 21 14:17:07 2020] query: INSERT INTO users VALUES (87808, 'User Outside')
|
||||||
|
[Tue Jul 21 14:18:25 2020] query: INSERT INTO users VALUES (87817, 'Johnny'); DROP TABLE users; --')
|
||||||
|
```
|
||||||
|
|
||||||
|
Looking ahead, we now *know* that there is unsafe external data (source)
|
||||||
|
which reaches (flow path) a database-writing command (sink). Thus, a query
|
||||||
|
written against this code should find at least one taint flow path.
|
||||||
|
|
||||||
## Problem statement
|
## Problem statement
|
||||||
|
|
||||||
Many security problems can be phrased in terms of _information flow_:
|
Many security problems can be phrased in terms of _information flow_:
|
||||||
@@ -70,9 +146,9 @@ data etc.
|
|||||||
|
|
||||||
We will use CodeQL to analyze the source code constructing a SQL
|
We will use CodeQL to analyze the source code constructing a SQL
|
||||||
query using string concatenation and then executing that query
|
query using string concatenation and then executing that query
|
||||||
string. The following example uses the `sqlite3` library and
|
string. The following example uses the `sqlite3` library; it
|
||||||
- receives user-provided data from `stdin`,
|
- receives user-provided data from `stdin` and keeps it in `buf`
|
||||||
- uses environment data in `id`,
|
- uses environment data and stores it in `id`,
|
||||||
- runs a query in `sqlite3_exec`
|
- runs a query in `sqlite3_exec`
|
||||||
|
|
||||||
This is intentionally simple code, but it has all the elements that have to be
|
This is intentionally simple code, but it has all the elements that have to be
|
||||||
@@ -150,8 +226,8 @@ int main(int argc, char* argv[]) {
|
|||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
In terms of sources, sinks, and information flow, the concrete problem is:
|
In terms of sources, sinks, and information flow, the concrete problem for codeql is:
|
||||||
1. specifying `stdin` as **source** using codeql,
|
1. specifying `buf` as **source**,
|
||||||
2. specifying the `query` argument to `sqlite3_exec()` as **sink**,
|
2. specifying the `query` argument to `sqlite3_exec()` as **sink**,
|
||||||
3. specifying some code-specific data flow steps for the codeql library,
|
3. specifying some code-specific data flow steps for the codeql library,
|
||||||
3. using the codeql taint flow library find taint flow paths (if there are any)
|
3. using the codeql taint flow library find taint flow paths (if there are any)
|
||||||
@@ -160,77 +236,7 @@ In terms of sources, sinks, and information flow, the concrete problem is:
|
|||||||
In the following, we go into more concrete detail and develop codedql scripts to
|
In the following, we go into more concrete detail and develop codedql scripts to
|
||||||
solve this problem.
|
solve this problem.
|
||||||
|
|
||||||
## Tutorial, part 1: running the code to see the problem
|
|
||||||
|
|
||||||
This program can be compiled and linked, and a simple sqlite db created via
|
|
||||||
|
|
||||||
```sh
|
|
||||||
# Build
|
|
||||||
./build.sh
|
|
||||||
|
|
||||||
# Prepare db
|
|
||||||
./admin rm-db
|
|
||||||
./admin create-db
|
|
||||||
./admin show-db
|
|
||||||
```
|
|
||||||
|
|
||||||
Users can be added via `stdin` in several ways; the second is a pretend "server"
|
|
||||||
using the `echo` command.
|
|
||||||
|
|
||||||
```sh
|
|
||||||
# Add regular user interactively
|
|
||||||
./add-user 2>> users.log
|
|
||||||
First User
|
|
||||||
|
|
||||||
# Regular user via "external" process
|
|
||||||
echo "User Outside" | ./add-user 2>> users.log
|
|
||||||
```
|
|
||||||
|
|
||||||
Check the db and log:
|
|
||||||
```
|
|
||||||
# Check
|
|
||||||
./admin show-db
|
|
||||||
|
|
||||||
tail -4 users.log
|
|
||||||
```
|
|
||||||
|
|
||||||
Looks ok:
|
|
||||||
```
|
|
||||||
0:$ ./admin show-db
|
|
||||||
87797|First User
|
|
||||||
87808|User Outside
|
|
||||||
|
|
||||||
0:$ tail -4 users.log
|
|
||||||
[Tue Jul 21 14:15:46 2020] query: INSERT INTO users VALUES (87797, 'First User')
|
|
||||||
[Tue Jul 21 14:17:07 2020] query: INSERT INTO users VALUES (87808, 'User Outside')
|
|
||||||
```
|
|
||||||
|
|
||||||
But there may be bad input; this one guesses the table name and drops it:
|
|
||||||
```sh
|
|
||||||
# Add Johnny Droptable
|
|
||||||
./add-user 2>> users.log
|
|
||||||
Johnny'); DROP TABLE users; --
|
|
||||||
```
|
|
||||||
|
|
||||||
And then we have this:
|
|
||||||
```sh
|
|
||||||
# And the problem:
|
|
||||||
./admin show-db
|
|
||||||
0:$ ./admin show-db
|
|
||||||
Error: near line 2: no such table: users
|
|
||||||
```
|
|
||||||
|
|
||||||
What happened? The log shows that data was treated as command:
|
|
||||||
```
|
|
||||||
1:$ tail -4 users.log
|
|
||||||
[Tue Jul 21 14:15:46 2020] query: INSERT INTO users VALUES (87797, 'First User')
|
|
||||||
[Tue Jul 21 14:17:07 2020] query: INSERT INTO users VALUES (87808, 'User Outside')
|
|
||||||
[Tue Jul 21 14:18:25 2020] query: INSERT INTO users VALUES (87817, 'Johnny'); DROP TABLE users; --')
|
|
||||||
```
|
|
||||||
|
|
||||||
Looking ahead, we now *know* that there is unsafe external data (source)
|
|
||||||
which reaches (flow path) a database-writing command (sink). Thus, a query
|
|
||||||
written against this code should find at least one taint flow path.
|
|
||||||
|
|
||||||
## Data flow overview and illustration
|
## Data flow overview and illustration
|
||||||
In the previous sections we identified the sources of problematic strings
|
In the previous sections we identified the sources of problematic strings
|
||||||
@@ -271,7 +277,7 @@ nodes, rather than for the full graph.
|
|||||||
To illustrate the dataflow for this problem, we have a [collection of slides](https://drive.google.com/file/d/1eEG0eGVDVEQh0C-0_4UIMcD23AWwnGtV/view?usp=sharing)
|
To illustrate the dataflow for this problem, we have a [collection of slides](https://drive.google.com/file/d/1eEG0eGVDVEQh0C-0_4UIMcD23AWwnGtV/view?usp=sharing)
|
||||||
for this workshop.
|
for this workshop.
|
||||||
|
|
||||||
## Tutorial, part 3: recap, sources and sinks
|
## Tutorial: recap, sources and sinks
|
||||||
XX:
|
XX:
|
||||||
<!--
|
<!--
|
||||||
!-- The complete project can be downloaded via this
|
!-- The complete project can be downloaded via this
|
||||||
@@ -332,7 +338,7 @@ from FunctionCall fc
|
|||||||
where fc.<tab>
|
where fc.<tab>
|
||||||
```
|
```
|
||||||
|
|
||||||
Now, we are looking for the call's `*target*`; completion shows `getTarget()`,
|
Now, we are looking for the call's *target*; completion shows `getTarget()`,
|
||||||
and we can finish that to
|
and we can finish that to
|
||||||
|
|
||||||
```ql
|
```ql
|
||||||
@@ -482,7 +488,7 @@ where
|
|||||||
select sink, source, sink, "Sqli flow from $@", source, "source"
|
select sink, source, sink, "Sqli flow from $@", source, "source"
|
||||||
```
|
```
|
||||||
|
|
||||||
## Tutorial, part 2: data flow details
|
## Tutorial: data flow details
|
||||||
With the dataflow configuration in place, we just need to provide the details for source(s), sink(s), and taint step(s).
|
With the dataflow configuration in place, we just need to provide the details for source(s), sink(s), and taint step(s).
|
||||||
|
|
||||||
### isSource predicate
|
### isSource predicate
|
||||||
|
|||||||
Reference in New Issue
Block a user