mirror of
https://github.com/hohn/codeql-dataflow-sql-injection.git
synced 2025-12-16 10:13:04 +01:00
Codeql Recap
This commit is contained in:
committed by
=Michael Hohn
parent
c0bedda060
commit
9a41879346
@@ -12,25 +12,26 @@ md_toc github < codeql-dataflow-sql-injection.md
|
|||||||
- [The Problem in Action](#the-problem-in-action)
|
- [The Problem in Action](#the-problem-in-action)
|
||||||
- [Problem Statement](#problem-statement)
|
- [Problem Statement](#problem-statement)
|
||||||
- [Data flow overview and illustration](#data-flow-overview-and-illustration)
|
- [Data flow overview and illustration](#data-flow-overview-and-illustration)
|
||||||
|
- [Codeql Recap](#codeql-recap)
|
||||||
|
- [from, where, select](#from-where-select)
|
||||||
|
- [Predicates](#predicates)
|
||||||
|
- [Existential quantifiers (local variables in queries)](#existential-quantifiers-local-variables-in-queries)
|
||||||
|
- [Classes](#classes)
|
||||||
- [Tutorial: Recap, Sources, Sinks and Flow Steps](#tutorial-recap-sources-sinks-and-flow-steps)
|
- [Tutorial: Recap, Sources, Sinks and Flow Steps](#tutorial-recap-sources-sinks-and-flow-steps)
|
||||||
- [Codeql Recap](#codeql-recap)
|
|
||||||
- [The Data Sink](#the-data-sink)
|
- [The Data Sink](#the-data-sink)
|
||||||
- [The Data Source](#the-data-source)
|
- [The Data Source](#the-data-source)
|
||||||
- [The Extra Flow Step](#the-extra-flow-step)
|
- [The Extra Flow Step](#the-extra-flow-step)
|
||||||
- [The CodeQL Data Flow Configuration](#the-codeql-data-flow-configuration)
|
- [The CodeQL Taint Flow Configuration](#the-codeql-taint-flow-configuration)
|
||||||
- [Taint Flow Configuration](#taint-flow-configuration)
|
- [Taint Flow Configuration](#taint-flow-configuration)
|
||||||
- [Path Problem Setup](#path-problem-setup)
|
- [Path Problem Setup](#path-problem-setup)
|
||||||
- [Path Problem Query Format](#path-problem-query-format)
|
- [Path Problem Query Format](#path-problem-query-format)
|
||||||
- [Tutorial: Data Flow Details](#tutorial-data-flow-details)
|
- [Tutorial: Taint Flow Details](#tutorial-taint-flow-details)
|
||||||
- [The isSink Predicate](#the-issink-predicate)
|
- [The isSink Predicate](#the-issink-predicate)
|
||||||
- [The Data Source](#the-data-source-1)
|
|
||||||
- [The Extra Flow Step](#the-extra-flow-step-1)
|
|
||||||
- [The isSource Predicate ](#the-issource-predicate-)
|
- [The isSource Predicate ](#the-issource-predicate-)
|
||||||
- [The isAdditionalTaintStep Predicate](#the-isadditionaltaintstep-predicate)
|
- [The isAdditionalTaintStep Predicate](#the-isadditionaltaintstep-predicate)
|
||||||
- [Complete query](#complete-query)
|
|
||||||
- [Appendix](#appendix)
|
- [Appendix](#appendix)
|
||||||
- [Test case: simple.cc](#test-case-simplecc)
|
- [The complete Query: SqlInjection.ql](#the-complete-query-sqlinjectionql)
|
||||||
- [bslstrings query and library: bslstrings.ql](#bslstrings-query-and-library-bslstringsql)
|
- [The Database Writer: add-user.c](#the-database-writer-add-userc)
|
||||||
|
|
||||||
|
|
||||||
## Setup Instructions
|
## Setup Instructions
|
||||||
@@ -283,7 +284,138 @@ nodes, rather than for the full graph.
|
|||||||
To illustrate the dataflow for this problem, we have a [collection of slides](https://drive.google.com/file/d/1eEG0eGVDVEQh0C-0_4UIMcD23AWwnGtV/view?usp=sharing)
|
To illustrate the dataflow for this problem, we have a [collection of slides](https://drive.google.com/file/d/1eEG0eGVDVEQh0C-0_4UIMcD23AWwnGtV/view?usp=sharing)
|
||||||
for this workshop.
|
for this workshop.
|
||||||
|
|
||||||
## Tutorial: Recap, Sources, Sinks and Flow Steps
|
## Codeql Recap
|
||||||
|
This is a brief review of codeql taken from the [full
|
||||||
|
introduction](https://git.io/JJqdS). For more details, see the [documentation
|
||||||
|
links](#documentation-links).
|
||||||
|
|
||||||
|
### from, where, select
|
||||||
|
Recall that codeql is a declarative language and a basic query is defined by a
|
||||||
|
_select_ clause, which specifies what the result of the query should be. For
|
||||||
|
example:
|
||||||
|
|
||||||
|
```ql
|
||||||
|
import cpp
|
||||||
|
|
||||||
|
select "hello world"
|
||||||
|
```
|
||||||
|
|
||||||
|
More complicated queries look like this:
|
||||||
|
```ql
|
||||||
|
from /* ... variable declarations ... */
|
||||||
|
where /* ... logical formulas ... */
|
||||||
|
select /* ... expressions ... */
|
||||||
|
```
|
||||||
|
|
||||||
|
The `from` clause specifies some variables that will be used in the query. The
|
||||||
|
`where` clause specifies some conditions on those variables in the form of logical
|
||||||
|
formulas. The `select` clauses speciifes what the results should be, and can refer
|
||||||
|
to variables defined in the `from` clause.
|
||||||
|
|
||||||
|
The `from` clause is defined as a series of variable declarations, where each
|
||||||
|
declaration has a _type_ and a _name_. For example:
|
||||||
|
|
||||||
|
```ql
|
||||||
|
from IfStmt ifStmt
|
||||||
|
select ifStmt
|
||||||
|
```
|
||||||
|
|
||||||
|
We are declaring a variable with the name `ifStmt` and the type `IfStmt` (from the
|
||||||
|
CodeQL standard library for analyzing C/C++). Variables represent a **set of
|
||||||
|
values**, initially constrained by the type of the variable. Here, the variable
|
||||||
|
`ifStmt` represents the set of all `if` statements in the C/C++ program, as we can
|
||||||
|
see if we run the query.
|
||||||
|
|
||||||
|
A query using all three clauses to find empty blocks:
|
||||||
|
```ql
|
||||||
|
from IfStmt ifStmt, Block block
|
||||||
|
where
|
||||||
|
ifStmt.getThen() = block and
|
||||||
|
block.getNumStmt() = 0
|
||||||
|
select ifStmt, "Empty if statement"
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
### Predicates
|
||||||
|
The other feature we will use are _predicates_. These provide a way to encapsulate
|
||||||
|
portions of logic in the program so that they can be reused. You can think of
|
||||||
|
them as a mini `from`-`where`-`select` query clause. Like a select clause they
|
||||||
|
also produce a set of "tuples" or rows in a result table.
|
||||||
|
|
||||||
|
We can introduce a new predicate in our query that identifies the set of empty
|
||||||
|
blocks in the program (for example, to reuse this feature in another query):
|
||||||
|
|
||||||
|
```ql
|
||||||
|
predicate isEmptyBlock(Block block) {
|
||||||
|
block.getNumStmt() = 0
|
||||||
|
}
|
||||||
|
|
||||||
|
from IfStmt ifStmt
|
||||||
|
where isEmptyBlock(ifStmt.getThen())
|
||||||
|
select ifStmt, "Empty if statement"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Existential quantifiers (local variables in queries)
|
||||||
|
Although the terminology may sound scary if you are not familiar with logic and
|
||||||
|
logic programming, *existential quantifiers* are simply ways to introduce
|
||||||
|
temporary variables with some associated conditions. The syntax for them is:
|
||||||
|
|
||||||
|
```ql
|
||||||
|
exists(<variable declarations> | <formula>)
|
||||||
|
```
|
||||||
|
|
||||||
|
They have a similar structure to the `from` and `where` clauses, where the first
|
||||||
|
part allows you to declare one or more variables, and the second formula
|
||||||
|
("conditions") that can be applied to those variables.
|
||||||
|
|
||||||
|
For example, we can use this to refactor the query
|
||||||
|
```ql
|
||||||
|
from IfStmt ifStmt, Block block
|
||||||
|
where
|
||||||
|
ifStmt.getThen() = block and
|
||||||
|
block.getNumStmt() = 0
|
||||||
|
select ifStmt, "Empty if statement"
|
||||||
|
```
|
||||||
|
|
||||||
|
to use a temporary variable for the empty block:
|
||||||
|
```ql
|
||||||
|
from IfStmt ifStmt
|
||||||
|
where
|
||||||
|
exists(Block block |
|
||||||
|
ifStmt.getThen() = block and
|
||||||
|
block.getNumStmt() = 0
|
||||||
|
)
|
||||||
|
select ifStmt, "Empty if statement"
|
||||||
|
```
|
||||||
|
|
||||||
|
This is frequently used to convert a query into a predicate.
|
||||||
|
|
||||||
|
### Classes
|
||||||
|
Classes are a way in which you can define new types within CodeQL, as well as
|
||||||
|
providing an easy way to reuse and structure code.
|
||||||
|
|
||||||
|
Like all types in CodeQL, classes represent a set of values. For example, the
|
||||||
|
`Block` type is, in fact, a class, and it represents the set of all blocks in the
|
||||||
|
program. You can also think of a class as defining a set of logical conditions
|
||||||
|
that specifies the set of values for that class.
|
||||||
|
|
||||||
|
For example, we can define a new CodeQL class to represent empty blocks:
|
||||||
|
```ql
|
||||||
|
class EmptyBlock extends Block {
|
||||||
|
EmptyBlock() {
|
||||||
|
this.getNumStmt() = 0
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
and use it in a query:
|
||||||
|
```ql
|
||||||
|
from IfStmt ifStmt, EmptyBlock block
|
||||||
|
where ifStmt.getThen() = block
|
||||||
|
select ifStmt, "Empty if statement"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tutorial: Sources, Sinks and Flow Steps
|
||||||
XX:
|
XX:
|
||||||
<!--
|
<!--
|
||||||
!-- The complete project can be downloaded via this
|
!-- The complete project can be downloaded via this
|
||||||
@@ -297,19 +429,6 @@ autocomplete suggestions (Ctrl + Space) and the jump-to-definition command (F12
|
|||||||
VS Code) are good ways explore the libraries.
|
VS Code) are good ways explore the libraries.
|
||||||
|
|
||||||
|
|
||||||
### Codeql Recap
|
|
||||||
XX:
|
|
||||||
|
|
||||||
As quick test of your setup, import the ql cpp library and run the empty query
|
|
||||||
```ql
|
|
||||||
import cpp
|
|
||||||
select 1
|
|
||||||
```
|
|
||||||
|
|
||||||
We'll assume the `import cpp` is in the header of our query and not rewrite it
|
|
||||||
every time.
|
|
||||||
|
|
||||||
|
|
||||||
### The Data Sink
|
### The Data Sink
|
||||||
Now let's find the function `sqlite3_exec`. In CodeQL, this uses `Function`
|
Now let's find the function `sqlite3_exec`. In CodeQL, this uses `Function`
|
||||||
and a `getName()` attribute.
|
and a `getName()` attribute.
|
||||||
|
|||||||
Reference in New Issue
Block a user