Codeql Recap

This commit is contained in:
Michael Hohn
2020-07-22 15:45:21 -07:00
committed by =Michael Hohn
parent c0bedda060
commit 9a41879346

View File

@@ -12,25 +12,26 @@ md_toc github < codeql-dataflow-sql-injection.md
- [The Problem in Action](#the-problem-in-action) - [The Problem in Action](#the-problem-in-action)
- [Problem Statement](#problem-statement) - [Problem Statement](#problem-statement)
- [Data flow overview and illustration](#data-flow-overview-and-illustration) - [Data flow overview and illustration](#data-flow-overview-and-illustration)
- [Codeql Recap](#codeql-recap)
- [from, where, select](#from-where-select)
- [Predicates](#predicates)
- [Existential quantifiers (local variables in queries)](#existential-quantifiers-local-variables-in-queries)
- [Classes](#classes)
- [Tutorial: Recap, Sources, Sinks and Flow Steps](#tutorial-recap-sources-sinks-and-flow-steps) - [Tutorial: Recap, Sources, Sinks and Flow Steps](#tutorial-recap-sources-sinks-and-flow-steps)
- [Codeql Recap](#codeql-recap)
- [The Data Sink](#the-data-sink) - [The Data Sink](#the-data-sink)
- [The Data Source](#the-data-source) - [The Data Source](#the-data-source)
- [The Extra Flow Step](#the-extra-flow-step) - [The Extra Flow Step](#the-extra-flow-step)
- [The CodeQL Data Flow Configuration](#the-codeql-data-flow-configuration) - [The CodeQL Taint Flow Configuration](#the-codeql-taint-flow-configuration)
- [Taint Flow Configuration](#taint-flow-configuration) - [Taint Flow Configuration](#taint-flow-configuration)
- [Path Problem Setup](#path-problem-setup) - [Path Problem Setup](#path-problem-setup)
- [Path Problem Query Format](#path-problem-query-format) - [Path Problem Query Format](#path-problem-query-format)
- [Tutorial: Data Flow Details](#tutorial-data-flow-details) - [Tutorial: Taint Flow Details](#tutorial-taint-flow-details)
- [The isSink Predicate](#the-issink-predicate) - [The isSink Predicate](#the-issink-predicate)
- [The Data Source](#the-data-source-1)
- [The Extra Flow Step](#the-extra-flow-step-1)
- [The isSource Predicate ](#the-issource-predicate-) - [The isSource Predicate ](#the-issource-predicate-)
- [The isAdditionalTaintStep Predicate](#the-isadditionaltaintstep-predicate) - [The isAdditionalTaintStep Predicate](#the-isadditionaltaintstep-predicate)
- [Complete query](#complete-query)
- [Appendix](#appendix) - [Appendix](#appendix)
- [Test case: simple.cc](#test-case-simplecc) - [The complete Query: SqlInjection.ql](#the-complete-query-sqlinjectionql)
- [bslstrings query and library: bslstrings.ql](#bslstrings-query-and-library-bslstringsql) - [The Database Writer: add-user.c](#the-database-writer-add-userc)
## Setup Instructions ## Setup Instructions
@@ -283,7 +284,138 @@ nodes, rather than for the full graph.
To illustrate the dataflow for this problem, we have a [collection of slides](https://drive.google.com/file/d/1eEG0eGVDVEQh0C-0_4UIMcD23AWwnGtV/view?usp=sharing) To illustrate the dataflow for this problem, we have a [collection of slides](https://drive.google.com/file/d/1eEG0eGVDVEQh0C-0_4UIMcD23AWwnGtV/view?usp=sharing)
for this workshop. for this workshop.
## Tutorial: Recap, Sources, Sinks and Flow Steps ## Codeql Recap
This is a brief review of codeql taken from the [full
introduction](https://git.io/JJqdS). For more details, see the [documentation
links](#documentation-links).
### from, where, select
Recall that codeql is a declarative language and a basic query is defined by a
_select_ clause, which specifies what the result of the query should be. For
example:
```ql
import cpp
select "hello world"
```
More complicated queries look like this:
```ql
from /* ... variable declarations ... */
where /* ... logical formulas ... */
select /* ... expressions ... */
```
The `from` clause specifies some variables that will be used in the query. The
`where` clause specifies some conditions on those variables in the form of logical
formulas. The `select` clauses speciifes what the results should be, and can refer
to variables defined in the `from` clause.
The `from` clause is defined as a series of variable declarations, where each
declaration has a _type_ and a _name_. For example:
```ql
from IfStmt ifStmt
select ifStmt
```
We are declaring a variable with the name `ifStmt` and the type `IfStmt` (from the
CodeQL standard library for analyzing C/C++). Variables represent a **set of
values**, initially constrained by the type of the variable. Here, the variable
`ifStmt` represents the set of all `if` statements in the C/C++ program, as we can
see if we run the query.
A query using all three clauses to find empty blocks:
```ql
from IfStmt ifStmt, Block block
where
ifStmt.getThen() = block and
block.getNumStmt() = 0
select ifStmt, "Empty if statement"
```
### Predicates
The other feature we will use are _predicates_. These provide a way to encapsulate
portions of logic in the program so that they can be reused. You can think of
them as a mini `from`-`where`-`select` query clause. Like a select clause they
also produce a set of "tuples" or rows in a result table.
We can introduce a new predicate in our query that identifies the set of empty
blocks in the program (for example, to reuse this feature in another query):
```ql
predicate isEmptyBlock(Block block) {
block.getNumStmt() = 0
}
from IfStmt ifStmt
where isEmptyBlock(ifStmt.getThen())
select ifStmt, "Empty if statement"
```
### Existential quantifiers (local variables in queries)
Although the terminology may sound scary if you are not familiar with logic and
logic programming, *existential quantifiers* are simply ways to introduce
temporary variables with some associated conditions. The syntax for them is:
```ql
exists(<variable declarations> | <formula>)
```
They have a similar structure to the `from` and `where` clauses, where the first
part allows you to declare one or more variables, and the second formula
("conditions") that can be applied to those variables.
For example, we can use this to refactor the query
```ql
from IfStmt ifStmt, Block block
where
ifStmt.getThen() = block and
block.getNumStmt() = 0
select ifStmt, "Empty if statement"
```
to use a temporary variable for the empty block:
```ql
from IfStmt ifStmt
where
exists(Block block |
ifStmt.getThen() = block and
block.getNumStmt() = 0
)
select ifStmt, "Empty if statement"
```
This is frequently used to convert a query into a predicate.
### Classes
Classes are a way in which you can define new types within CodeQL, as well as
providing an easy way to reuse and structure code.
Like all types in CodeQL, classes represent a set of values. For example, the
`Block` type is, in fact, a class, and it represents the set of all blocks in the
program. You can also think of a class as defining a set of logical conditions
that specifies the set of values for that class.
For example, we can define a new CodeQL class to represent empty blocks:
```ql
class EmptyBlock extends Block {
EmptyBlock() {
this.getNumStmt() = 0
}
}
```
and use it in a query:
```ql
from IfStmt ifStmt, EmptyBlock block
where ifStmt.getThen() = block
select ifStmt, "Empty if statement"
```
## Tutorial: Sources, Sinks and Flow Steps
XX: XX:
<!-- <!--
!-- The complete project can be downloaded via this !-- The complete project can be downloaded via this
@@ -297,19 +429,6 @@ autocomplete suggestions (Ctrl + Space) and the jump-to-definition command (F12
VS Code) are good ways explore the libraries. VS Code) are good ways explore the libraries.
### Codeql Recap
XX:
As quick test of your setup, import the ql cpp library and run the empty query
```ql
import cpp
select 1
```
We'll assume the `import cpp` is in the header of our query and not rewrite it
every time.
### The Data Sink ### The Data Sink
Now let's find the function `sqlite3_exec`. In CodeQL, this uses `Function` Now let's find the function `sqlite3_exec`. In CodeQL, this uses `Function`
and a `getName()` attribute. and a `getName()` attribute.