Using sqlite to illustrate models-as-data
This description uses / recycles a codeql workshop.
Build the codeql database
To get started, build the codeql database (adjust paths to your setup):
# Build the db with source commit id.
SRCDIR=$(pwd)
DB=$SRCDIR/java-sqlite-$(cd $SRCDIR && git rev-parse --short HEAD).db
echo $DB
test -d "$DB" && rm -fR "$DB"
mkdir -p "$DB"
# Use the correct codeql
export PATH="$(cd ../codeql && pwd):$PATH"
codeql database create --language=java -s . -j 8 -v $DB --command='./build.sh'
# Check for AddUser in the db
unzip -v $DB/src.zip | grep AddUser
Then add this database directory to your VS Code DATABASES tab.
Tests using a default query
You can run the stdlib query ../ql/java/ql/src/Security/CWE/CWE-089/SqlTainted.ql but will get no results. It does point at classes to inspect – in particular, the source and sink classes. Run ./Illustrations.ql; from the command line or vs studio code. Via cli:
# run query
codeql query run \
-v \
--database java-sqlite-e2e555c.db \
--output result.bqrs \
--threads=12 \
--ram=14000 \
Illustrations.ql
# format results
codeql bqrs decode --format=text result.bqrs | sed -n '/^Result set: #select/,$p'
This shows
Result set: #select
| ui | qsi |
+------+-------+
| args | query |
In the editor, these link to
main(ARGS)andconn.createStatement().executeUpdate(QUERY);
The second is correct, but System.console().readLine(); is not found.
Thus, SqlTainted.ql will not find anything.
TODO supplement sources via the model editor
-
We have no flow
- check source, sink
- we have a sink
- but ActiveThreatModelSource finds no source
- We can supplement in different ways
supplement codeql: Write full manual query: already in workshop
TODO supplement codeql: Add to FlowSource or a subclass
Note: this one area that just has to be known. Browsing source will not help you.
CodeQL reading hint:
class ActiveThreatModelSource extends DataFlow::Node
uses
this.(SourceNode).getThreatModel()
So following the cast (SourceNode) may be useful:
/**
,* A data flow source.
,*/
abstract class SourceNode extends DataFlow::Node
Following the abstract class is promising:
abstract class RemoteFlowSource extends SourceNode
and others.
In ../ql/java/ql/lib/Customizations.qll notice the comments mentioning RemoteFlowSource. Use imports from ../ql/java/ql/src/Security/CWE/CWE-089/SqlTainted.ql but note that there are conflicts. you will use
private import semmle.code.java.dataflow.FlowSources
Follow this to FlowSources, and find the mentioned RemoteFlowSource
abstract class RemoteFlowSource extends SourceNode
Add the custom source. The modified ../ql/java/ql/lib/Customizations.qll is
import java
private import semmle.code.java.dataflow.FlowSources
class ReadLine extends RemoteFlowSource {
ReadLine() {
exists(Call read |
read.getCallee().getName() = "readLine" and
read = this.asExpr()
)
}
override string getSourceType() { result = "Console readline" }
}
Note that the predicate
module QueryInjectionFlowConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node src) { src instanceof ActiveThreatModelSource }
...;
}
now also returns the readLine() result – although we extended RemoteFlowSource, not ActiveThreatModelSource
TODO supplement codeql: Add to models-as-data
- schema in codeql: ../ql/java/ql/lib/semmle/code/java/dataflow/internal/ExternalFlowExtensions.qll
- data sample: ../.github/codeql/extensions/jedis-db-local-java/models/redis.clients.jedis.model.yml
In the model editor, we see a java.io.*Console.*readline' (using show already modeled option)
1:$ rg -i 'java.io.*Console.*readline' ql/java
ql/java/ql/lib/ext/generated/java.io.model.yml
16: - ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
17: - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[0]", "Argument[this]", "taint", "df-generated"]
18: - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[1].ArrayElement", "Argument[this]", "taint", "df-generated"]
19: - ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
note: this file is in the generated/ tree.
The current readline modeling is in the summaryModel section; we need it
in a sourceModel
extensions:
- addsTo:
pack: codeql/java-all
extensible: summaryModel
data:
...
- ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
- ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[0]", "Argument[this]", "taint", "df-generated"]
- ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[1].ArrayElement", "Argument[this]", "taint", "df-generated"]
- ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument
The model editor will not show this because its already modeled. To illustrate text-based additions, we'll use plain text. Starting from
extensions:
- addsTo:
pack: codeql/java-all
extensible: summaryModel
data:
...
- ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
- ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[0]", "Argument[this]", "taint", "df-generated"]
- ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument[1].ArrayElement", "Argument[this]", "taint", "df-generated"]
- ["java.io", "Console", False, "readLine", "(String,Object[])", "", "Argument
and the field information
extensible predicate sourceModel(
string package, string type, boolean subtypes, string name, string signature, string ext,
string output, string kind, string provenance, QlBuiltins::ExtensionId madId
);
Starting from summaryModel
# summaryModel
# string package, string type, boolean subtypes, string name, string signature, string ext, string input, string output, string kind, string provenance, QlBuiltins::ExtensionId madId
- ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
we can construct the sourceModel
extensions:
- addsTo:
pack: codeql/java-all
extensible: sourceModel
data:
# sourceModel
# string package, string type, boolean subtypes, string name, string signature, string ext, string output, string kind, string provenance, QlBuiltins::ExtensionId madId
- ["java.io", "Console", False, "readLine", "()", "", "ReturnValue", "remote", "manual"]
# # from original
# # summaryModel
# # string package, string type, boolean subtypes, string name, string signature, string ext, string input, string output, string kind, string provenance, QlBuiltins::ExtensionId madId
# - ["java.io", "Console", False, "readLine", "()", "", "Argument[this]", "ReturnValue", "taint", "df-generated"]
and move this into ../.github/codeql/extensions/sqlite-db/models/sqlite.model.yml
To ensure that these model extensions are applied during query runs, include this setting
{
...,
"settings": {
...,
"codeQL.runningQueries.useExtensionPacks": "all"
}
}
in the workspace configuration file ../qllab.code-workspace
In some environments (e.g., older VS Code versions), you may also need to replicate this setting in ../.vscode/settings.json; there it simplifies to
"codeQL.runningQueries.useExtensionPacks": "all"
Now we can run ../ql/java/ql/src/Security/CWE/CWE-089/SqlTainted.ql again.