Files
codeql-lab/codeql-jedis/README.org
2025-07-11 11:13:09 -07:00

321 lines
13 KiB
Org Mode
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

* Jedis Codeql Setup
- fork at https://github.com/hohn/jedis
- github db build: enable code scanning, advanced config
- only java-kotlin, build-mode: none.
- creates https://github.com/hohn/jedis/blob/master/.github/workflows/codeql.yml
- action run at https://github.com/hohn/jedis/actions/workflows/codeql.yml
- db download
#+BEGIN_SRC sh
# list dbs
curl -H "Authorization: token $GITHUB_TOKEN" \
https://api.github.com/repos/hohn/jedis/code-scanning/analyses
# Get DB via curl
cd ~/work-gh/codeql-lab/assets
curl -H "Authorization: token $GITHUB_TOKEN" \
-H "Accept: application/zip" \
-L \
https://api.github.com/repos/hohn/jedis/code-scanning/codeql/databases/java \
-o jedis-database-gh.zip
#+END_SRC
- db at ~/work-gh/codeql-lab/assets/jedis-database-gh.zip
- local db build:
#+BEGIN_SRC sh
cd ~/work-gh/codeql-lab/
# Add the submodule
git submodule add https://github.com/hohn/jedis extern/jedis
# Initialize and clone the submodule
git submodule update --init --recursive
# Build directly once to resolve any errors
cd ~/work-gh/codeql-lab/extern/jedis
mvn install -DskipTests=true -Dmaven.javadoc.skip=true -B -V
# Build under codeql
# Step 1: Clean any prior Maven builds
cd ~/work-gh/codeql-lab/extern/jedis
mvn clean
# Step 2: Run CodeQL DB creation with mvn install
cd ~/work-gh/codeql-lab
codeql database create assets/jedis-db-local \
--overwrite \
--language=java \
--command="mvn install -DskipTests=true -Dmaven.javadoc.skip=true -B -V" \
--source-root=extern/jedis
#+END_SRC
* Jedis Codeql Modeling
** Setup and Start
#+BEGIN_SRC sh
# Step 1: Go to your CodeQL lab directory
cd ~/work-gh/codeql-lab
# Step 2: Extract the prebuilt CodeQL database for the Jedis project
unzip -q assets/jedis-db-local.zip
# Step 3: Extract the CodeQL command-line tools (platform-specific)
unzip -q assets/codeql-osx64.zip
# Step 4: Change directory to the unpacked CodeQL CLI tools
cd ~/work-gh/codeql-lab/codeql
# Step 5: Add the CodeQL CLI directory to your shell's PATH
# This allows you to run `codeql` from any location
export PATH="$(pwd):$PATH"
# Step 6: Launch Visual Studio Code with the lab workspace
code qllab.code-workspace
# In VS Code, perform the following setup manually:
# - Set the current database to: jedis-db-local
# (Usually from the CodeQL extension pane this connects the UI to your analysis DB)
# - Set the CodeQL CLI executable to: ~/work-gh/codeql-lab/codeql/codeql
# (Tell the extension where to find the CLI you just extracted)
# - In the CodeQL extension tab, scroll to the bottom and select:
# 'CodeQL: Method modeling' to begin a guided modeling tutorial
#+END_SRC
** Using the Editor
Note that just by starting =CodeQL: Method modeling=, the new file
: .github/codeql/extensions/jedis-db-local-java/codeql-pack.yml
is created.
** Relevant Queries
A quick =grep= shows
#+BEGIN_SRC text
grep 'java.*modelgen' files |grep -v test/
ql/java/ql/src/utils/modelgenerator
ql/java/ql/src/utils/modelgenerator/CaptureNeutralModels.ql
ql/java/ql/src/utils/modelgenerator/CaptureTypeBasedSummaryModels.ql
ql/java/ql/src/utils/modelgenerator/CaptureSinkModels.ql
ql/java/ql/src/utils/modelgenerator/CaptureContentSummaryModels.ql
ql/java/ql/src/utils/modelgenerator/internal
ql/java/ql/src/utils/modelgenerator/internal/CaptureModels.qll
ql/java/ql/src/utils/modelgenerator/internal/CaptureTypeBasedSummaryModels.qll
ql/java/ql/src/utils/modelgenerator/internal/CaptureModelsPrinting.qll
ql/java/ql/src/utils/modelgenerator/CaptureSummaryModels.ql
ql/java/ql/src/utils/modelgenerator/RegenerateModels.py
ql/java/ql/src/utils/modelgenerator/CaptureSourceModels.ql
ql/java/ql/src/utils/modelgenerator/debug
ql/java/ql/src/utils/modelgenerator/debug/CaptureSummaryModelsPartialPath.ql
ql/java/ql/src/utils/modelgenerator/debug/CaptureSummaryModelsPath.ql
ql/java/ql/src/utils/modelgenerator/debug/README.md
#+END_SRC
** Primary Query File
The primary query file is
: ../ql/java/ql/src/utils/modelgenerator/internal/CaptureModels.qll
This acts as the backbone, exposing traits like:
- SummaryModelGeneratorInput
- ModelGeneratorCommonInput
- isPrimitiveTypeUsedForBulkData(...)
- Likely common predicates such as:
+ hasNoSideEffects(...)
+ isNeutralReturn(...)
+ isBulkGetterLike(...)
These are imported by:
- CaptureSinkModels.ql
- CaptureSummaryModels.ql
- CaptureContentSummaryModels.ql
- CaptureHeuristicSummaryModels.ql
- Design: Three Modeling Targets
| Module | Implements | Purpose |
| ---------------------------- | ------------------------------- | ------------------------------------------------ |
| `SummaryModelGeneratorInput` | `SummaryModelGeneratorInputSig` | Models pass-through or computed summaries |
| `SourceModelGeneratorInput` | `SourceModelGeneratorInputSig` | Models user-controlled or origin taint sources |
| `SinkModelGeneratorInput` | `SinkModelGeneratorInputSig` | Models taint sinks (e.g., logging, SQL, network) |
- Shared Input System
ModelGeneratorCommonInput provides:
- Name formatting
- Type filtering (isRelevantType)
- Signature stringification
- “Approximate output” helpers like Argument[pos].Element
This gives a stable data interface to the rest of the system.
- Filtering logic
#+BEGIN_SRC java
private predicate relevant(Callable api) {
api.isPublic() and
api.getDeclaringType().isPublic() and
api.fromSource() and
not isUninterestingForModels(api) and
not isInfrequentlyUsed(api.getCompilationUnit())
}
#+END_SRC
** Experiment with test clone
The needed imports are private, so clone
: ql/java/ql/test/utils/modelgenerator/dataflow/CaptureSourceModels.ql
and experiment there.
#+BEGIN_SRC java
import java
import utils.modelgenerator.internal.CaptureModels
import SourceModels
import utils.test.InlineMadTest
module InlineMadTestConfig implements InlineMadTestConfigSig {
string getCapturedModel(Callable c) { result = Heuristic::captureSource(c) }
string getKind() { result = "source" }
}
import InlineMadTest<InlineMadTestConfig>
#+END_SRC
* Modeling sqlite as dependency
The tree
: src-sqlite
contains a trivial sample taken from a workshop. It uses
=sqlite-jdbc-3.36.0.1.jar=, so we can use it to illustrate modeling on a smaller
example.
* Modeling Jedis as a Dependency in Model Editor
** Set up and run Editor
To model =jedis= for taint analysis using the /model editor/, select the /"model
as dependency"/ option.
When this mode is active, the following CodeQL query is used:
: /Users/hohn/work-gh/codeql-lab/ql/java/ql/src/utils/modeleditor/FrameworkModeEndpoints.ql
This query defines:
#+BEGIN_SRC java
from PublicEndpointFromSource endpoint, boolean supported, string type
where
supported = isSupported(endpoint) and
type = supportedType(endpoint)
select endpoint, endpoint.getPackageName(), endpoint.getTypeName(), endpoint.getName(),
endpoint.getParameterTypes(), supported,
endpoint.getCompilationUnit().getParentContainer().getBaseName(), type
#+END_SRC
There is a direct connection between output columns in the model editor:
- =supported = true= → shows in the UI as /"Method already modeled"/
- =supported = false= → shown as /"Unmodeled"/
** Files Created or Modified by the Modeling Workflow
- Upon launching ==CodeQL: Method modeling==, a new pack manifest is created:
[[../.github/codeql/extensions/jedis-db-local-java/codeql-pack.yml]]
- After selecting methods and saving, modeling results are written to:
[[../.github/codeql/extensions/jedis-db-local-java/models/redis.clients.jedis.model.yml]]
** Workspace Configuration Required
WHAT SETTING?
To ensure that these model extensions are applied during query runs, include
this setting in the workspace configuration file [[../qllab.code-workspace]]
In some environments (e.g., older VS Code versions), you may also need to
replicate this setting in [[../.vscode/settings.json]]
* Verifying the Modeled Sink
Once the modeling is in place, a dataflow query like the following can be used
to confirm the modeled sinks:
#+BEGIN_SRC java
import java
private import semmle.code.java.dataflow.ExternalFlow
private import semmle.code.java.dataflow.DataFlow
from DataFlow::Node n, string type
where sinkNode(n, type) and type = "code-injection"
select n, type
#+END_SRC
Sample query result (run on the =jedis-db-local= database):
- example.ql on jedis-db-local - finished in 2 seconds (14 results)
| 1 | script | code-injection |
| 2 | getBytes(...) | code-injection |
| 3 | script | code-injection |
| 4 | script | code-injection |
| 5 | script | code-injection |
| 6 | script | code-injection |
| 7 | "return redis.call('get','foo')" | code-injection |
| 8 | "return redis.call('get','foo')" | code-injection |
| 9 | encode(...) | code-injection |
| 10 | encode(...) | code-injection |
| 11 | "return redis.call('get','foo')" | code-injection |
| 12 | "return redis.call('get','foo')" | code-injection |
| 13 | script | code-injection |
| 14 | "return {}" | code-injection |
* Identify usage of injection-related models in existing queries
To verify whether existing CodeQL queries make use of the injection-related
models, we can search for files in the =ql/java= and =ql/cpp= directories that
contain the string =-injection=. This string often appears in taint-tracking
configuration or query metadata.
** Java Queries
The following command locates =.ql= and =.qll= files in the Java query suite that reference =-injection=:
#+BEGIN_SRC sh
rg -l -- '-injection' ql/java | grep '\.qll*'
#+END_SRC
Example output:
#+BEGIN_SRC text
ql/java/ql/src/Security/CWE/CWE-643/XPathInjection.ql
ql/java/ql/src/Security/CWE/CWE-078/ExecTainted.ql
ql/java/ql/src/Security/CWE/CWE-022/TaintedPath.ql
ql/java/ql/src/Security/CWE/CWE-117/LogInjection.ql
ql/java/ql/src/Security/CWE/CWE-470/FragmentInjection.ql
ql/java/ql/src/Security/CWE/CWE-470/FragmentInjectionInPreferenceActivity.ql
ql/java/ql/src/Security/CWE/CWE-730/RegexInjection.ql
ql/java/ql/lib/semmle/code/java/security/XsltInjection.qll
ql/java/ql/src/Security/CWE/CWE-090/LdapInjection.ql
ql/java/ql/lib/semmle/code/java/security/GroovyInjection.qll
ql/java/ql/lib/semmle/code/java/security/XPath.qll
ql/java/ql/lib/semmle/code/java/security/TaintedEnvironmentVariableQuery.qll
ql/java/ql/src/Security/CWE/CWE-074/XsltInjection.ql
ql/java/ql/src/Security/CWE/CWE-074/JndiInjection.ql
...
ql/java/ql/src/utils/modelgenerator/internal/CaptureModels.qll
#+END_SRC
These files include both top-level queries (under =src/Security/...=) and reusable model libraries (under =lib/semmle/...=). Experimental and framework-specific queries are also included.
** C++ Queries
Likewise, to check for C++ queries that reference =-injection=, use:
#+BEGIN_SRC sh
rg -l -- '-injection' ql/cpp | grep '\.qll*'
#+END_SRC
Example output:
#+BEGIN_SRC text
ql/cpp/ql/src/Security/CWE/CWE-078/ExecTainted.ql
ql/cpp/ql/src/Security/CWE/CWE-022/TaintedPath.ql
ql/cpp/ql/src/experimental/Security/CWE/CWE-078/WordexpTainted.ql
ql/cpp/ql/src/Security/CWE/CWE-089/SqlTainted.ql
#+END_SRC
These files indicate active use of injection-related taint tracking in the C++ suite as well.
* TODO for java, the sqltainted query will find the sink, not the source yet.
[[../ql/java/ql/src/Security/CWE/CWE-089/SqlTainted.ql]]
* TODO vulnerable sample, jedis
Running the model editor a jedis db models jedis dependencies; we need jedis
/as/ dependency to model it.
* TODO vulnerable sample, sqlite
For .eval() to show in a query, it has to be used in an application. So we
modify src-sqlite/AddUser.java for jedis.