* Jedis Codeql Setup - fork at https://github.com/hohn/jedis - github db build: enable code scanning, advanced config - only java-kotlin, build-mode: none. - creates https://github.com/hohn/jedis/blob/master/.github/workflows/codeql.yml - action run at https://github.com/hohn/jedis/actions/workflows/codeql.yml - db download #+BEGIN_SRC sh # list dbs curl -H "Authorization: token $GITHUB_TOKEN" \ https://api.github.com/repos/hohn/jedis/code-scanning/analyses # Get DB via curl cd ~/work-gh/codeql-lab/assets curl -H "Authorization: token $GITHUB_TOKEN" \ -H "Accept: application/zip" \ -L \ https://api.github.com/repos/hohn/jedis/code-scanning/codeql/databases/java \ -o jedis-database-gh.zip #+END_SRC - db at ~/work-gh/codeql-lab/assets/jedis-database-gh.zip - local db build: #+BEGIN_SRC sh cd ~/work-gh/codeql-lab/ # Add the submodule git submodule add https://github.com/hohn/jedis extern/jedis # Initialize and clone the submodule git submodule update --init --recursive # Build directly once to resolve any errors cd ~/work-gh/codeql-lab/extern/jedis mvn install -DskipTests=true -Dmaven.javadoc.skip=true -B -V # Build under codeql # Step 1: Clean any prior Maven builds cd ~/work-gh/codeql-lab/extern/jedis mvn clean # Step 2: Run CodeQL DB creation with mvn install cd ~/work-gh/codeql-lab codeql database create assets/jedis-db-local \ --overwrite \ --language=java \ --command="mvn install -DskipTests=true -Dmaven.javadoc.skip=true -B -V" \ --source-root=extern/jedis #+END_SRC * Jedis Codeql Modeling ** Setup and Start #+BEGIN_SRC sh # Step 1: Go to your CodeQL lab directory cd ~/work-gh/codeql-lab # Step 2: Extract the prebuilt CodeQL database for the Jedis project unzip -q assets/jedis-db-local.zip # Step 3: Extract the CodeQL command-line tools (platform-specific) unzip -q assets/codeql-osx64.zip # Step 4: Change directory to the unpacked CodeQL CLI tools cd ~/work-gh/codeql-lab/codeql # Step 5: Add the CodeQL CLI directory to your shell's PATH # This allows you to run `codeql` from any location export PATH="$(pwd):$PATH" # Step 6: Launch Visual Studio Code with the lab workspace code qllab.code-workspace # In VS Code, perform the following setup manually: # - Set the current database to: jedis-db-local # (Usually from the CodeQL extension pane – this connects the UI to your analysis DB) # - Set the CodeQL CLI executable to: ~/work-gh/codeql-lab/codeql/codeql # (Tell the extension where to find the CLI you just extracted) # - In the CodeQL extension tab, scroll to the bottom and select: # 'CodeQL: Method modeling' to begin a guided modeling tutorial #+END_SRC ** Using the Editor Note that just by starting =CodeQL: Method modeling=, the new file : .github/codeql/extensions/jedis-db-local-java/codeql-pack.yml is created. ** Relevant Queries A quick =grep= shows #+BEGIN_SRC text grep 'java.*modelgen' files |grep -v test/ ql/java/ql/src/utils/modelgenerator ql/java/ql/src/utils/modelgenerator/CaptureNeutralModels.ql ql/java/ql/src/utils/modelgenerator/CaptureTypeBasedSummaryModels.ql ql/java/ql/src/utils/modelgenerator/CaptureSinkModels.ql ql/java/ql/src/utils/modelgenerator/CaptureContentSummaryModels.ql ql/java/ql/src/utils/modelgenerator/internal ql/java/ql/src/utils/modelgenerator/internal/CaptureModels.qll ql/java/ql/src/utils/modelgenerator/internal/CaptureTypeBasedSummaryModels.qll ql/java/ql/src/utils/modelgenerator/internal/CaptureModelsPrinting.qll ql/java/ql/src/utils/modelgenerator/CaptureSummaryModels.ql ql/java/ql/src/utils/modelgenerator/RegenerateModels.py ql/java/ql/src/utils/modelgenerator/CaptureSourceModels.ql ql/java/ql/src/utils/modelgenerator/debug ql/java/ql/src/utils/modelgenerator/debug/CaptureSummaryModelsPartialPath.ql ql/java/ql/src/utils/modelgenerator/debug/CaptureSummaryModelsPath.ql ql/java/ql/src/utils/modelgenerator/debug/README.md #+END_SRC ** Primary Query File The primary query file is : ../ql/java/ql/src/utils/modelgenerator/internal/CaptureModels.qll This acts as the backbone, exposing traits like: - SummaryModelGeneratorInput - ModelGeneratorCommonInput - isPrimitiveTypeUsedForBulkData(...) - Likely common predicates such as: + hasNoSideEffects(...) + isNeutralReturn(...) + isBulkGetterLike(...) These are imported by: - CaptureSinkModels.ql - CaptureSummaryModels.ql - CaptureContentSummaryModels.ql - CaptureHeuristicSummaryModels.ql - Design: Three Modeling Targets | Module | Implements | Purpose | | ---------------------------- | ------------------------------- | ------------------------------------------------ | | `SummaryModelGeneratorInput` | `SummaryModelGeneratorInputSig` | Models pass-through or computed summaries | | `SourceModelGeneratorInput` | `SourceModelGeneratorInputSig` | Models user-controlled or origin taint sources | | `SinkModelGeneratorInput` | `SinkModelGeneratorInputSig` | Models taint sinks (e.g., logging, SQL, network) | - Shared Input System ModelGeneratorCommonInput provides: - Name formatting - Type filtering (isRelevantType) - Signature stringification - “Approximate output” helpers like Argument[pos].Element This gives a stable data interface to the rest of the system. - Filtering logic #+BEGIN_SRC java private predicate relevant(Callable api) { api.isPublic() and api.getDeclaringType().isPublic() and api.fromSource() and not isUninterestingForModels(api) and not isInfrequentlyUsed(api.getCompilationUnit()) } #+END_SRC ** Experiment with test clone The needed imports are private, so clone : ql/java/ql/test/utils/modelgenerator/dataflow/CaptureSourceModels.ql and experiment there. #+BEGIN_SRC java import java import utils.modelgenerator.internal.CaptureModels import SourceModels import utils.test.InlineMadTest module InlineMadTestConfig implements InlineMadTestConfigSig { string getCapturedModel(Callable c) { result = Heuristic::captureSource(c) } string getKind() { result = "source" } } import InlineMadTest #+END_SRC * Modeling sqlite as dependency The tree : src-sqlite contains a trivial sample taken from a workshop. It uses =sqlite-jdbc-3.36.0.1.jar=, so we can use it to illustrate modeling on a smaller example. * Modeling Jedis as a Dependency in Model Editor ** Set up and run Editor To model =jedis= for taint analysis using the /model editor/, select the /"model as dependency"/ option. When this mode is active, the following CodeQL query is used: : /Users/hohn/work-gh/codeql-lab/ql/java/ql/src/utils/modeleditor/FrameworkModeEndpoints.ql This query defines: #+BEGIN_SRC java from PublicEndpointFromSource endpoint, boolean supported, string type where supported = isSupported(endpoint) and type = supportedType(endpoint) select endpoint, endpoint.getPackageName(), endpoint.getTypeName(), endpoint.getName(), endpoint.getParameterTypes(), supported, endpoint.getCompilationUnit().getParentContainer().getBaseName(), type #+END_SRC There is a direct connection between output columns in the model editor: - =supported = true= → shows in the UI as /"Method already modeled"/ - =supported = false= → shown as /"Unmodeled"/ ** Files Created or Modified by the Modeling Workflow - Upon launching ==CodeQL: Method modeling==, a new pack manifest is created: [[../.github/codeql/extensions/jedis-db-local-java/codeql-pack.yml]] - After selecting methods and saving, modeling results are written to: [[../.github/codeql/extensions/jedis-db-local-java/models/redis.clients.jedis.model.yml]] ** Workspace Configuration Required To ensure that these model extensions are applied during query runs, include this setting in the workspace configuration file [[../qllab.code-workspace]] In some environments (e.g., older VS Code versions), you may also need to replicate this setting in [[../.vscode/settings.json]] * Verifying the Modeled Sink Once the modeling is in place, a dataflow query like the following can be used to confirm the modeled sinks: #+BEGIN_SRC java import java private import semmle.code.java.dataflow.ExternalFlow private import semmle.code.java.dataflow.DataFlow from DataFlow::Node n, string type where sinkNode(n, type) and type = "code-injection" select n, type #+END_SRC Sample query result (run on the =jedis-db-local= database): - example.ql on jedis-db-local - finished in 2 seconds (14 results) | 1 | script | code-injection | | 2 | getBytes(...) | code-injection | | 3 | script | code-injection | | 4 | script | code-injection | | 5 | script | code-injection | | 6 | script | code-injection | | 7 | "return redis.call('get','foo')" | code-injection | | 8 | "return redis.call('get','foo')" | code-injection | | 9 | encode(...) | code-injection | | 10 | encode(...) | code-injection | | 11 | "return redis.call('get','foo')" | code-injection | | 12 | "return redis.call('get','foo')" | code-injection | | 13 | script | code-injection | | 14 | "return {}" | code-injection | * Identify usage of injection-related models in existing queries To verify whether existing CodeQL queries make use of the injection-related models, we can search for files in the =ql/java= and =ql/cpp= directories that contain the string =-injection=. This string often appears in taint-tracking configuration or query metadata. ** Java Queries The following command locates =.ql= and =.qll= files in the Java query suite that reference =-injection=: #+BEGIN_SRC sh rg -l -- '-injection' ql/java | grep '\.qll*' #+END_SRC Example output: #+BEGIN_SRC text ql/java/ql/src/Security/CWE/CWE-643/XPathInjection.ql ql/java/ql/src/Security/CWE/CWE-078/ExecTainted.ql ql/java/ql/src/Security/CWE/CWE-022/TaintedPath.ql ql/java/ql/src/Security/CWE/CWE-117/LogInjection.ql ql/java/ql/src/Security/CWE/CWE-470/FragmentInjection.ql ql/java/ql/src/Security/CWE/CWE-470/FragmentInjectionInPreferenceActivity.ql ql/java/ql/src/Security/CWE/CWE-730/RegexInjection.ql ql/java/ql/lib/semmle/code/java/security/XsltInjection.qll ql/java/ql/src/Security/CWE/CWE-090/LdapInjection.ql ql/java/ql/lib/semmle/code/java/security/GroovyInjection.qll ql/java/ql/lib/semmle/code/java/security/XPath.qll ql/java/ql/lib/semmle/code/java/security/TaintedEnvironmentVariableQuery.qll ql/java/ql/src/Security/CWE/CWE-074/XsltInjection.ql ql/java/ql/src/Security/CWE/CWE-074/JndiInjection.ql ... ql/java/ql/src/utils/modelgenerator/internal/CaptureModels.qll #+END_SRC These files include both top-level queries (under =src/Security/...=) and reusable model libraries (under =lib/semmle/...=). Experimental and framework-specific queries are also included. ** C++ Queries Likewise, to check for C++ queries that reference =-injection=, use: #+BEGIN_SRC sh rg -l -- '-injection' ql/cpp | grep '\.qll*' #+END_SRC Example output: #+BEGIN_SRC text ql/cpp/ql/src/Security/CWE/CWE-078/ExecTainted.ql ql/cpp/ql/src/Security/CWE/CWE-022/TaintedPath.ql ql/cpp/ql/src/experimental/Security/CWE/CWE-078/WordexpTainted.ql ql/cpp/ql/src/Security/CWE/CWE-089/SqlTainted.ql #+END_SRC These files indicate active use of injection-related taint tracking in the C++ suite as well. * TODO for java, the sqltainted query will find the sink, not the source yet. * TODO vulnerable sample, jedis Running the model editor a jedis db models jedis dependencies; we need jedis /as/ dependency to model it. * TODO vulnerable sample, sqlite For .eval() to show in a query, it has to be used in an application. So we modify src-sqlite/AddUser.java for jedis.