*** (PARTIAL) Extending Queries with Customizations.qll for C

This commit is contained in:
Michael Hohn
2025-07-30 21:45:48 -07:00
committed by =Michael Hohn
parent fa875f4ea0
commit 9ba32c29cd

View File

@@ -169,9 +169,12 @@
and repetitive pattern, making it ideal for large-scale modeling. The CodeQL and repetitive pattern, making it ideal for large-scale modeling. The CodeQL
model editor can be used to efficiently define sources and sinks for such model editor can be used to efficiently define sources and sinks for such
cases. A detailed explanation is provided cases. A detailed explanation is provided
OK
in [[file:~/work-gh/codeql-lab/codeql-jedis-java/README.org::*Modeling Jedis as a Dependency in Model Editor][Modeling Jedis as a Dependency in Model Editor]], while validation of in [[file:~/work-gh/codeql-lab/codeql-jedis-java/README.org::*Modeling Jedis as a Dependency in Model Editor][Modeling Jedis as a Dependency in Model Editor]], while validation of
OK
the modeled sink is discussed in [[file:~/work-gh/codeql-lab/codeql-jedis-java/README.org::*Verifying the Modeled Sink][Verifying the Modeled Sink]]. the modeled sink is discussed in [[file:~/work-gh/codeql-lab/codeql-jedis-java/README.org::*Verifying the Modeled Sink][Verifying the Modeled Sink]].
Finally, the query-level usage of these models can be seen Finally, the query-level usage of these models can be seen
OK
in [[file:~/work-gh/codeql-lab/codeql-jedis-java/README.org::*Identify usage of injection-related models in existing queries][Identify usage of injection-related models in existing queries]]. in [[file:~/work-gh/codeql-lab/codeql-jedis-java/README.org::*Identify usage of injection-related models in existing queries][Identify usage of injection-related models in existing queries]].
*** Customizations via Model Editor: Single-function case (Java SQLite sample) *** Customizations via Model Editor: Single-function case (Java SQLite sample)
@@ -184,6 +187,7 @@
[[./.github/codeql/extensions/sqlite-db/codeql-pack.yml]], and the extension data [[./.github/codeql/extensions/sqlite-db/codeql-pack.yml]], and the extension data
is provided in is provided in
[[./.github/codeql/extensions/sqlite-db/models/sqlite.model.yml]]. A detailed [[./.github/codeql/extensions/sqlite-db/models/sqlite.model.yml]]. A detailed
*OK*
explanation is available in [[file:~/work-gh/codeql-lab/codeql-sqlite-java/README.org::*Using sqlite to illustrate models-as-data][Using sqlite to illustrate models-as-data]]. explanation is available in [[file:~/work-gh/codeql-lab/codeql-sqlite-java/README.org::*Using sqlite to illustrate models-as-data][Using sqlite to illustrate models-as-data]].
To support this, we explain how the "models-as-data" system works To support this, we explain how the "models-as-data" system works
@@ -195,63 +199,72 @@
*** Review: SQLite Injection Workshop (C) *** Review: SQLite Injection Workshop (C)
This is the C version of the injection workshop, based on This is the C version of the injection workshop, based on
[[file:~/work-gh/codeql-lab/codeql-dataflow-sql-injection-c/add-user.c]]. It [[./codeql-dataflow-sql-injection-c/add-user.c]]. It
serves as the basis for both the "models-as-data" manual modeling and the serves as the basis for both the "models-as-data" manual modeling and the
extension via Customizations.qll. extension via =Customizations.qll=.
*** Use models-as-data QL code directly (no graphical editor) *** (PARTIAL) Use models-as-data QL code directly (no graphical editor)
This section focuses on applying the models-as-data system without using the This section focuses on using the models-as-data system *without* the
graphical model editor. While model definition files and supporting data graphical model editor. While model definition files and supporting data
already exist, we manually author YAML files for new models. This approach is already exist, we manually write YAML files to add or override flow
especially relevant for C, where graphical tooling is limited or nonexistent. behavior. This approach is especially relevant for C, where graphical tooling
is limited or nonexistent.
As reinforcement, we use the C version of the SQLite injection workshop: As reinforcement, we reuse the C version of the SQLite injection workshop:
- The code sample is at [[file:~/work-gh/codeql-lab/codeql-dataflow-sql-injection-c/add-user.c]]. - The code sample is at
- The accompanying query is [[file:~/work-gh/codeql-lab/codeql-dataflow-sql-injection-c/SqlInjection.ql]]. [[file:~/work-gh/codeql-lab/codeql-dataflow-sql-injection-c/add-user.c]].
- The accompanying query is
[[file:~/work-gh/codeql-lab/codeql-dataflow-sql-injection-c/SqlInjection.ql]].
We extend this example by modeling key functions manually: For structural reference, see the Java versions documentation (not the editor
- Add a source model for =count = read(STDIN_FILENO, buf, BUFSIZE);= interface): [[file:~/work-gh/codeql-lab/codeql-sqlite-java/README.org::*Using sqlite to illustrate models-as-data][Using sqlite to illustrate models-as-data]]. There is no separate
- Add a sink model for =rc = sqlite3_exec(db, query, NULL, 0, &zErrMsg);= C-specific walkthrough because the YAML structure and logic are nearly
identical.
For reference, see the Java versions structure (but not the graphical For workshop use, we extend the example by modeling key functions manually:
editor): [[file:~/work-gh/codeql-lab/codeql-sqlite-java/README.org::*Using sqlite to illustrate models-as-data][Using sqlite to illustrate models-as-data]], and the corresponding - Add a source model for: =count = read(STDIN_FILENO, buf, BUFSIZE);=
C-specific walkthrough: [[file:~/work-gh/codeql-lab/codeql-dataflow-sql-injection-c/README.org::*Using sqlite to illustrate models-as-data][Using sqlite to illustrate models-as-data]]. - Add a sink model for: =rc = sqlite3_exec(db, query, NULL, 0, &zErrMsg);=
We demonstrate how to define YAML-based models for standard functions like We demonstrate how to define YAML-based models for standard functions like
=read()= and verify their effect using the out-of-the-box query =read()= and verify their effect using the out-of-the-box query:
[[./ql/cpp/ql/src/Security/CWE/CWE-089/SqlTainted.ql][SqlTainted.ql]]. As an additional example, we introduce the higher-level, [[./ql/cpp/ql/src/Security/CWE/CWE-089/SqlTainted.ql][SqlTainted.ql]].
redundant =char* get_user_info()= as a custom source—even though it internally
As an additional teaching case, we introduce the higher-level, redundant
function =char* get_user_info()= as a custom source—even though it internally
calls a function already modeled as a source—to illustrate how user-defined calls a function already modeled as a source—to illustrate how user-defined
extensions propagate through the query logic. extensions affect propagation logic.
*** Extending Queries with Customizations.qll for C *** (PARTIAL) Extending Queries with Customizations.qll for C
The manual YAML modeling approach from the previous section works well for The manual YAML modeling approach described earlier works well for small or
isolated cases. However, to integrate seamlessly with idiomatic CodeQL isolated cases. However, to fully integrate with idiomatic CodeQL
queries, we show how to extend the standard QL libraries via queries—especially for large-scale or reusable analysis—you will want to
=Customizations.qll= extend the languages internal dataflow configuration using
=Customizations.qll=.
While most CodeQL-supported languages provide out-of-the-box support for Most CodeQL-supported languages (e.g., Java, Python) include out-of-the-box
=Customizations.qll=, C and C++ do not include this by default. However, it is support for =Customizations.qll=. In these cases, the primary language module
possible to enable such support by building a custom CodeQL bundle. This can (e.g., [[./ql/java/ql/lib/java.qll][java.qll]]) automatically imports [[./ql/java/ql/lib/Customizations.qll][Customizations.qll]], which defines
be done using the CLI tool at extension points for user-defined sources, sinks, and flow steps.
https://github.com/advanced-security/codeql-bundle. Since the tool functions
largely as a black box, we provide a more detailed illustration of the
underlying steps.
A working demonstration is available in Unfortunately, C and C++ do not include this mechanism by default. Enabling it
[[./codeql-dataflow-sql-injection-c/README.org]]. In languages like Java, requires modifying the language pack and rebuilding the CodeQL bundle.
=Customizations.qll= is included automatically via imports from
=<language>.qll=, such as [[./ql/java/ql/lib/java.qll][java.qll]] importing [[./ql/java/ql/lib/Customizations.qll][Customizations.qll]], which defines
user-extensible predicates for flow modeling.
For C/C++, the process requires explicit modification: This section is *partially complete*: we illustrate the required QL changes,
1. Modify =ql/cpp/ql/lib/cpp.qll= to import =Customizations.qll=. but do *not yet include* the full bundling process.
2. Create and populate =ql/cpp/ql/lib/Customizations.qll= with custom sources/sinks or extensions.
3. Rebuild the CodeQL bundle to include these changes.
This customization enables consistent user-defined flow modeling across To add Customizations support for C/C++, make the following changes:
languages, making it possible to reuse modeling patterns from Java or Python
in C/C++ contexts. 1. Modify =ql/cpp/ql/lib/cpp.qll= to import your =Customizations.qll= module.
2. Create and populate =ql/cpp/ql/lib/Customizations.qll= with definitions for
new sources, sinks, or flow steps.
3. For full deployment: Rebuild the CodeQL bundle to reflect these changes.
The rebuilt bundle can then be used in VS Code or the CLI, enabling you to
model C/C++ flows in a way that mirrors Java and other languages. Once this
bundling step is automated, custom C/C++ modeling will follow the same
developer workflow as any other language.
4. For workshops: The modifications have immediate effect
** TODO CodeQL Bundling ** TODO CodeQL Bundling
This section will provide a detailed walkthrough of the CodeQL bundling process This section will provide a detailed walkthrough of the CodeQL bundling process