major revisions

This commit is contained in:
Michael Hohn
2025-07-30 16:34:54 -07:00
committed by =Michael Hohn
parent c46f2260ca
commit 3869a61388

View File

@@ -74,9 +74,27 @@
- [[./codeql-sqlite-java/TaintFlowDebugging.ql]] - [[./codeql-sqlite-java/TaintFlowDebugging.ql]]
- [[./codeql-sqlite-java/TaintFlowDebugging.md]] - [[./codeql-sqlite-java/TaintFlowDebugging.md]]
*** Debugging data flow config (instead of taint flow), C *** TODO Debugging data flow config (instead of taint flow), C
A corresponding example for C is planned, using a simplified query to trace
value propagation in [[file:~/work-gh/codeql-lab/codeql-dataflow-sql-injection-c/add-user.c]].
Unlike Java, C may require manual modeling even to visualize basic flows.
** Modeling ** Modeling
There are two primary approaches to modeling: direct use of CodeQL predicates
and the models-as-data system. The models-as-data system is implemented in QL
but relies on external YAML files that are interpreted at query evaluation
time.
The model editor provides a GUI for managing YAML-based models, but the
underlying format is identical to that used by the models-as-data system. In C
and other cases where GUI support is limited or unavailable, we write these
YAML models manually and invoke them directly from queries.
When YAML models are written directly, the use of GPT-based tooling becomes
very natural. GPTs can extract function signatures, parameter semantics, and
flow annotations from documentation or code examples, then generate valid YAML
model entries automatically.
*** Review: SQLite Injection Workshop, Java *** Review: SQLite Injection Workshop, Java
We begin with a recap of the Java-based injection example, focusing on the We begin with a recap of the Java-based injection example, focusing on the
vulnerable code in [[./codeql-sqlite-java/AddUser.java][AddUser.java]]. Following that, we examine a fully manual vulnerable code in [[./codeql-sqlite-java/AddUser.java][AddUser.java]]. Following that, we examine a fully manual
@@ -128,31 +146,10 @@
*** Review: SQLite Injection Workshop (C) *** Review: SQLite Injection Workshop (C)
This is the C version of the workshop. This is the C version of the injection workshop, based on
[[file:~/work-gh/codeql-lab/codeql-dataflow-sql-injection-c/add-user.c]]. It
*** Extending Queries with Customizations.qll for C serves as the basis for both the "models-as-data" manual modeling and the
While most CodeQL-supported languages provide out-of-the-box support for extension via Customizations.qll.
=Customizations.qll=, C and C++ do not include this by default. However, it is
possible to enable such support by building a custom CodeQL bundle. This can
be done using the CLI tool at
https://github.com/advanced-security/codeql-bundle. Since the tool functions
largely as a black box, we provide a more detailed illustration of the
underlying steps.
A working demonstration is available in
[[./codeql-dataflow-sql-injection-c/README.org]]. In languages like Java,
=Customizations.qll= is included automatically via imports from
=<language>.qll=, such as [[./ql/java/ql/lib/java.qll][java.qll]] importing [[./ql/java/ql/lib/Customizations.qll][Customizations.qll]], which defines
user-extensible predicates for flow modeling.
For C/C++, the process requires explicit modification:
1. Modify =ql/cpp/ql/lib/cpp.qll= to import =Customizations.qll=.
2. Create and populate =ql/cpp/ql/lib/Customizations.qll= with custom sources/sinks or extensions.
3. Rebuild the CodeQL bundle to include these changes.
This customization enables consistent user-defined flow modeling across
languages, making it possible to reuse modeling patterns from Java or Python
in C/C++ contexts.
*** Use models-as-data QL code directly (no graphical editor) *** Use models-as-data QL code directly (no graphical editor)
This section focuses on applying the models-as-data system without using the This section focuses on applying the models-as-data system without using the
@@ -179,6 +176,35 @@
calls a function already modeled as a source—to illustrate how user-defined calls a function already modeled as a source—to illustrate how user-defined
extensions propagate through the query logic. extensions propagate through the query logic.
*** Extending Queries with Customizations.qll for C
The manual YAML modeling approach from the previous section works well for
isolated cases. However, to integrate seamlessly with idiomatic CodeQL
queries, we show how to extend the standard QL libraries via
=Customizations.qll=
While most CodeQL-supported languages provide out-of-the-box support for
=Customizations.qll=, C and C++ do not include this by default. However, it is
possible to enable such support by building a custom CodeQL bundle. This can
be done using the CLI tool at
https://github.com/advanced-security/codeql-bundle. Since the tool functions
largely as a black box, we provide a more detailed illustration of the
underlying steps.
A working demonstration is available in
[[./codeql-dataflow-sql-injection-c/README.org]]. In languages like Java,
=Customizations.qll= is included automatically via imports from
=<language>.qll=, such as [[./ql/java/ql/lib/java.qll][java.qll]] importing [[./ql/java/ql/lib/Customizations.qll][Customizations.qll]], which defines
user-extensible predicates for flow modeling.
For C/C++, the process requires explicit modification:
1. Modify =ql/cpp/ql/lib/cpp.qll= to import =Customizations.qll=.
2. Create and populate =ql/cpp/ql/lib/Customizations.qll= with custom sources/sinks or extensions.
3. Rebuild the CodeQL bundle to include these changes.
This customization enables consistent user-defined flow modeling across
languages, making it possible to reuse modeling patterns from Java or Python
in C/C++ contexts.
** TODO codeql-bundling ** TODO codeql-bundling
TBD: detailed description of TBD: detailed description of
https://github.com/advanced-security/codeql-bundle, in https://github.com/advanced-security/codeql-bundle, in