mirror of
https://github.com/hohn/codeql-lab.git
synced 2025-12-16 18:03:08 +01:00
Add prompt support files generated from rst doc
This commit is contained in:
19
codeql-docs/README.org
Normal file
19
codeql-docs/README.org
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
* TODO Direct Conversion RST -> Prompt by GPT
|
||||||
|
** For Go
|
||||||
|
+ [[../ql/docs/codeql/codeql-language-guides/abstract-syntax-tree-classes-for-working-with-go-programs.rst]]
|
||||||
|
- ./abstract-syntax-tree-classes-for-working-with-go-programs.gpt
|
||||||
|
+ [[../ql/docs/codeql/codeql-language-guides/analyzing-data-flow-in-go.rst]]
|
||||||
|
- ./analyzing-data-flow-in-go.gpt
|
||||||
|
+ [[../ql/docs/codeql/codeql-language-guides/basic-query-for-go-code.rst]]
|
||||||
|
- ./basic-query-for-go-code.gpt
|
||||||
|
+ [[../ql/docs/codeql/codeql-language-guides/codeql-for-go.rst]]
|
||||||
|
- ./codeql-for-go.gpt
|
||||||
|
+ [[../ql/docs/codeql/codeql-language-guides/codeql-library-for-go.rst]]
|
||||||
|
- ./codeql-library-for-go.gpt
|
||||||
|
+ [[../ql/docs/codeql/codeql-language-guides/customizing-library-models-for-go.rst]]
|
||||||
|
- ./customizing-library-models-for-go.gpt
|
||||||
|
|
||||||
|
** For Python
|
||||||
|
|
||||||
|
** For C/C++
|
||||||
|
|
||||||
@@ -0,0 +1,90 @@
|
|||||||
|
Purpose
|
||||||
|
- Write CodeQL queries over Go by navigating the Go AST classes.
|
||||||
|
- Model: Syntax → CodeQL class hierarchy; use predicates to access parts (condition, body, operands).
|
||||||
|
- Pattern: get<Part>(), getA<Part>(), get<Left/Right>Operand>(), getAnArgument(), getCallee().
|
||||||
|
|
||||||
|
Core Namespaces
|
||||||
|
- Statements: subclasses of Stmt.
|
||||||
|
- Expressions: subclasses of Expr (literals, unary, binary, calls, selectors, etc.).
|
||||||
|
- Declarations: FuncDecl, GenDecl (+ ImportSpec, TypeSpec, ValueSpec).
|
||||||
|
- Types: TypeExpr nodes (ArrayTypeExpr, StructTypeExpr, FuncTypeExpr, InterfaceTypeExpr, MapTypeExpr, ChanTypeExpr variants).
|
||||||
|
- Names/Selectors: SimpleName, SelectorExpr; Name hierarchy: PackageName, TypeName, ValueName, LabelName.
|
||||||
|
|
||||||
|
Statements (Stmt)
|
||||||
|
- EmptyStmt “;”; ExprStmt expression-as-stmt; BlockStmt “{…}”.
|
||||||
|
- IfStmt: if cond then [else]; supports init; Then/Else are blocks or statements.
|
||||||
|
- ForStmt: classic init/cond/post; LoopStmt superclass. RangeStmt: “for k,v := range expr { … }”.
|
||||||
|
- SwitchStmt/ExpressionSwitchStmt; TypeSwitchStmt; CaseClause inside switch.
|
||||||
|
- SelectStmt with CommClause; SendStmt “ch <- x”; RecvStmt “x = <-ch”.
|
||||||
|
- DeclStmt; Assignment family: SimpleAssignStmt (=), DefineStmt (:=), CompoundAssignStmt (+, -, *, /, %, &, |, ^, <<, >>, &^).
|
||||||
|
- IncStmt x++, DecStmt x--. GoStmt “go f()”; DeferStmt “defer f()”. LabeledStmt, BreakStmt, ContinueStmt, GotoStmt, FallthroughStmt, BadStmt.
|
||||||
|
|
||||||
|
Expressions (Expr)
|
||||||
|
Literals
|
||||||
|
- BasicLit subclasses: IntLit, FloatLit, ImagLit, CharLit/RuneLit, StringLit.
|
||||||
|
- CompositeLit: StructLit (T{…}), MapLit (map[K]V{…}).
|
||||||
|
- FuncLit: function literal (FuncDef).
|
||||||
|
|
||||||
|
UnaryExpr (UnaryExpr)
|
||||||
|
- PlusExpr “+x”, MinusExpr “-x”, NotExpr “!x”, ComplementExpr “^x”, AddressExpr “&x”, RecvExpr “<-x”.
|
||||||
|
|
||||||
|
BinaryExpr (BinaryExpr)
|
||||||
|
- Arithmetic: MulExpr, QuoExpr, RemExpr, AddExpr, SubExpr.
|
||||||
|
- Shift: ShlExpr “<<”, ShrExpr “>>”.
|
||||||
|
- Logical: LandExpr “&&”, LorExpr “||”.
|
||||||
|
- Relational: LssExpr “<”, GtrExpr “>”, LeqExpr “<=”, GeqExpr “>=”.
|
||||||
|
- Equality: EqlExpr “==”, NeqExpr “!=”.
|
||||||
|
- Bitwise: AndExpr “&”, OrExpr “|”, XorExpr “^”, AndNotExpr “&^”.
|
||||||
|
|
||||||
|
Type expressions (no common superclass)
|
||||||
|
- ArrayTypeExpr “[N]T”/“[]T”; StructTypeExpr “struct{…}”; FuncTypeExpr “func(…) …”.
|
||||||
|
- InterfaceTypeExpr; MapTypeExpr; ChanTypeExpr variants: SendChanTypeExpr, RecvChanTypeExpr, SendRecvChanTypeExpr.
|
||||||
|
|
||||||
|
Name/Selector/Call
|
||||||
|
- Name subclasses: SimpleName, QualifiedName; ValueName → ConstantName, VariableName, FunctionName.
|
||||||
|
- SelectorExpr “X.Y” for pkg qualifiers and field/method access.
|
||||||
|
- CallExpr: getCallee(), getAnArgument(); method calls often SelectorExpr as callee.
|
||||||
|
- IndexExpr “a[i]”; SliceExpr “a[i:j:k]”; KeyValueExpr in CompositeLit.
|
||||||
|
- ParenExpr; StarExpr pointer deref/type; TypeAssertExpr “x.(T)”; Conversion “T(x)”.
|
||||||
|
|
||||||
|
Declarations
|
||||||
|
- FuncDecl/FuncLit via FuncDef: getBody(), getName(), getParameter(i), getResultVar(i), getACall().
|
||||||
|
- GenDecl with ImportSpec/TypeSpec/ValueSpec; Field/FieldList for params, results, struct/interface fields.
|
||||||
|
|
||||||
|
Concurrency
|
||||||
|
- SelectStmt with CommClause; SendStmt; RecvExpr/RecvStmt; GoStmt; DeferStmt.
|
||||||
|
|
||||||
|
Navigation Idioms
|
||||||
|
- If: getCondition(), getThen(), getElse(); For/Range: inspect init/cond/post or range expr.
|
||||||
|
- Calls: from CallExpr c, SelectorExpr s | c.getCallee() = s and s.getMemberName() = "Foo".
|
||||||
|
- Method vs function: SelectorExpr callee vs SimpleName callee.
|
||||||
|
- Switch/TypeSwitch: use CaseClause, getExpr(i)/getStmt(i); Select: CommClause.
|
||||||
|
- Assign: match AssignStmt subclasses; short var define is DefineStmt.
|
||||||
|
- Binary/Unary: use specific subclasses or operator accessors.
|
||||||
|
- Literals: filter BasicLit subclasses; CompositeLit elements via keys/values.
|
||||||
|
|
||||||
|
Selection Patterns (QL sketches)
|
||||||
|
- Method calls by name:
|
||||||
|
from CallExpr call, SelectorExpr sel
|
||||||
|
where call.getCallee() = sel and sel.getMemberName() = "Close"
|
||||||
|
select call
|
||||||
|
- Range over map/slice:
|
||||||
|
from RangeStmt r select r
|
||||||
|
- Short var with channel receive:
|
||||||
|
from RecvStmt rs select rs
|
||||||
|
- Struct literal of type Point:
|
||||||
|
from StructLit lit where lit.getType().getName() = "Point" select lit
|
||||||
|
- Defer call:
|
||||||
|
from DeferStmt d, CallExpr c where d.getExpr() = c select d, c
|
||||||
|
|
||||||
|
Tips
|
||||||
|
- Prefer class tests over string parsing. Disambiguate type conversions (CallExpr callee is a TypeExpr).
|
||||||
|
- Inc/Dec are statements, not expressions. Handle ":=" vs "=" separately. Exclude BadStmt/BadExpr.
|
||||||
|
|
||||||
|
Cheatsheet (syntax → class)
|
||||||
|
- If: IfStmt; For: ForStmt; Range: RangeStmt; Switch: SwitchStmt/ExpressionSwitchStmt; Type switch: TypeSwitchStmt; Select: SelectStmt; Case: CaseClause; Select case: CommClause.
|
||||||
|
- Assign: SimpleAssignStmt (=), DefineStmt (:=), CompoundAssignStmt; Inc/Dec: IncStmt, DecStmt.
|
||||||
|
- Call: CallExpr; Selector: SelectorExpr; Index/Slice: IndexExpr/SliceExpr; Type assert: TypeAssertExpr; Unary/Binary: UnaryExpr/BinaryExpr subtypes.
|
||||||
|
- Literals: IntLit, FloatLit, ImagLit, CharLit/RuneLit, StringLit, StructLit, MapLit, FuncLit.
|
||||||
|
- Types: ArrayTypeExpr, StructTypeExpr, FuncTypeExpr, InterfaceTypeExpr, MapTypeExpr, ChanTypeExpr.
|
||||||
|
- Names/Entities: Name, ValueName, FunctionName; FuncDef, FuncDecl, FuncLit.
|
||||||
50
codeql-docs/analyzing-data-flow-in-go.gpt
Normal file
50
codeql-docs/analyzing-data-flow-in-go.gpt
Normal file
@@ -0,0 +1,50 @@
|
|||||||
|
Purpose
|
||||||
|
- Use CodeQL’s Go data-flow libraries to find how values and taint propagate.
|
||||||
|
- Cover local flow/taint (intra-procedural) and global flow/taint (inter-procedural), with configurable sources/sinks/barriers.
|
||||||
|
|
||||||
|
Local Data Flow (DataFlow)
|
||||||
|
- Node hierarchy: Node (ExprNode, ParameterNode, InstructionNode). Map to/from AST/IR via asExpr/asParameter/asInstruction and exprNode/parameterNode/instructionNode.
|
||||||
|
- localFlowStep(a,b): immediate edge; localFlow(a,b) is transitive closure (localFlowStep*).
|
||||||
|
- Example: find all expressions that flow to call arg 0 of os.Open:
|
||||||
|
import go
|
||||||
|
from Function osOpen, CallExpr call, Expr src
|
||||||
|
where osOpen.hasQualifiedName("os","Open") and call.getTarget() = osOpen and
|
||||||
|
DataFlow::localFlow(DataFlow::exprNode(src), DataFlow::exprNode(call.getArgument(0)))
|
||||||
|
select src
|
||||||
|
|
||||||
|
Local Taint (TaintTracking)
|
||||||
|
- localTaintStep / localTaint analogous to DataFlow but includes non-value-preserving steps (e.g., concatenation).
|
||||||
|
- Example: parameter → sink taint check with TaintTracking::localTaint.
|
||||||
|
|
||||||
|
Global Data Flow (DataFlow::Global)
|
||||||
|
- Implement DataFlow::ConfigSig:
|
||||||
|
- isSource(Node): where flow originates.
|
||||||
|
- isSink(Node): where flow ends.
|
||||||
|
- isBarrier(Node) [optional]: blocks flow.
|
||||||
|
- isAdditionalFlowStep(a,b) [optional]: add extra edges.
|
||||||
|
- Apply module: module MyFlow = DataFlow::Global<MyConfig>.
|
||||||
|
- Query via MyFlow::flow(source, sink).
|
||||||
|
|
||||||
|
Global Taint (TaintTracking::Global)
|
||||||
|
- Same signature as Global data flow; includes taint-style non-value-preserving steps.
|
||||||
|
- Good for security queries (untrusted → sink).
|
||||||
|
|
||||||
|
Predefined Sources
|
||||||
|
- RemoteFlowSource: user-controllable inputs; use as source for security findings.
|
||||||
|
|
||||||
|
Idioms
|
||||||
|
- Targeted call/arg sink: define isSink by matching call.getTarget() and sink.asExpr() = call.getArgument(i).
|
||||||
|
- Literal-only filter: require source.asExpr() instanceof StringLit (or other BasicLit subclass).
|
||||||
|
- Env source example: class GetenvSource extends CallExpr where getTarget().hasQualifiedName("os","Getenv").
|
||||||
|
- Compose flows: define MyFlow for literal→url.Parse, or taint from getenv→url.Parse using ConfigSig and Global.
|
||||||
|
|
||||||
|
Exercises (patterns to emulate)
|
||||||
|
- Hard-coded strings → url.Parse (local/global).
|
||||||
|
- Sources from os.Getenv.
|
||||||
|
- Full path query from getenv to url.Parse.
|
||||||
|
|
||||||
|
Tips
|
||||||
|
- Prefer DataFlow/TaintTracking APIs over string matching; use .asExpr() to recover expressions when defined.
|
||||||
|
- Be explicit about package-qualified targets with hasQualifiedName.
|
||||||
|
- For better perf/precision, start with localFlow/localTaint and expand to Global only when needed.
|
||||||
|
- Use select source, "... $@", sink to show path endpoints in results; add path explanation with path queries (outside this scope).
|
||||||
36
codeql-docs/basic-query-for-go-code.gpt
Normal file
36
codeql-docs/basic-query-for-go-code.gpt
Normal file
@@ -0,0 +1,36 @@
|
|||||||
|
Purpose
|
||||||
|
- Minimal Go query in VS Code; variables, constraints, and results for a concrete bug pattern.
|
||||||
|
|
||||||
|
Target Pattern
|
||||||
|
- Methods defined on value receivers that write to a field have no effect (receiver is copied).
|
||||||
|
- Safer alternative: method should use a pointer receiver.
|
||||||
|
|
||||||
|
Query
|
||||||
|
import go
|
||||||
|
from Method m, Variable recv, Write w, Field f
|
||||||
|
where recv = m.getReceiver() and
|
||||||
|
w.writesField(recv.getARead(), f, _) and
|
||||||
|
not recv.getType() instanceof PointerType
|
||||||
|
select w, "This update to " + f + " has no effect, because " + recv + " is not a pointer."
|
||||||
|
|
||||||
|
Structure (analogy to SQL)
|
||||||
|
- import: include standard Go library (import go).
|
||||||
|
- from: declare typed variables to range over (Method, Variable, Write, Field).
|
||||||
|
- where: constrain relationships among variables with predicates.
|
||||||
|
- select: emit results; message can concatenate strings and AST entities.
|
||||||
|
|
||||||
|
Key Predicates/Classes
|
||||||
|
- Method.getReceiver(): receiver variable of a method.
|
||||||
|
- Write.writesField(baseRead, field, idx): a write whose LHS writes field of a base expression.
|
||||||
|
- Variable.getARead(): a read expression of the variable (used to match Write receiver base).
|
||||||
|
- PointerType: type test to exclude pointer receivers.
|
||||||
|
|
||||||
|
Usage Hints
|
||||||
|
- Use hasQualifiedName(pkg, name) to narrow functions/methods by package.
|
||||||
|
- Start with quick query in the VS Code CodeQL extension; paste query under "import go".
|
||||||
|
- Click results to jump to the write site; refine constraints if needed.
|
||||||
|
|
||||||
|
Extensions
|
||||||
|
- Add a guard to exclude writes to fields of temporary copies (e.g., values returned from functions).
|
||||||
|
- Restrict to exported methods/types, or to specific packages.
|
||||||
|
- Convert to a path query to show flows leading to the write (optional).
|
||||||
21
codeql-docs/codeql-for-go.gpt
Normal file
21
codeql-docs/codeql-for-go.gpt
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
Purpose
|
||||||
|
- Orientation page for Go query authors; links and core concepts.
|
||||||
|
|
||||||
|
What to Learn (roadmap)
|
||||||
|
- Basic query for Go code: variables, predicates, SELECT formatting.
|
||||||
|
- CodeQL library for Go: AST, entities/names, types, DFG/CFG, calls.
|
||||||
|
- AST classes for Go: concrete syntax → CodeQL classes mapping and accessors.
|
||||||
|
- Analyzing data flow in Go: local/global flow and taint.
|
||||||
|
- Customizing library models for Go: data extensions (sources/sinks/summaries) and model packs.
|
||||||
|
|
||||||
|
Core Import
|
||||||
|
- Use "import go" to bring the standard Go library (go.qll and friends).
|
||||||
|
|
||||||
|
Best Practices
|
||||||
|
- Start syntactic (AST) for structure; switch to DFG for semantic flow.
|
||||||
|
- Use hasQualifiedName for stable matching of stdlib/framework APIs.
|
||||||
|
- Prefer library predicates over string parsing; rely on classes and accessors.
|
||||||
|
- Keep queries specific and cheap first; generalize after validation.
|
||||||
|
|
||||||
|
Next Steps
|
||||||
|
- Follow each linked topic for details and examples. Combine AST selections with DataFlow/TaintTracking when moving from structure to behavior.
|
||||||
37
codeql-docs/codeql-library-for-go.gpt
Normal file
37
codeql-docs/codeql-library-for-go.gpt
Normal file
@@ -0,0 +1,37 @@
|
|||||||
|
Purpose
|
||||||
|
- Quick reference to the Go standard library for CodeQL queries.
|
||||||
|
|
||||||
|
Views
|
||||||
|
- AST (syntactic): statements/expressions, names, declarations.
|
||||||
|
- CFG/IR: control flow, instructions (rarely used directly by queries).
|
||||||
|
- DFG (data-flow): value and taint propagation, call/callee mapping.
|
||||||
|
|
||||||
|
AST Essentials
|
||||||
|
- AstNode: getChild(i), getAChild(), getParent() for generic traversal (avoid index reliance).
|
||||||
|
- Statements: IfStmt, ForStmt, RangeStmt, SwitchStmt/ExpressionSwitchStmt, TypeSwitchStmt, SelectStmt, CaseClause, CommClause, BlockStmt, DeclStmt, Assign variants, Inc/Dec, GoStmt, DeferStmt, Labeled/Break/Continue/Goto/Fallthrough.
|
||||||
|
- Expressions: Ident, SelectorExpr (base/selector), BasicLit (IntLit/FloatLit/ImagLit/RuneLit/StringLit), FuncLit, CompositeLit (getKey/getValue), ParenExpr, IndexExpr, SliceExpr, ConversionExpr, TypeAssertExpr, CallExpr (getCalleeExpr/getArg), StarExpr, TypeExpr, OperatorExpr → UnaryExpr/BinaryExpr (ComparisonExpr with EqualityTestExpr/RelationalComparisonExpr).
|
||||||
|
- Statement accessors: per-class getters (getCondition, getThen, getElse, getInit, getPost, getExpr(i), getStmt(i), getComm(), etc.).
|
||||||
|
|
||||||
|
Names/Entities/Types
|
||||||
|
- Name hierarchy: SimpleName vs QualifiedName; namespaces: PackageName, TypeName, ValueName, LabelName; ValueName → ConstantName, VariableName, FunctionName.
|
||||||
|
- ReferenceExpr: lvalue/rvalue; ValueExpr: expressions with values.
|
||||||
|
- Entity: PackageEntity, TypeEntity, ValueEntity (Constant/Variable/Function), Label; hasQualifiedName, getDeclaration, getAReference.
|
||||||
|
- Variable subclasses: LocalVariable, ReceiverVariable, Parameter, ResultVariable; Field with hasQualifiedName(pkg,type,field).
|
||||||
|
- Function/Method: FuncDef unifies FuncDecl/FuncLit; getBody, getName, getParameter(i), getResultVar(i), getACall. Method.hasQualifiedName(pkg,type,method); implements(m2).
|
||||||
|
|
||||||
|
Data Flow Graph (DFG)
|
||||||
|
- DataFlow::Node ↔ optional AST via asExpr (use cautiously). getType(), getNumericValue/getStringValue/getExactValue for constants.
|
||||||
|
- Nodes: CallNode (getArgument(i), getResult(i), getTarget(), getACallee()), ParameterNode (asParameter), BinaryOperationNode (covers x+1, x+=1, x++), UnaryOperationNode; PointerDereferenceNode, AddressOperationNode, RelationalComparisonNode, EqualityTestNode.
|
||||||
|
- Read/Write: readsVariable/Field/Element, writesVariable/Field/Element.
|
||||||
|
|
||||||
|
Call Graph
|
||||||
|
- getTarget(): declared (may be interface method). getACallee(): all possible dynamic callees.
|
||||||
|
|
||||||
|
Global Flow/Taint (overview)
|
||||||
|
- Define ConfigSig with isSource/isSink/[isBarrier]; apply DataFlow::Global<..> or TaintTracking::Global<..>.
|
||||||
|
|
||||||
|
Advanced
|
||||||
|
- Basic blocks/dominance for CFG-based reasoning (rare for standard queries).
|
||||||
|
|
||||||
|
Guidance
|
||||||
|
- Prefer AST for structure, DFG for semantics. Use qualified names. Rely on library types/predicates over string parsing. Start local, move to global only as needed.
|
||||||
44
codeql-docs/customizing-library-models-for-go.gpt
Normal file
44
codeql-docs/customizing-library-models-for-go.gpt
Normal file
@@ -0,0 +1,44 @@
|
|||||||
|
Purpose
|
||||||
|
- Customize data-flow/taint analysis for Go by modeling frameworks/libraries via data extensions (YAML) and model packs.
|
||||||
|
|
||||||
|
Data Extensions (YAML)
|
||||||
|
- Structure:
|
||||||
|
extensions:
|
||||||
|
- addsTo:
|
||||||
|
pack: codeql/go-all
|
||||||
|
extensible: <extensible-predicate>
|
||||||
|
data:
|
||||||
|
- <tuple1>
|
||||||
|
- <tuple2>
|
||||||
|
- Union semantics across files: rows are combined; duplicates removed.
|
||||||
|
|
||||||
|
Extensible Predicates (Go)
|
||||||
|
- sourceModel(package, type, subtypes, name, signature, ext, output, kind, provenance)
|
||||||
|
- Define sources (e.g., user input). kind maps to threat model; provenance tags origin (manual/ai-manual/etc.).
|
||||||
|
- sinkModel(package, type, subtypes, name, signature, ext, input, kind, provenance)
|
||||||
|
- Define sinks (dangerous use).
|
||||||
|
- summaryModel(package, type, subtypes, name, signature, ext, input, output, kind, provenance)
|
||||||
|
- Define through-flow when dependency code isn’t in repo.
|
||||||
|
- neutralModel(package, type, name, signature, kind, provenance)
|
||||||
|
- Low-impact flows (weaker than summaries) to reduce over-taint/noise.
|
||||||
|
|
||||||
|
Access Paths (examples)
|
||||||
|
- Argument[i], Argument[i].ArrayElement, ReturnValue, ReturnValue.ArrayElement, Receiver, Qualifier, Field["name"], Field[<index>].
|
||||||
|
- kind: "value" moves whole values; "taint" propagates taint only.
|
||||||
|
|
||||||
|
Examples
|
||||||
|
- slices.Max: elements of arg[0] → ReturnValue.
|
||||||
|
- ["slices","",False,"Max","","","Argument[0].ArrayElement","ReturnValue","value","manual"]
|
||||||
|
- slices.Concat: elements of all args → elements of ReturnValue.
|
||||||
|
- ["slices","",False,"Concat","","","Argument[0].ArrayElement.ArrayElement","ReturnValue.ArrayElement","value","manual"]
|
||||||
|
- Threat models: use kind/provenance and pack wiring to include/exclude sets of sources.
|
||||||
|
|
||||||
|
Model Packs
|
||||||
|
- Group YAML files into a CodeQL model pack; publish to GHCR.
|
||||||
|
- Consumers add the pack to their query suite or CLI invocation to apply models.
|
||||||
|
|
||||||
|
Workflow Tips
|
||||||
|
- Start with summaries for common library flows that otherwise break paths (builders, containers, helpers).
|
||||||
|
- Add specific sources/sinks tied to your threat model.
|
||||||
|
- Keep models narrow to avoid false positives (match by hasQualifiedName).
|
||||||
|
- Validate with path queries and unit tests; iterate on precision vs recall.
|
||||||
Reference in New Issue
Block a user