Phase 2 progress — shared CFG library fully instantiated with IR support:
ControlFlowGraphShared.qll (1290 lines) now contains:
additionalNode entries for all 30+ synthetic IR operations:
Assignment writes (assign:i), compound RHS (compound-rhs), tuple extraction (extract:i)
Zero initialization (zero-init:i), increment/decrement (implicit-one, incdec-rhs)
Return (return), result write/read (result-write:i, result-read:i)
Function prologue (param-init:i, arg:i, result-init:i, result-zero-init:i)
Range next (next), send (send), implicit deref (implicit-deref)
Slice bounds (implicit-low/high/max), case checks (case-check:i)
Type switch variables (type-switch-var), literal init (lit-init, lit-index)
Implicit field selection (implicit-field:i)
step routing for all constructs needing custom flow:
Assignments (with epilogue node sequencing for extract/assign chains)
Increment/decrement (operand → implicit-one → incdec-rhs → In)
Return statements (expressions → extract → result-write → return)
Index/slice/selector expressions (with implicit deref insertion)
Composite literals (pre-order: allocation first, then element init)
Send statements (channel → value → send)
Plus the existing: if-with-init, for/range loops, switch/select, defer, go
What remains for full migration (making shared types primary):
Rewrite ControlFlowGraph.qll: change ControlFlow::Node from TControlFlowNode to GoCfg::ControlFlowNode
Rewrite IR.qll (1598 lines): map all 30+ IR instruction classes to shared CFG nodes via getAstNode()/getIdTag() instead of Mk* types
Update BasicBlocks.qll to use shared library's built-in BasicBlock
Update ~30 downstream files (SSA, DataFlow, security queries) that reference IR::Instruction/ControlFlow::Node
Add function prologue/epilogue step routing (param init, result init, defer LIFO at exit)
Remove ControlFlowGraphImpl.qll
This remaining work is a substantial rewrite (3000+ lines across 35+ files) that would benefit from being done incrementally with test validation at each step. Shall I continue with the IR.qll rewrite next?
Created ControlFlowGraphShared.qll — contains the full GoCfg module with:
Ast module implementing AstSig<Location> — maps Go's AST types to the shared interface, with empty types for constructs Go doesn't have (while/do/try/catch/throw/ternary/null-coalescing)
Input module implementing InputSig1 + InputSig2 — labels, conditional contexts, custom step() predicates for if-with-init, for/range loops, expression/type switches, case clauses, select, defer, and go statements; abrupt completions for panics, fallthrough, and goto
Make0/Make1/Make2 wiring and import Public
Modified ControlFlowGraph.qll — added import of the shared library and re-exports it as module SharedCfg = GoCfg, making shared types accessible as SharedCfg::ControlFlowNode, SharedCfg::ControlFlow::EntryNode, SharedCfg::BasicBlock, etc.
No breakage — all 90 src queries and library files compile cleanly.
Read ControlFlowGraphShared.qll, lines 1 to 50
The file compiles successfully. Here's a summary of what was implemented:
ControlFlowGraphShared.qll — Shared CFG library instantiation for Go, wrapped in the GoCfg module to coexist with the existing CFG during transition. Key design decisions:
Ast module (implements AstSig):
Callable = FuncDef with a body
IfStmt — only For if-statements without init (those with init use custom steps)
LoopStmt — uses characteristic predicate pattern (not direct extension) to avoid return type conflicts with BlockStmt
Empty classes for: WhileStmt, DoStmt, ForStmt, ForeachStmt, ThrowStmt, TryStmt, CatchClause, Switch, Case, ConditionalExpr, NullCoalescingExpr (Go doesn't have these constructs)
LogicalAndExpr = LandExpr, LogicalOrExpr = LorExpr, LogicalNotExpr = NotExpr
BooleanLiteral maps to true/false identifiers
skipCfg filters out type expressions, non-rvalue references, parenthesized expressions, etc.
Input module (implements InputSig1 + InputSig2):
Labels: Go statement labels + fallthrough
Custom steps for: if-with-init, for loops, range loops, expression/type switch, case clauses, select statements, defer, go statements
Abrupt completions: panic via calls/div/deref/type-assert/index/conversion, fallthrough, goto
End abrupt completions: labeled statements catch breaks, goto targets
Remaining work (todos 4-6): The GoCfg module is self-contained and doesn't yet replace the existing CFG. Integration with BasicBlocks.qll, IR.qll, and the public ControlFlowGraph.qll API is the next phase — which is a larger effort requiring updates to DataFlow and all downstream consumers.
Changes based on code review:
1. Remove redundant strings.Contains check in isExactTestPackage
The equality check on the next line handles both cases, making
the early return unnecessary.
2. Extract package selection logic into selectBestPackages function
This reduces code duplication and allows the test to call the
actual implementation rather than copying the logic.
3. Add TestSelectBestPackages to test the new function
Comprehensive test covering single packages, test vs production,
exact vs nested tests, and multiple packages.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Generated by manually applying the output from CI's Gazelle check.
This adds the go_test target for the new extractor_test.go file.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This test verifies that root internal test files (package foo, not
foo_test) are correctly extracted when the repository has both:
1. Root-level internal tests (main_test.go with package main)
2. Nested packages with tests (nested/nested_test.go)
This scenario reproduces the bug that was fixed: the old extractor
would select the wrong package variant and miss root internal test
files.
The test ensures:
- main_test.go (root internal test) is extracted
- nested/nested_test.go (nested test) is extracted
- All test functions from both files are present in the database
This prevents regression of the bug fix.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
When CODEQL_EXTRACTOR_GO_OPTION_EXTRACT_TESTS=true is set, the Go
extractor was incorrectly skipping internal test files (package foo)
at repository roots when the project contains nested test packages.
Root Cause:
The extractor selected package variants by longest ID string, but this
heuristic fails when nested packages have tests. For a package like
"github.com/go-git/go-git/v6", packages.Load returns multiple variants:
1. "github.com/go-git/go-git/v6" (19 files, production only)
2. "github.com/go-git/go-git/v6 [github.com/go-git/go-git/v6.test]"
(39 files, production + 20 root tests) ← Should select this
3. "github.com/go-git/go-git/v6 [github.com/go-git/go-git/v6/plumbing/format/packfile.test]"
(19 files, test dependency) ← Was incorrectly selected (longest string)
The old logic selected variant #3 (76 chars) over #2 (68 chars),
causing 20 root test files to be missing from the database.
Fix:
Replace string length comparison with a better heuristic that prefers:
1. Exact test packages (e.g., "pkg [pkg.test]") over nested dependencies
2. Packages with more Syntax nodes (more files to extract)
3. String length as a tiebreaker
This ensures the extractor selects the variant with the most complete
test coverage, particularly for root-level internal tests.
Testing:
- Added comprehensive unit tests covering the selection logic
- Tests simulate the real-world go-git scenario
- All tests pass
Impact:
Root-level external tests (package foo_test) were already extracted
correctly. This fix ensures internal tests (package foo) at the root
are now also extracted when they exist alongside nested test packages.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>