TS7 binary AST uses bit 6 for ExportContext (set on all nodes inside
`declare module` contexts), not GlobalAugmentation as previously assumed.
GlobalAugmentation is not a flag in the TS7 binary format at all.
Fix by using a synthetic flag bit (1<<30) for GlobalAugmentation that the
converter sets on `declare global {}` nodes based on the name identifier
being "global". This lets the Java extractor correctly distinguish
`declare global {}` from regular namespace declarations.
Also corrects the flag shift: ExportContext=64 (bit 6), ContainsThis=128
(bit 7), etc., matching the actual TS7 binary layout.
TRAP test results: 494/495 passing (99.8%)
Remaining: badimport.ts (TS7 binary API doesn't report parse diagnostics)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Wire the Go-based TypeScript parser wrapper as an alternative to the
Node.js wrapper. Enabled via SEMMLE_TYPESCRIPT_USE_GO_PARSER=true.
When enabled:
- Skips Node.js installation verification
- Launches the Go binary directly (no Node.js required)
- Uses the same newline-delimited JSON protocol over stdin/stdout
- Go binary path configurable via SEMMLE_TYPESCRIPT_GO_PARSER_WRAPPER
- tsgo binary path passed through via SEMMLE_TYPESCRIPT_TSGO_BINARY
The Go wrapper implements all protocol commands: get-metadata, parse,
prepare-files, reset, and quit.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The shell validation script now uses a structural comparison that
ignores expected numeric differences in kind/flags/token/operator
values between TS5 and TS7. Only truly structural diffs cause failure.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement the core components for translating tsgo's binary AST format
into the JSON format expected by the Java extractor:
- decoder.go: Binary AST format parser with random-access node accessors
(kind, pos, end, flags, children, strings, extended data)
- converter.go: Walks decoded AST and produces JSON matching Node.js
wrapper output (augmented , , , ,
isTypeOnly, HeritageClause token, TypeOperator operator)
- childprops.go: Maps ~100 SyntaxKind names to ordered child property
name lists for correct bitmask-to-property assignment
- scanner.go: TypeScript tokenizer producing array with rescan
support for regex, template, and greater-than disambiguation
Update metadata.go with correct TS7 SyntaxKind iota values and export
metadata functions. Wire decoder+converter through TsgoParser.Parse().
Validation test passes: all 421 diffs are expected TS5-vs-TS7 numeric
kind/flags/token/operator value differences. Zero structural diffs.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The script was calling wrappers in single-file CLI mode, but neither
wrapper supports that (they read commands from stdin). Now sends
parse + quit commands via stdin and uses `timeout` to avoid hangs.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add initial scaffolding for a Go process that will replace the Node.js
TypeScript parser wrapper, preparing for TypeScript 7's Go-based compiler.
The Go wrapper implements the same stdin/stdout line-delimited JSON
protocol as the existing Node.js wrapper (lib/typescript/src/main.ts),
making it a drop-in replacement from the Java extractor's perspective.
Key components:
- Protocol handler matching the Node.js wrapper's command set
(get-metadata, prepare-files, parse, reset, quit)
- Parser backend interface with tsgo subprocess implementation
using the tsgo --api --async JSON-RPC mode (LSP Content-Length framing)
- AST property whitelist matching the ~90 properties from the Node.js wrapper
- Static TS7 SyntaxKind and NodeFlags metadata mappings
- Validation framework for comparing JSON output between wrappers
- Integration tests demonstrating successful tsgo API communication:
initialize, updateSnapshot (project opening), getSourceFile
Key finding: the tsgo API returns binary-encoded ASTs (not JSON),
requiring a decoder for the custom flat-node-array format. See
microsoft/typescript-go/internal/api/encoder/ for the format spec.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Fix misplaced semicolons in test files (was inside comment, moved before it)
- Update QLdoc comments to reference new browser source kind names
- Update docs to list browser source kinds and fix outdated 'only remote' note
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add test files with #!/usr/bin/env bun, #!/usr/bin/env tsx, and
#!/usr/bin/env node shebangs. The query lists extracted .ts files,
verifying that all three shebangs are recognized and the files are
not skipped by the extractor.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update the shebang regexp (renamed NODE_INVOCATION -> JS_INVOCATION) to
also match 'bun' and 'tsx' so that scripts using these runtimes are
correctly identified as JavaScript files.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add explicit load statements for java_library and java_test from
@rules_java//java:defs.bzl in:
- javascript/extractor/BUILD.bazel
- javascript/extractor/test/com/semmle/js/extractor/test/BUILD.bazel