Unnamed tokens (keywords, operators, punctuation) in the unnamed
children bucket are no longer shown in dump output. They still appear
if they are inside a named field.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add yeast::dump::dump_ast() which produces indented text output:
program
assignment
left:
left_assignment_list
identifier "x"
identifier "y"
right:
call
method: identifier "foo"
Named fields are shown with "field:" labels, unnamed children are
indented under their parent. Leaf nodes show their text content.
Locations are optional via DumpOptions.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rule!(query => kind_name) is a shorthand for rules that simply gather
query captures into fields on a new node type. Each capture name
becomes a field: single captures produce single-valued fields, repeated
captures produce multi-valued fields.
rule!((foo f: (boo (_) @blah) (_)* @blop) => bar)
is equivalent to:
rule!((foo ...) => (bar blah: {blah} blop: {..blop}))
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rule! declares a desugaring rule with query pattern and transform
template in a single expression:
rule!(
(for pattern: (_) @pat value: (in (_) @val) body: (do (_)* @body))
=>
(call receiver: {val} method: (identifier "each") ...)
)
Captures become Rust variables automatically: @name binds as Id
(single capture) or Vec<Id> (after * or +). The BuildCtx is created
implicitly. tree! and trees! can also be used without an explicit
context inside rule! transforms.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rewrite the builder language section to document the current API:
tree!(ctx, ...) / trees!(ctx, ...) with BuildCtx, {expr} for embedded
Rust, {..expr} for splicing, #{expr} for computed literals, and $name
for fresh identifiers. Remove all references to the old TreeBuilder.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Delete TreeBuilder, TreeChildBuilder, TreesBuilder enums and all their
methods, along with the tree_builder! and trees_builder! proc macros.
All building is now done through tree!/trees! with BuildCtx.
The tree_builder module is kept for FreshScope only.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tree! now consistently returns Vec<Id>, even for a single element.
This matches the Rule transform signature and removes the need to
wrap single-node results in vec![].
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tree! now returns Id for a single element or Vec<Id> for multiple,
determined by how many top-level elements appear in the template.
trees! is kept as an alias for backward compatibility.
- tree!(ctx, (single_node ...)) → Id
- tree!(ctx, (node1 ...) (node2 ...)) → Vec<Id>
- tree!(ctx, (node) {..splice}) → Vec<Id>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
{..expr} in tree!/trees! splices a Vec<Id> (extend), while {expr}
inserts a single Id (push). The .. mirrors Rust spread syntax.
The assignment rule is now a single trees! expression with the loop
inlined via {..iter.map(|...| tree!(...)).collect()}.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
(kind {expr}) creates a leaf node whose content is expr.to_string().
This eliminates ctx.literal() calls for computed values like loop
counters: (integer {i}) instead of {ctx.literal("integer", &i.to_string())}.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
New tree!(ctx, ...) and trees!(ctx, ...) macros that directly build AST
nodes through a BuildCtx, replacing the intermediate TreeBuilder data
structures for new code.
Key features:
- {expr} embeds Rust expressions inline in templates
- @name references captures from the query match
- $fresh generates unique identifiers (shared across the template)
- (kind "literal") creates literal leaf nodes
- (@name)* splices repeated captures
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Covers architecture, the query and builder languages, capture semantics,
fresh identifiers, and extractor integration, with a complete for-loop
desugaring example.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The end token rule was deleting orphaned "end" tokens after
desugaring. This is no longer needed since the for-loop rule replaces
the entire for node (including its end token), and unnamed tokens are
skipped automatically during matching.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Since (_) and (_)* skip unnamed tokens automatically, explicit patterns
for discarding tokens are no longer needed.
Before: (left_assignment_list ((identifier) @left (",")?)* )
After: (left_assignment_list (identifier)* @left)
Before: (do "do"? (_)* @body)
After: (do (_)* @body)
Also fix proc macro parsing of (node_kind)* to correctly treat the
group as a repeated single node pattern rather than a repeated list.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Align capture syntax with tree-sitter queries: @name must always follow
a pattern, never appear standalone. This eliminates the ambiguity where
`@name` could be confused with a positional wildcard.
Before: right: @right, (@body)*
After: right: (_) @right, (_)* @body
For repeated patterns, `(_)* @body` captures each matched node into
the repeated capture variable, matching tree-sitter semantics.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In tree-sitter queries, (_) matches any named node but skips unnamed
tokens. Implement the same semantics: QueryNode::Any and captures
wrapping Any now skip unnamed children in positional matching.
This allows writing `(in (_) @val)` to match the first named child of
an `in` node, skipping the "in" keyword token. Previously this
required the awkward `(in "in" @val)`.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Unnamed tokens are now always wrapped in double quotes in the YAML
output, making them visually distinct from named node references.
YAML treats both forms as equivalent strings, so this is purely
a readability improvement.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add convert_from_json() to convert tree-sitter node-types.json back to
the YAML format. The CLI gains a --from-json flag.
The round-trip is tested: YAML → JSON → YAML → JSON produces identical
JSON output.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a human-friendly YAML format for specifying node types, with a
converter to tree-sitter's node-types.json format.
YAML format has three top-level sections:
- supertypes: union types mapping to lists of member types
- named: concrete AST nodes with fields (?, *, + suffixes for
multiplicity) and $children for unnamed children
- unnamed: list of token strings
Type references are resolved automatically: if a name appears only as
unnamed, it's treated as unnamed; otherwise named. Use {unnamed: name}
for explicit disambiguation.
Includes a library API (yeast::node_types_yaml::convert) and a CLI
binary (node_types_yaml) that reads YAML and outputs JSON.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add an optional output_node_types field to Language (generator) and
LanguageSpec (extractor). When set, the generator produces dbscheme/QL
from the output types, and the extractor validates TRAP against them.
This enables desugaring transforms that produce AST shapes different
from the tree-sitter grammar. When unset (None), behavior is unchanged
— the tree-sitter node_types are used for both input and output.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
build_tree() and build_trees() now create their own FreshScope
internally. For the rare case where a shared scope is needed across
multiple build calls (e.g. the assignment rule), _with_fresh variants
are available.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add two new TreeBuilder variants for declarative identifier creation:
- Literal: (kind "value") — creates a leaf node with fixed content
Example: (identifier "each") creates an identifier node with text 'each'
- Fresh: (kind $name) — creates a leaf node with an auto-generated
unique name. All occurrences of the same $name within one rule
application share the same generated value.
Example: (identifier $tmp) creates 'tmp-0', 'tmp-1', etc.
FreshScope tracks generated names per rule application. This eliminates
the need for manual Rc<Cell> counters and create_named_token calls.
The for-loop rule is now fully declarative (no imperative code in the
transform closure). The assignment rule still needs a closure for the
index counter in repeated captures.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add extract_and_desugar() which accepts yeast desugaring rules.
The original extract() function is unchanged and delegates to
extract_and_desugar with empty rules.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bare patterns inside a node (not preceded by 'field:') are now
automatically assigned to the synthetic 'child' field. This removes the
need for explicit 'child*: (...)' syntax.
Before: (do child*: (("do")? (@body)*))
After: (do ("do")? (@body)*)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
with procedural macros in a new yeast-macros crate. The proc macros parse
a tree-sitter-inspired syntax and generate the same runtime data structures.
Key improvements:
- Better error messages with source spans
- Cleaner syntax closer to tree-sitter query notation
- Captures use @name after patterns (tree-sitter style)
- Fields with bare @capture no longer need wrapping parens
- Removed ~10 interdependent macro_rules! definitions
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add the yeast crate (Yet another Elaborator for Abstract Syntax Trees),
a framework for tree-sitter AST transformations/desugaring. Integrate it
into the shared tree-sitter extractor.
Key components:
- shared/yeast/: New crate with query/match/transform pipeline for
tree-sitter ASTs, with Ruby desugaring rules as an example
- shared/tree-sitter-extractor: Pass parsed trees through yeast before
TRAP extraction, applying language-specific desugaring rules
Updated from the original hackathon branch to work with tree-sitter 0.24
and current main dependencies.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address review feedback by moving the shared method-name-based encryption/hash/digest
check into Sanitizers.qll, and reference it from both CleartextStorageQuery.qll and
SensitiveLoggingQuery.qll instead of duplicating the definition.
Address review feedback by introducing dedicated subclasses of
TrustBoundaryValidationSanitizer for SimpleTypeSanitizer, RegexpCheckBarrier,
and the HttpServletSession type check, so isBarrier only references the
abstract class.