Adds `manual_rule!` which provides a more low-level interface for
defining rewrites. (I'm not entirely sold on the name, so any
suggestions would be welcome.)
Notably, the captures bound in the body of such rules have _not_ been
translated yet -- they still come from the _input_ tree. It is the
user's duty to call ctx.translate on these (which has the effect of
recursively invoking the translation) before substituting them into the
output.
For _truly_ low-level access, the user can still construct a Rule
directly, but this is now somewhat cumbersome as the closure contained
therein takes quite a few parameters. Still, the possibility remains.
This enables users to specify how and when these captures get
translated. In conjunction with the context mechanism, this can be used
to e.g. translate some piece of information (e.g. the type of
something), record it in the context, and then recursively translate
some other capture that relies on this information. This allows
information to be cleanly passed into descendants (which can be written
using context accesses in the `rule!` macro form).
As a consequence of this change, we now need to pass around a
TranslatorHandle to perform the manual translation. For Repeating rules,
it doesn't really make sense to translate things, so in this case we
simply signal an error.
Also, the implementation of the `rule!` macro changes slightly (without
changing semantics): it now essentially delegates to `Rule::new`,
receiving raw captures, but then immediately applies the translation to
those captures (which, for the majority of cases, is likely the desired
behaviour).
Renames what was previously called `__yeast_ctx` into just `ctx`, and
adds a new field `user_ctx` to this context. Said field can contain a
struct of any user type (necessitating making various parts of the
implementation generic in said type).
Through some Deref magic, field accesses are delegated to the inner
struct (assuming they are not already defined on `ctx`), which should
hopefully make the interface a bit more ergonomic.
Previously, when a node was synthesized it would always take the
location from the node that matched the current rule. This resulted
in overly broad locations however.
For (foo #{bar}) we now take the location of the 'bar' node.
For non-leaf nodes we merge all its child node locations.
When a field pattern has a bare capture with no preceding pattern
atom (i.e. `foo: @bar`), implicitly use a true wildcard (`_`,
match_unnamed: true) as the node pattern, making it equivalent to
`foo: _ @bar`.
This is a convenience shorthand: in practice every `field: _ @cap`
in the Swift rules can now be written more concisely as `field: @cap`.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
A left fold over an iterable where the first element seeds the accumulator:
- first -> init : converts the first element to the initial accumulator
- acc, elem -> fold : fold step; acc = current accumulator, elem = next element
- Empty iterable produces nothing (0-element splice)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After a {expr} or {..expr} placeholder, an optional chain of
.<builtin>() calls may follow. Currently the only builtin is:
.map(param -> template)
which applies the template to each element of the iterable and
collects the resulting node IDs. A chain auto-splices into the
enclosing field/child position.
Example:
path: {parts}.map(p -> (identifier #{p}))
The framework is extensible: additional builtins can be added by
matching on the method name in parse_chain_suffix.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a {..expr} splice in an output template is empty (e.g. from an
optional capture that did not match), drop the field entirely rather
than emitting an empty named field. This lets a single rule with
optional captures replace what used to be two near-identical rules.
Also re-renders the corpus to drop the now-suppressed empty fields.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two changes to parse_query_fields:
- Allow `field: (kind)* @cap` (repetition + optional capture) in field
position, mirroring how it works for bare children.
- When the same field name is declared multiple times in a query (e.g.
`condition: (foo) condition: (bar)`), merge them into a single
ordered list of children rather than emitting duplicate field
entries (which at runtime restart the iterator for the field and
cause the second declaration to re-match from the first child).
Introduce NodeRef as a typed wrapper around node arena IDs. Captures in
desugaring rules are now bound as NodeRef instead of raw usize, which
prevents accidental misuse and enables source-text-aware rendering.
Add the YeastDisplay trait as an alternative to Display: its
yeast_to_string method receives the Ast, allowing NodeRef to resolve to
the captured node's source text instead of printing a numeric ID.
Store the original source bytes in the Ast so that NodeContent::Range
values (from synthesized literal nodes) can be resolved back to text.
Update yeast-macros to emit NodeRef-typed capture bindings and use
Into::<usize>::into where raw IDs are needed. The #{expr} template
syntax now uses YeastDisplay instead of Display.
The effect is visible in the corpus tests: operator nodes now correctly
render as e.g. operator "+" instead of operator "3".
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three improvements to the query parser, all aimed at allowing query
patterns to refer to unnamed tokens:
1. Bare-literal capture: `"=" @op` now captures the unnamed `=` token,
matching the parenthesized form `("=") @op`. Previously the literal
branch in parse_query_list skipped the maybe_wrap_capture call, so
the `@op` was a leftover token and would error.
2. Bare `_` matches any node, named or unnamed. Previously bare `_` and
`(_)` both produced QueryNode::Any with the same matches_named_only
behaviour, so bare `_` would skip unnamed children. Now Any carries a
match_unnamed flag: false for `(_)` (named-only, tree-sitter default)
and true for bare `_` (any node).
3. Named fields and bare child patterns may be intermixed in any order.
Previously, once parse_query_fields saw a bare pattern it would stop
accepting named fields. The fix accumulates bare patterns into the
implicit `child` field and keeps parsing.
Each named field independently selects its target field for matching, so
the source-order of fields in the query is purely cosmetic and intermixing
is safe.
Add tests covering parenthesized capture, bare-literal capture, and the
named-vs-any distinction between `(_)` and bare `_`. Update query-syntax
docs to reflect all three.