Commit Graph

18 Commits

Author SHA1 Message Date
Taus
c39bfa555d yeast: Add macro for fine-grained rules
Adds `manual_rule!` which provides a more low-level interface for
defining rewrites. (I'm not entirely sold on the name, so any
suggestions would be welcome.)

Notably, the captures bound in the body of such rules have _not_ been
translated yet -- they still come from the _input_ tree. It is the
user's duty to call ctx.translate on these (which has the effect of
recursively invoking the translation) before substituting them into the
output.

For _truly_ low-level access, the user can still construct a Rule
directly, but this is now somewhat cumbersome as the closure contained
therein takes quite a few parameters. Still, the possibility remains.
2026-06-25 12:02:39 +00:00
Taus
03350bf8d7 yeast: Pass raw captures to Rule::new rules
This enables users to specify how and when these captures get
translated. In conjunction with the context mechanism, this can be used
to e.g. translate some piece of information (e.g. the type of
something), record it in the context, and then recursively translate
some other capture that relies on this information. This allows
information to be cleanly passed into descendants (which can be written
using context accesses in the `rule!` macro form).

As a consequence of this change, we now need to pass around a
TranslatorHandle to perform the manual translation. For Repeating rules,
it doesn't really make sense to translate things, so in this case we
simply signal an error.

Also, the implementation of the `rule!` macro changes slightly (without
changing semantics): it now essentially delegates to `Rule::new`,
receiving raw captures, but then immediately applies the translation to
those captures (which, for the majority of cases, is likely the desired
behaviour).
2026-06-25 12:02:39 +00:00
Taus
d38ffe0ad5 yeast: Make transforms return Result
This will enable us to actually capture and log errors in complicated
rules (e.g. ones written in Rust) rather than just panicking.
2026-06-25 12:02:38 +00:00
Taus
d6373eaef7 yeast: Reify the context and allow user-defined data in it
Renames what was previously called `__yeast_ctx` into just `ctx`, and
adds a new field `user_ctx` to this context. Said field can contain a
struct of any user type (necessitating making various parts of the
implementation generic in said type).

Through some Deref magic, field accesses are delegated to the inner
struct (assuming they are not already defined on `ctx`), which should
hopefully make the interface a bit more ergonomic.
2026-06-25 12:02:38 +00:00
Asger F
6c74cd31e4 Yeast: use child locations instead of rule target
Previously, when a node was synthesized it would always take the
location from the node that matched the current rule. This resulted
in overly broad locations however.

For (foo #{bar}) we now take the location of the 'bar' node.

For non-leaf nodes we merge all its child node locations.
2026-06-18 14:26:30 +02:00
Asger F
d11b428292 yeast-macros: desugar 'field: @cap' to 'field: _ @cap'
When a field pattern has a bare capture with no preceding pattern
atom (i.e. `foo: @bar`), implicitly use a true wildcard (`_`,
match_unnamed: true) as the node pattern, making it equivalent to
`foo: _ @bar`.

This is a convenience shorthand: in practice every `field: _ @cap`
in the Swift rules can now be written more concisely as `field: @cap`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-15 10:49:33 +02:00
Asger F
ddc9516e92 Yeast: better support for rewriting unnamed nodes
- Ensure the full wildcard _ supports quantifiers
- Also rewrite unnamed nodes in one-shot phases
2026-06-15 10:49:31 +02:00
Asger F
00068948c1 yeast-macros: add .reduce_left(first -> init, acc, elem -> fold) chain
A left fold over an iterable where the first element seeds the accumulator:
- first -> init  : converts the first element to the initial accumulator
- acc, elem -> fold : fold step; acc = current accumulator, elem = next element
- Empty iterable produces nothing (0-element splice)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-15 10:49:29 +02:00
Asger F
28c879f58c yeast-macros: add .map(p -> tpl) chain syntax for tree templates
After a {expr} or {..expr} placeholder, an optional chain of
.<builtin>() calls may follow. Currently the only builtin is:

  .map(param -> template)

which applies the template to each element of the iterable and
collects the resulting node IDs. A chain auto-splices into the
enclosing field/child position.

Example:
  path: {parts}.map(p -> (identifier #{p}))

The framework is extensible: additional builtins can be added by
matching on the method name in parse_chain_suffix.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-15 10:49:27 +02:00
Asger F
21f216af8c yeast-macros: omit empty fields produced by .. splice
When a {..expr} splice in an output template is empty (e.g. from an
optional capture that did not match), drop the field entirely rather
than emitting an empty named field. This lets a single rule with
optional captures replace what used to be two near-identical rules.

Also re-renders the corpus to drop the now-suppressed empty fields.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-01 14:18:37 +02:00
Asger F
ac8eb50c26 Yeast: Allow 'r#type' to escape the 'type' keyword in macro 2026-06-01 14:18:37 +02:00
Asger F
ef9306d82c Yeast: Allow rules that return an empty sequence 2026-06-01 14:18:36 +02:00
Asger F
554bdf14b2 Yeast: fix warning about unnecessary mutability 2026-05-13 11:19:51 +02:00
Asger F
3b7a53f678 yeast-macros: merge repeated field declarations and support repetition in field patterns
Two changes to parse_query_fields:

- Allow `field: (kind)* @cap` (repetition + optional capture) in field
  position, mirroring how it works for bare children.
- When the same field name is declared multiple times in a query (e.g.
  `condition: (foo) condition: (bar)`), merge them into a single
  ordered list of children rather than emitting duplicate field
  entries (which at runtime restart the iterator for the field and
  cause the second declaration to re-match from the first child).
2026-05-13 10:35:27 +02:00
Asger F
2307839050 Yeast: Change how patterns with repetition are parsed 2026-05-13 10:35:21 +02:00
Asger F
5772ee4d9b YEAST: add NodeRef type, YeastDisplay trait, and source text storage
Introduce NodeRef as a typed wrapper around node arena IDs. Captures in
desugaring rules are now bound as NodeRef instead of raw usize, which
prevents accidental misuse and enables source-text-aware rendering.

Add the YeastDisplay trait as an alternative to Display: its
yeast_to_string method receives the Ast, allowing NodeRef to resolve to
the captured node's source text instead of printing a numeric ID.

Store the original source bytes in the Ast so that NodeContent::Range
values (from synthesized literal nodes) can be resolved back to text.

Update yeast-macros to emit NodeRef-typed capture bindings and use
Into::<usize>::into where raw IDs are needed. The #{expr} template
syntax now uses YeastDisplay instead of Display.

The effect is visible in the corpus tests: operator nodes now correctly
render as e.g. operator "+" instead of operator "3".

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-13 10:35:17 +02:00
Taus
a4df96aad6 yeast: Support capturing unnamed nodes in queries
Three improvements to the query parser, all aimed at allowing query
patterns to refer to unnamed tokens:

1. Bare-literal capture: `"=" @op` now captures the unnamed `=` token,
   matching the parenthesized form `("=") @op`. Previously the literal
   branch in parse_query_list skipped the maybe_wrap_capture call, so
   the `@op` was a leftover token and would error.

2. Bare `_` matches any node, named or unnamed. Previously bare `_` and
   `(_)` both produced QueryNode::Any with the same matches_named_only
   behaviour, so bare `_` would skip unnamed children. Now Any carries a
   match_unnamed flag: false for `(_)` (named-only, tree-sitter default)
   and true for bare `_` (any node).

3. Named fields and bare child patterns may be intermixed in any order.
   Previously, once parse_query_fields saw a bare pattern it would stop
   accepting named fields. The fix accumulates bare patterns into the
   implicit `child` field and keeps parsing.

Each named field independently selects its target field for matching, so
the source-order of fields in the query is purely cosmetic and intermixing
is safe.

Add tests covering parenthesized capture, bare-literal capture, and the
named-vs-any distinction between `(_)` and bare `_`. Update query-syntax
docs to reflect all three.
2026-05-07 15:08:21 +00:00
Taus
04f587190e yeast: AST desugaring framework with proc-macro DSL
YEAST (YEAST Elaborates Abstract Syntax Trees) is a framework for
transforming tree-sitter parse trees before CodeQL extraction.

Core components:
- shared/yeast/ — Ast, Node, Schema, query matching engine, captures,
  FreshScope, BuildCtx
- shared/yeast-macros/ — proc macros: query!, tree!, trees!, rule!

The query language is inspired by tree-sitter queries:
  (assignment left: (_) @lhs right: (_) @rhs)

Templates support embedded Rust ({expr}), splicing ({..expr}),
computed literals (#{expr}), and fresh identifiers ($name).

The rule! macro combines query and transform:
  rule!((for pattern: (_) @pat ...) => (call receiver: {val} ...))

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-06 11:34:09 +00:00