ast_types.yml additions:
- tuple_pattern { element*: pattern } in the pattern supertype.
- sequence_condition { stmt*: stmt, condition: condition } in the
condition supertype.
swift.rs:
- Map Swift tuple destructuring (e.g. `let (a, b) = pair`) to the new
tuple_pattern instead of synthesizing an apply_pattern.
- if-let / guard-let: explicitly match the value_binding_pattern
(the `let` keyword) and bind the source expression as the next
condition child, so `let` no longer leaks into the output.
Two changes to parse_query_fields:
- Allow `field: (kind)* @cap` (repetition + optional capture) in field
position, mirroring how it works for bare children.
- When the same field name is declared multiple times in a query (e.g.
`condition: (foo) condition: (bar)`), merge them into a single
ordered list of children rather than emitting duplicate field
entries (which at runtime restart the iterator for the field and
cause the second declaration to re-match from the first child).
Introduce NodeRef as a typed wrapper around node arena IDs. Captures in
desugaring rules are now bound as NodeRef instead of raw usize, which
prevents accidental misuse and enables source-text-aware rendering.
Add the YeastDisplay trait as an alternative to Display: its
yeast_to_string method receives the Ast, allowing NodeRef to resolve to
the captured node's source text instead of printing a numeric ID.
Store the original source bytes in the Ast so that NodeContent::Range
values (from synthesized literal nodes) can be resolved back to text.
Update yeast-macros to emit NodeRef-typed capture bindings and use
Into::<usize>::into where raw IDs are needed. The #{expr} template
syntax now uses YeastDisplay instead of Display.
The effect is visible in the corpus tests: operator nodes now correctly
render as e.g. operator "+" instead of operator "3".
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add corpus test cases for Swift covering closures, collections, control
flow, functions, literals, loops, operators, optionals/errors, types,
and variables. Update existing desugar.txt with raw parse sections.
Note: operator nodes currently render their node ID instead of the actual
operator text (e.g. operator "3" instead of operator "+"). This will be
fixed in the next commit.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add ast_types.yml defining the unified output AST schema with supertypes
(expr, stmt, condition, pattern) and named nodes (top_level, binary_expr,
name_expr, etc.).
Rewrite swift translation rules to map from tree-sitter Swift grammar to
the unified AST, using one-shot phase rules.
Update the generator to use the output AST schema for dbscheme/QL
generation, and normalize the extraction table prefix to 'unified'.
Improve the corpus test framework to include raw tree-sitter parse output,
type-error checking against the output schema, and better failure
reporting.
Regenerate Ast.qll, unified.dbscheme, and update BasicTest accordingly.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
One-shot desugaring rules now skip unnamed nodes (punctuation, keywords,
etc.) since rules are intended to target named nodes only.
Also prevent infinite recursion when a capture refers to the root node of
the matched tree (e.g. an @_ capture on the pattern root).
Additionally fix the swift.rs add_phase call to match the updated 3-arg
signature introduced by the one-shot phase kind commit.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Same pattern we've seen many times before: a field on an anonymous node
gets attached to the parent node instead.
I'm not 100% sure this is the right solution, but it seemed wrong to
just make `_parenthesized_type` named instead (we don't usually name
parentheticals). At the very least, this cleans up the spurious
navigation_expression.element and tuple_type_item.element fields.
Because `_type` was anonymous, its body was inlined in all of the places
it appeared. Because this body contained a `name` field, this field was
_also_ inlined. This caused a bunch of nodes to have spurious `name`
fields, and for some of them (that already had such a field) it caused
that field have multiplicity greater than one.
To fix this, we make the `_type` node named, which prevents the errant
field from escaping.
Adds a new type `nested_type_identifier`, which contains the
choice-branch that previously allowed those tokens to bleed through into
the closest parent field.
You know the drill. We just make an anonymous node named instead. In
this case, however, we have to be a bit more clever about how to rewrite
it. We turn the sequence of a type followed by an optional ! into a
_choice_ between mere type or type followed by bang (the latter being
our new named node).
Supertypes are a honking great idea. We should use more of them.
This massively cleans up the node types, without polluting the AST with
`expression` nodes.
Before, the `condition` field of an if statement supposedly could
contain things like parentheses and commas, due to bleeding from
referenced anonymous nodes. Making the node named makes this issue go
away.
We make _referenceable_operator a named node. This prevents it from
bleeding through to the _expression definition. It likely also makes the
output easier to deal with, as bare operators used as arguments now have
a named node wrapping them in the AST.
Also removes a duplicated inclusion of _comparison_operator that served
no purpose.
This caused any field containing an _expression to appear as if it could
countain any number of such nodes. It also threw away the information
that there was a `?` marker there.
To fix it, we simply move the definition into its own named node.
The astute reader will note that we seem to _lose_ some node types in
the process. Apparently, these were unreachable in the grammar, and the
newer version of tree-sitter removes such "dead code".