Same as in the preceding commit, these items do not make sense as
separate fields on the parent node, so we materialise (or create new)
intermediate nodes to group them together.
The field representation would have made it difficult to figure out
which parameters correspond to which default values and attributes, so
instead we now encapsulate these in a new `function_parameter` node.
Introduces (by making it named) a `block` node, and conversely makes
`statements` anonymous. This enables us to sensibly distinguish between
the "then" and "else" branch of an `if_statement`, which we were not
able to previously.
The output is not so interesting as the mapping removes most nodes from the current test file.
I added a name_expr.swift test so at least one NameExpr makes it through.
ast_types.yml additions:
- tuple_pattern { element*: pattern } in the pattern supertype.
- sequence_condition { stmt*: stmt, condition: condition } in the
condition supertype.
swift.rs:
- Map Swift tuple destructuring (e.g. `let (a, b) = pair`) to the new
tuple_pattern instead of synthesizing an apply_pattern.
- if-let / guard-let: explicitly match the value_binding_pattern
(the `let` keyword) and bind the source expression as the next
condition child, so `let` no longer leaks into the output.
Introduce NodeRef as a typed wrapper around node arena IDs. Captures in
desugaring rules are now bound as NodeRef instead of raw usize, which
prevents accidental misuse and enables source-text-aware rendering.
Add the YeastDisplay trait as an alternative to Display: its
yeast_to_string method receives the Ast, allowing NodeRef to resolve to
the captured node's source text instead of printing a numeric ID.
Store the original source bytes in the Ast so that NodeContent::Range
values (from synthesized literal nodes) can be resolved back to text.
Update yeast-macros to emit NodeRef-typed capture bindings and use
Into::<usize>::into where raw IDs are needed. The #{expr} template
syntax now uses YeastDisplay instead of Display.
The effect is visible in the corpus tests: operator nodes now correctly
render as e.g. operator "+" instead of operator "3".
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add corpus test cases for Swift covering closures, collections, control
flow, functions, literals, loops, operators, optionals/errors, types,
and variables. Update existing desugar.txt with raw parse sections.
Note: operator nodes currently render their node ID instead of the actual
operator text (e.g. operator "3" instead of operator "+"). This will be
fixed in the next commit.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add ast_types.yml defining the unified output AST schema with supertypes
(expr, stmt, condition, pattern) and named nodes (top_level, binary_expr,
name_expr, etc.).
Rewrite swift translation rules to map from tree-sitter Swift grammar to
the unified AST, using one-shot phase rules.
Update the generator to use the output AST schema for dbscheme/QL
generation, and normalize the extraction table prefix to 'unified'.
Improve the corpus test framework to include raw tree-sitter parse output,
type-error checking against the output schema, and better failure
reporting.
Regenerate Ast.qll, unified.dbscheme, and update BasicTest accordingly.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
One-shot desugaring rules now skip unnamed nodes (punctuation, keywords,
etc.) since rules are intended to target named nodes only.
Also prevent infinite recursion when a capture refers to the root node of
the matched tree (e.g. an @_ capture on the pattern root).
Additionally fix the swift.rs add_phase call to match the updated 3-arg
signature introduced by the one-shot phase kind commit.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This named node (which is in fact emitted by the scanner as an
`external`) was appearing as a child of `class_body` because of inlining
via `_class_member_separator`. This, in itself, appears to be somewhat
of a hack, to handle cases where a multiline comment signals the end of
a class member.
To fix this, we make the external node _unnamed_, but keep the `extras`
node _named_ (so we can still extract it from the parse tree), and we
add a new rule `multiline_comment` that mediates between the two. That
way, the use inside `_class_member_separator` can use the unnamed
variant, and no node is pushed into $children.
Doing this involved materialising a lot of previously anonymous nodes,
and I'm not entirely sure it's the best solution, but the node types
look decent enough.
Not entirely happy about the mixed nature of the `kind` filed (having
both tokens and the named node `throw_keyword` in there), but that's a
problem for a different time.
Some nodes with a single child (arguably redundant to do, but I think
it's nice to have the types be consistent), and also an instance of
ensuring that all branches of a `choice` expose consistent field names.
A lot of changes, but for the most part these are just adding named
fields in places where they make sense.
After this, there are still ~20 instances of unnamed children appearing.
I ended up also aliasing `_async_keyword` to a named node to make it
more consistent with the other node kinds that can be in this field (as
it would be awkward to have two named types and a token here).
Elsewhere in the node types, we'll still have `async?: "async"`, and I
think that's okay.