Apply rustfmt

Format the touched Rust crates (shared/tree-sitter-extractor, shared/yeast, shared/yeast-macros, unified/extractor) so the tree-sitter-extractor CI fmt check passes. No functional changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
unified/swift: Use tree! instead of ctx.node
2026-06-25 14:47:04 +02:00 · 2026-06-25 12:26:52 +00:00 · 2026-06-25 12:02:39 +00:00 · 2026-06-25 12:02:39 +00:00 · 2026-06-25 12:02:39 +00:00 · 2026-06-25 12:02:39 +00:00
49 changed files with 7067 additions and 1201 deletions
--- a/Features/CWE-295/AcceptAnyCertificate.cs
+++ b/Features/CWE-295/AcceptAnyCertificate.cs
@@ -1,22 +0,0 @@
-using System.Net.Http;
-using System.Net.Security;
-using System.Security.Cryptography.X509Certificates;
-
-public class CertificateValidation
-{
-    public void Bad()
-    {
-        var handler = new HttpClientHandler();
-        // BAD: the callback always returns true, so every certificate is trusted.
-        handler.ServerCertificateCustomValidationCallback =
-            (request, certificate, chain, errors) => true;
-    }
-
-    public void Good()
-    {
-        var handler = new HttpClientHandler();
-        // GOOD: the certificate is only trusted when there are no validation errors.
-        handler.ServerCertificateCustomValidationCallback =
-            (request, certificate, chain, errors) => errors == SslPolicyErrors.None;
-    }
-}
--- a/Features/CWE-295/AcceptAnyCertificate.qhelp
+++ b/Features/CWE-295/AcceptAnyCertificate.qhelp
@@ -1,52 +0,0 @@
-<!DOCTYPE qhelp PUBLIC
-  "-//Semmle//qhelp//EN"
-  "qhelp.dtd">
-<qhelp>
-<overview>
-<p>
-A TLS/SSL certificate validation callback that always returns <code>true</code> trusts every certificate,
-regardless of any validation errors that were detected. This allows an attacker to perform a machine-in-the-middle
-attack against the application, therefore breaking any security that Transport Layer Security (TLS) provides.
-</p>
-
-<p>
-An attack might look like this:
-</p>
-
-<ol>
-  <li>The vulnerable program connects to <code>https://example.com</code>.</li>
-  <li>The attacker intercepts this connection and presents a valid, self-signed certificate for <code>https://example.com</code>.</li>
-  <li>The vulnerable program calls the certificate validation callback to check whether it should trust the certificate.</li>
-  <li>The callback ignores the <code>SslPolicyErrors</code> argument and returns <code>true</code>.</li>
-  <li>The vulnerable program accepts the certificate and proceeds with the connection, since the callback indicated that the certificate is trusted.</li>
-  <li>The attacker can now read the data the program sends to <code>https://example.com</code> and/or alter its replies while the program thinks the connection is secure.</li>
-</ol>
-</overview>
-
-<recommendation>
-<p>
-Do not use a certificate validation callback that unconditionally returns <code>true</code>.
-Either rely on the default certificate validation, or implement a callback that inspects the
-<code>SslPolicyErrors</code> argument and only trusts a specific, known certificate (for example, when
-using a self-signed certificate that has been explicitly pinned).
-</p>
-</recommendation>
-
-<example>
-<p>
-In the first (bad) example, the callback always returns <code>true</code> and therefore trusts any certificate,
-which allows an attacker to perform a machine-in-the-middle attack. In the second (good) example, the callback
-returns <code>true</code> only when there are no validation errors.
-</p>
-<sample src="AcceptAnyCertificate.cs" />
-</example>
-
-<references>
-<li>Microsoft Learn:
-  <a href="https://learn.microsoft.com/en-us/dotnet/api/system.net.security.remotecertificatevalidationcallback">RemoteCertificateValidationCallback Delegate</a>.</li>
-<li>Microsoft Learn:
-  <a href="https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca5359">CA5359: Do not disable certificate validation</a>.</li>
-<li>OWASP:
-  <a href="https://owasp.org/www-community/attacks/Manipulator-in-the-middle_attack">Manipulator-in-the-middle attack</a>.</li>
-</references>
-</qhelp>
--- a/Features/CWE-295/AcceptAnyCertificate.ql
+++ b/Features/CWE-295/AcceptAnyCertificate.ql
@@ -1,101 +0,0 @@
-/**
- * @name Accepting any TLS certificate during validation
- * @description A certificate validation callback that always accepts any certificate
- *              allows an attacker to perform a machine-in-the-middle attack.
- * @kind path-problem
- * @problem.severity error
- * @security-severity 7.5
- * @precision high
- * @id cs/accept-any-certificate
- * @tags security
- *       external/cwe/cwe-295
- */
-
-import csharp
-import semmle.code.csharp.dataflow.DataFlow::DataFlow
-import AcceptAnyCertificate::PathGraph
-
-/**
- * Holds if `c` always returns `true` and never returns `false`, i.e. it accepts
- * every input it is given.
- */
-predicate alwaysReturnsTrue(Callable c) {
-  c.getReturnType() instanceof BoolType and
-  // There is at least one returned value, and every returned value is the
-  // constant `true`.
-  forex(Expr ret | c.canReturn(ret) | ret.getValue() = "true")
-}
-
-/**
- * A delegate type used as a TLS/SSL certificate validation callback. Such a
- * delegate returns a `bool` (whether the certificate is trusted) and takes a
- * `System.Net.Security.SslPolicyErrors` parameter describing any validation
- * errors that were found. This covers `RemoteCertificateValidationCallback` as
- * well as the `Func<..., SslPolicyErrors, bool>` callbacks used by, for example,
- * `HttpClientHandler.ServerCertificateCustomValidationCallback`.
- */
-class CertificateValidationCallbackType extends DelegateType {
-  CertificateValidationCallbackType() {
-    this.getReturnType() instanceof BoolType and
-    this.getAParameter().getType().hasFullyQualifiedName("System.Net.Security", "SslPolicyErrors")
-  }
-}
-
-/**
- * Gets a callable that always accepts any certificate, referenced by the
- * delegate-producing expression `e`.
- */
-Callable getAcceptingCallable(Expr e) {
-  // A lambda or anonymous method, e.g. `(sender, cert, chain, errors) => true`.
-  result = e and
-  alwaysReturnsTrue(e)
-  or
-  // A method group, e.g. `AcceptAllCertificates`, possibly wrapped in an
-  // (implicit or explicit) delegate creation.
-  result = e.(DelegateCreation).getArgument().(CallableAccess).getTarget() and
-  alwaysReturnsTrue(result)
-  or
-  result = e.(CallableAccess).getTarget() and
-  alwaysReturnsTrue(result)
-}
-
-module AcceptAnyCertificateConfig implements DataFlow::ConfigSig {
-  predicate isSource(DataFlow::Node source) {
-    exists(getAcceptingCallable(source.asExpr()))
-    or
-    // `HttpClientHandler.DangerousAcceptAnyServerCertificateValidator` is a
-    // built-in callback that accepts every certificate.
-    source
-        .asExpr()
-        .(PropertyAccess)
-        .getTarget()
-        .hasName("DangerousAcceptAnyServerCertificateValidator")
-  }
-
-  predicate isSink(DataFlow::Node sink) {
-    // The value assigned to a property, field or local of certificate
-    // validation callback type.
-    exists(Assignable a |
-      a.getType() instanceof CertificateValidationCallbackType and
-      sink.asExpr() = a.getAnAssignedValue()
-    )
-    or
-    // The value passed as a certificate validation callback argument, e.g. to
-    // the `SslStream` constructor.
-    exists(Call call, Parameter p |
-      p = call.getTarget().getAParameter() and
-      p.getType() instanceof CertificateValidationCallbackType and
-      sink.asExpr() = call.getArgumentForParameter(p)
-    )
-  }
-
-  predicate observeDiffInformedIncrementalMode() { any() }
-}
-
-module AcceptAnyCertificate = DataFlow::Global<AcceptAnyCertificateConfig>;
-
-from AcceptAnyCertificate::PathNode source, AcceptAnyCertificate::PathNode sink
-where AcceptAnyCertificate::flowPath(source, sink)
-select sink.getNode(), source, sink,
-  "This TLS certificate validation $@, which trusts any certificate.", source.getNode(),
-  "uses a callback"
--- a/csharp/ql/src/change-notes/2026-06-10-accept-any-certificate.md
+++ b/csharp/ql/src/change-notes/2026-06-10-accept-any-certificate.md
@@ -1,4 +0,0 @@
---
-category: newQuery
---
-* Added a new query, `cs/accept-any-certificate`, to detect TLS/SSL certificate validation callbacks that always accept any certificate (CWE-295).
--- a/Features/CWE-295/AcceptAnyCertificate/AcceptAnyCertificate.expected
+++ b/Features/CWE-295/AcceptAnyCertificate/AcceptAnyCertificate.expected
@@ -1,24 +0,0 @@
-edges
-| Test.cs:64:45:64:52 | access to local variable callback : (...) => ... | Test.cs:67:48:67:55 | access to local variable callback | provenance |  |
-| Test.cs:65:13:65:56 | (...) => ... : (...) => ... | Test.cs:64:45:64:52 | access to local variable callback : (...) => ... | provenance |  |
-nodes
-| Test.cs:14:13:14:57 | (...) => ... | semmle.label | (...) => ... |
-| Test.cs:22:13:25:13 | (...) => ... | semmle.label | (...) => ... |
-| Test.cs:33:13:33:74 | access to property DangerousAcceptAnyServerCertificateValidator | semmle.label | access to property DangerousAcceptAnyServerCertificateValidator |
-| Test.cs:40:13:40:56 | (...) => ... | semmle.label | (...) => ... |
-| Test.cs:52:67:52:75 | delegate creation of type RemoteCertificateValidationCallback | semmle.label | delegate creation of type RemoteCertificateValidationCallback |
-| Test.cs:59:13:59:56 | (...) => ... | semmle.label | (...) => ... |
-| Test.cs:64:45:64:52 | access to local variable callback : (...) => ... | semmle.label | access to local variable callback : (...) => ... |
-| Test.cs:65:13:65:56 | (...) => ... | semmle.label | (...) => ... |
-| Test.cs:65:13:65:56 | (...) => ... : (...) => ... | semmle.label | (...) => ... : (...) => ... |
-| Test.cs:67:48:67:55 | access to local variable callback | semmle.label | access to local variable callback |
-subpaths
-#select
-| Test.cs:14:13:14:57 | (...) => ... | Test.cs:14:13:14:57 | (...) => ... | Test.cs:14:13:14:57 | (...) => ... | This TLS certificate validation $@, which trusts any certificate. | Test.cs:14:13:14:57 | (...) => ... | uses a callback |
-| Test.cs:22:13:25:13 | (...) => ... | Test.cs:22:13:25:13 | (...) => ... | Test.cs:22:13:25:13 | (...) => ... | This TLS certificate validation $@, which trusts any certificate. | Test.cs:22:13:25:13 | (...) => ... | uses a callback |
-| Test.cs:33:13:33:74 | access to property DangerousAcceptAnyServerCertificateValidator | Test.cs:33:13:33:74 | access to property DangerousAcceptAnyServerCertificateValidator | Test.cs:33:13:33:74 | access to property DangerousAcceptAnyServerCertificateValidator | This TLS certificate validation $@, which trusts any certificate. | Test.cs:33:13:33:74 | access to property DangerousAcceptAnyServerCertificateValidator | uses a callback |
-| Test.cs:40:13:40:56 | (...) => ... | Test.cs:40:13:40:56 | (...) => ... | Test.cs:40:13:40:56 | (...) => ... | This TLS certificate validation $@, which trusts any certificate. | Test.cs:40:13:40:56 | (...) => ... | uses a callback |
-| Test.cs:52:67:52:75 | delegate creation of type RemoteCertificateValidationCallback | Test.cs:52:67:52:75 | delegate creation of type RemoteCertificateValidationCallback | Test.cs:52:67:52:75 | delegate creation of type RemoteCertificateValidationCallback | This TLS certificate validation $@, which trusts any certificate. | Test.cs:52:67:52:75 | delegate creation of type RemoteCertificateValidationCallback | uses a callback |
-| Test.cs:59:13:59:56 | (...) => ... | Test.cs:59:13:59:56 | (...) => ... | Test.cs:59:13:59:56 | (...) => ... | This TLS certificate validation $@, which trusts any certificate. | Test.cs:59:13:59:56 | (...) => ... | uses a callback |
-| Test.cs:65:13:65:56 | (...) => ... | Test.cs:65:13:65:56 | (...) => ... | Test.cs:65:13:65:56 | (...) => ... | This TLS certificate validation $@, which trusts any certificate. | Test.cs:65:13:65:56 | (...) => ... | uses a callback |
-| Test.cs:67:48:67:55 | access to local variable callback | Test.cs:65:13:65:56 | (...) => ... : (...) => ... | Test.cs:67:48:67:55 | access to local variable callback | This TLS certificate validation $@, which trusts any certificate. | Test.cs:65:13:65:56 | (...) => ... | uses a callback |
--- a/Features/CWE-295/AcceptAnyCertificate/AcceptAnyCertificate.qlref
+++ b/Features/CWE-295/AcceptAnyCertificate/AcceptAnyCertificate.qlref
@@ -1 +0,0 @@
-Security Features/CWE-295/AcceptAnyCertificate.ql
--- a/Features/CWE-295/AcceptAnyCertificate/Test.cs
+++ b/Features/CWE-295/AcceptAnyCertificate/Test.cs
@@ -1,89 +0,0 @@
-using System.IO;
-using System.Net;
-using System.Net.Http;
-using System.Net.Security;
-using System.Security.Cryptography.X509Certificates;
-
-public class CertificateValidationTests
-{
-    public void HttpClientHandlerBad()
-    {
-        var handler = new HttpClientHandler();
-        // BAD: always trusts any certificate.
-        handler.ServerCertificateCustomValidationCallback =
-            (request, certificate, chain, errors) => true;
-    }
-
-    public void HttpClientHandlerBlockBodyBad()
-    {
-        var handler = new HttpClientHandler();
-        // BAD: always trusts any certificate.
-        handler.ServerCertificateCustomValidationCallback =
-            (request, certificate, chain, errors) =>
-            {
-                return true;
-            };
-    }
-
-    public void HttpClientHandlerDangerousBad()
-    {
-        var handler = new HttpClientHandler();
-        // BAD: built-in callback that accepts any certificate.
-        handler.ServerCertificateCustomValidationCallback =
-            HttpClientHandler.DangerousAcceptAnyServerCertificateValidator;
-    }
-
-    public void ServicePointManagerBad()
-    {
-        // BAD: always trusts any certificate.
-        ServicePointManager.ServerCertificateValidationCallback =
-            (sender, certificate, chain, errors) => true;
-    }
-
-    private static bool AcceptAll(object sender, X509Certificate certificate, X509Chain chain,
-        SslPolicyErrors errors)
-    {
-        return true;
-    }
-
-    public void MethodGroupBad()
-    {
-        // BAD: the referenced method always returns true.
-        ServicePointManager.ServerCertificateValidationCallback = AcceptAll;
-    }
-
-    public void SslStreamBad(Stream stream)
-    {
-        // BAD: the validation callback always returns true.
-        var ssl = new SslStream(stream, false,
-            (sender, certificate, chain, errors) => true);
-    }
-
-    public void IndirectBad(Stream stream)
-    {
-        RemoteCertificateValidationCallback callback =
-            (sender, certificate, chain, errors) => true;
-        // BAD: the callback flowing here always returns true.
-        var ssl = new SslStream(stream, false, callback);
-    }
-
-    public void HttpClientHandlerGood()
-    {
-        var handler = new HttpClientHandler();
-        // GOOD: the certificate is only trusted when there are no validation errors.
-        handler.ServerCertificateCustomValidationCallback =
-            (request, certificate, chain, errors) => errors == SslPolicyErrors.None;
-    }
-
-    private static bool Validate(object sender, X509Certificate certificate, X509Chain chain,
-        SslPolicyErrors errors)
-    {
-        return errors == SslPolicyErrors.None;
-    }
-
-    public void MethodGroupGood()
-    {
-        // GOOD: the referenced method performs real validation.
-        ServicePointManager.ServerCertificateValidationCallback = Validate;
-    }
-}
--- a/Features/CWE-295/AcceptAnyCertificate/options
+++ b/Features/CWE-295/AcceptAnyCertificate/options
@@ -1,2 +0,0 @@
-semmle-extractor-options: /nostdlib /noconfig
-semmle-extractor-options: --load-sources-from-project:${testdir}/../../../../resources/stubs/_frameworks/Microsoft.NETCore.App/Microsoft.NETCore.App.csproj
--- a/shared/tree-sitter-extractor/src/extractor/mod.rs
+++ b/shared/tree-sitter-extractor/src/extractor/mod.rs
@@ -280,10 +280,11 @@ pub fn location_label(writer: &mut trap::Writer, location: trap::Location) -> tr
 }

 /// Extracts the source file at `path`, which is assumed to be canonicalized.
-/// When `yeast_runner` is `Some`, the parsed tree is first transformed
-/// through the supplied yeast `Runner` before TRAP extraction. Building the
-/// `Runner` (which parses YAML and constructs the schema) is the caller's
-/// responsibility, allowing it to be done once and shared across files.
+/// When `desugarer` is `Some`, the parsed tree is first transformed
+/// through the supplied yeast desugarer before TRAP extraction. Building
+/// the desugarer (which parses YAML and constructs the schema) is the
+/// caller's responsibility, allowing it to be done once and shared across
+/// files.
 #[allow(clippy::too_many_arguments)]
 pub fn extract(
    language: &Language,
@@ -295,7 +296,7 @@ pub fn extract(
    path: &Path,
    source: &[u8],
    ranges: &[Range],
-    yeast_runner: Option<&yeast::Runner<'_>>,
+    desugarer: Option<&dyn yeast::Desugarer>,
 ) {
    let path_str = file_paths::normalize_and_transform_path(path, transformer);
    let source_root = std::env::current_dir()
@@ -328,11 +329,14 @@ pub fn extract(
        schema,
    );

-    if let Some(yeast_runner) = yeast_runner {
-        let ast = yeast_runner
+    if let Some(desugarer) = desugarer {
+        let ast = desugarer
            .run_from_tree(&tree, source)
            .unwrap_or_else(|e| panic!("Desugaring failed for {path_str}: {e}"));
        traverse_yeast(&ast, &mut visitor);
+        // Comments and other `extra` nodes are not represented in the desugared
+        // AST, so recover them directly from the original parse tree.
+        traverse_extras(&tree, &mut visitor);
    } else {
        traverse(&tree, &mut visitor);
    }
@@ -365,6 +369,8 @@ struct Visitor<'a> {
    ast_node_parent_table_name: String,
    /// Language-specific name of the tokeninfo table
    tokeninfo_table_name: String,
+    /// Language-specific name of the trivia tokeninfo table
+    trivia_tokeninfo_table_name: String,
    /// A lookup table from type name to node types
    schema: &'a NodeTypeMap,
    /// A stack for gathering information from child nodes. Whenever a node is
@@ -395,11 +401,33 @@ impl<'a> Visitor<'a> {
            ast_node_location_table_name: format!("{language_prefix}_ast_node_location"),
            ast_node_parent_table_name: format!("{language_prefix}_ast_node_parent"),
            tokeninfo_table_name: format!("{language_prefix}_tokeninfo"),
+            trivia_tokeninfo_table_name: format!("{language_prefix}_trivia_tokeninfo"),
            schema,
            stack: Vec::new(),
        }
    }

+    /// Emits a `TriviaToken` for the given `extra` node (e.g. a comment) from
+    /// the original parse tree. Trivia tokens carry a location and their source
+    /// text, but are not attached to a parent in the (possibly desugared) AST.
+    fn emit_trivia_token(&mut self, node: &Node) {
+        let id = self.trap_writer.fresh_id();
+        let loc = location_for(self, self.file_label, node);
+        let loc_label = location_label(self.trap_writer, loc);
+        self.trap_writer.add_tuple(
+            &self.ast_node_location_table_name,
+            vec![trap::Arg::Label(id), trap::Arg::Label(loc_label)],
+        );
+        self.trap_writer.add_tuple(
+            &self.trivia_tokeninfo_table_name,
+            vec![
+                trap::Arg::Label(id),
+                trap::Arg::Int(node.kind_id() as usize),
+                sliced_source_arg(self.source, node),
+            ],
+        );
+    }
+
    fn record_parse_error(&mut self, loc: trap::Label, mesg: &diagnostics::DiagnosticMessage) {
        self.diagnostics_writer.write(mesg);
        let id = self.trap_writer.fresh_id();
@@ -835,6 +863,24 @@ fn traverse(tree: &Tree, visitor: &mut Visitor) {
    }
 }

+/// Walks the original tree-sitter tree and emits a `TriviaToken` for every
+/// `extra` node (e.g. a comment). Used to preserve comments that would
+/// otherwise be lost after a desugaring pass rewrites the tree.
+fn traverse_extras(tree: &Tree, visitor: &mut Visitor) {
+    emit_extras_in(visitor, tree.root_node());
+}
+
+fn emit_extras_in(visitor: &mut Visitor, node: Node<'_>) {
+    let mut cursor = node.walk();
+    for child in node.children(&mut cursor) {
+        if child.is_extra() {
+            visitor.emit_trivia_token(&child);
+        } else {
+            emit_extras_in(visitor, child);
+        }
+    }
+}
+
 fn traverse_yeast(tree: &yeast::Ast, visitor: &mut Visitor) {
    use yeast::Cursor;
    let mut cursor = tree.walk();
--- a/shared/tree-sitter-extractor/src/extractor/simple.rs
+++ b/shared/tree-sitter-extractor/src/extractor/simple.rs
@@ -13,11 +13,14 @@ pub struct LanguageSpec {
    pub prefix: &'static str,
    pub ts_language: tree_sitter::Language,
    pub node_types: &'static str,
-    /// Optional yeast desugaring configuration. When set, the parsed
-    /// tree is rewritten through yeast before TRAP extraction. The
-    /// config's `output_node_types_yaml` (if set) provides the schema
-    /// used both at runtime (for the rewriter) and for TRAP validation.
-    pub desugar: Option<yeast::DesugaringConfig>,
+    /// Optional desugarer. When set, the parsed tree is rewritten through
+    /// the desugarer before TRAP extraction. The desugarer's
+    /// `output_node_types_yaml()` (if set) provides the schema used both
+    /// at runtime (for the rewriter) and for TRAP validation.
+    ///
+    /// `Box<dyn yeast::Desugarer>` so the shared extractor is agnostic to
+    /// the user-defined context type the desugarer uses internally.
+    pub desugar: Option<Box<dyn yeast::Desugarer>>,
    pub file_globs: Vec<String>,
 }

@@ -91,35 +94,22 @@ impl Extractor {
            .collect();

        let mut schemas = vec![];
-        let mut yeast_runners = Vec::new();
        for lang in &self.languages {
-            let effective_node_types: String =
-                match lang.desugar.as_ref().and_then(|c| c.output_node_types_yaml) {
-                    Some(yaml) => yeast::node_types_yaml::convert(yaml).map_err(|e| {
-                        std::io::Error::other(format!(
-                            "Failed to convert YAML node-types to JSON for {}: {e}",
-                            lang.prefix
-                        ))
-                    })?,
-                    None => lang.node_types.to_string(),
-                };
-            let schema = node_types::read_node_types_str(lang.prefix, &effective_node_types)?;
-            schemas.push(schema);
-
-            // Build the yeast runner once per language so the YAML schema
-            // isn't re-parsed for every file.
-            let yeast_runner = lang
+            let effective_node_types: String = match lang
                .desugar
                .as_ref()
-                .map(|config| yeast::Runner::from_config(lang.ts_language.clone(), config))
-                .transpose()
-                .map_err(|e| {
+                .and_then(|d| d.output_node_types_yaml())
+            {
+                Some(yaml) => yeast::node_types_yaml::convert(yaml).map_err(|e| {
                    std::io::Error::other(format!(
-                        "Failed to build desugaring runner for {}: {e}",
+                        "Failed to convert YAML node-types to JSON for {}: {e}",
                        lang.prefix
                    ))
-                })?;
-            yeast_runners.push(yeast_runner);
+                })?,
+                None => lang.node_types.to_string(),
+            };
+            let schema = node_types::read_node_types_str(lang.prefix, &effective_node_types)?;
+            schemas.push(schema);
        }

        // Construct a single globset containing all language globs,
@@ -194,7 +184,7 @@ impl Extractor {
                                    &path,
                                    &source,
                                    &[],
-                                    yeast_runners[i].as_ref(),
+                                    lang.desugar.as_deref(),
                                );
                                std::fs::create_dir_all(src_archive_file.parent().unwrap())?;
                                std::fs::copy(&path, &src_archive_file)?;
--- a/shared/tree-sitter-extractor/src/generator/mod.rs
+++ b/shared/tree-sitter-extractor/src/generator/mod.rs
@@ -68,7 +68,12 @@ pub fn generate(
        let node_parent_table_name = format!("{}_ast_node_parent", &prefix);
        let token_name = format!("{}_token", &prefix);
        let tokeninfo_name = format!("{}_tokeninfo", &prefix);
+        let trivia_token_name = format!("{}_trivia_token", &prefix);
+        let trivia_tokeninfo_name = format!("{}_trivia_tokeninfo", &prefix);
        let reserved_word_name = format!("{}_reserved_word", &prefix);
+        // When a desugaring is configured, comments and other `extra` nodes are
+        // preserved from the original parse tree as `TriviaToken`s.
+        let has_trivia_tokens = language.desugar.is_some();
        let effective_node_types: String = match language
            .desugar
            .as_ref()
@@ -85,28 +90,35 @@ pub fn generate(
        let nodes = node_types::read_node_types_str(&prefix, &effective_node_types)?;
        let (dbscheme_entries, mut ast_node_members, token_kinds) = convert_nodes(&nodes);
        ast_node_members.insert(&token_name);
+        if has_trivia_tokens {
+            ast_node_members.insert(&trivia_token_name);
+        }
        writeln!(&mut dbscheme_writer, "/*- {} dbscheme -*/", language.name)?;
        dbscheme::write(&mut dbscheme_writer, &dbscheme_entries)?;
        let token_case = create_token_case(&token_name, token_kinds);
-        dbscheme::write(
-            &mut dbscheme_writer,
-            &[
-                dbscheme::Entry::Table(create_tokeninfo(&tokeninfo_name, &token_name)),
-                dbscheme::Entry::Case(token_case),
-                dbscheme::Entry::Union(dbscheme::Union {
-                    name: &ast_node_name,
-                    members: ast_node_members,
-                }),
-                dbscheme::Entry::Table(create_ast_node_location_table(
-                    &node_location_table_name,
-                    &ast_node_name,
-                )),
-                dbscheme::Entry::Table(create_ast_node_parent_table(
-                    &node_parent_table_name,
-                    &ast_node_name,
-                )),
-            ],
-        )?;
+        let mut dbscheme_tail = vec![
+            dbscheme::Entry::Table(create_tokeninfo(&tokeninfo_name, &token_name)),
+            dbscheme::Entry::Case(token_case),
+        ];
+        if has_trivia_tokens {
+            dbscheme_tail.push(dbscheme::Entry::Table(create_tokeninfo(
+                &trivia_tokeninfo_name,
+                &trivia_token_name,
+            )));
+        }
+        dbscheme_tail.push(dbscheme::Entry::Union(dbscheme::Union {
+            name: &ast_node_name,
+            members: ast_node_members,
+        }));
+        dbscheme_tail.push(dbscheme::Entry::Table(create_ast_node_location_table(
+            &node_location_table_name,
+            &ast_node_name,
+        )));
+        dbscheme_tail.push(dbscheme::Entry::Table(create_ast_node_parent_table(
+            &node_parent_table_name,
+            &ast_node_name,
+        )));
+        dbscheme::write(&mut dbscheme_writer, &dbscheme_tail)?;

        let mut body = vec![
            ql::TopLevel::Class(ql_gen::create_ast_node_class(
@@ -116,6 +128,12 @@ pub fn generate(
            )),
            ql::TopLevel::Class(ql_gen::create_token_class(&token_name, &tokeninfo_name)),
        ];
+        if has_trivia_tokens {
+            body.push(ql::TopLevel::Class(ql_gen::create_trivia_token_class(
+                &trivia_token_name,
+                &trivia_tokeninfo_name,
+            )));
+        }
        // Only emit the ReservedWord class when there are actually unnamed token
        // types in the schema (i.e., @{prefix}_reserved_word exists in the dbscheme).
        // When converting from a YEAST YAML schema that has no unnamed tokens, this
--- a/shared/tree-sitter-extractor/src/generator/ql_gen.rs
+++ b/shared/tree-sitter-extractor/src/generator/ql_gen.rs
@@ -199,6 +199,70 @@ pub fn create_token_class<'a>(token_type: &'a str, tokeninfo: &'a str) -> ql::Cl
    }
 }

+/// Creates the `TriviaToken` class. Trivia tokens (e.g. comments) are
+/// `extra` nodes preserved from the original parse tree even when the tree has
+/// been rewritten by a desugaring pass. They are not part of the regular
+/// `Token` hierarchy because they do not appear in the (possibly desugared)
+/// output schema.
+pub fn create_trivia_token_class<'a>(
+    trivia_token_type: &'a str,
+    trivia_tokeninfo: &'a str,
+) -> ql::Class<'a> {
+    let trivia_tokeninfo_arity = 3; // id, kind, value
+    let get_value = ql::Predicate {
+        qldoc: Some(String::from("Gets the source text of this trivia token.")),
+        name: "getValue",
+        overridden: false,
+        is_private: false,
+        is_final: true,
+        return_type: Some(ql::Type::String),
+        formal_parameters: vec![],
+        body: create_get_field_expr_for_column_storage(
+            "result",
+            trivia_tokeninfo,
+            1,
+            trivia_tokeninfo_arity,
+        ),
+        overlay: None,
+    };
+    let to_string = ql::Predicate {
+        qldoc: Some(String::from(
+            "Gets a string representation of this element.",
+        )),
+        name: "toString",
+        overridden: true,
+        is_private: false,
+        is_final: true,
+        return_type: Some(ql::Type::String),
+        formal_parameters: vec![],
+        body: ql::Expression::Equals(
+            Box::new(ql::Expression::Var("result")),
+            Box::new(ql::Expression::Dot(
+                Box::new(ql::Expression::Var("this")),
+                "getValue",
+                vec![],
+            )),
+        ),
+        overlay: None,
+    };
+    ql::Class {
+        qldoc: Some(String::from(
+            "A trivia token, such as a comment, preserved from the original parse tree.",
+        )),
+        name: "TriviaToken",
+        is_abstract: false,
+        supertypes: vec![ql::Type::At(trivia_token_type), ql::Type::Normal("AstNode")]
+            .into_iter()
+            .collect(),
+        characteristic_predicate: None,
+        predicates: vec![
+            get_value,
+            to_string,
+            create_get_a_primary_ql_class("TriviaToken", false),
+        ],
+    }
+}
+
 // Creates the `ReservedWord` class.
 pub fn create_reserved_word_class(db_name: &str) -> ql::Class<'_> {
    let class_name = "ReservedWord";
--- a/shared/yeast-macros/src/lib.rs
+++ b/shared/yeast-macros/src/lib.rs
@@ -44,8 +44,19 @@ pub fn query(input: TokenStream) -> TokenStream {
 /// {expr}                       - embed a Rust expression returning Id
 /// {..expr}                     - splice an iterable of Id (in child/field position)
 /// field: {..expr}              - splice into a named field
+/// {expr}.map(p -> tpl)         - apply tpl to each element; splice result
+/// {expr}.reduce_left(f -> init, acc, e -> fold)
+///                              - fold with per-element init; splice 0 or 1 result
 /// ```
 ///
+/// Chain syntax after `{expr}` or `{..expr}`:
+/// - `.map(param -> template)` — one output node per input element.
+/// - `.reduce_left(first -> init, acc, elem -> fold)` — fold left; the first
+///   element is converted by `init`, subsequent elements are folded by `fold`
+///   with the accumulator bound to `acc`. An empty iterable yields nothing.
+/// - Chains always splice (the result is iterable).
+/// - Multiple chains can be chained, e.g. `.map(...).reduce_left(...)`.
+///
 /// Can be called with an explicit context or using the implicit context
 /// from an enclosing `rule!`:
 ///
@@ -110,3 +121,37 @@ pub fn rule(input: TokenStream) -> TokenStream {
        Err(err) => err.to_compile_error().into(),
    }
 }
+
+/// Define a desugaring rule whose transform is a hand-written Rust block.
+///
+/// Use `manual_rule!` when the transform needs control over capture
+/// translation timing — for example, when an outer rule needs to set
+/// state in `ctx` (the `BuildCtx`'s user context) before recursive
+/// translation reaches inner rules that read that state.
+///
+/// ```text
+/// manual_rule!(
+///     (query_pattern field: (_) @name)
+///     {
+///         // `ctx` is a `&mut BuildCtx<'_, C>`; capture variables
+///         // (`name: NodeRef`, etc.) are bound from the query.
+///         let translated = ctx.translate(name)?;
+///         Ok(translated)
+///     }
+/// )
+/// ```
+///
+/// Differences from [`rule!`]:
+/// - Captures are **not** auto-translated before the body runs; they
+///   refer to raw input-schema nodes. Use [`BuildCtx::translate`] (or
+///   [`BuildCtx::translate_opt`]) to translate them when you choose.
+/// - The body is plain Rust returning `Result<Vec<Id>, String>` — no
+///   tree template, no `Ok(...)` wrap.
+#[proc_macro]
+pub fn manual_rule(input: TokenStream) -> TokenStream {
+    let input2: TokenStream2 = input.into();
+    match parse::parse_manual_rule_top(input2) {
+        Ok(output) => output.into(),
+        Err(err) => err.to_compile_error().into(),
+    }
+}
--- a/shared/yeast-macros/src/parse.rs
+++ b/shared/yeast-macros/src/parse.rs
@@ -121,9 +121,9 @@ fn parse_query_fields(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
        std::collections::HashMap::new();
    let mut bare_children: Vec<TokenStream> = Vec::new();
    let push_field_elem = |order: &mut Vec<String>,
-                               map: &mut std::collections::HashMap<String, Vec<TokenStream>>,
-                               name: String,
-                               elem: TokenStream| {
+                           map: &mut std::collections::HashMap<String, Vec<TokenStream>>,
+                           name: String,
+                           elem: TokenStream| {
        if !map.contains_key(&name) {
            order.push(name.clone());
            map.insert(name, vec![elem]);
@@ -141,7 +141,12 @@ fn parse_query_fields(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
            // Parse the field's pattern. To support repetition like
            // `field: (kind)* @cap`, parse the atom first, then check for
            // a quantifier, and lastly handle a trailing `@capture`.
-            let atom = parse_query_atom(tokens)?;
+            // `field: @cap` is sugar for `field: _ @cap`.
+            let atom = if peek_is_at(tokens) {
+                quote! { yeast::query::QueryNode::Any { match_unnamed: true } }
+            } else {
+                parse_query_atom(tokens)?
+            };
            if peek_is_repetition(tokens) {
                let rep = expect_repetition(tokens)?;
                let elem = quote! {
@@ -155,8 +160,7 @@ fn parse_query_fields(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
            } else {
                let child = if peek_is_at(tokens) {
                    tokens.next();
-                    let capture_name =
-                        expect_ident(tokens, "expected capture name after @")?;
+                    let capture_name = expect_ident(tokens, "expected capture name after @")?;
                    let name_str = capture_name.to_string();
                    quote! {
                        yeast::query::QueryNode::Capture {
@@ -259,6 +263,7 @@ fn parse_query_list(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
                    yeast::query::QueryListElem::SingleNode(#node)
                },
            )?;
+            let elem = maybe_wrap_list_capture(tokens, elem)?;
            elems.push(elem);
            continue;
        }
@@ -276,6 +281,7 @@ fn parse_query_list(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
                    yeast::query::QueryListElem::SingleNode(#node)
                },
            )?;
+            let elem = maybe_wrap_list_capture(tokens, elem)?;
            elems.push(elem);
            continue;
        }
@@ -289,10 +295,10 @@ fn parse_query_list(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
 // tree! / trees! parsing — direct code generation against BuildCtx
 // ---------------------------------------------------------------------------

-const IMPLICIT_CTX: &str = "__yeast_ctx";
+const IMPLICIT_CTX: &str = "ctx";

 /// Determine the context identifier: either explicit `ctx,` or the implicit
-/// `__yeast_ctx` from an enclosing `rule!`.
+/// `ctx` from an enclosing `rule!`.
 fn parse_ctx_or_implicit(tokens: &mut Tokens) -> Ident {
    // Check if first token is an ident followed by a comma
    let mut lookahead = tokens.clone();
@@ -352,7 +358,7 @@ fn parse_direct_node(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStream> {
        Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Brace => {
            let group = expect_group(tokens, Delimiter::Brace)?;
            let expr = group.stream();
-            Ok(quote! { ::std::convert::Into::<usize>::into(#expr) })
+            Ok(quote! { ::std::convert::Into::<usize>::into({ #expr }) })
        }
        Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Parenthesis => {
            let group = expect_group(tokens, Delimiter::Parenthesis)?;
@@ -389,8 +395,10 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
        let expr = group.stream();
        return Ok(quote! {
            {
-                let __value = yeast::YeastDisplay::yeast_to_string(&(#expr), &*#ctx.ast);
-                #ctx.literal(#kind_str, &__value)
+                let __expr = { #expr };
+                let __value = yeast::YeastDisplay::yeast_to_string(&__expr, &*#ctx.ast);
+                let __source_range = yeast::YeastSourceRange::yeast_source_range(&__expr, &*#ctx.ast);
+                #ctx.literal_with_source_range(#kind_str, &__value, __source_range)
            }
        });
    }
@@ -411,7 +419,11 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
    // Named fields — compute each value into a temp, then reference it
    while peek_is_field(tokens) {
        let field_name = expect_ident(tokens, "expected field name")?;
-        let field_str = field_name.to_string().strip_prefix("r#").unwrap_or(&field_name.to_string()).to_string();
+        let field_str = field_name
+            .to_string()
+            .strip_prefix("r#")
+            .unwrap_or(&field_name.to_string())
+            .to_string();
        expect_punct(tokens, ':', "expected `:` after field name")?;
        let temp = Ident::new(
            &format!("__field_{field_str}_{field_counter}"),
@@ -419,23 +431,36 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
        );
        field_counter += 1;

-        // Check for field: {..expr} — splice a Vec<Id> into the field
+        // Check for field: {..expr}.chain or field: {expr}.chain — splice a Vec<Id> into the field
        if peek_is_group(tokens, Delimiter::Brace) {
            let group_clone = tokens.clone().next().unwrap();
            if let TokenTree::Group(g) = &group_clone {
                let mut inner_check = g.stream().into_iter();
                let is_splice = matches!(inner_check.next(), Some(TokenTree::Punct(p)) if p.as_char() == '.')
                    && matches!(inner_check.next(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
-                if is_splice {
+                // Determine if a chain (.map(..)) follows the `{}` group.
+                let mut after = tokens.clone();
+                after.next(); // skip the brace group
+                let has_chain =
+                    matches!(after.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
+
+                if is_splice || has_chain {
                    let group = expect_group(tokens, Delimiter::Brace)?;
-                    let mut inner = group.stream().into_iter().peekable();
-                    inner.next(); // consume first .
-                    inner.next(); // consume second .
-                    let expr: proc_macro2::TokenStream = inner.collect();
+                    let base: TokenStream = if is_splice {
+                        let mut inner = group.stream().into_iter().peekable();
+                        inner.next(); // consume first .
+                        inner.next(); // consume second .
+                        let expr: TokenStream = inner.collect();
+                        quote! {
+                            { #expr }.into_iter().map(::std::convert::Into::<usize>::into)
+                        }
+                    } else {
+                        let expr = group.stream();
+                        quote! { { #expr }.into_iter() }
+                    };
+                    let chained = parse_chain_suffix(tokens, ctx, base)?;
                    stmts.push(quote! {
-                        let #temp: Vec<usize> = (#expr).into_iter()
-                            .map(::std::convert::Into::<usize>::into)
-                            .collect();
+                        let #temp: Vec<usize> = #chained.collect();
                    });
                    // An empty splice means the field is absent — skip it
                    // entirely rather than emitting an empty named field.
@@ -472,6 +497,94 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
    })
 }

+/// Parse a chain of `.method(args)` suffixes after a `{expr}` or `{..expr}`
+/// placeholder in tree templates. Currently supports:
+///
+/// ```text
+/// .map(param -> template)   -- iterator map: produces Vec<usize>
+/// ```
+///
+/// The chain may be empty (returns `base` unchanged). Multiple chained calls
+/// are supported, e.g. `.map(p -> ...).map(q -> ...)`.
+///
+/// Each call expects the receiver to be an iterator. The `base` argument
+/// should therefore already be an iterator (use `.into_iter()` on it before
+/// calling this function).
+fn parse_chain_suffix(tokens: &mut Tokens, ctx: &Ident, base: TokenStream) -> Result<TokenStream> {
+    let mut current = base;
+    while matches!(tokens.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.') {
+        tokens.next(); // consume .
+        let method = expect_ident(tokens, "expected method name after `.`")?;
+        let method_str = method.to_string();
+        let args_group = expect_group(tokens, Delimiter::Parenthesis)?;
+        match method_str.as_str() {
+            "map" => {
+                let mut inner = args_group.stream().into_iter().peekable();
+                let param = expect_ident(&mut inner, "expected lambda parameter name")?;
+                expect_punct(&mut inner, '-', "expected `->` after lambda parameter")?;
+                expect_punct(&mut inner, '>', "expected `->` after lambda parameter")?;
+                let body = parse_direct_node(&mut inner, ctx)?;
+                if let Some(tok) = inner.next() {
+                    return Err(syn::Error::new_spanned(
+                        tok,
+                        "unexpected token after lambda body",
+                    ));
+                }
+                current = quote! {
+                    #current.map(|#param| #body)
+                };
+            }
+            "reduce_left" => {
+                // Syntax: reduce_left(first -> init_tpl, acc, elem -> fold_tpl)
+                // - first -> init_tpl : converts the first element to the initial accumulator
+                // - acc, elem -> fold_tpl : fold step (acc = current accumulator, elem = next element)
+                // Empty iterator produces an empty iterator; non-empty produces a single-element iterator.
+                let mut inner = args_group.stream().into_iter().peekable();
+                let init_param = expect_ident(&mut inner, "expected initial lambda parameter")?;
+                expect_punct(&mut inner, '-', "expected `->` after init parameter")?;
+                expect_punct(&mut inner, '>', "expected `->` after init parameter")?;
+                let init_body = parse_direct_node(&mut inner, ctx)?;
+                expect_punct(&mut inner, ',', "expected `,` after init template")?;
+                let acc_param = expect_ident(&mut inner, "expected accumulator parameter")?;
+                expect_punct(&mut inner, ',', "expected `,` after accumulator parameter")?;
+                let elem_param = expect_ident(&mut inner, "expected element parameter")?;
+                expect_punct(&mut inner, '-', "expected `->` after element parameter")?;
+                expect_punct(&mut inner, '>', "expected `->` after element parameter")?;
+                let fold_body = parse_direct_node(&mut inner, ctx)?;
+                if let Some(tok) = inner.next() {
+                    return Err(syn::Error::new_spanned(
+                        tok,
+                        "unexpected token after fold template",
+                    ));
+                }
+                current = quote! {
+                    {
+                        let mut __iter = #current;
+                        let __result: Option<usize> = if let Some(#init_param) = __iter.next() {
+                            let mut __acc: usize = #init_body;
+                            for #elem_param in __iter {
+                                let #acc_param: usize = __acc;
+                                __acc = #fold_body;
+                            }
+                            Some(__acc)
+                        } else {
+                            None
+                        };
+                        __result.into_iter()
+                    }
+                };
+            }
+            _ => {
+                return Err(syn::Error::new_spanned(
+                    method,
+                    format!("unknown builtin method `.{method_str}()`"),
+                ));
+            }
+        }
+    }
+    Ok(current)
+}
+
 /// Parse the top-level list of a `trees!` template.
 /// Each item is a node template or `{expr}` splice.
 fn parse_direct_list(tokens: &mut Tokens, ctx: &Ident) -> Result<Vec<TokenStream>> {
@@ -492,23 +605,33 @@ fn parse_direct_list(tokens: &mut Tokens, ctx: &Ident) -> Result<Vec<TokenStream
            continue;
        }

-        // {expr} or {..expr} — single node or splice
+        // {expr} or {..expr} (with optional .chain) — single node or splice
        if peek_is_group(tokens, Delimiter::Brace) {
            let group = expect_group(tokens, Delimiter::Brace)?;
+            let has_chain =
+                matches!(tokens.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
            let mut inner = group.stream().into_iter().peekable();
-            if peek_is_dotdot(&inner) {
-                inner.next(); // consume first .
-                inner.next(); // consume second .
-                let expr: TokenStream = inner.collect();
+            let is_splice = peek_is_dotdot(&inner);
+            if is_splice || has_chain {
+                let base: TokenStream = if is_splice {
+                    inner.next(); // consume first .
+                    inner.next(); // consume second .
+                    let expr: TokenStream = inner.collect();
+                    quote! {
+                        { #expr }.into_iter().map(::std::convert::Into::<usize>::into)
+                    }
+                } else {
+                    let expr = group.stream();
+                    quote! { { #expr }.into_iter() }
+                };
+                let chained = parse_chain_suffix(tokens, ctx, base)?;
                items.push(quote! {
-                    __nodes.extend(
-                        (#expr).into_iter().map(::std::convert::Into::<usize>::into)
-                    );
+                    __nodes.extend(#chained);
                });
            } else {
                let expr = group.stream();
                items.push(quote! {
-                    __nodes.push(::std::convert::Into::<usize>::into(#expr));
+                    __nodes.push(::std::convert::Into::<usize>::into({ #expr }));
                });
            }
            continue;
@@ -604,8 +727,11 @@ fn extract_captures_inner(
                }
                last_mult = CaptureMultiplicity::Single;
            }
-            TokenTree::Punct(p) if matches!(p.as_char(), '*' | '+' | '?') => {
-                // Keep last_mult — the @capture follows
+            TokenTree::Punct(p) if p.as_char() == '*' || p.as_char() == '+' => {
+                last_mult = CaptureMultiplicity::Repeated;
+            }
+            TokenTree::Punct(p) if p.as_char() == '?' => {
+                last_mult = CaptureMultiplicity::Optional;
            }
            _ => {
                last_mult = CaptureMultiplicity::Single;
@@ -763,10 +889,117 @@ pub fn parse_rule_top(input: TokenStream) -> Result<TokenStream> {
    Ok(quote! {
        {
            let __query = #query_code;
-            yeast::Rule::new(__query, Box::new(|__ast: &mut yeast::Ast, __captures: yeast::captures::Captures, __fresh: &yeast::tree_builder::FreshScope, __source_range: Option<tree_sitter::Range>| {
+            yeast::Rule::new(__query, Box::new(|__ast: &mut yeast::Ast, mut __captures: yeast::captures::Captures, __fresh: &yeast::tree_builder::FreshScope, __source_range: Option<tree_sitter::Range>, __user_ctx: &mut _, __translator: yeast::TranslatorHandle<'_, _>| {
+                // Auto-translation prefix: recursively translate every
+                // captured node before invoking the user's transform body.
+                // For OneShot rules this preserves the legacy behaviour
+                // (input-schema captures translated to output-schema
+                // nodes); for Repeating rules it is a no-op.
+                __translator.auto_translate_captures(&mut __captures, __ast, __user_ctx)?;
                #(#bindings)*
-                let mut #ctx_ident = yeast::build::BuildCtx::with_source_range(__ast, &__captures, __fresh, __source_range);
-                #transform_body
+                let mut #ctx_ident = yeast::build::BuildCtx::with_translator(__ast, &__captures, __fresh, __source_range, __user_ctx, __translator);
+                let __result: Vec<usize> = { #transform_body };
+                Ok(__result)
+            }))
+        }
+    })
+}
+
+/// Parse `manual_rule!( query { body } )`.
+///
+/// Like [`parse_rule_top`] but:
+/// - Expects a Rust block `{ ... }` after the query (no `=>` arrow).
+/// - Generates code that does NOT auto-translate captures before
+///   running the body. Capture variables refer to raw (input-schema)
+///   nodes; the body is responsible for explicit translation via
+///   `ctx.translate(...)`.
+/// - The body is included verbatim and must evaluate to
+///   `Result<Vec<usize>, String>`.
+pub fn parse_manual_rule_top(input: TokenStream) -> Result<TokenStream> {
+    let mut tokens = input.into_iter().peekable();
+
+    // Collect query tokens up to the body block `{ ... }`.
+    let mut query_tokens = Vec::new();
+    loop {
+        match tokens.peek() {
+            None => {
+                return Err(syn::Error::new(
+                    Span::call_site(),
+                    "expected a Rust block `{ ... }` after the query in manual_rule!",
+                ))
+            }
+            Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Brace => break,
+            _ => {
+                query_tokens.push(tokens.next().unwrap());
+            }
+        }
+    }
+
+    let query_stream: TokenStream = query_tokens.into_iter().collect();
+
+    // Extract captures from the query (same as in `rule!`).
+    let captures = extract_captures(&query_stream);
+
+    // Parse the query into the QueryNode-building expression.
+    let query_code = parse_query_top(query_stream)?;
+
+    // Generate capture bindings (same as in `rule!`).
+    let ctx_ident = Ident::new(IMPLICIT_CTX, Span::call_site());
+    let bindings: Vec<TokenStream> = captures
+        .iter()
+        .map(|cap| {
+            let name = Ident::new(&cap.name, Span::call_site());
+            let name_str = &cap.name;
+            match cap.multiplicity {
+                CaptureMultiplicity::Repeated => quote! {
+                    let #name: Vec<yeast::NodeRef> = __captures.get_all(#name_str)
+                        .into_iter()
+                        .map(yeast::NodeRef)
+                        .collect();
+                },
+                CaptureMultiplicity::Optional => quote! {
+                    let #name: Option<yeast::NodeRef> =
+                        __captures.get_opt(#name_str).map(yeast::NodeRef);
+                },
+                CaptureMultiplicity::Single => quote! {
+                    let #name: yeast::NodeRef =
+                        yeast::NodeRef(__captures.get_var(#name_str).unwrap());
+                },
+            }
+        })
+        .collect();
+
+    // Consume the body block.
+    let body_group = match tokens.next() {
+        Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Brace => g,
+        other => {
+            return Err(syn::Error::new(
+                Span::call_site(),
+                format!(
+                    "expected a Rust block `{{ ... }}` after the query in manual_rule!, found: {other:?}"
+                ),
+            ))
+        }
+    };
+    let body_stream = body_group.stream();
+
+    // No tokens should follow the body.
+    if let Some(tok) = tokens.next() {
+        return Err(syn::Error::new_spanned(
+            tok,
+            "unexpected token after manual_rule! body",
+        ));
+    }
+
+    Ok(quote! {
+        {
+            let __query = #query_code;
+            yeast::Rule::new(__query, Box::new(|__ast: &mut yeast::Ast, __captures: yeast::captures::Captures, __fresh: &yeast::tree_builder::FreshScope, __source_range: Option<tree_sitter::Range>, __user_ctx: &mut _, __translator: yeast::TranslatorHandle<'_, _>| {
+                // No auto-translate prefix for manual rules — the body
+                // is responsible for translating captures explicitly.
+                #(#bindings)*
+                let mut #ctx_ident = yeast::build::BuildCtx::with_translator(__ast, &__captures, __fresh, __source_range, __user_ctx, __translator);
+                #body_stream
            }))
        }
    })
--- a/shared/yeast/doc/yeast.md
+++ b/shared/yeast/doc/yeast.md
@@ -265,7 +265,21 @@ occurrences of the same `$name` within one `BuildCtx` share the same value:
 )
 ```

-`{..expr}` splices a `Vec<Id>` (or any iterable of `Id`):
+The contents of `{…}` are treated as a Rust block, so multi-statement
+expressions (with `let` bindings) work too:
+
+```rust
+(assignment
+    left: {tmp}
+    right: {
+        let lit = ctx.literal("integer", "0");
+        tree!((binary_expr op: (operator "+") left: {tmp} right: {lit}))
+    })
+```
+
+`{..expr}` splices a `Vec<Id>` (or any iterable of `Id`); the contents
+are likewise a Rust block, so the splice can be the result of arbitrary
+computation:

 ```rust
 yeast::trees!(ctx,
--- a/shared/yeast/src/bin/main.rs
+++ b/shared/yeast/src/bin/main.rs
@@ -20,7 +20,7 @@ fn main() {
    let args = Cli::parse();
    let language = get_language(&args.language);
    let source = std::fs::read_to_string(&args.file).unwrap();
-    let runner = yeast::Runner::new(language, &[]);
+    let runner: yeast::Runner = yeast::Runner::new(language, &[]);
    let ast = runner.run(&source).unwrap();
    println!("{}", ast.print(&source, ast.get_root()));
 }
--- a/shared/yeast/src/build.rs
+++ b/shared/yeast/src/build.rs
@@ -2,28 +2,60 @@ use std::collections::BTreeMap;

 use crate::captures::Captures;
 use crate::tree_builder::FreshScope;
-use crate::{Ast, FieldId, Id, NodeContent};
+use crate::{Ast, FieldId, Id, NodeContent, TranslatorHandle};

 /// Context for building new AST nodes during a transformation.
 ///
 /// Used by the `tree!` and `trees!` macros. Holds a mutable reference to the
-/// AST, a reference to the captures from a query match, and a `FreshScope` for
-/// generating unique identifiers.
-pub struct BuildCtx<'a> {
+/// AST, a reference to the captures from a query match, a `FreshScope` for
+/// generating unique identifiers, and a mutable reference to a user-defined
+/// context of type `C`.
+///
+/// The user context `C` is shared across rules via the framework's driver:
+/// outer rules can write to it before recursive translation, and inner rules
+/// can read (or further mutate) it during their transforms. The framework
+/// snapshots and restores the user context around each rule application, so
+/// mutations made by a rule are visible to its descendants (via recursive
+/// translation) but not to its parent's siblings.
+///
+/// `BuildCtx` implements [`Deref`] and [`DerefMut`] targeting `C`, so user
+/// context fields are accessible as `ctx.my_field` directly (provided they
+/// don't collide with `BuildCtx`'s own fields like `ast`, `captures`, etc.).
+///
+/// The default `C = ()` means rules that don't need any user context don't
+/// pay any cost.
+///
+/// When constructed by the framework (via the rule! macro), `BuildCtx` also
+/// carries a [`TranslatorHandle`] that the [`translate`] method delegates
+/// to. When constructed by hand (e.g. in tests), the translator is `None`
+/// and [`translate`] returns an error.
+pub struct BuildCtx<'a, C: 'a = ()> {
    pub ast: &'a mut Ast,
    pub captures: &'a Captures,
    pub fresh: &'a FreshScope,
    /// Source range of the matched node, inherited by synthetic nodes.
    pub source_range: Option<tree_sitter::Range>,
+    /// User-supplied context, accessible directly via `ctx.field` (via Deref).
+    pub user_ctx: &'a mut C,
+    /// Optional translator handle, populated when the context is built by
+    /// the framework's rule driver. None when the context is built by hand.
+    pub(crate) translator: Option<TranslatorHandle<'a, C>>,
 }

-impl<'a> BuildCtx<'a> {
-    pub fn new(ast: &'a mut Ast, captures: &'a Captures, fresh: &'a FreshScope) -> Self {
+impl<'a, C> BuildCtx<'a, C> {
+    pub fn new(
+        ast: &'a mut Ast,
+        captures: &'a Captures,
+        fresh: &'a FreshScope,
+        user_ctx: &'a mut C,
+    ) -> Self {
        Self {
            ast,
            captures,
            fresh,
            source_range: None,
+            user_ctx,
+            translator: None,
        }
    }

@@ -32,12 +64,35 @@ impl<'a> BuildCtx<'a> {
        captures: &'a Captures,
        fresh: &'a FreshScope,
        source_range: Option<tree_sitter::Range>,
+        user_ctx: &'a mut C,
    ) -> Self {
        Self {
            ast,
            captures,
            fresh,
            source_range,
+            user_ctx,
+            translator: None,
+        }
+    }
+
+    /// Construct a `BuildCtx` carrying a translator handle. Used by the
+    /// `rule!` macro to enable [`translate`] inside rule transforms.
+    pub fn with_translator(
+        ast: &'a mut Ast,
+        captures: &'a Captures,
+        fresh: &'a FreshScope,
+        source_range: Option<tree_sitter::Range>,
+        user_ctx: &'a mut C,
+        translator: TranslatorHandle<'a, C>,
+    ) -> Self {
+        Self {
+            ast,
+            captures,
+            fresh,
+            source_range,
+            user_ctx,
+            translator: Some(translator),
        }
    }

@@ -82,10 +137,83 @@ impl<'a> BuildCtx<'a> {
            .create_named_token_with_range(kind, value.to_string(), self.source_range)
    }

+    /// Create a leaf node with fixed content and an optional preferred source range.
+    /// If `source_range` is `None`, falls back to this context's inherited range.
+    pub fn literal_with_source_range(
+        &mut self,
+        kind: &'static str,
+        value: &str,
+        source_range: Option<tree_sitter::Range>,
+    ) -> Id {
+        self.ast.create_named_token_with_range(
+            kind,
+            value.to_string(),
+            source_range.or(self.source_range),
+        )
+    }
+
    /// Create a leaf node with an auto-generated unique name.
    pub fn fresh(&mut self, kind: &'static str, name: &str) -> Id {
        let generated = self.fresh.resolve(name);
        self.ast
            .create_named_token_with_range(kind, generated, self.source_range)
    }
+
+    /// Prepend a value to a field of an existing node.
+    pub fn prepend_field(&mut self, node_id: Id, field_name: &str, value_id: Id) {
+        let field_id = self
+            .ast
+            .field_id_for_name(field_name)
+            .unwrap_or_else(|| panic!("build: field '{field_name}' not found"));
+        self.ast.prepend_field_child(node_id, field_id, value_id);
+    }
+}
+
+impl<C: Clone> BuildCtx<'_, C> {
+    /// Recursively translate a node via the framework's rule machinery.
+    /// In a OneShot phase, applies OneShot rules to the given node and
+    /// returns the resulting node ids. In a Repeating phase, errors
+    /// (translation is not meaningful when input and output share a
+    /// schema).
+    ///
+    /// Accepts any value convertible to [`Id`] (including [`crate::NodeRef`]),
+    /// so manual rules can pass capture bindings directly without unwrapping.
+    ///
+    /// Errors if this `BuildCtx` was constructed by hand (without a
+    /// translator handle) — for example, in unit tests that don't go
+    /// through the rule driver.
+    pub fn translate<I: Into<Id>>(&mut self, id: I) -> Result<Vec<Id>, String> {
+        let id = id.into();
+        match &self.translator {
+            Some(t) => t.translate(self.ast, self.user_ctx, id),
+            None => Err("translate() called on a BuildCtx without a translator handle".into()),
+        }
+    }
+
+    /// Translate an optional capture, returning the first translated id or
+    /// `None`. Convenience for `?`-quantifier captures (`Option<NodeRef>`).
+    ///
+    /// If the underlying translation produces multiple ids for a single
+    /// input, only the first is returned. For most use cases (e.g.
+    /// translating a single type annotation) this is what you want; if
+    /// you need all ids, use [`translate`] directly.
+    pub fn translate_opt<I: Into<Id>>(&mut self, id: Option<I>) -> Result<Option<Id>, String> {
+        match id {
+            Some(id) => Ok(self.translate(id)?.into_iter().next()),
+            None => Ok(None),
+        }
+    }
+}
+
+impl<C> std::ops::Deref for BuildCtx<'_, C> {
+    type Target = C;
+    fn deref(&self) -> &C {
+        &*self.user_ctx
+    }
+}
+
+impl<C> std::ops::DerefMut for BuildCtx<'_, C> {
+    fn deref_mut(&mut self) -> &mut C {
+        &mut *self.user_ctx
+    }
 }
--- a/shared/yeast/src/dump.rs
+++ b/shared/yeast/src/dump.rs
@@ -53,12 +53,7 @@ pub fn dump_ast_with_options(
 ///
 /// Any node that does not match the expected type set for its parent field is
 /// rendered with a trailing `" <-- ERROR: ..."` annotation on the same line.
-pub fn dump_ast_with_type_errors(
-    ast: &Ast,
-    root: usize,
-    source: &str,
-    schema: &Schema,
-) -> String {
+pub fn dump_ast_with_type_errors(ast: &Ast, root: usize, source: &str, schema: &Schema) -> String {
    dump_ast_with_type_errors_and_options(ast, root, source, schema, &DumpOptions::default())
 }

@@ -74,7 +69,15 @@ pub fn dump_ast_with_type_errors_and_options(
    options: &DumpOptions,
 ) -> String {
    let mut out = String::new();
-    dump_node(ast, root, source, options, 0, Some((schema, None, None)), &mut out);
+    dump_node(
+        ast,
+        root,
+        source,
+        options,
+        0,
+        Some((schema, None, None)),
+        &mut out,
+    );
    out
 }

@@ -232,8 +235,8 @@ fn dump_node(
        }
        let field_name = ast.field_name_for_id(field_id).unwrap_or("?");
        let child_type_check = type_check.map(|(schema, _, _)| {
-            let expected = expected_for_field(schema, node.kind_name(), field_id)
-                .or(Some(EMPTY_NODE_TYPES));
+            let expected =
+                expected_for_field(schema, node.kind_name(), field_id).or(Some(EMPTY_NODE_TYPES));
            let parent_field = Some((node.kind_name(), field_name));
            (schema, expected, parent_field)
        });
--- a/shared/yeast/src/lib.rs
+++ b/shared/yeast/src/lib.rs
@@ -16,7 +16,7 @@ pub mod schema;
 pub mod tree_builder;
 mod visitor;

-pub use yeast_macros::{query, rule, tree, trees};
+pub use yeast_macros::{manual_rule, query, rule, tree, trees};

 use captures::Captures;
 pub use cursor::Cursor;
@@ -58,12 +58,30 @@ pub trait YeastDisplay {
    fn yeast_to_string(&self, ast: &Ast) -> String;
 }

+/// Optional source range for values used in `#{expr}` interpolations.
+///
+/// By default this returns `None`, so synthesized leaves inherit the matched
+/// rule's source range. `NodeRef` returns the referenced node's range, letting
+/// `(kind #{capture})` carry the captured node's location.
+pub trait YeastSourceRange {
+    fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range>;
+}
+
 impl YeastDisplay for NodeRef {
    fn yeast_to_string(&self, ast: &Ast) -> String {
        ast.source_text(self.0)
    }
 }

+impl YeastSourceRange for NodeRef {
+    fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range> {
+        ast.get_node(self.0).and_then(|n| match &n.content {
+            NodeContent::Range(r) => Some(r.clone()),
+            _ => n.source_range,
+        })
+    }
+}
+
 macro_rules! impl_yeast_display_via_display {
    ($($t:ty),* $(,)?) => {
        $(
@@ -72,6 +90,12 @@ macro_rules! impl_yeast_display_via_display {
                    ::std::string::ToString::to_string(self)
                }
            }
+
+            impl YeastSourceRange for $t {
+                fn yeast_source_range(&self, _ast: &Ast) -> Option<tree_sitter::Range> {
+                    None
+                }
+            }
        )*
    };
 }
@@ -90,6 +114,12 @@ impl<T: YeastDisplay + ?Sized> YeastDisplay for &T {
    }
 }

+impl<T: YeastSourceRange + ?Sized> YeastSourceRange for &T {
+    fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range> {
+        (**self).yeast_source_range(ast)
+    }
+}
+
 pub const CHILD_FIELD: u16 = u16::MAX;

 #[derive(Debug)]
@@ -267,7 +297,9 @@ impl Ast {
    /// Returns the source text for `id`, resolving `NodeContent::Range`
    /// against the stored source bytes when available.
    pub fn source_text(&self, id: Id) -> String {
-        let Some(node) = self.get_node(id) else { return String::new(); };
+        let Some(node) = self.get_node(id) else {
+            return String::new();
+        };
        let read_range = |range: &tree_sitter::Range| {
            let start = range.start_byte;
            let end = range.end_byte;
@@ -368,6 +400,15 @@ impl Ast {
        is_named: bool,
        source_range: Option<tree_sitter::Range>,
    ) -> Id {
+        let source_range = match &content {
+            // Parsed nodes already carry an exact source range in their content.
+            NodeContent::Range(_) => source_range,
+            // Synthesized nodes derive location from children when possible,
+            // and fall back to the inherited rule-match range otherwise.
+            _ => self
+                .union_source_range_of_children(&fields)
+                .or(source_range),
+        };
        let id = self.nodes.len();
        self.nodes.push(Node {
            kind,
@@ -383,10 +424,79 @@ impl Ast {
        id
    }

+    fn union_source_range_of_children(
+        &self,
+        fields: &BTreeMap<FieldId, Vec<Id>>,
+    ) -> Option<tree_sitter::Range> {
+        let mut start_byte: Option<usize> = None;
+        let mut end_byte: Option<usize> = None;
+        let mut start_point = tree_sitter::Point { row: 0, column: 0 };
+        let mut end_point = tree_sitter::Point { row: 0, column: 0 };
+
+        for child_ids in fields.values() {
+            for &child_id in child_ids {
+                let Some(child) = self.get_node(child_id) else {
+                    continue;
+                };
+
+                let child_start_byte = child.start_byte();
+                let child_end_byte = child.end_byte();
+
+                // Skip children that carry no usable location.
+                if child_start_byte == 0 && child_end_byte == 0 {
+                    continue;
+                }
+
+                match start_byte {
+                    None => {
+                        start_byte = Some(child_start_byte);
+                        start_point = child.start_position();
+                    }
+                    Some(current_start) if child_start_byte < current_start => {
+                        start_byte = Some(child_start_byte);
+                        start_point = child.start_position();
+                    }
+                    _ => {}
+                }
+
+                match end_byte {
+                    None => {
+                        end_byte = Some(child_end_byte);
+                        end_point = child.end_position();
+                    }
+                    Some(current_end) if child_end_byte > current_end => {
+                        end_byte = Some(child_end_byte);
+                        end_point = child.end_position();
+                    }
+                    _ => {}
+                }
+            }
+        }
+
+        match (start_byte, end_byte) {
+            (Some(start_byte), Some(end_byte)) => Some(tree_sitter::Range {
+                start_byte,
+                end_byte,
+                start_point,
+                end_point,
+            }),
+            _ => None,
+        }
+    }
+
    pub fn create_named_token(&mut self, kind: &'static str, content: String) -> Id {
        self.create_named_token_with_range(kind, content, None)
    }

+    /// Prepend a child id to the given field of the given node.
+    pub fn prepend_field_child(&mut self, node_id: Id, field_id: FieldId, value_id: Id) {
+        let node = self
+            .nodes
+            .get_mut(node_id)
+            .expect("prepend_field_child: invalid node id");
+        node.fields.entry(field_id).or_default().insert(0, value_id);
+    }
+
    pub fn create_named_token_with_range(
        &mut self,
        kind: &'static str,
@@ -595,18 +705,118 @@ impl From<tree_sitter::Range> for NodeContent {
    }
 }

-/// The transform function for a rule: takes the AST, captured variables, a
-/// fresh-name scope, and the source range of the matched node, and returns
-/// the IDs of the replacement nodes.
-pub type Transform = Box<
-    dyn Fn(&mut Ast, Captures, &tree_builder::FreshScope, Option<tree_sitter::Range>) -> Vec<Id>
+/// A handle that lets a rule transform recursively translate AST nodes via
+/// the framework's rule machinery. Constructed by the driver and passed as
+/// the last argument of every [`Transform`] invocation.
+///
+/// The `rule!` macro uses [`TranslatorHandle::auto_translate_captures`] in
+/// its generated prefix to translate captures before running the user's
+/// transform body. Manually-written transforms (using [`Rule::new`]
+/// directly) can call [`TranslatorHandle::translate`] selectively on
+/// specific node ids to control when translation happens.
+pub struct TranslatorHandle<'a, C> {
+    inner: TranslatorImpl<'a, C>,
+}
+
+/// Internal phase-specific translation state. Kept private — callers
+/// interact with [`TranslatorHandle`] only.
+enum TranslatorImpl<'a, C> {
+    /// OneShot phase translator: recursively applies OneShot rules.
+    OneShot {
+        index: &'a RuleIndex<'a, C>,
+        fresh: &'a tree_builder::FreshScope,
+        rewrite_depth: usize,
+        /// The id of the node the current rule is matching. Used by
+        /// [`auto_translate_captures`] to avoid infinite recursion when a
+        /// rule captures its own match root (e.g. via `(_) @_`).
+        matched_root: Id,
+    },
+    /// Repeating phase translator: translation is not meaningful here
+    /// (input and output schemas are the same). [`translate`] errors;
+    /// [`auto_translate_captures`] is a no-op so the macro's auto-prefix
+    /// works unchanged for Repeating rules.
+    Repeating,
+}
+
+impl<'a, C: Clone> TranslatorHandle<'a, C> {
+    /// Recursively apply OneShot rules to `id` and return the resulting
+    /// node ids. Errors in a Repeating phase (where translation is not
+    /// meaningful).
+    pub fn translate(&self, ast: &mut Ast, user_ctx: &mut C, id: Id) -> Result<Vec<Id>, String> {
+        match &self.inner {
+            TranslatorImpl::OneShot {
+                index,
+                fresh,
+                rewrite_depth,
+                ..
+            } => apply_one_shot_rules_inner(index, ast, user_ctx, id, fresh, rewrite_depth + 1),
+            TranslatorImpl::Repeating => {
+                Err("translate() is not available in a Repeating phase".into())
+            }
+        }
+    }
+
+    /// Translate every captured node in `captures` in place (OneShot phase
+    /// only). In a Repeating phase this is a no-op — Repeating rules
+    /// receive raw captures.
+    ///
+    /// Used by the `rule!` macro's generated prefix to preserve the
+    /// pre-existing "auto-translate captures before running the transform
+    /// body" behavior. Manually-written transforms typically translate
+    /// captures selectively via [`translate`] instead.
+    ///
+    /// To avoid infinite recursion, a capture whose id matches the rule's
+    /// matched root (e.g. from a `(_) @_` pattern) is left unchanged.
+    pub fn auto_translate_captures(
+        &self,
+        captures: &mut Captures,
+        ast: &mut Ast,
+        user_ctx: &mut C,
+    ) -> Result<(), String> {
+        match &self.inner {
+            TranslatorImpl::OneShot { matched_root, .. } => {
+                let root = *matched_root;
+                captures.try_map_all_captures(|cid| {
+                    if cid == root {
+                        Ok(vec![cid])
+                    } else {
+                        self.translate(ast, user_ctx, cid)
+                    }
+                })
+            }
+            TranslatorImpl::Repeating => Ok(()),
+        }
+    }
+}
+
+/// The transform function for a rule.
+///
+/// Takes the AST, the (raw, untranslated) captured variables, a fresh-name
+/// scope, the source range of the matched node, a mutable reference to the
+/// user context of type `C`, and a [`TranslatorHandle`] for recursively
+/// translating nodes. Returns the IDs of the replacement nodes, or an
+/// error message if the transform could not be completed.
+///
+/// Transforms produced by [`Rule::new`] receive **raw** captures and must
+/// translate them themselves (via the handle). Transforms produced by the
+/// `rule!` macro have an auto-translation prefix injected for backward
+/// compatibility.
+pub type Transform<C = ()> = Box<
+    dyn Fn(
+            &mut Ast,
+            Captures,
+            &tree_builder::FreshScope,
+            Option<tree_sitter::Range>,
+            &mut C,
+            TranslatorHandle<'_, C>,
+        ) -> Result<Vec<Id>, String>
        + Send
        + Sync,
 >;

-pub struct Rule {
+pub struct Rule<C = ()> {
    query: QueryNode,
-    transform: Transform,
+    transform: Transform<C>,
    /// If true, after this rule fires on a node the engine will try to
    /// re-apply this same rule on the result root. Defaults to false:
    /// each rule fires at most once on a given node, which prevents
@@ -614,8 +824,8 @@ pub struct Rule {
    repeated: bool,
 }

-impl Rule {
-    pub fn new(query: QueryNode, transform: Transform) -> Self {
+impl<C> Rule<C> {
+    pub fn new(query: QueryNode, transform: Transform<C>) -> Self {
        Self {
            query,
            transform,
@@ -637,9 +847,13 @@ impl Rule {
        ast: &mut Ast,
        node: Id,
        fresh: &tree_builder::FreshScope,
+        user_ctx: &mut C,
+        translator: TranslatorHandle<'_, C>,
    ) -> Result<Option<Vec<Id>>, String> {
        match self.try_match(ast, node)? {
-            Some(captures) => Ok(Some(self.run_transform(ast, captures, node, fresh))),
+            Some(captures) => Ok(Some(
+                self.run_transform(ast, captures, node, fresh, user_ctx, translator)?,
+            )),
            None => Ok(None),
        }
    }
@@ -663,29 +877,31 @@ impl Rule {
        captures: Captures,
        node: Id,
        fresh: &tree_builder::FreshScope,
-    ) -> Vec<Id> {
+        user_ctx: &mut C,
+        translator: TranslatorHandle<'_, C>,
+    ) -> Result<Vec<Id>, String> {
        fresh.next_scope();
        let source_range = ast.get_node(node).and_then(|n| match n.content {
            NodeContent::Range(r) => Some(r),
            _ => n.source_range,
        });
-        (self.transform)(ast, captures, fresh, source_range)
+        (self.transform)(ast, captures, fresh, source_range, user_ctx, translator)
    }
 }

 const MAX_REWRITE_DEPTH: usize = 100;

 /// Index of rules by their root query kind for fast lookup.
-struct RuleIndex<'a> {
+struct RuleIndex<'a, C> {
    /// Rules indexed by root node kind name.
-    by_kind: BTreeMap<&'static str, Vec<&'a Rule>>,
+    by_kind: BTreeMap<&'static str, Vec<&'a Rule<C>>>,
    /// Rules with wildcard queries (Any) that apply to all nodes.
-    wildcard: Vec<&'a Rule>,
+    wildcard: Vec<&'a Rule<C>>,
 }

-impl<'a> RuleIndex<'a> {
-    fn new(rules: &'a [Rule]) -> Self {
-        let mut by_kind: BTreeMap<&'static str, Vec<&'a Rule>> = BTreeMap::new();
+impl<'a, C> RuleIndex<'a, C> {
+    fn new(rules: &'a [Rule<C>]) -> Self {
+        let mut by_kind: BTreeMap<&'static str, Vec<&'a Rule<C>>> = BTreeMap::new();
        let mut wildcard = Vec::new();
        for rule in rules {
            match rule.query.root_kind() {
@@ -696,7 +912,7 @@ impl<'a> RuleIndex<'a> {
        Self { by_kind, wildcard }
    }

-    fn rules_for_kind(&self, kind: &str) -> impl Iterator<Item = &&'a Rule> {
+    fn rules_for_kind(&self, kind: &str) -> impl Iterator<Item = &&'a Rule<C>> {
        self.by_kind
            .get(kind)
            .into_iter()
@@ -705,23 +921,25 @@ impl<'a> RuleIndex<'a> {
    }
 }

-fn apply_repeating_rules(
-    rules: &[Rule],
+fn apply_repeating_rules<C: Clone>(
+    rules: &[Rule<C>],
    ast: &mut Ast,
+    user_ctx: &mut C,
    id: Id,
    fresh: &tree_builder::FreshScope,
 ) -> Result<Vec<Id>, String> {
    let index = RuleIndex::new(rules);
-    apply_repeating_rules_inner(&index, ast, id, fresh, 0, None)
+    apply_repeating_rules_inner(&index, ast, user_ctx, id, fresh, 0, None)
 }

-fn apply_repeating_rules_inner(
-    index: &RuleIndex,
+fn apply_repeating_rules_inner<C: Clone>(
+    index: &RuleIndex<C>,
    ast: &mut Ast,
+    user_ctx: &mut C,
    id: Id,
    fresh: &tree_builder::FreshScope,
    rewrite_depth: usize,
-    skip_rule: Option<*const Rule>,
+    skip_rule: Option<*const Rule<C>>,
 ) -> Result<Vec<Id>, String> {
    if rewrite_depth > MAX_REWRITE_DEPTH {
        return Err(format!(
@@ -732,11 +950,23 @@ fn apply_repeating_rules_inner(

    let node_kind = ast.get_node(id).map(|n| n.kind()).unwrap_or("");
    for rule in index.rules_for_kind(node_kind) {
-        let rule_ptr = *rule as *const Rule;
+        let rule_ptr = *rule as *const Rule<C>;
        if Some(rule_ptr) == skip_rule {
            continue;
        }
-        if let Some(result_node) = rule.try_rule(ast, id, fresh)? {
+        // Snapshot the user context before invoking the rule so that any
+        // mutations the rule makes are visible during recursive translation
+        // of its result, but not leaked to the parent's siblings.
+        let snapshot = user_ctx.clone();
+        // Repeating rules don't need a real translator: their captures
+        // aren't auto-translated (Repeating preserves the input schema),
+        // and `ctx.translate(id)` errors if invoked from a Repeating
+        // transform.
+        let translator = TranslatorHandle {
+            inner: TranslatorImpl::Repeating,
+        };
+        let try_result = rule.try_rule(ast, id, fresh, user_ctx, translator)?;
+        if let Some(result_node) = try_result {
            // For non-repeated rules, suppress further application of *this*
            // rule on the result root, so a rule whose output matches its own
            // query doesn't loop. Other rules and child traversal are
@@ -747,14 +977,19 @@ fn apply_repeating_rules_inner(
                results.extend(apply_repeating_rules_inner(
                    index,
                    ast,
+                    user_ctx,
                    node,
                    fresh,
                    rewrite_depth + 1,
                    next_skip,
                )?);
            }
+            *user_ctx = snapshot;
            return Ok(results);
        }
+        // Rule didn't match; restore any speculative changes (none expected
+        // since try_rule only mutates on match, but be defensive).
+        *user_ctx = snapshot;
    }

    // Take the parent's fields by ownership: the recursion will rewrite
@@ -769,7 +1004,15 @@ fn apply_repeating_rules_inner(
    for children in fields.values_mut() {
        let mut new_children: Option<Vec<Id>> = None;
        for (i, &child_id) in children.iter().enumerate() {
-            let result = apply_repeating_rules_inner(index, ast, child_id, fresh, rewrite_depth, None)?;
+            let result = apply_repeating_rules_inner(
+                index,
+                ast,
+                user_ctx,
+                child_id,
+                fresh,
+                rewrite_depth,
+                None,
+            )?;
            let unchanged = result.len() == 1 && result[0] == child_id;
            match (&mut new_children, unchanged) {
                (None, true) => {} // unchanged so far, no allocation needed
@@ -798,24 +1041,25 @@ fn apply_repeating_rules_inner(
 /// each visited node, recursion proceeds only through captured nodes (not
 /// through the input node's children directly), and an error is returned if
 /// no rule matches a visited node.
-fn apply_one_shot_rules(
-    rules: &[Rule],
+fn apply_one_shot_rules<C: Clone>(
+    rules: &[Rule<C>],
    ast: &mut Ast,
+    user_ctx: &mut C,
    id: Id,
    fresh: &tree_builder::FreshScope,
 ) -> Result<Vec<Id>, String> {
    let index = RuleIndex::new(rules);
-    apply_one_shot_rules_inner(&index, ast, id, fresh, 0)
+    apply_one_shot_rules_inner(&index, ast, user_ctx, id, fresh, 0)
 }

-fn apply_one_shot_rules_inner(
-    index: &RuleIndex,
+fn apply_one_shot_rules_inner<C: Clone>(
+    index: &RuleIndex<C>,
    ast: &mut Ast,
+    user_ctx: &mut C,
    id: Id,
    fresh: &tree_builder::FreshScope,
    rewrite_depth: usize,
 ) -> Result<Vec<Id>, String> {
-
    if rewrite_depth > MAX_REWRITE_DEPTH {
        return Err(format!(
            "Desugaring exceeded maximum rewrite depth ({MAX_REWRITE_DEPTH}). \
@@ -825,31 +1069,28 @@ fn apply_one_shot_rules_inner(

    let node_kind = ast.get_node(id).map(|n| n.kind()).unwrap_or("");

-    // Don't rewrite unnamed nodes (punctuation, keywords, etc.); leave them
-    // as-is. Rules target named nodes only.
-    if let Some(node) = ast.get_node(id) {
-        if !node.is_named() {
-            return Ok(vec![id]);
-        }
-    }
-
    for rule in index.rules_for_kind(node_kind) {
-        if let Some(mut captures) = rule.try_match(ast, id)? {
-            // Recursively translate every captured node before invoking the
-            // transform. The transform's output uses output-schema kinds, so
-            // we must translate captured input-schema nodes to their
-            // output-schema equivalents first.
-            captures.try_map_all_captures(|captured_id| {
-                // Avoid infinite recursion when a capture refers to the root
-                // node of the matched tree (e.g. an `@_` capture on the
-                // pattern root): re-analyzing it would match the same rule
-                // again indefinitely.
-                if captured_id == id {
-                    return Ok(vec![captured_id]);
-                }
-                apply_one_shot_rules_inner(index, ast, captured_id, fresh, rewrite_depth + 1)
-            })?;
-            return Ok(rule.run_transform(ast, captures, id, fresh));
+        if let Some(captures) = rule.try_match(ast, id)? {
+            // Snapshot the user context before invoking the rule so that any
+            // mutations the rule (or its transitively-translated captures)
+            // make are visible during this rule's transform, but not leaked
+            // to the parent's siblings.
+            let snapshot = user_ctx.clone();
+            // Build the translator handle the transform will use to
+            // recursively translate captures (or, for macro-generated
+            // rules, the auto-translate prefix uses it to translate every
+            // capture up front, preserving the legacy behavior).
+            let translator = TranslatorHandle {
+                inner: TranslatorImpl::OneShot {
+                    index,
+                    fresh,
+                    rewrite_depth,
+                    matched_root: id,
+                },
+            };
+            let result = rule.run_transform(ast, captures, id, fresh, user_ctx, translator)?;
+            *user_ctx = snapshot;
+            return Ok(result);
        }
    }

@@ -877,15 +1118,15 @@ pub enum PhaseKind {
 /// starts. Rules within a phase compete for matches as usual; rules in
 /// different phases never compete because each traversal only considers the
 /// current phase's rules.
-pub struct Phase {
+pub struct Phase<C = ()> {
    /// Name used in error messages.
    pub name: String,
-    pub rules: Vec<Rule>,
+    pub rules: Vec<Rule<C>>,
    pub kind: PhaseKind,
 }

-impl Phase {
-    pub fn new(name: impl Into<String>, kind: PhaseKind, rules: Vec<Rule>) -> Self {
+impl<C> Phase<C> {
+    pub fn new(name: impl Into<String>, kind: PhaseKind, rules: Vec<Rule<C>>) -> Self {
        Self {
            name: name.into(),
            rules,
@@ -911,17 +1152,30 @@ impl Phase {
 ///     .add_phase("desugar", PhaseKind::Repeating, desugar_rules)
 ///     .with_output_node_types_yaml(yaml);
 /// ```
-#[derive(Default)]
-pub struct DesugaringConfig {
+///
+/// The optional type parameter `C` is the user context type threaded through
+/// rule transforms. Defaults to `()` (no user context).
+pub struct DesugaringConfig<C = ()> {
    /// Phases of rule application, applied in order.
-    pub phases: Vec<Phase>,
+    pub phases: Vec<Phase<C>>,
    /// Output node-types in YAML format. If `None`, the input grammar's
    /// node types are used (i.e. the desugared AST has the same node types
    /// as the tree-sitter grammar).
    pub output_node_types_yaml: Option<&'static str>,
 }

-impl DesugaringConfig {
+// Manual `Default` impl so users with a custom `C` that doesn't implement
+// `Default` can still construct an empty config.
+impl<C> Default for DesugaringConfig<C> {
+    fn default() -> Self {
+        Self {
+            phases: Vec::new(),
+            output_node_types_yaml: None,
+        }
+    }
+}
+
+impl<C> DesugaringConfig<C> {
    /// Create an empty configuration. Add phases via [`add_phase`] and an
    /// optional output schema via [`with_output_node_types_yaml`].
    pub fn new() -> Self {
@@ -933,7 +1187,7 @@ impl DesugaringConfig {
        mut self,
        name: impl Into<String>,
        kind: PhaseKind,
-        rules: Vec<Rule>,
+        rules: Vec<Rule<C>>,
    ) -> Self {
        self.phases.push(Phase::new(name, kind, rules));
        self
@@ -955,15 +1209,15 @@ impl DesugaringConfig {
    }
 }

-pub struct Runner<'a> {
+pub struct Runner<'a, C = ()> {
    language: tree_sitter::Language,
    schema: schema::Schema,
-    phases: &'a [Phase],
+    phases: &'a [Phase<C>],
 }

-impl<'a> Runner<'a> {
+impl<'a, C> Runner<'a, C> {
    /// Create a runner using the input grammar's schema for output.
-    pub fn new(language: tree_sitter::Language, phases: &'a [Phase]) -> Self {
+    pub fn new(language: tree_sitter::Language, phases: &'a [Phase<C>]) -> Self {
        let schema = schema::Schema::from_language(&language);
        Self {
            language,
@@ -976,7 +1230,7 @@ impl<'a> Runner<'a> {
    pub fn with_schema(
        language: tree_sitter::Language,
        schema: &schema::Schema,
-        phases: &'a [Phase],
+        phases: &'a [Phase<C>],
    ) -> Self {
        Self {
            language,
@@ -988,7 +1242,7 @@ impl<'a> Runner<'a> {
    /// Create a runner from a [`DesugaringConfig`].
    pub fn from_config(
        language: tree_sitter::Language,
-        config: &'a DesugaringConfig,
+        config: &'a DesugaringConfig<C>,
    ) -> Result<Self, String> {
        let schema = config.build_schema(&language)?;
        Ok(Self {
@@ -997,11 +1251,17 @@ impl<'a> Runner<'a> {
            phases: &config.phases,
        })
    }
+}

-    pub fn run_from_tree(
+impl<'a, C: Clone> Runner<'a, C> {
+    /// Parse `tree` against `source` and run all phases, threading
+    /// `user_ctx` through every rule transform. The caller owns the
+    /// initial context state.
+    pub fn run_from_tree_with_ctx(
        &self,
        tree: &tree_sitter::Tree,
        source: &[u8],
+        user_ctx: &mut C,
    ) -> Result<Ast, String> {
        let mut ast = Ast::from_tree_with_schema_and_source(
            self.schema.clone(),
@@ -1009,11 +1269,13 @@ impl<'a> Runner<'a> {
            &self.language,
            source.to_vec(),
        );
-        self.run_phases(&mut ast)?;
+        self.run_phases(&mut ast, user_ctx)?;
        Ok(ast)
    }

-    pub fn run(&self, input: &str) -> Result<Ast, String> {
+    /// Parse `input` and run all phases, threading `user_ctx` through
+    /// every rule transform. The caller owns the initial context state.
+    pub fn run_with_ctx(&self, input: &str, user_ctx: &mut C) -> Result<Ast, String> {
        let mut parser = tree_sitter::Parser::new();
        parser
            .set_language(&self.language)
@@ -1027,20 +1289,24 @@ impl<'a> Runner<'a> {
            &self.language,
            input.as_bytes().to_vec(),
        );
-        self.run_phases(&mut ast)?;
+        self.run_phases(&mut ast, user_ctx)?;
        Ok(ast)
    }

    /// Apply each phase in turn to the AST, threading the root through.
    /// A single `FreshScope` is shared across phases so that fresh
    /// identifiers generated in different phases don't collide.
-    fn run_phases(&self, ast: &mut Ast) -> Result<(), String> {
+    fn run_phases(&self, ast: &mut Ast, user_ctx: &mut C) -> Result<(), String> {
        let fresh = tree_builder::FreshScope::new();
        let mut root = ast.get_root();
        for phase in self.phases {
            let res = match phase.kind {
-                PhaseKind::Repeating => apply_repeating_rules(&phase.rules, ast, root, &fresh),
-                PhaseKind::OneShot => apply_one_shot_rules(&phase.rules, ast, root, &fresh),
+                PhaseKind::Repeating => {
+                    apply_repeating_rules(&phase.rules, ast, user_ctx, root, &fresh)
+                }
+                PhaseKind::OneShot => {
+                    apply_one_shot_rules(&phase.rules, ast, user_ctx, root, &fresh)
+                }
            }
            .map_err(|e| format!("Phase `{}`: {e}", phase.name))?;
            if res.len() != 1 {
@@ -1056,3 +1322,78 @@ impl<'a> Runner<'a> {
        Ok(())
    }
 }
+
+impl<'a, C: Clone + Default> Runner<'a, C> {
+    /// Parse `tree` against `source` and run all phases, using the
+    /// default context (`C::default()`) as the initial context state.
+    pub fn run_from_tree(&self, tree: &tree_sitter::Tree, source: &[u8]) -> Result<Ast, String> {
+        let mut user_ctx = C::default();
+        self.run_from_tree_with_ctx(tree, source, &mut user_ctx)
+    }
+
+    /// Parse `input` and run all phases, using the default context
+    /// (`C::default()`) as the initial context state.
+    pub fn run(&self, input: &str) -> Result<Ast, String> {
+        let mut user_ctx = C::default();
+        self.run_with_ctx(input, &mut user_ctx)
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Desugarer: type-erased view of a DesugaringConfig + Runner
+// ---------------------------------------------------------------------------
+
+/// Type-erased interface to a desugaring pipeline for a single language.
+///
+/// Consumers (e.g. a generic tree-sitter extractor) hold
+/// `Box<dyn Desugarer>` so they can dispatch through the trait without
+/// knowing the user context type `C` that's internal to yeast.
+///
+/// Construct one via [`ConcreteDesugarer::new`] from a
+/// [`DesugaringConfig<C>`] and a [`tree_sitter::Language`].
+pub trait Desugarer: Send + Sync {
+    /// The output AST schema (in YAML format), or `None` if the input
+    /// grammar's schema should be used.
+    fn output_node_types_yaml(&self) -> Option<&'static str>;
+
+    /// Parse `tree` against `source` and run the desugaring pipeline.
+    /// Each call constructs a fresh default user context internally.
+    fn run_from_tree(&self, tree: &tree_sitter::Tree, source: &[u8]) -> Result<Ast, String>;
+}
+
+/// A concrete [`Desugarer`] backed by a [`DesugaringConfig<C>`] for a
+/// specific user context type `C`. Stores the language and a pre-built
+/// schema so that per-call cost is bounded to constructing a transient
+/// [`Runner`] and cloning the schema (no YAML re-parsing).
+pub struct ConcreteDesugarer<C: Default + Clone + Send + Sync + 'static> {
+    language: tree_sitter::Language,
+    schema: schema::Schema,
+    config: DesugaringConfig<C>,
+}
+
+impl<C: Default + Clone + Send + Sync + 'static> ConcreteDesugarer<C> {
+    /// Build a desugarer for `language` from `config`. Parses the output
+    /// schema YAML once (if set) and stores it for reuse across files.
+    pub fn new(
+        language: tree_sitter::Language,
+        config: DesugaringConfig<C>,
+    ) -> Result<Self, String> {
+        let schema = config.build_schema(&language)?;
+        Ok(Self {
+            language,
+            schema,
+            config,
+        })
+    }
+}
+
+impl<C: Default + Clone + Send + Sync + 'static> Desugarer for ConcreteDesugarer<C> {
+    fn output_node_types_yaml(&self) -> Option<&'static str> {
+        self.config.output_node_types_yaml
+    }
+
+    fn run_from_tree(&self, tree: &tree_sitter::Tree, source: &[u8]) -> Result<Ast, String> {
+        let runner = Runner::with_schema(self.language.clone(), &self.schema, &self.config.phases);
+        runner.run_from_tree(tree, source)
+    }
+}
--- a/shared/yeast/src/node_types_yaml.rs
+++ b/shared/yeast/src/node_types_yaml.rs
@@ -242,10 +242,7 @@ pub fn convert(yaml_input: &str) -> Result<String, String> {

 /// Apply YAML node-type definitions to a mutable Schema.
 /// Registers all types, fields, and allowed types from the YAML into the schema.
-fn apply_yaml_to_schema(
-    yaml: &YamlNodeTypes,
-    schema: &mut crate::schema::Schema,
-) {
+fn apply_yaml_to_schema(yaml: &YamlNodeTypes, schema: &mut crate::schema::Schema) {
    // Register all supertypes as node kinds
    for name in yaml.supertypes.keys() {
        schema.register_kind(name);
@@ -307,7 +304,8 @@ fn apply_yaml_to_schema(
                .into_vec()
                .into_iter()
                .map(|type_ref| {
-                    let (kind, named) = resolve_type_ref_pair(&type_ref, &named_types, &unnamed_types);
+                    let (kind, named) =
+                        resolve_type_ref_pair(&type_ref, &named_types, &unnamed_types);
                    crate::schema::NodeType { kind, named }
                })
                .collect::<Vec<_>>();
--- a/shared/yeast/src/schema.rs
+++ b/shared/yeast/src/schema.rs
@@ -198,13 +198,8 @@ impl Schema {
            .insert((parent_kind.to_string(), field_id), node_types);
    }

-    pub fn field_types(
-        &self,
-        parent_kind: &str,
-        field_id: FieldId,
-    ) -> Option<&Vec<NodeType>> {
-        self.field_types
-            .get(&(parent_kind.to_string(), field_id))
+    pub fn field_types(&self, parent_kind: &str, field_id: FieldId) -> Option<&Vec<NodeType>> {
+        self.field_types.get(&(parent_kind.to_string(), field_id))
    }

    pub fn set_field_cardinality(
--- a/shared/yeast/tests/test.rs
+++ b/shared/yeast/tests/test.rs
@@ -7,7 +7,7 @@ const OUTPUT_SCHEMA_YAML: &str = include_str!("node-types.yml");

 /// Helper: parse Ruby source with no rules, return dump.
 fn parse_and_dump(input: &str) -> String {
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run(input).unwrap();
    dump_ast(&ast, ast.get_root(), input)
 }
@@ -18,13 +18,23 @@ fn run_and_dump(input: &str, rules: Vec<Rule>) -> String {
    run_phased_and_dump(input, vec![Phase::new("test", PhaseKind::Repeating, rules)])
 }

+/// Helper: parse Ruby source with custom rules and return the transformed AST.
+fn run_and_ast(input: &str, rules: Vec<Rule>) -> Ast {
+    let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
+    let schema =
+        yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
+    let phases = vec![Phase::new("test", PhaseKind::Repeating, rules)];
+    let runner: Runner = Runner::with_schema(lang, &schema, &phases);
+    runner.run(input).unwrap()
+}
+
 /// Helper: parse Ruby source with a custom output schema and multiple
 /// rule phases, return dump.
 fn run_phased_and_dump(input: &str, phases: Vec<Phase>) -> String {
    let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
    let schema =
        yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
-    let runner = Runner::with_schema(lang, &schema, &phases);
+    let runner: Runner = Runner::with_schema(lang, &schema, &phases);
    let ast = runner.run(input).unwrap();
    dump_ast(&ast, ast.get_root(), input)
 }
@@ -36,7 +46,7 @@ fn run_and_get_error(input: &str, rules: Vec<Rule>) -> String {
    let schema =
        yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
    let phases = vec![Phase::new("test", PhaseKind::Repeating, rules)];
-    let runner = Runner::with_schema(lang, &schema, &phases);
+    let runner: Runner = Runner::with_schema(lang, &schema, &phases);
    runner
        .run(input)
        .expect_err("expected runner to return an error")
@@ -44,7 +54,7 @@ fn run_and_get_error(input: &str, rules: Vec<Rule>) -> String {

 /// Helper: parse Ruby source with no rules and dump with schema type errors.
 fn parse_and_dump_typed(input: &str, schema_yaml: &str) -> String {
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run(input).unwrap();
    let schema = yeast::node_types_yaml::schema_from_yaml(schema_yaml).unwrap();
    dump_ast_with_type_errors(&ast, ast.get_root(), input, &schema)
@@ -54,10 +64,10 @@ fn parse_and_dump_typed(input: &str, schema_yaml: &str) -> String {
 /// building schema with language IDs so field checks align with parser fields.
 fn parse_and_dump_typed_with_language(input: &str, schema_yaml: &str) -> String {
    let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
-    let runner = Runner::new(lang.clone(), &[]);
+    let runner: Runner = Runner::new(lang.clone(), &[]);
    let ast = runner.run(input).unwrap();
-    let schema = yeast::node_types_yaml::schema_from_yaml_with_language(schema_yaml, &lang)
-        .unwrap();
+    let schema =
+        yeast::node_types_yaml::schema_from_yaml_with_language(schema_yaml, &lang).unwrap();
    dump_ast_with_type_errors(&ast, ast.get_root(), input, &schema)
 }

@@ -66,7 +76,7 @@ fn run_and_dump_typed(input: &str, rules: Vec<Rule>, schema_yaml: &str) -> Strin
    let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
    let schema = yeast::node_types_yaml::schema_from_yaml(schema_yaml).unwrap();
    let phases = vec![Phase::new("test", PhaseKind::Repeating, rules)];
-    let runner = Runner::with_schema(lang, &schema, &phases);
+    let runner: Runner = Runner::with_schema(lang, &schema, &phases);
    let ast = runner.run(input).unwrap();
    dump_ast_with_type_errors(&ast, ast.get_root(), input, &schema)
 }
@@ -156,7 +166,7 @@ fn test_parse_for_loop() {

 #[test]
 fn test_dump_highlights_type_errors_inline() {
-        let schema_yaml = r#"
+    let schema_yaml = r#"
 named:
    program:
        $children*: assignment
@@ -166,13 +176,13 @@ named:
    identifier:
 "#;

-        let dump = parse_and_dump_typed("x = 1", schema_yaml);
-        assert!(dump.contains("integer \"1\" <-- ERROR:"));
+    let dump = parse_and_dump_typed("x = 1", schema_yaml);
+    assert!(dump.contains("integer \"1\" <-- ERROR:"));
 }

 #[test]
 fn test_dump_reports_preserved_unknown_kind_after_transformation() {
-        let schema_yaml = r#"
+    let schema_yaml = r#"
 named:
    program:
        $children*: assignment
@@ -182,25 +192,25 @@ named:
    identifier:
 "#;

-        // This rewrite runs and preserves the RHS node kind via capture.
-        // With schema above, preserving `integer` should be reported inline.
-        let rules = vec![yeast::rule!(
-                (assignment left: (_) @left right: (_) @right)
-                =>
-                (assignment
-                        left: {left}
-                        right: {right}
-                )
-        )];
+    // This rewrite runs and preserves the RHS node kind via capture.
+    // With schema above, preserving `integer` should be reported inline.
+    let rules: Vec<Rule> = vec![yeast::rule!(
+            (assignment left: (_) @left right: (_) @right)
+            =>
+            (assignment
+                    left: {left}
+                    right: {right}
+            )
+    )];

-        let dump = run_and_dump_typed("x = 1", rules, schema_yaml);
-        assert!(dump.contains("integer \"1\" <-- ERROR:"));
-        assert!(dump.contains("node kind 'integer' not in schema"));
+    let dump = run_and_dump_typed("x = 1", rules, schema_yaml);
+    assert!(dump.contains("integer \"1\" <-- ERROR:"));
+    assert!(dump.contains("node kind 'integer' not in schema"));
 }

 #[test]
 fn test_dump_reports_undeclared_field_on_node() {
-        let schema_yaml = r#"
+    let schema_yaml = r#"
 named:
    program:
        $children*: assignment
@@ -209,14 +219,14 @@ named:
    identifier:
 "#;

-        let dump = parse_and_dump_typed_with_language("x = y", schema_yaml);
-        assert!(dump.contains("right: identifier \"y\" <-- ERROR:"));
-        assert!(dump.contains("the node 'assignment' has no field 'right'"));
+    let dump = parse_and_dump_typed_with_language("x = y", schema_yaml);
+    assert!(dump.contains("right: identifier \"y\" <-- ERROR:"));
+    assert!(dump.contains("the node 'assignment' has no field 'right'"));
 }

 #[test]
 fn test_dump_reports_disallowed_kind_in_field_type() {
-        let schema_yaml = r#"
+    let schema_yaml = r#"
 named:
    program:
        $children*: assignment
@@ -227,17 +237,17 @@ named:
    integer:
 "#;

-        let dump = parse_and_dump_typed_with_language("x = 1", schema_yaml);
-        assert!(dump.contains("right: integer \"1\" <-- ERROR:"));
-        assert!(dump.contains("should contain"));
-        assert!(dump.contains("but got integer"));
+    let dump = parse_and_dump_typed_with_language("x = 1", schema_yaml);
+    assert!(dump.contains("right: integer \"1\" <-- ERROR:"));
+    assert!(dump.contains("should contain"));
+    assert!(dump.contains("but got integer"));
 }

 // ---- Query tests ----

 #[test]
 fn test_query_match() {
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("x = 1").unwrap();

    let query = yeast::query!(
@@ -258,7 +268,7 @@ fn test_query_match() {

 #[test]
 fn test_query_no_match() {
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("x = 1").unwrap();

    let query = yeast::query!(
@@ -283,7 +293,7 @@ fn test_query_skips_extras_in_positional_match() {
    // captured comment to nothing (a common idiom, e.g.
    // `(comment) => ()` in Swift) leaves the capture's match-list empty
    // and causes the transform to fail with "Variable X has 0 matches".
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("[1, # comment\n2]").unwrap();

    // Navigate to the `array` node: program -> array.
@@ -299,15 +309,11 @@ fn test_query_skips_extras_in_positional_match() {
    let matched = query.do_match(&ast, array_id, &mut captures).unwrap();
    assert!(matched);
    assert_eq!(
-        ast.get_node(captures.get_var("a").unwrap())
-            .unwrap()
-            .kind(),
+        ast.get_node(captures.get_var("a").unwrap()).unwrap().kind(),
        "integer"
    );
    assert_eq!(
-        ast.get_node(captures.get_var("b").unwrap())
-            .unwrap()
-            .kind(),
+        ast.get_node(captures.get_var("b").unwrap()).unwrap().kind(),
        "integer"
    );
 }
@@ -315,14 +321,14 @@ fn test_query_skips_extras_in_positional_match() {
 #[test]
 fn test_reachable_nodes_excludes_orphaned_rewrite_nodes() {
    let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
-    let schema = yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang)
-        .unwrap();
-    let phases = vec![Phase::new(
+    let schema =
+        yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
+    let phases: Vec<Phase> = vec![Phase::new(
        "test",
        PhaseKind::Repeating,
        vec![yeast::rule!((integer) => (identifier "replaced"))],
    )];
-    let runner = Runner::with_schema(lang, &schema, &phases);
+    let runner: Runner = Runner::with_schema(lang, &schema, &phases);

    let input = "x = 1";
    let ast = runner.run(input).unwrap();
@@ -340,7 +346,7 @@ fn test_reachable_nodes_excludes_orphaned_rewrite_nodes() {

 #[test]
 fn test_query_repeated_capture() {
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("x, y, z = 1").unwrap();

    let query = yeast::query!(
@@ -365,7 +371,7 @@ fn test_query_repeated_capture() {
 #[test]
 fn test_capture_unnamed_node_parenthesized() {
    // `("=") @op` captures the unnamed `=` token between left and right.
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("x = 1").unwrap();

    let query = yeast::query!(
@@ -389,10 +395,33 @@ fn test_capture_unnamed_node_parenthesized() {
    assert!(!op_node.is_named());
 }

+#[test]
+fn test_capture_bare_underscore_repeated() {
+    // `_` matches named and unnamed nodes in bare-child position. On this
+    // assignment shape, bare children correspond to unnamed tokens (the `=`).
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let ast = runner.run("x = 1").unwrap();
+
+    let query = yeast::query!((assignment _* @all));
+
+    let mut cursor = AstCursor::new(&ast);
+    cursor.goto_first_child();
+    let assignment_id = cursor.node_id();
+
+    let mut captures = yeast::captures::Captures::new();
+    let matched = query.do_match(&ast, assignment_id, &mut captures).unwrap();
+    assert!(matched);
+
+    let all = captures.get_all("all");
+    assert_eq!(all.len(), 1);
+    assert_eq!(ast.get_node(all[0]).unwrap().kind(), "=");
+    assert!(!ast.get_node(all[0]).unwrap().is_named());
+}
+
 #[test]
 fn test_capture_unnamed_node_bare_literal() {
    // `"=" @op` (without surrounding parens) is the same as `("=") @op`.
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("x = 1").unwrap();

    let query = yeast::query!(
@@ -421,7 +450,7 @@ fn test_bare_underscore_matches_unnamed() {
    // Bare `_` matches any node, including unnamed tokens, while `(_)`
    // matches only named nodes. Demonstrate by matching the unnamed `=`
    // token in the implicit `child` field of an `assignment`.
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("x = 1").unwrap();

    let mut cursor = AstCursor::new(&ast);
@@ -460,7 +489,7 @@ fn test_bare_forms_in_field_position() {
    // field's value, not just in the bare-children position. This is
    // syntactic sugar for `(_)` / `("…")` and goes through the same
    // code paths.
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("x = 1").unwrap();

    let mut cursor = AstCursor::new(&ast);
@@ -499,7 +528,7 @@ fn test_forward_scan_finds_unnamed_token_late() {
    // query for `("end")` skip past the first two and match the third.
    // Without forward-scan, the matcher took the first child unconditionally
    // and failed.
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("for x in list do\n  y\nend").unwrap();

    // Navigate: program > for > do (the body wrapper).
@@ -526,7 +555,7 @@ fn test_forward_scan_preserves_order() {
    // order. A query for ("end") then ("do") should fail because `do`
    // appears before `end` in the source order; once forward-scan has
    // consumed `end`, the iterator is exhausted.
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("for x in list do\n  y\nend").unwrap();

    let mut cursor = AstCursor::new(&ast);
@@ -547,7 +576,7 @@ fn test_forward_scan_preserves_order() {

 #[test]
 fn test_tree_builder() {
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let mut ast = runner.run("x = 1").unwrap();
    let input = "x = 1";

@@ -565,7 +594,8 @@ fn test_tree_builder() {

    // Swap left and right
    let fresh = yeast::tree_builder::FreshScope::new();
-    let mut ctx = yeast::build::BuildCtx::new(&mut ast, &captures, &fresh);
+    let mut user_ctx = ();
+    let mut ctx = yeast::build::BuildCtx::new(&mut ast, &captures, &fresh, &mut user_ctx);
    let new_id = yeast::tree!(ctx,
        (program
            child: (assignment
@@ -593,7 +623,7 @@ fn test_tree_builder() {
 // tree-sitter-ruby grammar with named fields for nodes that only have
 // unnamed children in tree-sitter (e.g. block_body.stmt, block_parameters.parameter).
 fn ruby_rules() -> Vec<Rule> {
-    let assign_rule = yeast::rule!(
+    let assign_rule: Rule = yeast::rule!(
        (assignment
            left: (left_assignment_list
                (identifier)* @left
@@ -618,7 +648,7 @@ fn ruby_rules() -> Vec<Rule> {
        )}
    );

-    let for_rule = yeast::rule!(
+    let for_rule: Rule = yeast::rule!(
        (for
            pattern: (_) @pat
            value: (in (_) @val)
@@ -700,7 +730,7 @@ fn test_desugar_for_loop() {

 #[test]
 fn test_shorthand_rule() {
-    let rule = yeast::rule!(
+    let rule: Rule = yeast::rule!(
        (assignment
            left: (_) @method
            right: (_) @receiver
@@ -852,7 +882,7 @@ fn test_phase_error_includes_phase_name() {
        PhaseKind::Repeating,
        vec![swap_assignment_rule().repeated()],
    )];
-    let runner = Runner::with_schema(lang, &schema, &phases);
+    let runner: Runner = Runner::with_schema(lang, &schema, &phases);
    let err = runner
        .run("x = 1")
        .expect_err("expected runner to return an error");
@@ -895,7 +925,7 @@ fn test_one_shot_phase() {
        PhaseKind::OneShot,
        one_shot_xeq1_rules(),
    )];
-    let runner = Runner::with_schema(lang, &schema, &phases);
+    let runner: Runner = Runner::with_schema(lang, &schema, &phases);

    let input = "x = 1";
    let ast = runner.run(input).unwrap();
@@ -921,7 +951,7 @@ fn test_one_shot_phase_errors_when_no_rule_matches() {
    let mut rules = one_shot_xeq1_rules();
    rules.pop();
    let phases = vec![Phase::new("translate", PhaseKind::OneShot, rules)];
-    let runner = Runner::with_schema(lang, &schema, &phases);
+    let runner: Runner = Runner::with_schema(lang, &schema, &phases);

    let err = runner
        .run("x = 1")
@@ -945,7 +975,7 @@ fn test_one_shot_recurses_into_returned_capture() {
    let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
    let schema =
        yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
-    let rules = vec![
+    let rules: Vec<Rule> = vec![
        yeast::rule!(
            (program (_)* @stmts)
            =>
@@ -961,7 +991,7 @@ fn test_one_shot_recurses_into_returned_capture() {
        yeast::rule!((integer) => (integer "INT")),
    ];
    let phases = vec![Phase::new("translate", PhaseKind::OneShot, rules)];
-    let runner = Runner::with_schema(lang, &schema, &phases);
+    let runner: Runner = Runner::with_schema(lang, &schema, &phases);

    let input = "x = 1";
    let ast = runner.run(input).unwrap();
@@ -987,7 +1017,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {
    let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
    let schema =
        yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
-    let rules = vec![
+    let rules: Vec<Rule> = vec![
        yeast::rule!(
            (program (_)* @stmts)
            =>
@@ -1008,7 +1038,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {
        yeast::rule!((integer) => (integer "INT")),
    ];
    let phases = vec![Phase::new("translate", PhaseKind::OneShot, rules)];
-    let runner = Runner::with_schema(lang, &schema, &phases);
+    let runner: Runner = Runner::with_schema(lang, &schema, &phases);

    let input = "x = 1";
    let ast = runner.run(input).unwrap();
@@ -1032,7 +1062,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {

 #[test]
 fn test_cursor_navigation() {
-    let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
+    let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
    let ast = runner.run("x = 1").unwrap();
    let mut cursor = AstCursor::new(&ast);

@@ -1106,7 +1136,7 @@ fn test_desugar_for_with_multiple_assignment() {
 /// resolves to the captured node's source text via `YeastDisplay`.
 #[test]
 fn test_hash_brace_renders_capture_source_text() {
-    let rule = rule!(
+    let rule: Rule = rule!(
        (call
            method: (identifier) @name
            receiver: (identifier) @recv
@@ -1135,7 +1165,7 @@ fn test_hash_brace_renders_capture_source_text() {
 /// `Display` impl (covered by `YeastDisplay`'s blanket impls for primitives).
 #[test]
 fn test_hash_brace_renders_integer_expression() {
-    let rule = rule!(
+    let rule: Rule = rule!(
        (identifier) @_
        =>
        (identifier #{1 + 2})
@@ -1149,3 +1179,39 @@ fn test_hash_brace_renders_integer_expression() {
    "#,
    );
 }
+
+/// Regression test: `(kind #{capture})` should inherit the captured node's
+/// source location, not the full source range of the matched rule root.
+#[test]
+fn test_hash_brace_uses_capture_location_for_leaf() {
+    let rule: Rule = rule!(
+        (call
+            method: (identifier) @name
+            receiver: (identifier) @recv
+        )
+        =>
+        (call
+            method: (identifier #{name})
+            receiver: (identifier #{recv})
+            arguments: (argument_list)
+        )
+    );
+
+    let ast = run_and_ast("foo.bar()", vec![rule]);
+
+    let mut bar_ids: Vec<usize> = Vec::new();
+    for id in ast.reachable_node_ids() {
+        let Some(node) = ast.get_node(id) else {
+            continue;
+        };
+        if node.kind() == "identifier" && ast.source_text(id) == "bar" {
+            bar_ids.push(id);
+        }
+    }
+
+    assert_eq!(bar_ids.len(), 1, "expected exactly one identifier 'bar'");
+    let bar = ast.get_node(bar_ids[0]).unwrap();
+
+    assert_eq!(bar.start_byte(), 4);
+    assert_eq!(bar.end_byte(), 7);
+}
--- a/unified/AGENTS.md
+++ b/unified/AGENTS.md
@@ -3,25 +3,21 @@
 This is a CodeQL extractor based on tree-sitter.

 ## Building
-To build the extractor, run `scripts/create-extractor-pack.sh`
+- To build the extractor, run `scripts/create-extractor-pack.sh`

-## Editing the Swift grammar
-The vendored tree-sitter-swift grammar lives at
-`extractor/tree-sitter-swift/`. After editing `grammar.js` (or any other
-grammar source), run `scripts/regenerate-grammar.sh` to:
- regenerate `extractor/tree-sitter-swift/src/{parser.c, grammar.json,
-  node-types.json}` (and the `src/tree_sitter/*.h` headers) via
-  `tree-sitter generate`; and
- refresh `extractor/tree-sitter-swift/node-types.yml`, the
-  human-readable companion to `src/node-types.json` produced by yeast's
-  `node_types_yaml` binary.
+## Swift Parser
+- The Swift parser is defined by `extractor/tree-sitter-swift/grammar.js` and can be edited if needed.

-`node-types.yml` is the recommended review surface for grammar changes —
-it shows the impact of a grammar tweak on the named node kinds, fields,
-and child types in a form much easier to read than the raw JSON.
+- After editing the grammar, always run `scripts/regenerate-grammar.sh`.

-## Extractor Testing
- To run extractor tests, run `cargo test` in the `extractor` directory.
+- The raw parse tree is described by `extractor/tree-sitter-swift/node-types.yml` and should be reviewed after grammar changes.
+
+## AST Mapping
+- The target AST shape is described by `extractor/ast_types.yml`.
+
+- The mapping from the parse tree to the target AST is found in `extractor/src/languages/swift/swift.rs`
+
+- To run tests for the parser and mapping, run `cargo test` in the `extractor` directory.

 - Do not edit the printed ASTs in `extractor/test/corpus` directly. To regenerate the ASTs, run `scripts/update-corpus.sh`.

--- a/unified/extractor/ast_types.yml
+++ b/unified/extractor/ast_types.yml
@@ -2,36 +2,103 @@ supertypes:
  expr:
    - name_expr
    - int_literal
+    - float_literal
+    - boolean_literal
    - string_literal
+    - regex_literal
+    - builtin_expr
    - binary_expr
    - unary_expr
    - call_expr
    - member_access_expr
-    - lambda_expr
-    - unsupported_node
-  stmt:
-    - empty_stmt
-    - block_stmt
-    - expr_stmt
-    - if_stmt
-    - variable_declaration_stmt
-    - guard_if_stmt
-    - unsupported_node
-  condition:
-    - expr_condition
-    - let_pattern_condition
-    - sequence_condition
+    - super_expr
+    - function_expr
+    - array_literal
+    - map_literal
+    - key_value_pair
+    - tuple_expr
+    - type_cast_expr
+    - type_test_expr
+    - if_expr
+    - assign_expr
+    - compound_assign_expr
+    - pattern_guard_expr
+    - empty_expr
+    - block
+    - break_expr
+    - continue_expr
+    - return_expr
+    - throw_expr
+    - try_expr
+    - switch_expr
    - unsupported_node
+  expr_or_pattern:
+    - expr
+    - pattern
+  expr_or_type:
+    - expr
+    - type_expr
  pattern:
-    - var_pattern
-    - apply_pattern
+    - name_pattern
    - tuple_pattern
+    - constructor_pattern
    - ignore_pattern
+    - expr_equality_pattern
+    - bulk_importing_pattern
    - unsupported_node
+  # A statement is anything that can appear in a block.
+  # This type contains all of 'expr' and has partial overlap with 'member'.
+  # For example, type_alias_declaration can appear either as a stmt or member.
+  # constructor_declaration and destructor_declaration appear here because
+  # tree-sitter-swift's error recovery for #if/#endif in class bodies can place
+  # init/deinit declarations at the wrong (statement) level.
+  stmt:
+    - expr
+    - variable_declaration
+    - type_alias_declaration
+    - function_declaration
+    - import_declaration
+    - operator_syntax_declaration
+    - class_like_declaration
+    - accessor_declaration
+    - constructor_declaration
+    - destructor_declaration
+    - guard_if_stmt
+    - for_each_stmt
+    - while_stmt
+    - do_while_stmt
+    - labeled_stmt
+  # A member is anything that can appear in the body of a class-like declaration
+  member:
+    - constructor_declaration
+    - destructor_declaration
+    - function_declaration
+    - variable_declaration
+    - accessor_declaration
+    - initializer_declaration
+    - class_like_declaration
+    - type_alias_declaration
+    - associated_type_declaration
+    - unsupported_node
+  type_expr:
+    - named_type_expr
+    - generic_type_expr
+    - tuple_type_expr
+    - function_type_expr
+    - inferred_type_expr
+    - unsupported_node
+  type_constraint:
+    - equality_type_constraint
+    - bound_type_constraint
+  operator:
+    - infix_operator
+    - prefix_operator
+    - postfix_operator
 named:
-  # Top-level is the root node, currently containing a list of expressions
+  # Top-level is the root node, containing a single block of statements
+  # (which are themselves expressions or declarations).
  top_level:
-    body*: [expr, stmt]
+    body: block

  # An identifier used in the context of an expression
  name_expr:
@@ -40,13 +107,28 @@ named:
  # An integer literal
  int_literal:

+  # A floating-point literal
+  float_literal:
+
+  # A boolean literal
+  boolean_literal:
+
+  # A literal backed by a keyword such as `nil`, `null`, or `nullptr`.
+  #
+  # Although nil/null are keyword literals in many languages there should be
+  # no attempt to normalize "null-like" named entities, like Python's `None`.
+  builtin_expr:
+
  # A string literal
  string_literal:

+  # A regex literal
+  regex_literal:
+
  # Application of a binary operator, such as `a + b`
  binary_expr:
    left: expr
-    operator: operator
+    operator: infix_operator
    right: expr

  # Application of a unary operator, such as `!x`
@@ -54,86 +136,310 @@ named:
    operand: expr
    operator: operator

-  # A function or method call, such as `f(x)` or `obj.m(x)`. Method calls
-  # are represented as a call whose `function` is a `member_access_expr`.
+  # Plain assignment
+  assign_expr:
+    target: expr_or_pattern
+    value: expr
+
+  # Compound assignment
+  compound_assign_expr:
+    target: expr
+    operator: infix_operator
+    value: expr
+
+  # A function or method call, such as `f(x)` or `obj.m(x)`.
+  #
+  # Method calls are represented as a call whose `function` is a `member_access_expr`.
+  #
+  # Constructor calls are marked by a language-specific modifier, and the target may be
+  # a `type_expr` if the parser can deduce that the target is a type.
  call_expr:
-    function: expr
-    argument*: expr
+    modifier*: modifier
+    callee: expr_or_type
+    argument*: argument
+
+  argument:
+    modifier*: modifier
+    name?: identifier
+    value: expr

  # Member access, such as `obj.member`.
+  #
+  # The base may be a type expression when it is a static member access like `Array<Int>.method`.
+  # In ambiguous cases where the parser cannot distinguish static and instance member access, the base
+  # will be typically be an expression.
+  #
+  # For `super.x` the base will be an instance of `super_expr`.
  member_access_expr:
-    target: expr
+    base: expr_or_type
    member: identifier

-  lambda_expr:
+  # A type expression that refers to a type inferred from the contextual type.
+  # This is used to translate Swift's leading-dot syntax, `.foo`, which means `T.foo` where
+  # `T` is the contextual type of some enclosing expression. This is translated to a member_access
+  # with an inferred_type_expr as the base.
+  inferred_type_expr:
+
+  # A `super` token, which can usually only appear as the base of member access.
+  super_expr:
+
+  function_expr:
+    modifier*: modifier
+    capture_declaration*: variable_declaration
    parameter*: parameter
-    body: [expr, stmt]
+    return_type?: type_expr
+    body: block

-  # A parameter
+  array_literal:
+    element*: expr
+
+  map_literal:
+    element*: expr
+
+  # A key-value pair, usually appearing as a named argument or as part of a map literal.
+  #
+  # For some languages, the key-value pair is a first class value and this type of expression
+  # may thus appear anywhere in the general case.
+  key_value_pair:
+    key: expr
+    value: expr
+
+  # A tuple expression, such as `(a, b, c)`.
+  tuple_expr:
+    element*: expr
+
+  # A parameter.
+  #
+  # `type` is its declared type annotation (if any)
+  #
+  # `pattern` binds the parameter's internal name(s). For a simple parameter this is a
+  # `name_pattern`, but may be an arbitrary pattern for languages where patterns may appear
+  # in the parameter list.
+  #
+  # `external_name` is the name by which to call sites refer to the parameter, if the parameter
+  # can be passed as a named parameter. For example, the Swift function `func greet(person id: String)`
+  # would have `person` as the external name and a `name_pattern` wrapping `id` is the parameter's pattern.
  parameter:
+    modifier*: modifier
+    external_name?: identifier
+    type?: type_expr
+    pattern?: pattern
+    default?: expr
+
+  # An expression that does nothing. Used where the grammar permits an
+  # empty statement (e.g. a stray `;`).
+  empty_expr:
+
+  # A brace-delimited sequence of statements (`{ ... }`). Blocks are the
+  # only nodes that can directly contain statements; every other body-like
+  # field holds a single `block`.
+  block:
+    stmt*: stmt
+
+  if_expr:
+    condition: expr
+    then?: expr
+    else?: expr
+
+  # A variable declaration or destructuring assignment that introduces new variables.
+  #
+  # Any occurrence of `var_patterns` in 'pattern' result in fresh bindings that are
+  # in scope for the rest of the enclosing block.
+  #
+  # The initializer is optional (but typically cannot be omitted if combined with a non-trivial pattern).
+  #
+  # Modifiers should include 'var', 'let', 'const', etc, if they are significant.
+  # A grouped declaration like `let x = 1, y = 2` is emitted as a sequence of
+  # `variable_declaration`s directly into the enclosing stmt/member slot; every
+  # declaration after the first in such a group is tagged with a synthetic
+  # `chained_declaration` modifier so the grouping can be recovered downstream.
+  variable_declaration:
+    modifier*: modifier
    pattern: pattern
-
-  empty_stmt:
-
-  block_stmt:
-    body*: stmt
-
-  expr_stmt:
-    expr: expr
-
-  if_stmt:
-    condition: condition
-    then?: stmt
-    else?: stmt
-
-  variable_declaration_stmt:
-    variable_declarator+: variable_declarator
-
-  # A variable declaration, or assignment to a pattern.
-  # The initializer is optional (but typically only possible in combination with a simple variable pattern).
-  variable_declarator:
-    pattern: pattern
+    type?: type_expr
    value?: expr

  # Evaluate 'condition', and if false, execute 'else' which must break from the enclosing block scope (return, break, etc).
  # Any variables bound by 'condition' will be in scope for the remainder of the enclosing block scope
-  # (which differs from how if_stmt works).
+  # (which differs from how if_expr works).
  guard_if_stmt:
-    condition: condition
-    else: stmt
+    condition: expr
+    else: block

-  # Evaluates the given condition and interprets it as a boolean (by language conventions)
-  expr_condition:
-    expr: expr
+  # `break` (with optional label)
+  break_expr:
+    label?: identifier

-  # A series of statements that are executed before evaluating the trailing condition.
-  # Useful for languages where a conditional clause may be preceded by side-effecting
-  # syntactic elements (e.g. binding clauses) that don't themselves form a condition.
-  sequence_condition:
-    stmt*: stmt
-    condition: condition
+  # `continue` (with optional label)
+  continue_expr:
+    label?: identifier
+
+  # A labeled statement, such as `outer: for ... { ... }`. The labeled
+  # statement appears as the `stmt` field; `break`/`continue` may target
+  # the label.
+  labeled_stmt:
+    label: identifier
+    stmt: stmt
+
+  # `return value` or bare `return`
+  return_expr:
+    value?: expr
+
+  # `throw value`
+  throw_expr:
+    value?: expr
+
+  # An import declaration.
+  #
+  # The semantics of an import are generally:
+  # - Evaluate the 'imported_expr' to a value (possibly a compile-time value, such as namespace)
+  # - Filter away possible values based on modifiers (e.g. type-only imports only accept types)
+  # - Assign the value to the pattern, binding variables and/or type names in scope
+  #
+  import_declaration:
+    modifier*: modifier
+    imported_expr: expr # Qualified names are encoded as a chain of member_access_expr ending with a name_expr
+    pattern?: pattern # Binds local names in scope (possibly via bulk_importing_pattern)
+
+  # `typealias Name = Type`
+  type_alias_declaration:
+    modifier*: modifier
+    name: identifier
+    type_parameter*: type_parameter
+    type_constraint*: type_constraint
+    type: type_expr
+
+  # A top-level function declaration.
+  function_declaration:
+    modifier*: modifier
+    name: identifier
+    type_parameter*: type_parameter
+    type_constraint*: type_constraint
+    parameter*: parameter
+    return_type?: type_expr
+    body?: block
+
+  # `for pattern in iterable [where guard] { body }`.
+  for_each_stmt:
+    modifier*: modifier
+    pattern: pattern
+    iterable: expr
+    guard?: expr
+    body?: block
+
+  # `while condition { body }`.
+  while_stmt:
+    modifier*: modifier
+    condition: expr
+    body?: block
+
+  # `repeat { body } while condition`.
+  do_while_stmt:
+    modifier*: modifier
+    body?: block
+    condition: expr
+
+  # `do { body } catch pattern { ... } catch ...`. Swift uses `do`/`catch`
+  # for error handling; for languages with `try`/`catch`, this is the same shape.
+  try_expr:
+    modifier*: modifier
+    body: block
+    catch_clause*: catch_clause
+
+  catch_clause:
+    modifier*: modifier
+    pattern?: pattern
+    guard?: expr
+    body: block
+
+  # `switch value { case pattern: body case ...: default: body }`
+  switch_expr:
+    modifier*: modifier
+    value: expr
+    case*: switch_case
+
+  # A single `case ...:` (or `default:`) entry in a switch.
+  # An entry with multiple `case p1, p2:` patterns has multiple `pattern`s.
+  # A `default:` entry has no patterns.
+  # An optional `guard` corresponds to a `where`-clause on the case.
+  switch_case:
+    modifier*: modifier
+    pattern*: pattern
+    guard?: expr
+    body: block

  # Evaluate 'expr' and match its result against 'pattern', and return true if it matches.
-  # Variables bound by the pattern will be in scope within the 'true' branch controlled by this condition.
-  let_pattern_condition:
+  # Variables bound by the pattern will be in scope within the 'true' branch controlled by this expression.
+  #
+  # In Swift, `if case let PATTERN = EXPR` maps to this node
+  #
+  # Java: 'if (x instanceof Foo y && w ...) { ... }'
+  pattern_guard_expr:
    pattern: pattern
    value: expr

-  # A pattern matching anything, binding its value to the given variable
-  var_pattern:
+  # A type cast expression, such as `x as T`, `x as? T`, or `x as! T`. The
+  # operator distinguishes between the variants.
+  type_cast_expr:
+    expr: expr
+    operator: infix_operator
+    type: type_expr
+
+  # A type-test expression, such as `x is T`. Yields a boolean indicating
+  # whether `expr` is an instance of `type`.
+  type_test_expr:
+    expr: expr
+    operator: infix_operator
+    type: type_expr
+
+  # An identifier that introduces a variable.
+  #
+  # When used as a pattern, the pattern matches anything and binds its incoming value to the variable
+  name_pattern:
+    modifier*: modifier
    identifier: identifier

  # A pattern matching anything, binding no variables, usually using the syntax "_"
  ignore_pattern:

-  # A pattern such as `Some(x)` where `Some` is the constructor and `x` is an argument
-  apply_pattern:
-    constructor: expr
-    argument*: pattern
+  # A pattern that matches if the incoming value is equal to the value of the given expression.
+  # Used for literal patterns in switch (e.g. `case 1:`).
+  expr_equality_pattern:
+    expr: expr

  # A tuple pattern such as `(a, b)` in `let (a, b) = pair`.
+  #
+  # Elements of the tuple pattern can have names, such as Swift's `let (foo: x, bar: y) = tuple`.
  tuple_pattern:
-    element*: pattern
+    modifier*: modifier
+    element*: pattern_element
+
+  # A pattern such as `Some(x)` where `Some` is the constructor and `x` is an element.
+  # The element names are interpreted as argument labels and/or field names.
+  constructor_pattern:
+    modifier*: modifier
+    constructor: expr_or_type
+    element*: pattern_element
+
+  # A pattern with an optional associated name.
+  pattern_element:
+    modifier*: modifier
+    key?: identifier
+    pattern: pattern
+
+  # A pattern that checks if the incoming value has the given type, and if so, the
+  # value is matched against the given nested pattern (and succeeds iff the nested match succeeds).
+  #
+  # In Swift: `if let y = x as? Foo` is a pattern_guard_expr containing a type_test_pattern
+  # In Java: `x instanceof Foo y` is a type_test_pattern wrapping a name_pattern
+  type_test_pattern:
+    pattern: pattern
+    type: type_expr
+
+  # A '*' pattern that imports all members of the incoming value into the local scope
+  # Currently this can only appear in import declarations.
+  bulk_importing_pattern:
+    modifier*: modifier

  # An simple unqualified identifier token
  identifier:
@@ -141,4 +447,129 @@ named:
  # A node that we don't yet translate
  unsupported_node:

-  operator:
+  infix_operator:
+
+  prefix_operator:
+
+  postfix_operator:
+
+  # The fixity of a custom operator declaration (e.g. "prefix", "infix",
+  # "postfix"). The value is the keyword string.
+  fixity:
+
+  type_parameter:
+    modifier*: modifier
+    name: identifier
+    bound?: type_expr
+
+  # A generic constraint of the form `T == U`, requiring two types to be
+  # equal. Appears in `where` clauses on generic declarations
+  # (e.g. Swift `func foo<T, U>() where T == U`).
+  equality_type_constraint:
+    left: type_expr
+    right: type_expr
+
+  # A generic constraint of the form `T: Bound`, requiring a type parameter
+  # to conform to (or inherit from) some other type. Appears in `where`
+  # clauses on generic declarations (e.g. Swift `where T: Equatable`).
+  bound_type_constraint:
+    type: type_expr
+    bound: type_expr
+
+  # `infix operator +++` (and the like) — a declaration of a custom operator.
+  operator_syntax_declaration:
+    modifier*: modifier
+    name: identifier
+    # The fixity specifier (`prefix`, `infix`, `postfix`), when applicable.
+    fixity?: fixity
+    # The declared precedence level, when present (e.g. Swift's
+    # `infix operator +++ : AdditionPrecedence`).
+    precedence?: expr
+
+  # A class-like declaration: class, struct, interface (protocol), enum (or actor).
+  # The syntactic kind is carried as a `modifier` (e.g. "class", "struct",
+  # "interface", "enum", "extension"). The `"enum_case"` modifier additionally
+  # marks a declaration as an enum case with associated values. Extensions are
+  # represented as a class-like declaration with the `"extension"` modifier and
+  # no `name`; the extended type appears as a `base_type`.
+  class_like_declaration:
+    modifier*: modifier
+    name?: identifier
+    type_parameter*: type_parameter
+    type_constraint*: type_constraint
+    base_type*: base_type
+    member*: member
+
+  # One of the base types of a class declaration.
+  #
+  # If the language has multiple kinds of base classes (e.g. extends/implements) the
+  # kind should be included as a modifier on this node.
+  base_type:
+    modifier*: modifier
+    type: type_expr
+
+  constructor_declaration:
+    modifier*: modifier
+    name?: identifier
+    parameter*: parameter
+    body: block
+
+  # A destructor / finalizer (Swift `deinit`, C++ `~T()`, etc.).
+  destructor_declaration:
+    modifier*: modifier
+    body: block
+
+  # Declaration of a single accessor for a property (such as a getter, setter,
+  # or observer like Swift's `willSet`/`didSet`).
+  #
+  # Multiple accessors for the same property are emitted as a sequence of
+  # accessor_declaration nodes; every accessor after the first is tagged with
+  # a synthetic `chained_declaration` modifier so the grouping can be recovered
+  # downstream. Stored properties with observers are emitted as a
+  # variable_declaration followed by one accessor_declaration per observer
+  # (each observer also tagged with `chained_declaration`).
+  accessor_declaration:
+    modifier*: modifier
+    name: identifier
+    accessor_kind: accessor_kind
+    parameter*: parameter
+    type?: type_expr
+    body?: block
+
+  # "get", "set", or a language-specific kind like "didSet"
+  accessor_kind:
+
+  # Static or instance initializer block. That is, code that runs at initialization time of either the class or an instance.
+  initializer_declaration:
+    modifier*: modifier
+    body: block
+
+  associated_type_declaration:
+    modifier*: modifier
+    name: identifier
+    bound?: type_expr
+
+  named_type_expr:
+    qualifier?: type_expr
+    name: identifier
+
+  generic_type_expr:
+    base: type_expr
+    type_argument*: type_expr
+
+  # A tuple type such as `(Int, String)` or `(a: A, b: B)`.
+  tuple_type_expr:
+    element*: tuple_type_element
+
+  # An element of a `tuple_type_expr`, optionally carrying a label.
+  tuple_type_element:
+    name?: identifier
+    type: type_expr
+
+  # A function type such as `(Int, String) -> Bool` or `(x: Int) -> Bool`.
+  function_type_expr:
+    parameter*: parameter
+    return_type: type_expr
+
+  # A modifier such as 'static', 'public', or 'async'. For now this is just a leaf node with a string value.
+  modifier:
--- a/unified/extractor/src/extractor.rs
+++ b/unified/extractor/src/extractor.rs
@@ -1,9 +1,9 @@
 use clap::Args;
 use std::path::PathBuf;

+use crate::languages;
 use codeql_extractor::extractor::simple;
 use codeql_extractor::trap;
-use crate::languages;

 #[derive(Args)]
 pub struct Options {
@@ -35,7 +35,9 @@ pub fn run(options: Options) -> std::io::Result<()> {
        prefix: "unified".to_string(),
        languages,
        trap_dir: options.output_dir,
-        trap_compression: trap::Compression::from_env("CODEQL_EXTRACTOR_UNIFIED_OPTION_TRAP_COMPRESSION"),
+        trap_compression: trap::Compression::from_env(
+            "CODEQL_EXTRACTOR_UNIFIED_OPTION_TRAP_COMPRESSION",
+        ),
        source_archive_dir: options.source_archive_dir,
        file_lists: vec![options.file_list],
    };
--- a/unified/extractor/src/generator.rs
+++ b/unified/extractor/src/generator.rs
@@ -22,14 +22,19 @@ pub fn run(options: Options) -> std::io::Result<()> {
    // The QL-visible schema is the unified output AST, not the per-language
    // input grammars. Pass it via `desugar.output_node_types_yaml` so the
    // generator converts the YAML to JSON node-types.
-    let desugar = yeast::DesugaringConfig::new()
-        .with_output_node_types_yaml(languages::OUTPUT_AST_SCHEMA);
+    let desugar =
+        yeast::DesugaringConfig::new().with_output_node_types_yaml(languages::OUTPUT_AST_SCHEMA);

    let languages = vec![Language {
        name: "Unified".to_owned(),
-        node_types: "",   // unused: generator picks up output_node_types_yaml above
+        node_types: "", // unused: generator picks up output_node_types_yaml above
        desugar: Some(desugar),
    }];

-    generate(languages, options.dbscheme, options.library, "run unified/scripts/create-extractor-pack.sh")
+    generate(
+        languages,
+        options.dbscheme,
+        options.library,
+        "run unified/scripts/create-extractor-pack.sh",
+    )
 }
--- a/unified/extractor/src/languages/swift/swift.rs
+++ b/unified/extractor/src/languages/swift/swift.rs
--- a/unified/extractor/tests/corpus/swift/closures.txt
+++ b/unified/extractor/tests/corpus/swift/closures.txt
@@ -50,6 +50,35 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "f"
+          value:
+            function_expr
+              body:
+                block
+                  stmt:
+                    binary_expr
+                      operator: infix_operator "*"
+                      left:
+                        name_expr
+                          identifier: identifier "x"
+                      right: int_literal "2"
+              parameter:
+                parameter
+                  pattern:
+                    name_pattern
+                      identifier: identifier "x"
+                  type:
+                    named_type_expr
+                      name: identifier "Int"
+              return_type:
+                named_type_expr
+                  name: identifier "Int"

 ===
 Closure with shorthand parameters
@@ -82,6 +111,26 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "f"
+          value:
+            function_expr
+              body:
+                block
+                  stmt:
+                    binary_expr
+                      operator: infix_operator "+"
+                      left:
+                        name_expr
+                          identifier: identifier "$0"
+                      right:
+                        name_expr
+                          identifier: identifier "$1"

 ===
 Trailing closure
@@ -114,6 +163,28 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        call_expr
+          argument:
+            argument
+              value:
+                function_expr
+                  body:
+                    block
+                      stmt:
+                        binary_expr
+                          operator: infix_operator "*"
+                          left:
+                            name_expr
+                              identifier: identifier "$0"
+                          right: int_literal "2"
+          callee:
+            member_access_expr
+              base:
+                name_expr
+                  identifier: identifier "xs"
+              member: identifier "map"

 ===
 Closure with capture list
@@ -163,6 +234,31 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "f"
+          value:
+            function_expr
+              body:
+                block
+                  stmt:
+                    call_expr
+                      callee:
+                        member_access_expr
+                          base:
+                            name_expr
+                              identifier: identifier "self"
+                          member: identifier "doThing"
+              capture_declaration:
+                variable_declaration
+                  modifier: modifier "weak"
+                  pattern:
+                    name_pattern
+                      identifier: identifier "self"

 ===
 Multi-statement closure
@@ -236,3 +332,46 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "f"
+          value:
+            function_expr
+              body:
+                block
+                  stmt:
+                    variable_declaration
+                      modifier: modifier "let"
+                      pattern:
+                        name_pattern
+                          identifier: identifier "y"
+                      value:
+                        binary_expr
+                          operator: infix_operator "+"
+                          left:
+                            name_expr
+                              identifier: identifier "x"
+                          right: int_literal "1"
+                    return_expr
+                      value:
+                        binary_expr
+                          operator: infix_operator "*"
+                          left:
+                            name_expr
+                              identifier: identifier "y"
+                          right: int_literal "2"
+              parameter:
+                parameter
+                  pattern:
+                    name_pattern
+                      identifier: identifier "x"
+                  type:
+                    named_type_expr
+                      name: identifier "Int"
+              return_type:
+                named_type_expr
+                  name: identifier "Int"
--- a/unified/extractor/tests/corpus/swift/collections.txt
+++ b/unified/extractor/tests/corpus/swift/collections.txt
@@ -28,6 +28,19 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "xs"
+          value:
+            array_literal
+              element:
+                int_literal "1"
+                int_literal "2"
+                int_literal "3"

 ===
 Empty array literal with type
@@ -68,6 +81,22 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "xs"
+          type:
+            generic_type_expr
+              base:
+                named_type_expr
+                  name: identifier "Array"
+              type_argument:
+                named_type_expr
+                  name: identifier "Int"
+          value: array_literal "[]"

 ===
 Dictionary literal
@@ -106,6 +135,14 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "d"
+          value: map_literal "[\"a\": 1, \"b\": 2]"

 ===
 Set literal
@@ -155,6 +192,22 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "s"
+          type:
+            named_type_expr
+              name: identifier "Set<Int>"
+          value:
+            array_literal
+              element:
+                int_literal "1"
+                int_literal "2"
+                int_literal "3"

 ===
 Tuple literal
@@ -191,6 +244,14 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "t"
+          value: tuple_expr "(1, \"two\", 3.0)"

 ===
 Subscript access
@@ -232,9 +293,21 @@ source_file

 top_level
  body:
-    unsupported_node "// TODO: tree-sitter-swift parses `xs[0]` as a call_expression (same shape"
-    unsupported_node "// as `xs(0)`), so the mapping currently produces a call_expr. Update the"
-    unsupported_node "// parser / add a separate subscript_expr node and remap when fixed."
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "first"
+          value:
+            call_expr
+              argument:
+                argument
+                  value: int_literal "0"
+              callee:
+                name_expr
+                  identifier: identifier "xs"

 ===
 Dictionary subscript
@@ -276,8 +349,21 @@ source_file

 top_level
  body:
-    unsupported_node "// TODO: same parser issue as the array subscript case above —"
-    unsupported_node "// `d[\"key\"]` is parsed as `call_expression(d, (\"key\"))`."
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "v"
+          value:
+            call_expr
+              argument:
+                argument
+                  value: string_literal "\"key\""
+              callee:
+                name_expr
+                  identifier: identifier "d"

 ===
 Tuple member access
@@ -309,3 +395,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "n"
+          value:
+            member_access_expr
+              base:
+                name_expr
+                  identifier: identifier "t"
+              member: identifier "0"
--- a/unified/extractor/tests/corpus/swift/control-flow.txt
+++ b/unified/extractor/tests/corpus/swift/control-flow.txt
@@ -35,6 +35,28 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        if_expr
+          condition:
+            binary_expr
+              operator: infix_operator ">"
+              left:
+                name_expr
+                  identifier: identifier "x"
+              right: int_literal "0"
+          then:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value:
+                        name_expr
+                          identifier: identifier "x"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"

 ===
 If-else
@@ -90,6 +112,43 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        if_expr
+          condition:
+            binary_expr
+              operator: infix_operator ">"
+              left:
+                name_expr
+                  identifier: identifier "x"
+              right: int_literal "0"
+          else:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value:
+                        unary_expr
+                          operand:
+                            name_expr
+                              identifier: identifier "x"
+                          operator: prefix_operator "-"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"
+          then:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value:
+                        name_expr
+                          identifier: identifier "x"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"

 ===
 If-else-if chain
@@ -165,6 +224,55 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        if_expr
+          condition:
+            binary_expr
+              operator: infix_operator ">"
+              left:
+                name_expr
+                  identifier: identifier "x"
+              right: int_literal "0"
+          else:
+            if_expr
+              condition:
+                binary_expr
+                  operator: infix_operator "<"
+                  left:
+                    name_expr
+                      identifier: identifier "x"
+                  right: int_literal "0"
+              else:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value: int_literal "3"
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+              then:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value: int_literal "2"
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+          then:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value: int_literal "1"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"

 ===
 If-let optional binding
@@ -207,6 +315,39 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        if_expr
+          condition:
+            pattern_guard_expr
+              pattern:
+                constructor_pattern
+                  element:
+                    pattern_element
+                      pattern:
+                        name_pattern
+                          identifier: identifier "value"
+                  constructor:
+                    member_access_expr
+                      base:
+                        named_type_expr
+                          name: identifier "Optional"
+                      member: identifier "some"
+              value:
+                name_expr
+                  identifier: identifier "optional"
+          then:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value:
+                        name_expr
+                          identifier: identifier "value"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"

 ===
 Guard let
@@ -240,6 +381,30 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        guard_if_stmt
+          condition:
+            pattern_guard_expr
+              pattern:
+                constructor_pattern
+                  element:
+                    pattern_element
+                      pattern:
+                        name_pattern
+                          identifier: identifier "value"
+                  constructor:
+                    member_access_expr
+                      base:
+                        named_type_expr
+                          name: identifier "Optional"
+                      member: identifier "some"
+              value:
+                name_expr
+                  identifier: identifier "optional"
+          else:
+            block
+              stmt: return_expr "return"

 ===
 Ternary expression
@@ -277,6 +442,27 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "y"
+          value:
+            if_expr
+              condition:
+                binary_expr
+                  operator: infix_operator ">"
+                  left:
+                    name_expr
+                      identifier: identifier "x"
+                  right: int_literal "0"
+              else:
+                unary_expr
+                  operand: int_literal "1"
+                  operator: prefix_operator "-"
+              then: int_literal "1"

 ===
 Switch statement
@@ -357,6 +543,54 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        switch_expr
+          case:
+            switch_case
+              body:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value: string_literal "\"one\""
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+              pattern:
+                expr_equality_pattern
+                  expr: int_literal "1"
+            switch_case
+              body:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value: string_literal "\"two or three\""
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+              pattern:
+                expr_equality_pattern
+                  expr: int_literal "2"
+                expr_equality_pattern
+                  expr: int_literal "3"
+            switch_case
+              body:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value: string_literal "\"other\""
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+          value:
+            name_expr
+              identifier: identifier "x"

 ===
 Switch with binding pattern
@@ -396,6 +630,7 @@ source_file
                                      pattern:
                                        pattern
                                          bound_identifier: simple_identifier "r"
+                      dot: .
                      name: simple_identifier "circle"
          statement:
            call_expression
@@ -428,6 +663,7 @@ source_file
                                      pattern:
                                        pattern
                                          bound_identifier: simple_identifier "s"
+                      dot: .
                      name: simple_identifier "square"
          statement:
            call_expression
@@ -445,3 +681,207 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        switch_expr
+          case:
+            switch_case
+              body:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value:
+                            name_expr
+                              identifier: identifier "r"
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+              pattern:
+                constructor_pattern
+                  element:
+                    pattern_element
+                      pattern:
+                        name_pattern
+                          identifier: identifier "r"
+                  constructor:
+                    member_access_expr
+                      base: inferred_type_expr "."
+                      member: identifier "circle"
+            switch_case
+              body:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value:
+                            name_expr
+                              identifier: identifier "s"
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+              pattern:
+                constructor_pattern
+                  element:
+                    pattern_element
+                      pattern:
+                        name_pattern
+                          identifier: identifier "s"
+                  constructor:
+                    member_access_expr
+                      base: inferred_type_expr "."
+                      member: identifier "square"
+          value:
+            name_expr
+              identifier: identifier "shape"
+
+===
+Switch with labeled case pattern arguments
+===
+
+switch x {
+case .implicit(isAcknowledged: false):
+  print("yes")
+case .thread(threadRowId: _, let rowId):
+  print(rowId)
+}
+
+---
+
+source_file
+  statement:
+    switch_statement
+      entry:
+        switch_entry
+          pattern:
+            switch_pattern
+              pattern:
+                pattern
+                  kind:
+                    case_pattern
+                      arguments:
+                        tuple_pattern
+                          item:
+                            tuple_pattern_item
+                              name: simple_identifier "isAcknowledged"
+                              pattern:
+                                pattern
+                                  kind:
+                                    boolean_literal
+                      dot: .
+                      name: simple_identifier "implicit"
+          statement:
+            call_expression
+              function: simple_identifier "print"
+              suffix:
+                call_suffix
+                  arguments:
+                    value_arguments
+                      argument:
+                        value_argument
+                          value:
+                            line_string_literal
+                              text: line_str_text "yes"
+        switch_entry
+          pattern:
+            switch_pattern
+              pattern:
+                pattern
+                  kind:
+                    case_pattern
+                      arguments:
+                        tuple_pattern
+                          item:
+                            tuple_pattern_item
+                              name: simple_identifier "threadRowId"
+                              pattern:
+                                pattern
+                                  kind: wildcard_pattern "_"
+                            tuple_pattern_item
+                              pattern:
+                                pattern
+                                  kind:
+                                    binding_pattern
+                                      binding:
+                                        value_binding_pattern
+                                          mutability: let
+                                      pattern:
+                                        pattern
+                                          bound_identifier: simple_identifier "rowId"
+                      dot: .
+                      name: simple_identifier "thread"
+          statement:
+            call_expression
+              function: simple_identifier "print"
+              suffix:
+                call_suffix
+                  arguments:
+                    value_arguments
+                      argument:
+                        value_argument
+                          value: simple_identifier "rowId"
+      expr: simple_identifier "x"
+
+---
+
+top_level
+  body:
+    block
+      stmt:
+        switch_expr
+          case:
+            switch_case
+              body:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value: string_literal "\"yes\""
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+              pattern:
+                constructor_pattern
+                  element:
+                    pattern_element
+                      key: identifier "isAcknowledged"
+                      pattern:
+                        expr_equality_pattern
+                          expr: boolean_literal "false"
+                  constructor:
+                    member_access_expr
+                      base: inferred_type_expr "."
+                      member: identifier "implicit"
+            switch_case
+              body:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value:
+                            name_expr
+                              identifier: identifier "rowId"
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+              pattern:
+                constructor_pattern
+                  element:
+                    pattern_element
+                      key: identifier "threadRowId"
+                      pattern: ignore_pattern "_"
+                    pattern_element
+                      pattern:
+                        name_pattern
+                          identifier: identifier "rowId"
+                  constructor:
+                    member_access_expr
+                      base: inferred_type_expr "."
+                      member: identifier "thread"
+          value:
+            name_expr
+              identifier: identifier "x"
--- a/unified/extractor/tests/corpus/swift/desugar.txt
+++ b/unified/extractor/tests/corpus/swift/desugar.txt
@@ -17,6 +17,12 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "+"
+          left: int_literal "1"
+          right: int_literal "2"

 ===
 Another additive expression is desugared
@@ -37,3 +43,144 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "+"
+          left:
+            name_expr
+              identifier: identifier "foo"
+          right:
+            name_expr
+              identifier: identifier "bar"
+
+===
+Simple import with single name
+===
+
+import Foundation
+
+---
+
+source_file
+  statement:
+    import_declaration
+      name:
+        identifier
+          part: simple_identifier "Foundation"
+
+---
+
+top_level
+  body:
+    block
+      stmt:
+        import_declaration
+          pattern: bulk_importing_pattern "import Foundation"
+          imported_expr:
+            name_expr
+              identifier: identifier "Foundation"
+
+===
+Import with dotted path (two parts)
+===
+
+import Foundation.Networking
+
+---
+
+source_file
+  statement:
+    import_declaration
+      name:
+        identifier
+          part:
+            simple_identifier "Foundation"
+            simple_identifier "Networking"
+
+---
+
+top_level
+  body:
+    block
+      stmt:
+        import_declaration
+          pattern: bulk_importing_pattern "import Foundation.Networking"
+          imported_expr:
+            member_access_expr
+              base:
+                name_expr
+                  identifier: identifier "Foundation"
+              member: identifier "Networking"
+
+===
+Import with deeply nested path (three parts)
+===
+
+import Foundation.Networking.URLSession
+
+---
+
+source_file
+  statement:
+    import_declaration
+      name:
+        identifier
+          part:
+            simple_identifier "Foundation"
+            simple_identifier "Networking"
+            simple_identifier "URLSession"
+
+---
+
+top_level
+  body:
+    block
+      stmt:
+        import_declaration
+          pattern: bulk_importing_pattern "import Foundation.Networking.URLSession"
+          imported_expr:
+            member_access_expr
+              base:
+                member_access_expr
+                  base:
+                    name_expr
+                      identifier: identifier "Foundation"
+                  member: identifier "Networking"
+              member: identifier "URLSession"
+
+===
+Scoped import uses name_pattern
+===
+
+import struct Foundation.Date
+
+---
+
+source_file
+  statement:
+    import_declaration
+      name:
+        identifier
+          part:
+            simple_identifier "Foundation"
+            simple_identifier "Date"
+      scoped_import_kind: struct
+
+---
+
+top_level
+  body:
+    block
+      stmt:
+        import_declaration
+          modifier: modifier "struct"
+          pattern:
+            name_pattern
+              identifier: identifier "Date"
+          imported_expr:
+            member_access_expr
+              base:
+                name_expr
+                  identifier: identifier "Foundation"
+              member: identifier "Date"
--- a/unified/extractor/tests/corpus/swift/functions.txt
+++ b/unified/extractor/tests/corpus/swift/functions.txt
@@ -31,6 +31,20 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        function_declaration
+          body:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value: string_literal "\"hello\""
+                  callee:
+                    name_expr
+                      identifier: identifier "print"
+          name: identifier "greet"

 ===
 Function with parameters and return type
@@ -93,6 +107,37 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        function_declaration
+          body:
+            block
+              stmt:
+                return_expr
+                  value:
+                    binary_expr
+                      operator: infix_operator "+"
+                      left:
+                        name_expr
+                          identifier: identifier "a"
+                      right:
+                        name_expr
+                          identifier: identifier "b"
+          name: identifier "add"
+          parameter:
+            parameter
+              external_name: identifier "_"
+              pattern:
+                name_pattern
+                  identifier: identifier "a"
+            parameter
+              external_name: identifier "_"
+              pattern:
+                name_pattern
+                  identifier: identifier "b"
+          return_type:
+            named_type_expr
+              name: identifier "Int"

 ===
 Function with named parameters
@@ -138,6 +183,28 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        function_declaration
+          body:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value:
+                        name_expr
+                          identifier: identifier "name"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"
+          name: identifier "greet"
+          parameter:
+            parameter
+              external_name: identifier "person"
+              pattern:
+                name_pattern
+                  identifier: identifier "name"

 ===
 Function with default parameter value
@@ -185,6 +252,28 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        function_declaration
+          body:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value:
+                        name_expr
+                          identifier: identifier "name"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"
+          name: identifier "greet"
+          parameter:
+            parameter
+              default: string_literal "\"world\""
+              pattern:
+                name_pattern
+                  identifier: identifier "name"

 ===
 Variadic function
@@ -249,6 +338,38 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        function_declaration
+          body:
+            block
+              stmt:
+                return_expr
+                  value:
+                    call_expr
+                      argument:
+                        argument
+                          value: int_literal "0"
+                        argument
+                          value:
+                            name_expr
+                              identifier: identifier "+"
+                      callee:
+                        member_access_expr
+                          base:
+                            name_expr
+                              identifier: identifier "values"
+                          member: identifier "reduce"
+          name: identifier "sum"
+          parameter:
+            parameter
+              external_name: identifier "_"
+              pattern:
+                name_pattern
+                  identifier: identifier "values"
+          return_type:
+            named_type_expr
+              name: identifier "Int"

 ===
 Function call
@@ -276,6 +397,17 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        call_expr
+          argument:
+            argument
+              value: int_literal "1"
+            argument
+              value: int_literal "2"
+          callee:
+            name_expr
+              identifier: identifier "foo"

 ===
 Function call with labelled arguments
@@ -306,6 +438,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        call_expr
+          argument:
+            argument
+              name: identifier "person"
+              value: string_literal "\"Bob\""
+          callee:
+            name_expr
+              identifier: identifier "greet"

 ===
 Method call
@@ -336,6 +478,18 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        call_expr
+          argument:
+            argument
+              value: int_literal "1"
+          callee:
+            member_access_expr
+              base:
+                name_expr
+                  identifier: identifier "list"
+              member: identifier "append"

 ===
 Generic function
@@ -387,3 +541,117 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        function_declaration
+          body:
+            block
+              stmt:
+                return_expr
+                  value:
+                    name_expr
+                      identifier: identifier "x"
+          name: identifier "identity"
+          parameter:
+            parameter
+              external_name: identifier "_"
+              pattern:
+                name_pattern
+                  identifier: identifier "x"
+          return_type:
+            named_type_expr
+              name: identifier "T"
+
+===
+Leading-dot expression value
+===
+
+let x = .foo
+
+---
+
+source_file
+  statement:
+    property_declaration
+      binding:
+        value_binding_pattern
+          mutability: let
+      declarator:
+        property_binding
+          name:
+            pattern
+              bound_identifier: simple_identifier "x"
+          value:
+            prefix_expression
+              operation: .
+              target: simple_identifier "foo"
+
+---
+
+top_level
+  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "x"
+          value:
+            member_access_expr
+              base: inferred_type_expr ".foo"
+              member: identifier "foo"
+
+===
+Leading-dot expression call
+===
+
+let y = .some(1)
+
+---
+
+source_file
+  statement:
+    property_declaration
+      binding:
+        value_binding_pattern
+          mutability: let
+      declarator:
+        property_binding
+          name:
+            pattern
+              bound_identifier: simple_identifier "y"
+          value:
+            call_expression
+              function:
+                prefix_expression
+                  operation: .
+                  target: simple_identifier "some"
+              suffix:
+                call_suffix
+                  arguments:
+                    value_arguments
+                      argument:
+                        value_argument
+                          value: integer_literal "1"
+
+---
+
+top_level
+  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "y"
+          value:
+            call_expr
+              argument:
+                argument
+                  value: int_literal "1"
+              callee:
+                member_access_expr
+                  base: inferred_type_expr ".some"
+                  member: identifier "some"
--- a/unified/extractor/tests/corpus/swift/literals.txt
+++ b/unified/extractor/tests/corpus/swift/literals.txt
@@ -13,6 +13,8 @@ source_file

 top_level
  body:
+    block
+      stmt: int_literal "42"

 ===
 Negative integer literal
@@ -32,6 +34,11 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        unary_expr
+          operand: int_literal "7"
+          operator: prefix_operator "-"

 ===
 Floating-point literal
@@ -48,6 +55,8 @@ source_file

 top_level
  body:
+    block
+      stmt: float_literal "3.14"

 ===
 Boolean literals
@@ -67,6 +76,10 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        boolean_literal "true"
+        boolean_literal "false"

 ===
 Nil literal
@@ -83,6 +96,8 @@ source_file

 top_level
  body:
+    block
+      stmt: builtin_expr "nil"

 ===
 String literal
@@ -101,6 +116,8 @@ source_file

 top_level
  body:
+    block
+      stmt: string_literal "\"hello\""

 ===
 String with interpolation
@@ -122,3 +139,5 @@ source_file

 top_level
  body:
+    block
+      stmt: string_literal "\"hello \\(name)\""
--- a/unified/extractor/tests/corpus/swift/loops.txt
+++ b/unified/extractor/tests/corpus/swift/loops.txt
@@ -37,6 +37,30 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        for_each_stmt
+          body:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value:
+                        name_expr
+                          identifier: identifier "x"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"
+          pattern:
+            name_pattern
+              identifier: identifier "x"
+          iterable:
+            array_literal
+              element:
+                int_literal "1"
+                int_literal "2"
+                int_literal "3"

 ===
 For-in over range
@@ -76,6 +100,29 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        for_each_stmt
+          body:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value:
+                        name_expr
+                          identifier: identifier "i"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"
+          pattern:
+            name_pattern
+              identifier: identifier "i"
+          iterable:
+            binary_expr
+              operator: infix_operator "..<"
+              left: int_literal "0"
+              right: int_literal "10"

 ===
 For-in with where clause
@@ -119,6 +166,34 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        for_each_stmt
+          body:
+            block
+              stmt:
+                call_expr
+                  argument:
+                    argument
+                      value:
+                        name_expr
+                          identifier: identifier "x"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"
+          pattern:
+            name_pattern
+              identifier: identifier "x"
+          guard:
+            binary_expr
+              operator: infix_operator ">"
+              left:
+                name_expr
+                  identifier: identifier "x"
+              right: int_literal "0"
+          iterable:
+            name_expr
+              identifier: identifier "xs"

 ===
 While loop
@@ -154,6 +229,25 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        while_stmt
+          body:
+            block
+              stmt:
+                compound_assign_expr
+                  operator: infix_operator "-="
+                  target:
+                    name_expr
+                      identifier: identifier "x"
+                  value: int_literal "1"
+          condition:
+            binary_expr
+              operator: infix_operator ">"
+              left:
+                name_expr
+                  identifier: identifier "x"
+              right: int_literal "0"

 ===
 Repeat-while loop
@@ -189,6 +283,25 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        do_while_stmt
+          body:
+            block
+              stmt:
+                compound_assign_expr
+                  operator: infix_operator "-="
+                  target:
+                    name_expr
+                      identifier: identifier "x"
+                  value: int_literal "1"
+          condition:
+            binary_expr
+              operator: infix_operator ">"
+              left:
+                name_expr
+                  identifier: identifier "x"
+              right: int_literal "0"

 ===
 Break and continue
@@ -252,3 +365,46 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        for_each_stmt
+          body:
+            block
+              stmt:
+                if_expr
+                  condition:
+                    binary_expr
+                      operator: infix_operator "<"
+                      left:
+                        name_expr
+                          identifier: identifier "x"
+                      right: int_literal "0"
+                  then:
+                    block
+                      stmt: continue_expr "continue"
+                if_expr
+                  condition:
+                    binary_expr
+                      operator: infix_operator ">"
+                      left:
+                        name_expr
+                          identifier: identifier "x"
+                      right: int_literal "100"
+                  then:
+                    block
+                      stmt: break_expr "break"
+                call_expr
+                  argument:
+                    argument
+                      value:
+                        name_expr
+                          identifier: identifier "x"
+                  callee:
+                    name_expr
+                      identifier: identifier "print"
+          pattern:
+            name_pattern
+              identifier: identifier "x"
+          iterable:
+            name_expr
+              identifier: identifier "xs"
--- a/unified/extractor/tests/corpus/swift/operators.txt
+++ b/unified/extractor/tests/corpus/swift/operators.txt
@@ -17,6 +17,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "+"
+          left:
+            name_expr
+              identifier: identifier "a"
+          right:
+            name_expr
+              identifier: identifier "b"

 ===
 Subtraction
@@ -37,6 +47,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "-"
+          left:
+            name_expr
+              identifier: identifier "a"
+          right:
+            name_expr
+              identifier: identifier "b"

 ===
 Multiplication
@@ -57,6 +77,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "*"
+          left:
+            name_expr
+              identifier: identifier "a"
+          right:
+            name_expr
+              identifier: identifier "b"

 ===
 Division
@@ -77,6 +107,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "/"
+          left:
+            name_expr
+              identifier: identifier "a"
+          right:
+            name_expr
+              identifier: identifier "b"

 ===
 Operator precedence: addition and multiplication
@@ -101,6 +141,22 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "+"
+          left:
+            name_expr
+              identifier: identifier "a"
+          right:
+            binary_expr
+              operator: infix_operator "*"
+              left:
+                name_expr
+                  identifier: identifier "b"
+              right:
+                name_expr
+                  identifier: identifier "c"

 ===
 Parenthesised expression
@@ -129,6 +185,14 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "*"
+          left: tuple_expr "(a + b)"
+          right:
+            name_expr
+              identifier: identifier "c"

 ===
 Comparison
@@ -149,6 +213,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "<"
+          left:
+            name_expr
+              identifier: identifier "a"
+          right:
+            name_expr
+              identifier: identifier "b"

 ===
 Equality
@@ -169,6 +243,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "=="
+          left:
+            name_expr
+              identifier: identifier "a"
+          right:
+            name_expr
+              identifier: identifier "b"

 ===
 Logical and
@@ -189,6 +273,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "&&"
+          left:
+            name_expr
+              identifier: identifier "a"
+          right:
+            name_expr
+              identifier: identifier "b"

 ===
 Logical or
@@ -209,6 +303,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "||"
+          left:
+            name_expr
+              identifier: identifier "a"
+          right:
+            name_expr
+              identifier: identifier "b"

 ===
 Logical not
@@ -228,6 +332,13 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        unary_expr
+          operand:
+            name_expr
+              identifier: identifier "a"
+          operator: prefix_operator "!"

 ===
 Range operator
@@ -248,3 +359,9 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        binary_expr
+          operator: infix_operator "..."
+          left: int_literal "1"
+          right: int_literal "10"
--- a/unified/extractor/tests/corpus/swift/optionals-and-errors.txt
+++ b/unified/extractor/tests/corpus/swift/optionals-and-errors.txt
@@ -34,6 +34,22 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "x"
+          type:
+            generic_type_expr
+              base:
+                named_type_expr
+                  name: identifier "Optional"
+              type_argument:
+                named_type_expr
+                  name: identifier "Int"
+          value: builtin_expr "nil"

 ===
 Optional chaining
@@ -74,6 +90,22 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "n"
+          value:
+            member_access_expr
+              base:
+                member_access_expr
+                  base:
+                    name_expr
+                      identifier: identifier "obj"
+                  member: identifier "foo"
+              member: identifier "bar"

 ===
 Force unwrap
@@ -103,6 +135,19 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "n"
+          value:
+            unary_expr
+              operand:
+                name_expr
+                  identifier: identifier "opt"
+              operator: postfix_operator "!"

 ===
 Nil-coalescing
@@ -132,6 +177,20 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "n"
+          value:
+            binary_expr
+              operator: infix_operator "??"
+              left:
+                name_expr
+                  identifier: identifier "opt"
+              right: int_literal "0"

 ===
 Throwing function
@@ -167,6 +226,18 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        function_declaration
+          body:
+            block
+              stmt:
+                return_expr
+                  value: string_literal "\"\""
+          name: identifier "read"
+          return_type:
+            named_type_expr
+              name: identifier "String"

 ===
 Do-catch
@@ -216,6 +287,33 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        try_expr
+          body:
+            block
+              stmt:
+                unary_expr
+                  operand:
+                    call_expr
+                      callee:
+                        name_expr
+                          identifier: identifier "foo"
+                  operator: prefix_operator "try"
+          catch_clause:
+            catch_clause
+              body:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value:
+                            name_expr
+                              identifier: identifier "error"
+                      callee:
+                        name_expr
+                          identifier: identifier "print"

 ===
 Try? expression
@@ -252,6 +350,21 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "result"
+          value:
+            unary_expr
+              operand:
+                call_expr
+                  callee:
+                    name_expr
+                      identifier: identifier "foo"
+              operator: prefix_operator "try?"

 ===
 Try! expression
@@ -288,3 +401,18 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "result"
+          value:
+            unary_expr
+              operand:
+                call_expr
+                  callee:
+                    name_expr
+                      identifier: identifier "foo"
+              operator: prefix_operator "try!"
--- a/unified/extractor/tests/corpus/swift/types.txt
+++ b/unified/extractor/tests/corpus/swift/types.txt
@@ -18,6 +18,11 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          modifier: modifier "class"
+          name: identifier "Foo"

 ===
 Class with stored properties
@@ -79,6 +84,28 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            variable_declaration
+              modifier: modifier "var"
+              pattern:
+                name_pattern
+                  identifier: identifier "x"
+              type:
+                named_type_expr
+                  name: identifier "Int"
+            variable_declaration
+              modifier: modifier "var"
+              pattern:
+                name_pattern
+                  identifier: identifier "y"
+              type:
+                named_type_expr
+                  name: identifier "Int"
+          modifier: modifier "class"
+          name: identifier "Point"

 ===
 Class with initializer
@@ -152,6 +179,34 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            variable_declaration
+              modifier: modifier "var"
+              pattern:
+                name_pattern
+                  identifier: identifier "x"
+              type:
+                named_type_expr
+                  name: identifier "Int"
+            constructor_declaration
+              body:
+                block
+                  stmt:
+                    assign_expr
+                      target:
+                        member_access_expr
+                          base:
+                            name_expr
+                              identifier: identifier "self"
+                          member: identifier "x"
+                      value:
+                        name_expr
+                          identifier: identifier "x"
+          modifier: modifier "class"
+          name: identifier "Point"

 ===
 Class with method
@@ -200,6 +255,29 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            variable_declaration
+              modifier: modifier "var"
+              pattern:
+                name_pattern
+                  identifier: identifier "n"
+              value: int_literal "0"
+            function_declaration
+              body:
+                block
+                  stmt:
+                    compound_assign_expr
+                      operator: infix_operator "+="
+                      target:
+                        name_expr
+                          identifier: identifier "n"
+                      value: int_literal "1"
+              name: identifier "bump"
+          modifier: modifier "class"
+          name: identifier "Counter"

 ===
 Class inheritance
@@ -228,6 +306,11 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          modifier: modifier "class"
+          name: identifier "Dog"

 ===
 Struct
@@ -289,6 +372,28 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            variable_declaration
+              modifier: modifier "let"
+              pattern:
+                name_pattern
+                  identifier: identifier "x"
+              type:
+                named_type_expr
+                  name: identifier "Int"
+            variable_declaration
+              modifier: modifier "let"
+              pattern:
+                name_pattern
+                  identifier: identifier "y"
+              type:
+                named_type_expr
+                  name: identifier "Int"
+          modifier: modifier "struct"
+          name: identifier "Point"

 ===
 Enum with cases
@@ -332,6 +437,32 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            variable_declaration
+              modifier: modifier "enum_case"
+              pattern:
+                name_pattern
+                  identifier: identifier "north"
+            variable_declaration
+              modifier: modifier "enum_case"
+              pattern:
+                name_pattern
+                  identifier: identifier "south"
+            variable_declaration
+              modifier: modifier "enum_case"
+              pattern:
+                name_pattern
+                  identifier: identifier "east"
+            variable_declaration
+              modifier: modifier "enum_case"
+              pattern:
+                name_pattern
+                  identifier: identifier "west"
+          modifier: modifier "enum"
+          name: identifier "Direction"

 ===
 Enum with associated values
@@ -389,6 +520,40 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            class_like_declaration
+              member:
+                constructor_declaration
+                  body: block "circle(radius: Double)"
+                  parameter:
+                    parameter
+                      pattern:
+                        name_pattern
+                          identifier: identifier "radius"
+                      type:
+                        named_type_expr
+                          name: identifier "Double"
+              modifier: modifier "enum_case"
+              name: identifier "circle"
+            class_like_declaration
+              member:
+                constructor_declaration
+                  body: block "square(side: Double)"
+                  parameter:
+                    parameter
+                      pattern:
+                        name_pattern
+                          identifier: identifier "side"
+                      type:
+                        named_type_expr
+                          name: identifier "Double"
+              modifier: modifier "enum_case"
+              name: identifier "square"
+          modifier: modifier "enum"
+          name: identifier "Shape"

 ===
 Protocol declaration
@@ -414,6 +579,15 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            function_declaration
+              body: block "func draw()"
+              name: identifier "draw"
+          modifier: modifier "protocol"
+          name: identifier "Drawable"

 ===
 Extension
@@ -463,6 +637,30 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            function_declaration
+              body:
+                block
+                  stmt:
+                    return_expr
+                      value:
+                        binary_expr
+                          operator: infix_operator "*"
+                          left:
+                            name_expr
+                              identifier: identifier "self"
+                          right:
+                            name_expr
+                              identifier: identifier "self"
+              name: identifier "squared"
+              return_type:
+                named_type_expr
+                  name: identifier "Int"
+          modifier: modifier "extension"
+          name: identifier "Int"

 ===
 Computed property
@@ -555,6 +753,48 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            variable_declaration
+              modifier: modifier "var"
+              pattern:
+                name_pattern
+                  identifier: identifier "w"
+              type:
+                named_type_expr
+                  name: identifier "Double"
+            variable_declaration
+              modifier: modifier "var"
+              pattern:
+                name_pattern
+                  identifier: identifier "h"
+              type:
+                named_type_expr
+                  name: identifier "Double"
+            accessor_declaration
+              body:
+                block
+                  stmt:
+                    return_expr
+                      value:
+                        binary_expr
+                          operator: infix_operator "*"
+                          left:
+                            name_expr
+                              identifier: identifier "w"
+                          right:
+                            name_expr
+                              identifier: identifier "h"
+              modifier: modifier "var"
+              name: identifier "area"
+              type:
+                named_type_expr
+                  name: identifier "Double"
+              accessor_kind: accessor_kind "get"
+          modifier: modifier "class"
+          name: identifier "Rect"

 ===
 Property with getter and setter
@@ -639,3 +879,204 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            variable_declaration
+              modifier: modifier "var"
+              pattern:
+                name_pattern
+                  identifier: identifier "_v"
+              value: int_literal "0"
+            accessor_declaration
+              body:
+                block
+                  stmt:
+                    return_expr
+                      value:
+                        name_expr
+                          identifier: identifier "_v"
+              modifier: modifier "var"
+              name: identifier "v"
+              type:
+                named_type_expr
+                  name: identifier "Int"
+              accessor_kind: accessor_kind "get"
+            accessor_declaration
+              body:
+                block
+                  stmt:
+                    assign_expr
+                      target:
+                        name_expr
+                          identifier: identifier "_v"
+                      value:
+                        name_expr
+                          identifier: identifier "newValue"
+              modifier:
+                modifier "var"
+                modifier "chained_declaration"
+              name: identifier "v"
+              type:
+                named_type_expr
+                  name: identifier "Int"
+              accessor_kind: accessor_kind "set"
+          modifier: modifier "class"
+          name: identifier "Box"
+
+===
+Protocol with read-only and read-write property requirements
+===
+
+protocol P {
+  var foo: Int { get }
+  var bar: String { get set }
+}
+
+---
+
+source_file
+  statement:
+    protocol_declaration
+      body:
+        protocol_body
+          member:
+            protocol_property_declaration
+              name:
+                pattern
+                  binding:
+                    value_binding_pattern
+                      mutability: var
+                  bound_identifier: simple_identifier "foo"
+              requirements:
+                protocol_property_requirements
+                  accessor:
+                    getter_specifier
+              type:
+                type_annotation
+                  type:
+                    type
+                      name:
+                        user_type
+                          part:
+                            simple_user_type
+                              name: type_identifier "Int"
+            protocol_property_declaration
+              name:
+                pattern
+                  binding:
+                    value_binding_pattern
+                      mutability: var
+                  bound_identifier: simple_identifier "bar"
+              requirements:
+                protocol_property_requirements
+                  accessor:
+                    getter_specifier
+                    setter_specifier
+              type:
+                type_annotation
+                  type:
+                    type
+                      name:
+                        user_type
+                          part:
+                            simple_user_type
+                              name: type_identifier "String"
+      name: type_identifier "P"
+
+---
+
+top_level
+  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            accessor_declaration
+              name: identifier "foo"
+              type:
+                named_type_expr
+                  name: identifier "Int"
+              accessor_kind: accessor_kind "get"
+            accessor_declaration
+              name: identifier "bar"
+              type:
+                named_type_expr
+                  name: identifier "String"
+              accessor_kind: accessor_kind "get"
+            accessor_declaration
+              modifier: modifier "chained_declaration"
+              name: identifier "bar"
+              type:
+                named_type_expr
+                  name: identifier "String"
+              accessor_kind: accessor_kind "set"
+          modifier: modifier "protocol"
+          name: identifier "P"
+
+===
+Enum with comma-separated cases (chained_declaration)
+===
+
+enum Suit {
+  case clubs, diamonds, hearts, spades
+}
+
+---
+
+source_file
+  statement:
+    class_declaration
+      body:
+        enum_class_body
+          member:
+            enum_entry
+              case:
+                enum_case_entry
+                  name: simple_identifier "clubs"
+                enum_case_entry
+                  name: simple_identifier "diamonds"
+                enum_case_entry
+                  name: simple_identifier "hearts"
+                enum_case_entry
+                  name: simple_identifier "spades"
+      declaration_kind: enum
+      name: type_identifier "Suit"
+
+---
+
+top_level
+  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            variable_declaration
+              modifier: modifier "enum_case"
+              pattern:
+                name_pattern
+                  identifier: identifier "clubs"
+            variable_declaration
+              modifier:
+                modifier "chained_declaration"
+                modifier "enum_case"
+              pattern:
+                name_pattern
+                  identifier: identifier "diamonds"
+            variable_declaration
+              modifier:
+                modifier "chained_declaration"
+                modifier "enum_case"
+              pattern:
+                name_pattern
+                  identifier: identifier "hearts"
+            variable_declaration
+              modifier:
+                modifier "chained_declaration"
+                modifier "enum_case"
+              pattern:
+                name_pattern
+                  identifier: identifier "spades"
+          modifier: modifier "enum"
+          name: identifier "Suit"
--- a/unified/extractor/tests/corpus/swift/variables.txt
+++ b/unified/extractor/tests/corpus/swift/variables.txt
@@ -23,6 +23,14 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "x"
+          value: int_literal "1"

 ===
 Var binding
@@ -49,6 +57,14 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "var"
+          pattern:
+            name_pattern
+              identifier: identifier "x"
+          value: int_literal "1"

 ===
 Let with type annotation
@@ -84,6 +100,17 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "x"
+          type:
+            named_type_expr
+              name: identifier "Int"
+          value: int_literal "1"

 ===
 Var without initialiser
@@ -118,6 +145,16 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "var"
+          pattern:
+            name_pattern
+              identifier: identifier "x"
+          type:
+            named_type_expr
+              name: identifier "Int"

 ===
 Tuple destructuring binding
@@ -154,6 +191,28 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            tuple_pattern
+              element:
+                pattern_element
+                  pattern:
+                    expr_equality_pattern
+                      expr:
+                        name_expr
+                          identifier: identifier "a"
+                pattern_element
+                  pattern:
+                    expr_equality_pattern
+                      expr:
+                        name_expr
+                          identifier: identifier "b"
+          value:
+            name_expr
+              identifier: identifier "pair"

 ===
 Multiple bindings on one line
@@ -185,6 +244,22 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        variable_declaration
+          modifier: modifier "let"
+          pattern:
+            name_pattern
+              identifier: identifier "x"
+          value: int_literal "1"
+        variable_declaration
+          modifier:
+            modifier "let"
+            modifier "chained_declaration"
+          pattern:
+            name_pattern
+              identifier: identifier "y"
+          value: int_literal "2"

 ===
 Assignment
@@ -207,6 +282,13 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        assign_expr
+          target:
+            name_expr
+              identifier: identifier "x"
+          value: int_literal "1"

 ===
 Compound assignment
@@ -229,3 +311,138 @@ source_file

 top_level
  body:
+    block
+      stmt:
+        compound_assign_expr
+          operator: infix_operator "+="
+          target:
+            name_expr
+              identifier: identifier "x"
+          value: int_literal "1"
+
+===
+Property with willSet and didSet observers
+===
+
+class C {
+  var x: Int = 0 {
+    willSet { print(newValue) }
+    didSet { print(oldValue) }
+  }
+}
+
+---
+
+source_file
+  statement:
+    class_declaration
+      body:
+        class_body
+          member:
+            property_declaration
+              binding:
+                value_binding_pattern
+                  mutability: var
+              declarator:
+                property_binding
+                  name:
+                    pattern
+                      bound_identifier: simple_identifier "x"
+                  observers:
+                    willset_didset_block
+                      didset:
+                        didset_clause
+                          body:
+                            block
+                              statement:
+                                call_expression
+                                  function: simple_identifier "print"
+                                  suffix:
+                                    call_suffix
+                                      arguments:
+                                        value_arguments
+                                          argument:
+                                            value_argument
+                                              value: simple_identifier "oldValue"
+                      willset:
+                        willset_clause
+                          body:
+                            block
+                              statement:
+                                call_expression
+                                  function: simple_identifier "print"
+                                  suffix:
+                                    call_suffix
+                                      arguments:
+                                        value_arguments
+                                          argument:
+                                            value_argument
+                                              value: simple_identifier "newValue"
+                  type:
+                    type_annotation
+                      type:
+                        type
+                          name:
+                            user_type
+                              part:
+                                simple_user_type
+                                  name: type_identifier "Int"
+                  value: integer_literal "0"
+      declaration_kind: class
+      name: type_identifier "C"
+
+---
+
+top_level
+  body:
+    block
+      stmt:
+        class_like_declaration
+          member:
+            variable_declaration
+              modifier: modifier "var"
+              pattern:
+                name_pattern
+                  identifier: identifier "x"
+              type:
+                named_type_expr
+                  name: identifier "Int"
+              value: int_literal "0"
+            accessor_declaration
+              body:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value:
+                            name_expr
+                              identifier: identifier "newValue"
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+              modifier:
+                modifier "var"
+                modifier "chained_declaration"
+              name: identifier "x"
+              accessor_kind: accessor_kind "willSet"
+            accessor_declaration
+              body:
+                block
+                  stmt:
+                    call_expr
+                      argument:
+                        argument
+                          value:
+                            name_expr
+                              identifier: identifier "oldValue"
+                      callee:
+                        name_expr
+                          identifier: identifier "print"
+              modifier:
+                modifier "var"
+                modifier "chained_declaration"
+              name: identifier "x"
+              accessor_kind: accessor_kind "didSet"
+          modifier: modifier "class"
+          name: identifier "C"
--- a/unified/extractor/tests/corpus_tests.rs
+++ b/unified/extractor/tests/corpus_tests.rs
@@ -2,7 +2,7 @@ use std::fs;
 use std::path::Path;

 use codeql_extractor::extractor::simple;
-use yeast::{dump::dump_ast, dump::dump_ast_with_type_errors, Runner};
+use yeast::{Runner, dump::dump_ast, dump::dump_ast_with_type_errors};

 #[path = "../src/languages/mod.rs"]
 mod languages;
@@ -146,29 +146,36 @@ fn render_corpus(cases: &[CorpusCase]) -> String {
    out
 }

-fn run_desugaring(
-    lang: &simple::LanguageSpec,
-    input: &str,
-) -> Result<yeast::Ast, String> {
-    let runner = match lang.desugar.as_ref() {
-        Some(config) => Runner::from_config(lang.ts_language.clone(), config)
-            .map_err(|e| format!("Failed to create yeast runner: {e}"))?,
-        None => Runner::new(lang.ts_language.clone(), &[]),
-    };
-
-    runner
-        .run(input)
-        .map_err(|e| format!("Failed to parse input: {e}"))
+fn run_desugaring(lang: &simple::LanguageSpec, input: &str) -> Result<yeast::Ast, String> {
+    match lang.desugar.as_deref() {
+        Some(desugarer) => {
+            // Parse the input ourselves so we don't depend on the desugarer
+            // knowing about the language.
+            let mut parser = tree_sitter::Parser::new();
+            parser
+                .set_language(&lang.ts_language)
+                .map_err(|e| format!("Failed to set language: {e}"))?;
+            let tree = parser
+                .parse(input, None)
+                .ok_or_else(|| "Failed to parse input".to_string())?;
+            desugarer
+                .run_from_tree(&tree, input.as_bytes())
+                .map_err(|e| format!("Desugaring failed: {e}"))
+        }
+        None => {
+            let runner: Runner = Runner::new(lang.ts_language.clone(), &[]);
+            runner
+                .run(input)
+                .map_err(|e| format!("Failed to parse input: {e}"))
+        }
+    }
 }

 /// Produce the raw tree-sitter parse tree dump for `input`, with no
 /// desugaring rules applied. Uses a `Runner` with an empty phase list and
 /// the input grammar's own schema.
-fn dump_raw_parse(
-    lang: &simple::LanguageSpec,
-    input: &str,
-) -> Result<String, String> {
-    let runner = Runner::new(lang.ts_language.clone(), &[]);
+fn dump_raw_parse(lang: &simple::LanguageSpec, input: &str) -> Result<String, String> {
+    let runner: Runner = Runner::new(lang.ts_language.clone(), &[]);
    let ast = runner
        .run(input)
        .map_err(|e| format!("Failed to parse input: {e}"))?;
@@ -272,11 +279,7 @@ fn test_corpus() {
                }
            }

-            assert!(
-                failures.is_empty(),
-                "{}",
-                failures.join("\n\n") + "\n\n"
-            );
+            assert!(failures.is_empty(), "{}", failures.join("\n\n") + "\n\n");

            if update_mode {
                let updated = render_corpus(&cases);
@@ -285,7 +288,9 @@ fn test_corpus() {
                    write_result.is_ok(),
                    "Failed to update corpus file {}: {}",
                    corpus_path.display(),
-                    write_result.err().map_or_else(String::new, |e| e.to_string())
+                    write_result
+                        .err()
+                        .map_or_else(String::new, |e| e.to_string())
                );
            }
        }
--- a/unified/extractor/tree-sitter-swift/grammar.js
+++ b/unified/extractor/tree-sitter-swift/grammar.js
@@ -1368,7 +1368,7 @@ module.exports = grammar({
      seq(
        field("modifiers", optional($.modifiers)),
        "import",
-        optional($._import_kind),
+        optional(field("scoped_import_kind", $._import_kind)),
        field("name", $.identifier)
      ),
    _import_kind: ($) =>
@@ -1930,7 +1930,7 @@ module.exports = grammar({
      seq(
        optional("case"),
        optional(field("type", $.user_type)), // XXX this should just be _type but that creates ambiguity
-        $._dot,
+        field("dot", $._dot),
        field("name", $.simple_identifier),
        optional(field("arguments", $.tuple_pattern))
      ),
--- a/unified/extractor/tree-sitter-swift/node-types.yml
+++ b/unified/extractor/tree-sitter-swift/node-types.yml
@@ -173,6 +173,7 @@ named:
    value?: expression
  case_pattern:
    arguments?: tuple_pattern
+    dot: "."
    name: simple_identifier
    type?: user_type
  catch_block:
@@ -351,6 +352,7 @@ named:
  import_declaration:
    modifiers?: modifiers
    name: identifier
+    scoped_import_kind?: ["class", "enum", "func", "let", "protocol", "struct", "typealias", "var"]
  infix_expression:
    lhs: expression
    op: custom_operator
--- a/unified/ql/lib/codeql/unified/Ast.qll
+++ b/unified/ql/lib/codeql/unified/Ast.qll
--- a/unified/ql/lib/codeql/unified/Comments.qll
+++ b/unified/ql/lib/codeql/unified/Comments.qll
@@ -0,0 +1,18 @@
+/** Provides classes for working with comments. */
+
+private import unified
+
+/**
+ * A comment appearing in the source code.
+ */
+class Comment extends TriviaToken {
+  // At the moment, comments are the only type trivia token we extract
+  /**
+   * Gets the text inside this comment, not counting the delimeters.
+   */
+  string getCommentText() {
+    result = this.getValue().regexpCapture("//(.*)", 1)
+    or
+    result = this.getValue().regexpCapture("(?s)/\\*(.*)\\*/", 1)
+  }
+}
--- a/unified/ql/lib/unified.dbscheme
+++ b/unified/ql/lib/unified.dbscheme
--- a/unified/ql/lib/unified.qll
+++ b/unified/ql/lib/unified.qll
@@ -0,0 +1,8 @@
+/**
+ * Provides classes for working with the AST, as well as files and locations.
+ */
+
+import codeql.Locations
+import codeql.files.FileSystem
+import codeql.unified.Ast::Unified
+import codeql.unified.Comments
--- a/unified/ql/test/library-tests/BasicTest/test.expected
+++ b/unified/ql/test/library-tests/BasicTest/test.expected
@@ -1,9 +1,37 @@
 nameExpr
+| name_expr.swift:1:9:1:9 | NameExpr | y |
+| test.swift:1:8:1:17 | NameExpr | Foundation |
+| test.swift:8:9:8:13 | NameExpr | items |
+| test.swift:8:22:8:25 | NameExpr | item |
+| test.swift:12:16:12:20 | NameExpr | items |
+| test.swift:12:31:12:34 | NameExpr | item |
+| test.swift:25:18:25:22 | NameExpr | Array |
+| test.swift:25:24:25:28 | NameExpr | first |
+| test.swift:26:17:26:22 | NameExpr | second |
+| test.swift:27:13:27:18 | NameExpr | result |
+| test.swift:27:29:27:32 | NameExpr | item |
+| test.swift:28:13:28:18 | NameExpr | result |
+| test.swift:28:27:28:30 | NameExpr | item |
+| test.swift:31:12:31:17 | NameExpr | result |
+| test.swift:40:16:40:19 | NameExpr | data |
+| test.swift:44:9:44:12 | NameExpr | data |
+| test.swift:48:15:48:19 | NameExpr | index |
+| test.swift:48:29:48:33 | NameExpr | index |
+| test.swift:48:37:48:40 | NameExpr | data |
+| test.swift:49:16:49:19 | NameExpr | data |
+| test.swift:49:21:49:25 | NameExpr | index |
+| test.swift:53:9:53:12 | NameExpr | data |
+| test.swift:53:21:53:24 | NameExpr | item |
+| test.swift:63:16:63:19 | NameExpr | self |
+| test.swift:65:29:65:37 | NameExpr | transform |
+| test.swift:65:39:65:43 | NameExpr | value |
+| test.swift:67:29:67:33 | NameExpr | error |
+| test.swift:76:16:76:19 | NameExpr | self |
+| test.swift:76:21:76:21 | NameExpr | i |
+| test.swift:76:26:76:29 | NameExpr | self |
+| test.swift:76:31:76:31 | NameExpr | i |
+| test.swift:86:12:86:17 | NameExpr | values |
+| test.swift:87:12:87:17 | NameExpr | values |
+| test.swift:87:38:87:43 | NameExpr | values |
+| test.swift:87:49:87:57 | NameExpr | transform |
 unsupported
-| test.swift:3:1:3:38 |  |  |
-| test.swift:16:1:16:32 |  |  |
-| test.swift:23:1:23:37 |  |  |
-| test.swift:34:1:34:49 |  |  |
-| test.swift:57:1:57:30 |  |  |
-| test.swift:72:1:72:37 |  |  |
-| test.swift:84:1:84:24 |  |  |
--- a/unified/ql/test/library-tests/comments/comments.expected
+++ b/unified/ql/test/library-tests/comments/comments.expected
@@ -0,0 +1,3 @@
+| comments.swift:1:1:1:22 | // Hello this is swift |  Hello this is swift |
+| comments.swift:3:1:6:3 | /*\n * This is a multi-line comment\n * It should be ignored by the parser\n */ | \n * This is a multi-line comment\n * It should be ignored by the parser\n  |
+| comments.swift:9:5:9:36 | // This is a single-line comment |  This is a single-line comment |
--- a/unified/ql/test/library-tests/comments/comments.ql
+++ b/unified/ql/test/library-tests/comments/comments.ql
@@ -0,0 +1,3 @@
+import unified
+
+query predicate comments(Comment c, string text) { text = c.getCommentText() }
--- a/unified/ql/test/library-tests/comments/comments.swift
+++ b/unified/ql/test/library-tests/comments/comments.swift
@@ -0,0 +1,11 @@
+// Hello this is swift
+
+/*
+ * This is a multi-line comment
+ * It should be ignored by the parser
+ */
+
+func hello() {
+    // This is a single-line comment
+    print("Hello, world!")
+}
Author	SHA1	Message	Date
Taus	f75cf95db5	Apply rustfmt Format the touched Rust crates (shared/tree-sitter-extractor, shared/yeast, shared/yeast-macros, unified/extractor) so the tree-sitter-extractor CI fmt check passes. No functional changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-25 12:26:52 +00:00
Taus	8a3bb915e4	unified/swift: Use `tree!` instead of ctx.node Cleans up a few places where we were constructing trees piece by piece rather than using the `tree!` macro. In the process, Copilot noticed an issue that should probably be addressed: the labeled_statement rule can never fire, since there are no such nodes in the input. This is possibly a simple as making _labeled_statement (which _does_ exist) named, but I haven't attempted this. Finally, a small change to yeast makes it so that the contents of a {} interpolation can be a Rust block (previously it could only be a single expression). This avoids the need to double-wrap instances where you want to interpolate a single node produced as the final value of some block.	2026-06-25 12:02:39 +00:00
Taus	ded5cc2901	unified/swift: Replace reduce_left with Rust helpers (Both reduce_left and map are still supported, but we could remove them at this point.) I think this way of writing things makes the intent a lot clearer -- it avoids extending the yeast rule language with complicated constructs, pushing the complexity (such as it is) into Rust instead.	2026-06-25 12:02:39 +00:00
Taus	d9484e6196	unified/swift: Propagate property_declaration modifiers via context Gets rid of the final uses of mutation (via prepend_field). The approach is the same as in the preceding commits: we set the appropriate fields on the context when processing the outer node, and then access these fields on the inner nodes. The repeated use of `modifier` fields is a _bit_ clunky, but since we're likely moving to an out-of-band modifier mechanism at some point, I think it's good enough for now.	2026-06-25 12:02:39 +00:00
Taus	7793702e8a	unified/swift: Propagate enum_entry outer modifiers via context Same as in the preceding commit, we added a test beforehand for testing this syntax, and verified that it was unchanged by the cleanup in this commit.	2026-06-25 12:02:39 +00:00
Taus	4730898b2f	unified/swift: Translate protocol properties using context Avoids more "mutation after creation" via prepend_field. Also adds a test to the corpus for exercising this syntax. Although it's not evident, the test output was unchanged by this refactoring.	2026-06-25 12:02:39 +00:00
Taus	86feaeff4e	unified/swift: Propagate parameter default values via context Extends the context with a field for keeping track of the default value. In the process, we also rename the context to SwiftContext as it now doesn't only concern itself with properties.	2026-06-25 12:02:39 +00:00
Taus	0c85c31129	yeast: Simplify Swift rules using the new machinery Propagates in name and type information for various property declarations, using the context mechanism. This avoids mutating already-translated nodes in-place, and is generally much easier to read.	2026-06-25 12:02:39 +00:00
Taus	919c5b8c53	yeast: Hide desugaring behind Desugarer trait This was necessary since otherwise the generic type of the user-specified context (which should only be a concern for yeast) starts to bleed out into the shared extractor. Instead, we type-erase it by putting it inside the aforementioned trait.	2026-06-25 12:02:39 +00:00
Taus	c39bfa555d	yeast: Add macro for fine-grained rules Adds `manual_rule!` which provides a more low-level interface for defining rewrites. (I'm not entirely sold on the name, so any suggestions would be welcome.) Notably, the captures bound in the body of such rules have _not_ been translated yet -- they still come from the _input_ tree. It is the user's duty to call ctx.translate on these (which has the effect of recursively invoking the translation) before substituting them into the output. For _truly_ low-level access, the user can still construct a Rule directly, but this is now somewhat cumbersome as the closure contained therein takes quite a few parameters. Still, the possibility remains.	2026-06-25 12:02:39 +00:00
Taus	03350bf8d7	yeast: Pass raw captures to `Rule::new` rules This enables users to specify how and when these captures get translated. In conjunction with the context mechanism, this can be used to e.g. translate some piece of information (e.g. the type of something), record it in the context, and then recursively translate some other capture that relies on this information. This allows information to be cleanly passed into descendants (which can be written using context accesses in the `rule!` macro form). As a consequence of this change, we now need to pass around a TranslatorHandle to perform the manual translation. For Repeating rules, it doesn't really make sense to translate things, so in this case we simply signal an error. Also, the implementation of the `rule!` macro changes slightly (without changing semantics): it now essentially delegates to `Rule::new`, receiving raw captures, but then immediately applies the translation to those captures (which, for the majority of cases, is likely the desired behaviour).	2026-06-25 12:02:39 +00:00
Taus	d38ffe0ad5	yeast: Make transforms return `Result` This will enable us to actually capture and log errors in complicated rules (e.g. ones written in Rust) rather than just panicking.	2026-06-25 12:02:38 +00:00
Taus	d6373eaef7	yeast: Reify the context and allow user-defined data in it Renames what was previously called `__yeast_ctx` into just `ctx`, and adds a new field `user_ctx` to this context. Said field can contain a struct of any user type (necessitating making various parts of the implementation generic in said type). Through some Deref magic, field accesses are delegated to the inner struct (assuming they are not already defined on `ctx`), which should hopefully make the interface a bit more ergonomic.	2026-06-25 12:02:38 +00:00
Asger F	89cd6770ae	Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-06-25 13:18:27 +02:00
Asger F	66c1f037f5	Add TODO	2026-06-19 12:19:51 +02:00
Asger F	2675070291	unified/swift: Clean up translation of patterns Patterns have an unusual parse tree, but now the matching should at least be a bit easier to follow. The TODO regarding not being able to pass down context to handle var/let is still relevant, and can't be solved in the mapping alone.	2026-06-19 11:35:06 +02:00
Asger F	c01264d05c	Coerce pattern_element.key to be an identifier	2026-06-19 10:31:34 +02:00
Asger F	63e1cc90e9	Test: add corpus test for switch case patterns with labeled arguments Adds a test case 'Switch with labeled case pattern arguments' covering: - case .implicit(isAcknowledged: false) — labeled bool literal - case .thread(threadRowId: _, let rowId) — labeled wildcard + binding The current output contains type errors: pattern_element::key is being produced as name_expr instead of identifier. These will be fixed in the following commit.	2026-06-19 10:27:20 +02:00
Asger F	2182265120	unified/swift: Better source range for inferred_type_expr	2026-06-18 14:57:55 +02:00
Asger F	0b666d47db	Preserve the dot token in case patterns	2026-06-18 14:55:54 +02:00
Asger F	142ac47166	Refactor: map switch case patterns to constructor_pattern instead of tuple_pattern Changed the desugaring rules to properly map case patterns with binding (e.g., 'case .circle(let r):') to constructor_pattern nodes instead of tuple_pattern. New rules added: - tuple_pattern_item → pattern_element (preserves optional name/key) - pattern.kind: binding_pattern → name_pattern (extracts bound identifier) - pattern.kind: case_pattern → constructor_pattern (creates proper constructor with bound arguments as pattern_elements) This provides a more semantically correct AST representation: - Constructor name: name_expr identifier 'circle' - Elements: pattern_element containing name_pattern identifier 'r' Instead of the previous tuple_pattern string representation. Updated control-flow.txt corpus outputs.	2026-06-18 14:54:59 +02:00
Asger F	2470c1388a	Fix: preserve switch case patterns in desugared output The switch_entry rule was capturing switch_pattern wrapper nodes instead of drilling into them to extract the actual pattern nodes. This caused patterns from switch cases to be lost during desugaring. Changed the pattern match from: (switch_entry pattern: (switch_pattern)* @pats ...) to: (switch_entry pattern: (switch_pattern pattern: @pats)* ...) This now correctly extracts the pattern field from each switch_pattern node, ensuring that patterns from cases like 'case 1:' and 'case .circle(let r):' are preserved in the switch_case AST nodes. Updated control-flow.txt corpus outputs to reflect the new behavior.	2026-06-18 14:37:42 +02:00
Asger F	fa98557dd9	Update QL test output	2026-06-18 14:26:49 +02:00
Asger F	1e167dfa6b	unified/swift: add type and declaration-family mappings	2026-06-18 14:26:47 +02:00
Asger F	f362707493	unified/swift: Imports	2026-06-18 14:26:45 +02:00
Asger F	15208b70aa	Unified: Add import_declaration.scoped_import_kind	2026-06-18 14:26:43 +02:00
Asger F	3522f35ab2	unified/swift: add collections, optionals/errors	2026-06-18 14:26:42 +02:00
Asger F	938396a751	unified/swift: add control-flow and loop mappings	2026-06-18 14:26:40 +02:00
Asger F	790d4f11be	unified/swift: add closure and capture mappings	2026-06-18 14:26:38 +02:00
Asger F	8f747a355c	unified/swift: add function and parameter mappings	2026-06-18 14:26:37 +02:00
Asger F	d17fd2d964	unified/swift: add variable/property/accessor and enum mappings	2026-06-18 14:26:35 +02:00
Asger F	4e9c3fb436	unified/swift: add literals, names, and operator expression mappings	2026-06-18 14:26:33 +02:00
Asger F	0e9d17b59c	unified/swift: add top-level normalization and fallback scaffold	2026-06-18 14:26:31 +02:00
Asger F	6c74cd31e4	Yeast: use child locations instead of rule target Previously, when a node was synthesized it would always take the location from the node that matched the current rule. This resulted in overly broad locations however. For (foo #{bar}) we now take the location of the 'bar' node. For non-leaf nodes we merge all its child node locations.	2026-06-18 14:26:30 +02:00
Asger F	166406acbb	Unified: Elaborate a bit more on AGENTS.md	2026-06-18 14:26:28 +02:00
Asger F	b40cb5dedd	Regenerate QL	2026-06-18 14:26:26 +02:00
Asger F	6dd7dedc19	Rewrite AST	2026-06-18 14:26:22 +02:00
Asger F	1d8e682e5f	Reset mappings	2026-06-15 10:49:37 +02:00
Asger F	0baa126473	Add ability to prepend fields in Yeast	2026-06-15 10:49:35 +02:00
Asger F	d11b428292	yeast-macros: desugar 'field: @cap' to 'field: _ @cap' When a field pattern has a bare capture with no preceding pattern atom (i.e. `foo: @bar`), implicitly use a true wildcard (`_`, match_unnamed: true) as the node pattern, making it equivalent to `foo: _ @bar`. This is a convenience shorthand: in practice every `field: _ @cap` in the Swift rules can now be written more concisely as `field: @cap`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-15 10:49:33 +02:00
Asger F	ddc9516e92	Yeast: better support for rewriting unnamed nodes - Ensure the full wildcard _ supports quantifiers - Also rewrite unnamed nodes in one-shot phases	2026-06-15 10:49:31 +02:00
Asger F	00068948c1	yeast-macros: add .reduce_left(first -> init, acc, elem -> fold) chain A left fold over an iterable where the first element seeds the accumulator: - first -> init : converts the first element to the initial accumulator - acc, elem -> fold : fold step; acc = current accumulator, elem = next element - Empty iterable produces nothing (0-element splice) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-15 10:49:29 +02:00
Asger F	28c879f58c	yeast-macros: add .map(p -> tpl) chain syntax for tree templates After a {expr} or {..expr} placeholder, an optional chain of .<builtin>() calls may follow. Currently the only builtin is: .map(param -> template) which applies the template to each element of the iterable and collects the resulting node IDs. A chain auto-splices into the enclosing field/child position. Example: path: {parts}.map(p -> (identifier #{p})) The framework is extensible: additional builtins can be added by matching on the method name in parse_chain_suffix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-15 10:49:27 +02:00
Asger F	6000c18c24	Unified: also QLDoc for unified.qll	2026-06-12 16:48:25 +02:00
Asger F	e81a3bcbc3	Unified: Add QLDoc	2026-06-12 16:47:06 +02:00
Asger F	7d6d5bfb4a	Unified: add test for comments	2026-06-12 16:36:33 +02:00
Asger F	f83adb55ce	Unified: regenerate AST	2026-06-12 16:33:51 +02:00
Asger F	5608369abe	Extract trivia tokens from original parse tree	2026-06-12 16:32:57 +02:00
				`@@ -1 +0,0 @@`
				`Security Features/CWE-295/AcceptAnyCertificate.ql`