Compare commits

..

1 Commits

Author SHA1 Message Date
copilot-swe-agent[bot]
677afff7af Add CWE-295 C# query for accepting any TLS certificate 2026-06-10 09:12:29 +00:00
49 changed files with 1191 additions and 7057 deletions

View File

@@ -0,0 +1,22 @@
using System.Net.Http;
using System.Net.Security;
using System.Security.Cryptography.X509Certificates;
public class CertificateValidation
{
public void Bad()
{
var handler = new HttpClientHandler();
// BAD: the callback always returns true, so every certificate is trusted.
handler.ServerCertificateCustomValidationCallback =
(request, certificate, chain, errors) => true;
}
public void Good()
{
var handler = new HttpClientHandler();
// GOOD: the certificate is only trusted when there are no validation errors.
handler.ServerCertificateCustomValidationCallback =
(request, certificate, chain, errors) => errors == SslPolicyErrors.None;
}
}

View File

@@ -0,0 +1,52 @@
<!DOCTYPE qhelp PUBLIC
"-//Semmle//qhelp//EN"
"qhelp.dtd">
<qhelp>
<overview>
<p>
A TLS/SSL certificate validation callback that always returns <code>true</code> trusts every certificate,
regardless of any validation errors that were detected. This allows an attacker to perform a machine-in-the-middle
attack against the application, therefore breaking any security that Transport Layer Security (TLS) provides.
</p>
<p>
An attack might look like this:
</p>
<ol>
<li>The vulnerable program connects to <code>https://example.com</code>.</li>
<li>The attacker intercepts this connection and presents a valid, self-signed certificate for <code>https://example.com</code>.</li>
<li>The vulnerable program calls the certificate validation callback to check whether it should trust the certificate.</li>
<li>The callback ignores the <code>SslPolicyErrors</code> argument and returns <code>true</code>.</li>
<li>The vulnerable program accepts the certificate and proceeds with the connection, since the callback indicated that the certificate is trusted.</li>
<li>The attacker can now read the data the program sends to <code>https://example.com</code> and/or alter its replies while the program thinks the connection is secure.</li>
</ol>
</overview>
<recommendation>
<p>
Do not use a certificate validation callback that unconditionally returns <code>true</code>.
Either rely on the default certificate validation, or implement a callback that inspects the
<code>SslPolicyErrors</code> argument and only trusts a specific, known certificate (for example, when
using a self-signed certificate that has been explicitly pinned).
</p>
</recommendation>
<example>
<p>
In the first (bad) example, the callback always returns <code>true</code> and therefore trusts any certificate,
which allows an attacker to perform a machine-in-the-middle attack. In the second (good) example, the callback
returns <code>true</code> only when there are no validation errors.
</p>
<sample src="AcceptAnyCertificate.cs" />
</example>
<references>
<li>Microsoft Learn:
<a href="https://learn.microsoft.com/en-us/dotnet/api/system.net.security.remotecertificatevalidationcallback">RemoteCertificateValidationCallback Delegate</a>.</li>
<li>Microsoft Learn:
<a href="https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca5359">CA5359: Do not disable certificate validation</a>.</li>
<li>OWASP:
<a href="https://owasp.org/www-community/attacks/Manipulator-in-the-middle_attack">Manipulator-in-the-middle attack</a>.</li>
</references>
</qhelp>

View File

@@ -0,0 +1,101 @@
/**
* @name Accepting any TLS certificate during validation
* @description A certificate validation callback that always accepts any certificate
* allows an attacker to perform a machine-in-the-middle attack.
* @kind path-problem
* @problem.severity error
* @security-severity 7.5
* @precision high
* @id cs/accept-any-certificate
* @tags security
* external/cwe/cwe-295
*/
import csharp
import semmle.code.csharp.dataflow.DataFlow::DataFlow
import AcceptAnyCertificate::PathGraph
/**
* Holds if `c` always returns `true` and never returns `false`, i.e. it accepts
* every input it is given.
*/
predicate alwaysReturnsTrue(Callable c) {
c.getReturnType() instanceof BoolType and
// There is at least one returned value, and every returned value is the
// constant `true`.
forex(Expr ret | c.canReturn(ret) | ret.getValue() = "true")
}
/**
* A delegate type used as a TLS/SSL certificate validation callback. Such a
* delegate returns a `bool` (whether the certificate is trusted) and takes a
* `System.Net.Security.SslPolicyErrors` parameter describing any validation
* errors that were found. This covers `RemoteCertificateValidationCallback` as
* well as the `Func<..., SslPolicyErrors, bool>` callbacks used by, for example,
* `HttpClientHandler.ServerCertificateCustomValidationCallback`.
*/
class CertificateValidationCallbackType extends DelegateType {
CertificateValidationCallbackType() {
this.getReturnType() instanceof BoolType and
this.getAParameter().getType().hasFullyQualifiedName("System.Net.Security", "SslPolicyErrors")
}
}
/**
* Gets a callable that always accepts any certificate, referenced by the
* delegate-producing expression `e`.
*/
Callable getAcceptingCallable(Expr e) {
// A lambda or anonymous method, e.g. `(sender, cert, chain, errors) => true`.
result = e and
alwaysReturnsTrue(e)
or
// A method group, e.g. `AcceptAllCertificates`, possibly wrapped in an
// (implicit or explicit) delegate creation.
result = e.(DelegateCreation).getArgument().(CallableAccess).getTarget() and
alwaysReturnsTrue(result)
or
result = e.(CallableAccess).getTarget() and
alwaysReturnsTrue(result)
}
module AcceptAnyCertificateConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
exists(getAcceptingCallable(source.asExpr()))
or
// `HttpClientHandler.DangerousAcceptAnyServerCertificateValidator` is a
// built-in callback that accepts every certificate.
source
.asExpr()
.(PropertyAccess)
.getTarget()
.hasName("DangerousAcceptAnyServerCertificateValidator")
}
predicate isSink(DataFlow::Node sink) {
// The value assigned to a property, field or local of certificate
// validation callback type.
exists(Assignable a |
a.getType() instanceof CertificateValidationCallbackType and
sink.asExpr() = a.getAnAssignedValue()
)
or
// The value passed as a certificate validation callback argument, e.g. to
// the `SslStream` constructor.
exists(Call call, Parameter p |
p = call.getTarget().getAParameter() and
p.getType() instanceof CertificateValidationCallbackType and
sink.asExpr() = call.getArgumentForParameter(p)
)
}
predicate observeDiffInformedIncrementalMode() { any() }
}
module AcceptAnyCertificate = DataFlow::Global<AcceptAnyCertificateConfig>;
from AcceptAnyCertificate::PathNode source, AcceptAnyCertificate::PathNode sink
where AcceptAnyCertificate::flowPath(source, sink)
select sink.getNode(), source, sink,
"This TLS certificate validation $@, which trusts any certificate.", source.getNode(),
"uses a callback"

View File

@@ -0,0 +1,4 @@
---
category: newQuery
---
* Added a new query, `cs/accept-any-certificate`, to detect TLS/SSL certificate validation callbacks that always accept any certificate (CWE-295).

View File

@@ -0,0 +1,24 @@
edges
| Test.cs:64:45:64:52 | access to local variable callback : (...) => ... | Test.cs:67:48:67:55 | access to local variable callback | provenance | |
| Test.cs:65:13:65:56 | (...) => ... : (...) => ... | Test.cs:64:45:64:52 | access to local variable callback : (...) => ... | provenance | |
nodes
| Test.cs:14:13:14:57 | (...) => ... | semmle.label | (...) => ... |
| Test.cs:22:13:25:13 | (...) => ... | semmle.label | (...) => ... |
| Test.cs:33:13:33:74 | access to property DangerousAcceptAnyServerCertificateValidator | semmle.label | access to property DangerousAcceptAnyServerCertificateValidator |
| Test.cs:40:13:40:56 | (...) => ... | semmle.label | (...) => ... |
| Test.cs:52:67:52:75 | delegate creation of type RemoteCertificateValidationCallback | semmle.label | delegate creation of type RemoteCertificateValidationCallback |
| Test.cs:59:13:59:56 | (...) => ... | semmle.label | (...) => ... |
| Test.cs:64:45:64:52 | access to local variable callback : (...) => ... | semmle.label | access to local variable callback : (...) => ... |
| Test.cs:65:13:65:56 | (...) => ... | semmle.label | (...) => ... |
| Test.cs:65:13:65:56 | (...) => ... : (...) => ... | semmle.label | (...) => ... : (...) => ... |
| Test.cs:67:48:67:55 | access to local variable callback | semmle.label | access to local variable callback |
subpaths
#select
| Test.cs:14:13:14:57 | (...) => ... | Test.cs:14:13:14:57 | (...) => ... | Test.cs:14:13:14:57 | (...) => ... | This TLS certificate validation $@, which trusts any certificate. | Test.cs:14:13:14:57 | (...) => ... | uses a callback |
| Test.cs:22:13:25:13 | (...) => ... | Test.cs:22:13:25:13 | (...) => ... | Test.cs:22:13:25:13 | (...) => ... | This TLS certificate validation $@, which trusts any certificate. | Test.cs:22:13:25:13 | (...) => ... | uses a callback |
| Test.cs:33:13:33:74 | access to property DangerousAcceptAnyServerCertificateValidator | Test.cs:33:13:33:74 | access to property DangerousAcceptAnyServerCertificateValidator | Test.cs:33:13:33:74 | access to property DangerousAcceptAnyServerCertificateValidator | This TLS certificate validation $@, which trusts any certificate. | Test.cs:33:13:33:74 | access to property DangerousAcceptAnyServerCertificateValidator | uses a callback |
| Test.cs:40:13:40:56 | (...) => ... | Test.cs:40:13:40:56 | (...) => ... | Test.cs:40:13:40:56 | (...) => ... | This TLS certificate validation $@, which trusts any certificate. | Test.cs:40:13:40:56 | (...) => ... | uses a callback |
| Test.cs:52:67:52:75 | delegate creation of type RemoteCertificateValidationCallback | Test.cs:52:67:52:75 | delegate creation of type RemoteCertificateValidationCallback | Test.cs:52:67:52:75 | delegate creation of type RemoteCertificateValidationCallback | This TLS certificate validation $@, which trusts any certificate. | Test.cs:52:67:52:75 | delegate creation of type RemoteCertificateValidationCallback | uses a callback |
| Test.cs:59:13:59:56 | (...) => ... | Test.cs:59:13:59:56 | (...) => ... | Test.cs:59:13:59:56 | (...) => ... | This TLS certificate validation $@, which trusts any certificate. | Test.cs:59:13:59:56 | (...) => ... | uses a callback |
| Test.cs:65:13:65:56 | (...) => ... | Test.cs:65:13:65:56 | (...) => ... | Test.cs:65:13:65:56 | (...) => ... | This TLS certificate validation $@, which trusts any certificate. | Test.cs:65:13:65:56 | (...) => ... | uses a callback |
| Test.cs:67:48:67:55 | access to local variable callback | Test.cs:65:13:65:56 | (...) => ... : (...) => ... | Test.cs:67:48:67:55 | access to local variable callback | This TLS certificate validation $@, which trusts any certificate. | Test.cs:65:13:65:56 | (...) => ... | uses a callback |

View File

@@ -0,0 +1 @@
Security Features/CWE-295/AcceptAnyCertificate.ql

View File

@@ -0,0 +1,89 @@
using System.IO;
using System.Net;
using System.Net.Http;
using System.Net.Security;
using System.Security.Cryptography.X509Certificates;
public class CertificateValidationTests
{
public void HttpClientHandlerBad()
{
var handler = new HttpClientHandler();
// BAD: always trusts any certificate.
handler.ServerCertificateCustomValidationCallback =
(request, certificate, chain, errors) => true;
}
public void HttpClientHandlerBlockBodyBad()
{
var handler = new HttpClientHandler();
// BAD: always trusts any certificate.
handler.ServerCertificateCustomValidationCallback =
(request, certificate, chain, errors) =>
{
return true;
};
}
public void HttpClientHandlerDangerousBad()
{
var handler = new HttpClientHandler();
// BAD: built-in callback that accepts any certificate.
handler.ServerCertificateCustomValidationCallback =
HttpClientHandler.DangerousAcceptAnyServerCertificateValidator;
}
public void ServicePointManagerBad()
{
// BAD: always trusts any certificate.
ServicePointManager.ServerCertificateValidationCallback =
(sender, certificate, chain, errors) => true;
}
private static bool AcceptAll(object sender, X509Certificate certificate, X509Chain chain,
SslPolicyErrors errors)
{
return true;
}
public void MethodGroupBad()
{
// BAD: the referenced method always returns true.
ServicePointManager.ServerCertificateValidationCallback = AcceptAll;
}
public void SslStreamBad(Stream stream)
{
// BAD: the validation callback always returns true.
var ssl = new SslStream(stream, false,
(sender, certificate, chain, errors) => true);
}
public void IndirectBad(Stream stream)
{
RemoteCertificateValidationCallback callback =
(sender, certificate, chain, errors) => true;
// BAD: the callback flowing here always returns true.
var ssl = new SslStream(stream, false, callback);
}
public void HttpClientHandlerGood()
{
var handler = new HttpClientHandler();
// GOOD: the certificate is only trusted when there are no validation errors.
handler.ServerCertificateCustomValidationCallback =
(request, certificate, chain, errors) => errors == SslPolicyErrors.None;
}
private static bool Validate(object sender, X509Certificate certificate, X509Chain chain,
SslPolicyErrors errors)
{
return errors == SslPolicyErrors.None;
}
public void MethodGroupGood()
{
// GOOD: the referenced method performs real validation.
ServicePointManager.ServerCertificateValidationCallback = Validate;
}
}

View File

@@ -0,0 +1,2 @@
semmle-extractor-options: /nostdlib /noconfig
semmle-extractor-options: --load-sources-from-project:${testdir}/../../../../resources/stubs/_frameworks/Microsoft.NETCore.App/Microsoft.NETCore.App.csproj

View File

@@ -280,11 +280,10 @@ pub fn location_label(writer: &mut trap::Writer, location: trap::Location) -> tr
}
/// Extracts the source file at `path`, which is assumed to be canonicalized.
/// When `desugarer` is `Some`, the parsed tree is first transformed
/// through the supplied yeast desugarer before TRAP extraction. Building
/// the desugarer (which parses YAML and constructs the schema) is the
/// caller's responsibility, allowing it to be done once and shared across
/// files.
/// When `yeast_runner` is `Some`, the parsed tree is first transformed
/// through the supplied yeast `Runner` before TRAP extraction. Building the
/// `Runner` (which parses YAML and constructs the schema) is the caller's
/// responsibility, allowing it to be done once and shared across files.
#[allow(clippy::too_many_arguments)]
pub fn extract(
language: &Language,
@@ -296,7 +295,7 @@ pub fn extract(
path: &Path,
source: &[u8],
ranges: &[Range],
desugarer: Option<&dyn yeast::Desugarer>,
yeast_runner: Option<&yeast::Runner<'_>>,
) {
let path_str = file_paths::normalize_and_transform_path(path, transformer);
let source_root = std::env::current_dir()
@@ -329,14 +328,11 @@ pub fn extract(
schema,
);
if let Some(desugarer) = desugarer {
let ast = desugarer
if let Some(yeast_runner) = yeast_runner {
let ast = yeast_runner
.run_from_tree(&tree, source)
.unwrap_or_else(|e| panic!("Desugaring failed for {path_str}: {e}"));
traverse_yeast(&ast, &mut visitor);
// Comments and other `extra` nodes are not represented in the desugared
// AST, so recover them directly from the original parse tree.
traverse_extras(&tree, &mut visitor);
} else {
traverse(&tree, &mut visitor);
}
@@ -369,8 +365,6 @@ struct Visitor<'a> {
ast_node_parent_table_name: String,
/// Language-specific name of the tokeninfo table
tokeninfo_table_name: String,
/// Language-specific name of the trivia tokeninfo table
trivia_tokeninfo_table_name: String,
/// A lookup table from type name to node types
schema: &'a NodeTypeMap,
/// A stack for gathering information from child nodes. Whenever a node is
@@ -401,33 +395,11 @@ impl<'a> Visitor<'a> {
ast_node_location_table_name: format!("{language_prefix}_ast_node_location"),
ast_node_parent_table_name: format!("{language_prefix}_ast_node_parent"),
tokeninfo_table_name: format!("{language_prefix}_tokeninfo"),
trivia_tokeninfo_table_name: format!("{language_prefix}_trivia_tokeninfo"),
schema,
stack: Vec::new(),
}
}
/// Emits a `TriviaToken` for the given `extra` node (e.g. a comment) from
/// the original parse tree. Trivia tokens carry a location and their source
/// text, but are not attached to a parent in the (possibly desugared) AST.
fn emit_trivia_token(&mut self, node: &Node) {
let id = self.trap_writer.fresh_id();
let loc = location_for(self, self.file_label, node);
let loc_label = location_label(self.trap_writer, loc);
self.trap_writer.add_tuple(
&self.ast_node_location_table_name,
vec![trap::Arg::Label(id), trap::Arg::Label(loc_label)],
);
self.trap_writer.add_tuple(
&self.trivia_tokeninfo_table_name,
vec![
trap::Arg::Label(id),
trap::Arg::Int(node.kind_id() as usize),
sliced_source_arg(self.source, node),
],
);
}
fn record_parse_error(&mut self, loc: trap::Label, mesg: &diagnostics::DiagnosticMessage) {
self.diagnostics_writer.write(mesg);
let id = self.trap_writer.fresh_id();
@@ -863,24 +835,6 @@ fn traverse(tree: &Tree, visitor: &mut Visitor) {
}
}
/// Walks the original tree-sitter tree and emits a `TriviaToken` for every
/// `extra` node (e.g. a comment). Used to preserve comments that would
/// otherwise be lost after a desugaring pass rewrites the tree.
fn traverse_extras(tree: &Tree, visitor: &mut Visitor) {
emit_extras_in(visitor, tree.root_node());
}
fn emit_extras_in(visitor: &mut Visitor, node: Node<'_>) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
if child.is_extra() {
visitor.emit_trivia_token(&child);
} else {
emit_extras_in(visitor, child);
}
}
}
fn traverse_yeast(tree: &yeast::Ast, visitor: &mut Visitor) {
use yeast::Cursor;
let mut cursor = tree.walk();

View File

@@ -13,14 +13,11 @@ pub struct LanguageSpec {
pub prefix: &'static str,
pub ts_language: tree_sitter::Language,
pub node_types: &'static str,
/// Optional desugarer. When set, the parsed tree is rewritten through
/// the desugarer before TRAP extraction. The desugarer's
/// `output_node_types_yaml()` (if set) provides the schema used both
/// at runtime (for the rewriter) and for TRAP validation.
///
/// `Box<dyn yeast::Desugarer>` so the shared extractor is agnostic to
/// the user-defined context type the desugarer uses internally.
pub desugar: Option<Box<dyn yeast::Desugarer>>,
/// Optional yeast desugaring configuration. When set, the parsed
/// tree is rewritten through yeast before TRAP extraction. The
/// config's `output_node_types_yaml` (if set) provides the schema
/// used both at runtime (for the rewriter) and for TRAP validation.
pub desugar: Option<yeast::DesugaringConfig>,
pub file_globs: Vec<String>,
}
@@ -94,22 +91,35 @@ impl Extractor {
.collect();
let mut schemas = vec![];
let mut yeast_runners = Vec::new();
for lang in &self.languages {
let effective_node_types: String = match lang
.desugar
.as_ref()
.and_then(|d| d.output_node_types_yaml())
{
Some(yaml) => yeast::node_types_yaml::convert(yaml).map_err(|e| {
std::io::Error::other(format!(
"Failed to convert YAML node-types to JSON for {}: {e}",
lang.prefix
))
})?,
None => lang.node_types.to_string(),
};
let effective_node_types: String =
match lang.desugar.as_ref().and_then(|c| c.output_node_types_yaml) {
Some(yaml) => yeast::node_types_yaml::convert(yaml).map_err(|e| {
std::io::Error::other(format!(
"Failed to convert YAML node-types to JSON for {}: {e}",
lang.prefix
))
})?,
None => lang.node_types.to_string(),
};
let schema = node_types::read_node_types_str(lang.prefix, &effective_node_types)?;
schemas.push(schema);
// Build the yeast runner once per language so the YAML schema
// isn't re-parsed for every file.
let yeast_runner = lang
.desugar
.as_ref()
.map(|config| yeast::Runner::from_config(lang.ts_language.clone(), config))
.transpose()
.map_err(|e| {
std::io::Error::other(format!(
"Failed to build desugaring runner for {}: {e}",
lang.prefix
))
})?;
yeast_runners.push(yeast_runner);
}
// Construct a single globset containing all language globs,
@@ -184,7 +194,7 @@ impl Extractor {
&path,
&source,
&[],
lang.desugar.as_deref(),
yeast_runners[i].as_ref(),
);
std::fs::create_dir_all(src_archive_file.parent().unwrap())?;
std::fs::copy(&path, &src_archive_file)?;

View File

@@ -68,12 +68,7 @@ pub fn generate(
let node_parent_table_name = format!("{}_ast_node_parent", &prefix);
let token_name = format!("{}_token", &prefix);
let tokeninfo_name = format!("{}_tokeninfo", &prefix);
let trivia_token_name = format!("{}_trivia_token", &prefix);
let trivia_tokeninfo_name = format!("{}_trivia_tokeninfo", &prefix);
let reserved_word_name = format!("{}_reserved_word", &prefix);
// When a desugaring is configured, comments and other `extra` nodes are
// preserved from the original parse tree as `TriviaToken`s.
let has_trivia_tokens = language.desugar.is_some();
let effective_node_types: String = match language
.desugar
.as_ref()
@@ -90,35 +85,28 @@ pub fn generate(
let nodes = node_types::read_node_types_str(&prefix, &effective_node_types)?;
let (dbscheme_entries, mut ast_node_members, token_kinds) = convert_nodes(&nodes);
ast_node_members.insert(&token_name);
if has_trivia_tokens {
ast_node_members.insert(&trivia_token_name);
}
writeln!(&mut dbscheme_writer, "/*- {} dbscheme -*/", language.name)?;
dbscheme::write(&mut dbscheme_writer, &dbscheme_entries)?;
let token_case = create_token_case(&token_name, token_kinds);
let mut dbscheme_tail = vec![
dbscheme::Entry::Table(create_tokeninfo(&tokeninfo_name, &token_name)),
dbscheme::Entry::Case(token_case),
];
if has_trivia_tokens {
dbscheme_tail.push(dbscheme::Entry::Table(create_tokeninfo(
&trivia_tokeninfo_name,
&trivia_token_name,
)));
}
dbscheme_tail.push(dbscheme::Entry::Union(dbscheme::Union {
name: &ast_node_name,
members: ast_node_members,
}));
dbscheme_tail.push(dbscheme::Entry::Table(create_ast_node_location_table(
&node_location_table_name,
&ast_node_name,
)));
dbscheme_tail.push(dbscheme::Entry::Table(create_ast_node_parent_table(
&node_parent_table_name,
&ast_node_name,
)));
dbscheme::write(&mut dbscheme_writer, &dbscheme_tail)?;
dbscheme::write(
&mut dbscheme_writer,
&[
dbscheme::Entry::Table(create_tokeninfo(&tokeninfo_name, &token_name)),
dbscheme::Entry::Case(token_case),
dbscheme::Entry::Union(dbscheme::Union {
name: &ast_node_name,
members: ast_node_members,
}),
dbscheme::Entry::Table(create_ast_node_location_table(
&node_location_table_name,
&ast_node_name,
)),
dbscheme::Entry::Table(create_ast_node_parent_table(
&node_parent_table_name,
&ast_node_name,
)),
],
)?;
let mut body = vec![
ql::TopLevel::Class(ql_gen::create_ast_node_class(
@@ -128,12 +116,6 @@ pub fn generate(
)),
ql::TopLevel::Class(ql_gen::create_token_class(&token_name, &tokeninfo_name)),
];
if has_trivia_tokens {
body.push(ql::TopLevel::Class(ql_gen::create_trivia_token_class(
&trivia_token_name,
&trivia_tokeninfo_name,
)));
}
// Only emit the ReservedWord class when there are actually unnamed token
// types in the schema (i.e., @{prefix}_reserved_word exists in the dbscheme).
// When converting from a YEAST YAML schema that has no unnamed tokens, this

View File

@@ -199,70 +199,6 @@ pub fn create_token_class<'a>(token_type: &'a str, tokeninfo: &'a str) -> ql::Cl
}
}
/// Creates the `TriviaToken` class. Trivia tokens (e.g. comments) are
/// `extra` nodes preserved from the original parse tree even when the tree has
/// been rewritten by a desugaring pass. They are not part of the regular
/// `Token` hierarchy because they do not appear in the (possibly desugared)
/// output schema.
pub fn create_trivia_token_class<'a>(
trivia_token_type: &'a str,
trivia_tokeninfo: &'a str,
) -> ql::Class<'a> {
let trivia_tokeninfo_arity = 3; // id, kind, value
let get_value = ql::Predicate {
qldoc: Some(String::from("Gets the source text of this trivia token.")),
name: "getValue",
overridden: false,
is_private: false,
is_final: true,
return_type: Some(ql::Type::String),
formal_parameters: vec![],
body: create_get_field_expr_for_column_storage(
"result",
trivia_tokeninfo,
1,
trivia_tokeninfo_arity,
),
overlay: None,
};
let to_string = ql::Predicate {
qldoc: Some(String::from(
"Gets a string representation of this element.",
)),
name: "toString",
overridden: true,
is_private: false,
is_final: true,
return_type: Some(ql::Type::String),
formal_parameters: vec![],
body: ql::Expression::Equals(
Box::new(ql::Expression::Var("result")),
Box::new(ql::Expression::Dot(
Box::new(ql::Expression::Var("this")),
"getValue",
vec![],
)),
),
overlay: None,
};
ql::Class {
qldoc: Some(String::from(
"A trivia token, such as a comment, preserved from the original parse tree.",
)),
name: "TriviaToken",
is_abstract: false,
supertypes: vec![ql::Type::At(trivia_token_type), ql::Type::Normal("AstNode")]
.into_iter()
.collect(),
characteristic_predicate: None,
predicates: vec![
get_value,
to_string,
create_get_a_primary_ql_class("TriviaToken", false),
],
}
}
// Creates the `ReservedWord` class.
pub fn create_reserved_word_class(db_name: &str) -> ql::Class<'_> {
let class_name = "ReservedWord";

View File

@@ -44,19 +44,8 @@ pub fn query(input: TokenStream) -> TokenStream {
/// {expr} - embed a Rust expression returning Id
/// {..expr} - splice an iterable of Id (in child/field position)
/// field: {..expr} - splice into a named field
/// {expr}.map(p -> tpl) - apply tpl to each element; splice result
/// {expr}.reduce_left(f -> init, acc, e -> fold)
/// - fold with per-element init; splice 0 or 1 result
/// ```
///
/// Chain syntax after `{expr}` or `{..expr}`:
/// - `.map(param -> template)` — one output node per input element.
/// - `.reduce_left(first -> init, acc, elem -> fold)` — fold left; the first
/// element is converted by `init`, subsequent elements are folded by `fold`
/// with the accumulator bound to `acc`. An empty iterable yields nothing.
/// - Chains always splice (the result is iterable).
/// - Multiple chains can be chained, e.g. `.map(...).reduce_left(...)`.
///
/// Can be called with an explicit context or using the implicit context
/// from an enclosing `rule!`:
///
@@ -121,37 +110,3 @@ pub fn rule(input: TokenStream) -> TokenStream {
Err(err) => err.to_compile_error().into(),
}
}
/// Define a desugaring rule whose transform is a hand-written Rust block.
///
/// Use `manual_rule!` when the transform needs control over capture
/// translation timing — for example, when an outer rule needs to set
/// state in `ctx` (the `BuildCtx`'s user context) before recursive
/// translation reaches inner rules that read that state.
///
/// ```text
/// manual_rule!(
/// (query_pattern field: (_) @name)
/// {
/// // `ctx` is a `&mut BuildCtx<'_, C>`; capture variables
/// // (`name: NodeRef`, etc.) are bound from the query.
/// let translated = ctx.translate(name)?;
/// Ok(translated)
/// }
/// )
/// ```
///
/// Differences from [`rule!`]:
/// - Captures are **not** auto-translated before the body runs; they
/// refer to raw input-schema nodes. Use [`BuildCtx::translate`] (or
/// [`BuildCtx::translate_opt`]) to translate them when you choose.
/// - The body is plain Rust returning `Result<Vec<Id>, String>` — no
/// tree template, no `Ok(...)` wrap.
#[proc_macro]
pub fn manual_rule(input: TokenStream) -> TokenStream {
let input2: TokenStream2 = input.into();
match parse::parse_manual_rule_top(input2) {
Ok(output) => output.into(),
Err(err) => err.to_compile_error().into(),
}
}

View File

@@ -121,9 +121,9 @@ fn parse_query_fields(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
std::collections::HashMap::new();
let mut bare_children: Vec<TokenStream> = Vec::new();
let push_field_elem = |order: &mut Vec<String>,
map: &mut std::collections::HashMap<String, Vec<TokenStream>>,
name: String,
elem: TokenStream| {
map: &mut std::collections::HashMap<String, Vec<TokenStream>>,
name: String,
elem: TokenStream| {
if !map.contains_key(&name) {
order.push(name.clone());
map.insert(name, vec![elem]);
@@ -141,12 +141,7 @@ fn parse_query_fields(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
// Parse the field's pattern. To support repetition like
// `field: (kind)* @cap`, parse the atom first, then check for
// a quantifier, and lastly handle a trailing `@capture`.
// `field: @cap` is sugar for `field: _ @cap`.
let atom = if peek_is_at(tokens) {
quote! { yeast::query::QueryNode::Any { match_unnamed: true } }
} else {
parse_query_atom(tokens)?
};
let atom = parse_query_atom(tokens)?;
if peek_is_repetition(tokens) {
let rep = expect_repetition(tokens)?;
let elem = quote! {
@@ -160,7 +155,8 @@ fn parse_query_fields(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
} else {
let child = if peek_is_at(tokens) {
tokens.next();
let capture_name = expect_ident(tokens, "expected capture name after @")?;
let capture_name =
expect_ident(tokens, "expected capture name after @")?;
let name_str = capture_name.to_string();
quote! {
yeast::query::QueryNode::Capture {
@@ -263,7 +259,6 @@ fn parse_query_list(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
yeast::query::QueryListElem::SingleNode(#node)
},
)?;
let elem = maybe_wrap_list_capture(tokens, elem)?;
elems.push(elem);
continue;
}
@@ -281,7 +276,6 @@ fn parse_query_list(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
yeast::query::QueryListElem::SingleNode(#node)
},
)?;
let elem = maybe_wrap_list_capture(tokens, elem)?;
elems.push(elem);
continue;
}
@@ -295,10 +289,10 @@ fn parse_query_list(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
// tree! / trees! parsing — direct code generation against BuildCtx
// ---------------------------------------------------------------------------
const IMPLICIT_CTX: &str = "ctx";
const IMPLICIT_CTX: &str = "__yeast_ctx";
/// Determine the context identifier: either explicit `ctx,` or the implicit
/// `ctx` from an enclosing `rule!`.
/// `__yeast_ctx` from an enclosing `rule!`.
fn parse_ctx_or_implicit(tokens: &mut Tokens) -> Ident {
// Check if first token is an ident followed by a comma
let mut lookahead = tokens.clone();
@@ -358,7 +352,7 @@ fn parse_direct_node(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStream> {
Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Brace => {
let group = expect_group(tokens, Delimiter::Brace)?;
let expr = group.stream();
Ok(quote! { ::std::convert::Into::<usize>::into({ #expr }) })
Ok(quote! { ::std::convert::Into::<usize>::into(#expr) })
}
Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Parenthesis => {
let group = expect_group(tokens, Delimiter::Parenthesis)?;
@@ -395,10 +389,8 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
let expr = group.stream();
return Ok(quote! {
{
let __expr = { #expr };
let __value = yeast::YeastDisplay::yeast_to_string(&__expr, &*#ctx.ast);
let __source_range = yeast::YeastSourceRange::yeast_source_range(&__expr, &*#ctx.ast);
#ctx.literal_with_source_range(#kind_str, &__value, __source_range)
let __value = yeast::YeastDisplay::yeast_to_string(&(#expr), &*#ctx.ast);
#ctx.literal(#kind_str, &__value)
}
});
}
@@ -419,11 +411,7 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
// Named fields — compute each value into a temp, then reference it
while peek_is_field(tokens) {
let field_name = expect_ident(tokens, "expected field name")?;
let field_str = field_name
.to_string()
.strip_prefix("r#")
.unwrap_or(&field_name.to_string())
.to_string();
let field_str = field_name.to_string().strip_prefix("r#").unwrap_or(&field_name.to_string()).to_string();
expect_punct(tokens, ':', "expected `:` after field name")?;
let temp = Ident::new(
&format!("__field_{field_str}_{field_counter}"),
@@ -431,36 +419,23 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
);
field_counter += 1;
// Check for field: {..expr}.chain or field: {expr}.chain — splice a Vec<Id> into the field
// Check for field: {..expr} — splice a Vec<Id> into the field
if peek_is_group(tokens, Delimiter::Brace) {
let group_clone = tokens.clone().next().unwrap();
if let TokenTree::Group(g) = &group_clone {
let mut inner_check = g.stream().into_iter();
let is_splice = matches!(inner_check.next(), Some(TokenTree::Punct(p)) if p.as_char() == '.')
&& matches!(inner_check.next(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
// Determine if a chain (.map(..)) follows the `{}` group.
let mut after = tokens.clone();
after.next(); // skip the brace group
let has_chain =
matches!(after.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
if is_splice || has_chain {
if is_splice {
let group = expect_group(tokens, Delimiter::Brace)?;
let base: TokenStream = if is_splice {
let mut inner = group.stream().into_iter().peekable();
inner.next(); // consume first .
inner.next(); // consume second .
let expr: TokenStream = inner.collect();
quote! {
{ #expr }.into_iter().map(::std::convert::Into::<usize>::into)
}
} else {
let expr = group.stream();
quote! { { #expr }.into_iter() }
};
let chained = parse_chain_suffix(tokens, ctx, base)?;
let mut inner = group.stream().into_iter().peekable();
inner.next(); // consume first .
inner.next(); // consume second .
let expr: proc_macro2::TokenStream = inner.collect();
stmts.push(quote! {
let #temp: Vec<usize> = #chained.collect();
let #temp: Vec<usize> = (#expr).into_iter()
.map(::std::convert::Into::<usize>::into)
.collect();
});
// An empty splice means the field is absent — skip it
// entirely rather than emitting an empty named field.
@@ -497,94 +472,6 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
})
}
/// Parse a chain of `.method(args)` suffixes after a `{expr}` or `{..expr}`
/// placeholder in tree templates. Currently supports:
///
/// ```text
/// .map(param -> template) -- iterator map: produces Vec<usize>
/// ```
///
/// The chain may be empty (returns `base` unchanged). Multiple chained calls
/// are supported, e.g. `.map(p -> ...).map(q -> ...)`.
///
/// Each call expects the receiver to be an iterator. The `base` argument
/// should therefore already be an iterator (use `.into_iter()` on it before
/// calling this function).
fn parse_chain_suffix(tokens: &mut Tokens, ctx: &Ident, base: TokenStream) -> Result<TokenStream> {
let mut current = base;
while matches!(tokens.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.') {
tokens.next(); // consume .
let method = expect_ident(tokens, "expected method name after `.`")?;
let method_str = method.to_string();
let args_group = expect_group(tokens, Delimiter::Parenthesis)?;
match method_str.as_str() {
"map" => {
let mut inner = args_group.stream().into_iter().peekable();
let param = expect_ident(&mut inner, "expected lambda parameter name")?;
expect_punct(&mut inner, '-', "expected `->` after lambda parameter")?;
expect_punct(&mut inner, '>', "expected `->` after lambda parameter")?;
let body = parse_direct_node(&mut inner, ctx)?;
if let Some(tok) = inner.next() {
return Err(syn::Error::new_spanned(
tok,
"unexpected token after lambda body",
));
}
current = quote! {
#current.map(|#param| #body)
};
}
"reduce_left" => {
// Syntax: reduce_left(first -> init_tpl, acc, elem -> fold_tpl)
// - first -> init_tpl : converts the first element to the initial accumulator
// - acc, elem -> fold_tpl : fold step (acc = current accumulator, elem = next element)
// Empty iterator produces an empty iterator; non-empty produces a single-element iterator.
let mut inner = args_group.stream().into_iter().peekable();
let init_param = expect_ident(&mut inner, "expected initial lambda parameter")?;
expect_punct(&mut inner, '-', "expected `->` after init parameter")?;
expect_punct(&mut inner, '>', "expected `->` after init parameter")?;
let init_body = parse_direct_node(&mut inner, ctx)?;
expect_punct(&mut inner, ',', "expected `,` after init template")?;
let acc_param = expect_ident(&mut inner, "expected accumulator parameter")?;
expect_punct(&mut inner, ',', "expected `,` after accumulator parameter")?;
let elem_param = expect_ident(&mut inner, "expected element parameter")?;
expect_punct(&mut inner, '-', "expected `->` after element parameter")?;
expect_punct(&mut inner, '>', "expected `->` after element parameter")?;
let fold_body = parse_direct_node(&mut inner, ctx)?;
if let Some(tok) = inner.next() {
return Err(syn::Error::new_spanned(
tok,
"unexpected token after fold template",
));
}
current = quote! {
{
let mut __iter = #current;
let __result: Option<usize> = if let Some(#init_param) = __iter.next() {
let mut __acc: usize = #init_body;
for #elem_param in __iter {
let #acc_param: usize = __acc;
__acc = #fold_body;
}
Some(__acc)
} else {
None
};
__result.into_iter()
}
};
}
_ => {
return Err(syn::Error::new_spanned(
method,
format!("unknown builtin method `.{method_str}()`"),
));
}
}
}
Ok(current)
}
/// Parse the top-level list of a `trees!` template.
/// Each item is a node template or `{expr}` splice.
fn parse_direct_list(tokens: &mut Tokens, ctx: &Ident) -> Result<Vec<TokenStream>> {
@@ -605,33 +492,23 @@ fn parse_direct_list(tokens: &mut Tokens, ctx: &Ident) -> Result<Vec<TokenStream
continue;
}
// {expr} or {..expr} (with optional .chain) — single node or splice
// {expr} or {..expr} — single node or splice
if peek_is_group(tokens, Delimiter::Brace) {
let group = expect_group(tokens, Delimiter::Brace)?;
let has_chain =
matches!(tokens.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
let mut inner = group.stream().into_iter().peekable();
let is_splice = peek_is_dotdot(&inner);
if is_splice || has_chain {
let base: TokenStream = if is_splice {
inner.next(); // consume first .
inner.next(); // consume second .
let expr: TokenStream = inner.collect();
quote! {
{ #expr }.into_iter().map(::std::convert::Into::<usize>::into)
}
} else {
let expr = group.stream();
quote! { { #expr }.into_iter() }
};
let chained = parse_chain_suffix(tokens, ctx, base)?;
if peek_is_dotdot(&inner) {
inner.next(); // consume first .
inner.next(); // consume second .
let expr: TokenStream = inner.collect();
items.push(quote! {
__nodes.extend(#chained);
__nodes.extend(
(#expr).into_iter().map(::std::convert::Into::<usize>::into)
);
});
} else {
let expr = group.stream();
items.push(quote! {
__nodes.push(::std::convert::Into::<usize>::into({ #expr }));
__nodes.push(::std::convert::Into::<usize>::into(#expr));
});
}
continue;
@@ -727,11 +604,8 @@ fn extract_captures_inner(
}
last_mult = CaptureMultiplicity::Single;
}
TokenTree::Punct(p) if p.as_char() == '*' || p.as_char() == '+' => {
last_mult = CaptureMultiplicity::Repeated;
}
TokenTree::Punct(p) if p.as_char() == '?' => {
last_mult = CaptureMultiplicity::Optional;
TokenTree::Punct(p) if matches!(p.as_char(), '*' | '+' | '?') => {
// Keep last_mult — the @capture follows
}
_ => {
last_mult = CaptureMultiplicity::Single;
@@ -889,117 +763,10 @@ pub fn parse_rule_top(input: TokenStream) -> Result<TokenStream> {
Ok(quote! {
{
let __query = #query_code;
yeast::Rule::new(__query, Box::new(|__ast: &mut yeast::Ast, mut __captures: yeast::captures::Captures, __fresh: &yeast::tree_builder::FreshScope, __source_range: Option<tree_sitter::Range>, __user_ctx: &mut _, __translator: yeast::TranslatorHandle<'_, _>| {
// Auto-translation prefix: recursively translate every
// captured node before invoking the user's transform body.
// For OneShot rules this preserves the legacy behaviour
// (input-schema captures translated to output-schema
// nodes); for Repeating rules it is a no-op.
__translator.auto_translate_captures(&mut __captures, __ast, __user_ctx)?;
yeast::Rule::new(__query, Box::new(|__ast: &mut yeast::Ast, __captures: yeast::captures::Captures, __fresh: &yeast::tree_builder::FreshScope, __source_range: Option<tree_sitter::Range>| {
#(#bindings)*
let mut #ctx_ident = yeast::build::BuildCtx::with_translator(__ast, &__captures, __fresh, __source_range, __user_ctx, __translator);
let __result: Vec<usize> = { #transform_body };
Ok(__result)
}))
}
})
}
/// Parse `manual_rule!( query { body } )`.
///
/// Like [`parse_rule_top`] but:
/// - Expects a Rust block `{ ... }` after the query (no `=>` arrow).
/// - Generates code that does NOT auto-translate captures before
/// running the body. Capture variables refer to raw (input-schema)
/// nodes; the body is responsible for explicit translation via
/// `ctx.translate(...)`.
/// - The body is included verbatim and must evaluate to
/// `Result<Vec<usize>, String>`.
pub fn parse_manual_rule_top(input: TokenStream) -> Result<TokenStream> {
let mut tokens = input.into_iter().peekable();
// Collect query tokens up to the body block `{ ... }`.
let mut query_tokens = Vec::new();
loop {
match tokens.peek() {
None => {
return Err(syn::Error::new(
Span::call_site(),
"expected a Rust block `{ ... }` after the query in manual_rule!",
))
}
Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Brace => break,
_ => {
query_tokens.push(tokens.next().unwrap());
}
}
}
let query_stream: TokenStream = query_tokens.into_iter().collect();
// Extract captures from the query (same as in `rule!`).
let captures = extract_captures(&query_stream);
// Parse the query into the QueryNode-building expression.
let query_code = parse_query_top(query_stream)?;
// Generate capture bindings (same as in `rule!`).
let ctx_ident = Ident::new(IMPLICIT_CTX, Span::call_site());
let bindings: Vec<TokenStream> = captures
.iter()
.map(|cap| {
let name = Ident::new(&cap.name, Span::call_site());
let name_str = &cap.name;
match cap.multiplicity {
CaptureMultiplicity::Repeated => quote! {
let #name: Vec<yeast::NodeRef> = __captures.get_all(#name_str)
.into_iter()
.map(yeast::NodeRef)
.collect();
},
CaptureMultiplicity::Optional => quote! {
let #name: Option<yeast::NodeRef> =
__captures.get_opt(#name_str).map(yeast::NodeRef);
},
CaptureMultiplicity::Single => quote! {
let #name: yeast::NodeRef =
yeast::NodeRef(__captures.get_var(#name_str).unwrap());
},
}
})
.collect();
// Consume the body block.
let body_group = match tokens.next() {
Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Brace => g,
other => {
return Err(syn::Error::new(
Span::call_site(),
format!(
"expected a Rust block `{{ ... }}` after the query in manual_rule!, found: {other:?}"
),
))
}
};
let body_stream = body_group.stream();
// No tokens should follow the body.
if let Some(tok) = tokens.next() {
return Err(syn::Error::new_spanned(
tok,
"unexpected token after manual_rule! body",
));
}
Ok(quote! {
{
let __query = #query_code;
yeast::Rule::new(__query, Box::new(|__ast: &mut yeast::Ast, __captures: yeast::captures::Captures, __fresh: &yeast::tree_builder::FreshScope, __source_range: Option<tree_sitter::Range>, __user_ctx: &mut _, __translator: yeast::TranslatorHandle<'_, _>| {
// No auto-translate prefix for manual rules — the body
// is responsible for translating captures explicitly.
#(#bindings)*
let mut #ctx_ident = yeast::build::BuildCtx::with_translator(__ast, &__captures, __fresh, __source_range, __user_ctx, __translator);
#body_stream
let mut #ctx_ident = yeast::build::BuildCtx::with_source_range(__ast, &__captures, __fresh, __source_range);
#transform_body
}))
}
})

View File

@@ -265,21 +265,7 @@ occurrences of the same `$name` within one `BuildCtx` share the same value:
)
```
The contents of `{…}` are treated as a Rust block, so multi-statement
expressions (with `let` bindings) work too:
```rust
(assignment
left: {tmp}
right: {
let lit = ctx.literal("integer", "0");
tree!((binary_expr op: (operator "+") left: {tmp} right: {lit}))
})
```
`{..expr}` splices a `Vec<Id>` (or any iterable of `Id`); the contents
are likewise a Rust block, so the splice can be the result of arbitrary
computation:
`{..expr}` splices a `Vec<Id>` (or any iterable of `Id`):
```rust
yeast::trees!(ctx,

View File

@@ -20,7 +20,7 @@ fn main() {
let args = Cli::parse();
let language = get_language(&args.language);
let source = std::fs::read_to_string(&args.file).unwrap();
let runner: yeast::Runner = yeast::Runner::new(language, &[]);
let runner = yeast::Runner::new(language, &[]);
let ast = runner.run(&source).unwrap();
println!("{}", ast.print(&source, ast.get_root()));
}

View File

@@ -2,60 +2,28 @@ use std::collections::BTreeMap;
use crate::captures::Captures;
use crate::tree_builder::FreshScope;
use crate::{Ast, FieldId, Id, NodeContent, TranslatorHandle};
use crate::{Ast, FieldId, Id, NodeContent};
/// Context for building new AST nodes during a transformation.
///
/// Used by the `tree!` and `trees!` macros. Holds a mutable reference to the
/// AST, a reference to the captures from a query match, a `FreshScope` for
/// generating unique identifiers, and a mutable reference to a user-defined
/// context of type `C`.
///
/// The user context `C` is shared across rules via the framework's driver:
/// outer rules can write to it before recursive translation, and inner rules
/// can read (or further mutate) it during their transforms. The framework
/// snapshots and restores the user context around each rule application, so
/// mutations made by a rule are visible to its descendants (via recursive
/// translation) but not to its parent's siblings.
///
/// `BuildCtx` implements [`Deref`] and [`DerefMut`] targeting `C`, so user
/// context fields are accessible as `ctx.my_field` directly (provided they
/// don't collide with `BuildCtx`'s own fields like `ast`, `captures`, etc.).
///
/// The default `C = ()` means rules that don't need any user context don't
/// pay any cost.
///
/// When constructed by the framework (via the rule! macro), `BuildCtx` also
/// carries a [`TranslatorHandle`] that the [`translate`] method delegates
/// to. When constructed by hand (e.g. in tests), the translator is `None`
/// and [`translate`] returns an error.
pub struct BuildCtx<'a, C: 'a = ()> {
/// AST, a reference to the captures from a query match, and a `FreshScope` for
/// generating unique identifiers.
pub struct BuildCtx<'a> {
pub ast: &'a mut Ast,
pub captures: &'a Captures,
pub fresh: &'a FreshScope,
/// Source range of the matched node, inherited by synthetic nodes.
pub source_range: Option<tree_sitter::Range>,
/// User-supplied context, accessible directly via `ctx.field` (via Deref).
pub user_ctx: &'a mut C,
/// Optional translator handle, populated when the context is built by
/// the framework's rule driver. None when the context is built by hand.
pub(crate) translator: Option<TranslatorHandle<'a, C>>,
}
impl<'a, C> BuildCtx<'a, C> {
pub fn new(
ast: &'a mut Ast,
captures: &'a Captures,
fresh: &'a FreshScope,
user_ctx: &'a mut C,
) -> Self {
impl<'a> BuildCtx<'a> {
pub fn new(ast: &'a mut Ast, captures: &'a Captures, fresh: &'a FreshScope) -> Self {
Self {
ast,
captures,
fresh,
source_range: None,
user_ctx,
translator: None,
}
}
@@ -64,35 +32,12 @@ impl<'a, C> BuildCtx<'a, C> {
captures: &'a Captures,
fresh: &'a FreshScope,
source_range: Option<tree_sitter::Range>,
user_ctx: &'a mut C,
) -> Self {
Self {
ast,
captures,
fresh,
source_range,
user_ctx,
translator: None,
}
}
/// Construct a `BuildCtx` carrying a translator handle. Used by the
/// `rule!` macro to enable [`translate`] inside rule transforms.
pub fn with_translator(
ast: &'a mut Ast,
captures: &'a Captures,
fresh: &'a FreshScope,
source_range: Option<tree_sitter::Range>,
user_ctx: &'a mut C,
translator: TranslatorHandle<'a, C>,
) -> Self {
Self {
ast,
captures,
fresh,
source_range,
user_ctx,
translator: Some(translator),
}
}
@@ -137,83 +82,10 @@ impl<'a, C> BuildCtx<'a, C> {
.create_named_token_with_range(kind, value.to_string(), self.source_range)
}
/// Create a leaf node with fixed content and an optional preferred source range.
/// If `source_range` is `None`, falls back to this context's inherited range.
pub fn literal_with_source_range(
&mut self,
kind: &'static str,
value: &str,
source_range: Option<tree_sitter::Range>,
) -> Id {
self.ast.create_named_token_with_range(
kind,
value.to_string(),
source_range.or(self.source_range),
)
}
/// Create a leaf node with an auto-generated unique name.
pub fn fresh(&mut self, kind: &'static str, name: &str) -> Id {
let generated = self.fresh.resolve(name);
self.ast
.create_named_token_with_range(kind, generated, self.source_range)
}
/// Prepend a value to a field of an existing node.
pub fn prepend_field(&mut self, node_id: Id, field_name: &str, value_id: Id) {
let field_id = self
.ast
.field_id_for_name(field_name)
.unwrap_or_else(|| panic!("build: field '{field_name}' not found"));
self.ast.prepend_field_child(node_id, field_id, value_id);
}
}
impl<C: Clone> BuildCtx<'_, C> {
/// Recursively translate a node via the framework's rule machinery.
/// In a OneShot phase, applies OneShot rules to the given node and
/// returns the resulting node ids. In a Repeating phase, errors
/// (translation is not meaningful when input and output share a
/// schema).
///
/// Accepts any value convertible to [`Id`] (including [`crate::NodeRef`]),
/// so manual rules can pass capture bindings directly without unwrapping.
///
/// Errors if this `BuildCtx` was constructed by hand (without a
/// translator handle) — for example, in unit tests that don't go
/// through the rule driver.
pub fn translate<I: Into<Id>>(&mut self, id: I) -> Result<Vec<Id>, String> {
let id = id.into();
match &self.translator {
Some(t) => t.translate(self.ast, self.user_ctx, id),
None => Err("translate() called on a BuildCtx without a translator handle".into()),
}
}
/// Translate an optional capture, returning the first translated id or
/// `None`. Convenience for `?`-quantifier captures (`Option<NodeRef>`).
///
/// If the underlying translation produces multiple ids for a single
/// input, only the first is returned. For most use cases (e.g.
/// translating a single type annotation) this is what you want; if
/// you need all ids, use [`translate`] directly.
pub fn translate_opt<I: Into<Id>>(&mut self, id: Option<I>) -> Result<Option<Id>, String> {
match id {
Some(id) => Ok(self.translate(id)?.into_iter().next()),
None => Ok(None),
}
}
}
impl<C> std::ops::Deref for BuildCtx<'_, C> {
type Target = C;
fn deref(&self) -> &C {
&*self.user_ctx
}
}
impl<C> std::ops::DerefMut for BuildCtx<'_, C> {
fn deref_mut(&mut self) -> &mut C {
&mut *self.user_ctx
}
}

View File

@@ -53,7 +53,12 @@ pub fn dump_ast_with_options(
///
/// Any node that does not match the expected type set for its parent field is
/// rendered with a trailing `" <-- ERROR: ..."` annotation on the same line.
pub fn dump_ast_with_type_errors(ast: &Ast, root: usize, source: &str, schema: &Schema) -> String {
pub fn dump_ast_with_type_errors(
ast: &Ast,
root: usize,
source: &str,
schema: &Schema,
) -> String {
dump_ast_with_type_errors_and_options(ast, root, source, schema, &DumpOptions::default())
}
@@ -69,15 +74,7 @@ pub fn dump_ast_with_type_errors_and_options(
options: &DumpOptions,
) -> String {
let mut out = String::new();
dump_node(
ast,
root,
source,
options,
0,
Some((schema, None, None)),
&mut out,
);
dump_node(ast, root, source, options, 0, Some((schema, None, None)), &mut out);
out
}
@@ -235,8 +232,8 @@ fn dump_node(
}
let field_name = ast.field_name_for_id(field_id).unwrap_or("?");
let child_type_check = type_check.map(|(schema, _, _)| {
let expected =
expected_for_field(schema, node.kind_name(), field_id).or(Some(EMPTY_NODE_TYPES));
let expected = expected_for_field(schema, node.kind_name(), field_id)
.or(Some(EMPTY_NODE_TYPES));
let parent_field = Some((node.kind_name(), field_name));
(schema, expected, parent_field)
});

View File

@@ -16,7 +16,7 @@ pub mod schema;
pub mod tree_builder;
mod visitor;
pub use yeast_macros::{manual_rule, query, rule, tree, trees};
pub use yeast_macros::{query, rule, tree, trees};
use captures::Captures;
pub use cursor::Cursor;
@@ -58,30 +58,12 @@ pub trait YeastDisplay {
fn yeast_to_string(&self, ast: &Ast) -> String;
}
/// Optional source range for values used in `#{expr}` interpolations.
///
/// By default this returns `None`, so synthesized leaves inherit the matched
/// rule's source range. `NodeRef` returns the referenced node's range, letting
/// `(kind #{capture})` carry the captured node's location.
pub trait YeastSourceRange {
fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range>;
}
impl YeastDisplay for NodeRef {
fn yeast_to_string(&self, ast: &Ast) -> String {
ast.source_text(self.0)
}
}
impl YeastSourceRange for NodeRef {
fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range> {
ast.get_node(self.0).and_then(|n| match &n.content {
NodeContent::Range(r) => Some(r.clone()),
_ => n.source_range,
})
}
}
macro_rules! impl_yeast_display_via_display {
($($t:ty),* $(,)?) => {
$(
@@ -90,12 +72,6 @@ macro_rules! impl_yeast_display_via_display {
::std::string::ToString::to_string(self)
}
}
impl YeastSourceRange for $t {
fn yeast_source_range(&self, _ast: &Ast) -> Option<tree_sitter::Range> {
None
}
}
)*
};
}
@@ -114,12 +90,6 @@ impl<T: YeastDisplay + ?Sized> YeastDisplay for &T {
}
}
impl<T: YeastSourceRange + ?Sized> YeastSourceRange for &T {
fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range> {
(**self).yeast_source_range(ast)
}
}
pub const CHILD_FIELD: u16 = u16::MAX;
#[derive(Debug)]
@@ -297,9 +267,7 @@ impl Ast {
/// Returns the source text for `id`, resolving `NodeContent::Range`
/// against the stored source bytes when available.
pub fn source_text(&self, id: Id) -> String {
let Some(node) = self.get_node(id) else {
return String::new();
};
let Some(node) = self.get_node(id) else { return String::new(); };
let read_range = |range: &tree_sitter::Range| {
let start = range.start_byte;
let end = range.end_byte;
@@ -400,15 +368,6 @@ impl Ast {
is_named: bool,
source_range: Option<tree_sitter::Range>,
) -> Id {
let source_range = match &content {
// Parsed nodes already carry an exact source range in their content.
NodeContent::Range(_) => source_range,
// Synthesized nodes derive location from children when possible,
// and fall back to the inherited rule-match range otherwise.
_ => self
.union_source_range_of_children(&fields)
.or(source_range),
};
let id = self.nodes.len();
self.nodes.push(Node {
kind,
@@ -424,79 +383,10 @@ impl Ast {
id
}
fn union_source_range_of_children(
&self,
fields: &BTreeMap<FieldId, Vec<Id>>,
) -> Option<tree_sitter::Range> {
let mut start_byte: Option<usize> = None;
let mut end_byte: Option<usize> = None;
let mut start_point = tree_sitter::Point { row: 0, column: 0 };
let mut end_point = tree_sitter::Point { row: 0, column: 0 };
for child_ids in fields.values() {
for &child_id in child_ids {
let Some(child) = self.get_node(child_id) else {
continue;
};
let child_start_byte = child.start_byte();
let child_end_byte = child.end_byte();
// Skip children that carry no usable location.
if child_start_byte == 0 && child_end_byte == 0 {
continue;
}
match start_byte {
None => {
start_byte = Some(child_start_byte);
start_point = child.start_position();
}
Some(current_start) if child_start_byte < current_start => {
start_byte = Some(child_start_byte);
start_point = child.start_position();
}
_ => {}
}
match end_byte {
None => {
end_byte = Some(child_end_byte);
end_point = child.end_position();
}
Some(current_end) if child_end_byte > current_end => {
end_byte = Some(child_end_byte);
end_point = child.end_position();
}
_ => {}
}
}
}
match (start_byte, end_byte) {
(Some(start_byte), Some(end_byte)) => Some(tree_sitter::Range {
start_byte,
end_byte,
start_point,
end_point,
}),
_ => None,
}
}
pub fn create_named_token(&mut self, kind: &'static str, content: String) -> Id {
self.create_named_token_with_range(kind, content, None)
}
/// Prepend a child id to the given field of the given node.
pub fn prepend_field_child(&mut self, node_id: Id, field_id: FieldId, value_id: Id) {
let node = self
.nodes
.get_mut(node_id)
.expect("prepend_field_child: invalid node id");
node.fields.entry(field_id).or_default().insert(0, value_id);
}
pub fn create_named_token_with_range(
&mut self,
kind: &'static str,
@@ -705,118 +595,18 @@ impl From<tree_sitter::Range> for NodeContent {
}
}
/// A handle that lets a rule transform recursively translate AST nodes via
/// the framework's rule machinery. Constructed by the driver and passed as
/// the last argument of every [`Transform`] invocation.
///
/// The `rule!` macro uses [`TranslatorHandle::auto_translate_captures`] in
/// its generated prefix to translate captures before running the user's
/// transform body. Manually-written transforms (using [`Rule::new`]
/// directly) can call [`TranslatorHandle::translate`] selectively on
/// specific node ids to control when translation happens.
pub struct TranslatorHandle<'a, C> {
inner: TranslatorImpl<'a, C>,
}
/// Internal phase-specific translation state. Kept private — callers
/// interact with [`TranslatorHandle`] only.
enum TranslatorImpl<'a, C> {
/// OneShot phase translator: recursively applies OneShot rules.
OneShot {
index: &'a RuleIndex<'a, C>,
fresh: &'a tree_builder::FreshScope,
rewrite_depth: usize,
/// The id of the node the current rule is matching. Used by
/// [`auto_translate_captures`] to avoid infinite recursion when a
/// rule captures its own match root (e.g. via `(_) @_`).
matched_root: Id,
},
/// Repeating phase translator: translation is not meaningful here
/// (input and output schemas are the same). [`translate`] errors;
/// [`auto_translate_captures`] is a no-op so the macro's auto-prefix
/// works unchanged for Repeating rules.
Repeating,
}
impl<'a, C: Clone> TranslatorHandle<'a, C> {
/// Recursively apply OneShot rules to `id` and return the resulting
/// node ids. Errors in a Repeating phase (where translation is not
/// meaningful).
pub fn translate(&self, ast: &mut Ast, user_ctx: &mut C, id: Id) -> Result<Vec<Id>, String> {
match &self.inner {
TranslatorImpl::OneShot {
index,
fresh,
rewrite_depth,
..
} => apply_one_shot_rules_inner(index, ast, user_ctx, id, fresh, rewrite_depth + 1),
TranslatorImpl::Repeating => {
Err("translate() is not available in a Repeating phase".into())
}
}
}
/// Translate every captured node in `captures` in place (OneShot phase
/// only). In a Repeating phase this is a no-op — Repeating rules
/// receive raw captures.
///
/// Used by the `rule!` macro's generated prefix to preserve the
/// pre-existing "auto-translate captures before running the transform
/// body" behavior. Manually-written transforms typically translate
/// captures selectively via [`translate`] instead.
///
/// To avoid infinite recursion, a capture whose id matches the rule's
/// matched root (e.g. from a `(_) @_` pattern) is left unchanged.
pub fn auto_translate_captures(
&self,
captures: &mut Captures,
ast: &mut Ast,
user_ctx: &mut C,
) -> Result<(), String> {
match &self.inner {
TranslatorImpl::OneShot { matched_root, .. } => {
let root = *matched_root;
captures.try_map_all_captures(|cid| {
if cid == root {
Ok(vec![cid])
} else {
self.translate(ast, user_ctx, cid)
}
})
}
TranslatorImpl::Repeating => Ok(()),
}
}
}
/// The transform function for a rule.
///
/// Takes the AST, the (raw, untranslated) captured variables, a fresh-name
/// scope, the source range of the matched node, a mutable reference to the
/// user context of type `C`, and a [`TranslatorHandle`] for recursively
/// translating nodes. Returns the IDs of the replacement nodes, or an
/// error message if the transform could not be completed.
///
/// Transforms produced by [`Rule::new`] receive **raw** captures and must
/// translate them themselves (via the handle). Transforms produced by the
/// `rule!` macro have an auto-translation prefix injected for backward
/// compatibility.
pub type Transform<C = ()> = Box<
dyn Fn(
&mut Ast,
Captures,
&tree_builder::FreshScope,
Option<tree_sitter::Range>,
&mut C,
TranslatorHandle<'_, C>,
) -> Result<Vec<Id>, String>
/// The transform function for a rule: takes the AST, captured variables, a
/// fresh-name scope, and the source range of the matched node, and returns
/// the IDs of the replacement nodes.
pub type Transform = Box<
dyn Fn(&mut Ast, Captures, &tree_builder::FreshScope, Option<tree_sitter::Range>) -> Vec<Id>
+ Send
+ Sync,
>;
pub struct Rule<C = ()> {
pub struct Rule {
query: QueryNode,
transform: Transform<C>,
transform: Transform,
/// If true, after this rule fires on a node the engine will try to
/// re-apply this same rule on the result root. Defaults to false:
/// each rule fires at most once on a given node, which prevents
@@ -824,8 +614,8 @@ pub struct Rule<C = ()> {
repeated: bool,
}
impl<C> Rule<C> {
pub fn new(query: QueryNode, transform: Transform<C>) -> Self {
impl Rule {
pub fn new(query: QueryNode, transform: Transform) -> Self {
Self {
query,
transform,
@@ -847,13 +637,9 @@ impl<C> Rule<C> {
ast: &mut Ast,
node: Id,
fresh: &tree_builder::FreshScope,
user_ctx: &mut C,
translator: TranslatorHandle<'_, C>,
) -> Result<Option<Vec<Id>>, String> {
match self.try_match(ast, node)? {
Some(captures) => Ok(Some(
self.run_transform(ast, captures, node, fresh, user_ctx, translator)?,
)),
Some(captures) => Ok(Some(self.run_transform(ast, captures, node, fresh))),
None => Ok(None),
}
}
@@ -877,31 +663,29 @@ impl<C> Rule<C> {
captures: Captures,
node: Id,
fresh: &tree_builder::FreshScope,
user_ctx: &mut C,
translator: TranslatorHandle<'_, C>,
) -> Result<Vec<Id>, String> {
) -> Vec<Id> {
fresh.next_scope();
let source_range = ast.get_node(node).and_then(|n| match n.content {
NodeContent::Range(r) => Some(r),
_ => n.source_range,
});
(self.transform)(ast, captures, fresh, source_range, user_ctx, translator)
(self.transform)(ast, captures, fresh, source_range)
}
}
const MAX_REWRITE_DEPTH: usize = 100;
/// Index of rules by their root query kind for fast lookup.
struct RuleIndex<'a, C> {
struct RuleIndex<'a> {
/// Rules indexed by root node kind name.
by_kind: BTreeMap<&'static str, Vec<&'a Rule<C>>>,
by_kind: BTreeMap<&'static str, Vec<&'a Rule>>,
/// Rules with wildcard queries (Any) that apply to all nodes.
wildcard: Vec<&'a Rule<C>>,
wildcard: Vec<&'a Rule>,
}
impl<'a, C> RuleIndex<'a, C> {
fn new(rules: &'a [Rule<C>]) -> Self {
let mut by_kind: BTreeMap<&'static str, Vec<&'a Rule<C>>> = BTreeMap::new();
impl<'a> RuleIndex<'a> {
fn new(rules: &'a [Rule]) -> Self {
let mut by_kind: BTreeMap<&'static str, Vec<&'a Rule>> = BTreeMap::new();
let mut wildcard = Vec::new();
for rule in rules {
match rule.query.root_kind() {
@@ -912,7 +696,7 @@ impl<'a, C> RuleIndex<'a, C> {
Self { by_kind, wildcard }
}
fn rules_for_kind(&self, kind: &str) -> impl Iterator<Item = &&'a Rule<C>> {
fn rules_for_kind(&self, kind: &str) -> impl Iterator<Item = &&'a Rule> {
self.by_kind
.get(kind)
.into_iter()
@@ -921,25 +705,23 @@ impl<'a, C> RuleIndex<'a, C> {
}
}
fn apply_repeating_rules<C: Clone>(
rules: &[Rule<C>],
fn apply_repeating_rules(
rules: &[Rule],
ast: &mut Ast,
user_ctx: &mut C,
id: Id,
fresh: &tree_builder::FreshScope,
) -> Result<Vec<Id>, String> {
let index = RuleIndex::new(rules);
apply_repeating_rules_inner(&index, ast, user_ctx, id, fresh, 0, None)
apply_repeating_rules_inner(&index, ast, id, fresh, 0, None)
}
fn apply_repeating_rules_inner<C: Clone>(
index: &RuleIndex<C>,
fn apply_repeating_rules_inner(
index: &RuleIndex,
ast: &mut Ast,
user_ctx: &mut C,
id: Id,
fresh: &tree_builder::FreshScope,
rewrite_depth: usize,
skip_rule: Option<*const Rule<C>>,
skip_rule: Option<*const Rule>,
) -> Result<Vec<Id>, String> {
if rewrite_depth > MAX_REWRITE_DEPTH {
return Err(format!(
@@ -950,23 +732,11 @@ fn apply_repeating_rules_inner<C: Clone>(
let node_kind = ast.get_node(id).map(|n| n.kind()).unwrap_or("");
for rule in index.rules_for_kind(node_kind) {
let rule_ptr = *rule as *const Rule<C>;
let rule_ptr = *rule as *const Rule;
if Some(rule_ptr) == skip_rule {
continue;
}
// Snapshot the user context before invoking the rule so that any
// mutations the rule makes are visible during recursive translation
// of its result, but not leaked to the parent's siblings.
let snapshot = user_ctx.clone();
// Repeating rules don't need a real translator: their captures
// aren't auto-translated (Repeating preserves the input schema),
// and `ctx.translate(id)` errors if invoked from a Repeating
// transform.
let translator = TranslatorHandle {
inner: TranslatorImpl::Repeating,
};
let try_result = rule.try_rule(ast, id, fresh, user_ctx, translator)?;
if let Some(result_node) = try_result {
if let Some(result_node) = rule.try_rule(ast, id, fresh)? {
// For non-repeated rules, suppress further application of *this*
// rule on the result root, so a rule whose output matches its own
// query doesn't loop. Other rules and child traversal are
@@ -977,19 +747,14 @@ fn apply_repeating_rules_inner<C: Clone>(
results.extend(apply_repeating_rules_inner(
index,
ast,
user_ctx,
node,
fresh,
rewrite_depth + 1,
next_skip,
)?);
}
*user_ctx = snapshot;
return Ok(results);
}
// Rule didn't match; restore any speculative changes (none expected
// since try_rule only mutates on match, but be defensive).
*user_ctx = snapshot;
}
// Take the parent's fields by ownership: the recursion will rewrite
@@ -1004,15 +769,7 @@ fn apply_repeating_rules_inner<C: Clone>(
for children in fields.values_mut() {
let mut new_children: Option<Vec<Id>> = None;
for (i, &child_id) in children.iter().enumerate() {
let result = apply_repeating_rules_inner(
index,
ast,
user_ctx,
child_id,
fresh,
rewrite_depth,
None,
)?;
let result = apply_repeating_rules_inner(index, ast, child_id, fresh, rewrite_depth, None)?;
let unchanged = result.len() == 1 && result[0] == child_id;
match (&mut new_children, unchanged) {
(None, true) => {} // unchanged so far, no allocation needed
@@ -1041,25 +798,24 @@ fn apply_repeating_rules_inner<C: Clone>(
/// each visited node, recursion proceeds only through captured nodes (not
/// through the input node's children directly), and an error is returned if
/// no rule matches a visited node.
fn apply_one_shot_rules<C: Clone>(
rules: &[Rule<C>],
fn apply_one_shot_rules(
rules: &[Rule],
ast: &mut Ast,
user_ctx: &mut C,
id: Id,
fresh: &tree_builder::FreshScope,
) -> Result<Vec<Id>, String> {
let index = RuleIndex::new(rules);
apply_one_shot_rules_inner(&index, ast, user_ctx, id, fresh, 0)
apply_one_shot_rules_inner(&index, ast, id, fresh, 0)
}
fn apply_one_shot_rules_inner<C: Clone>(
index: &RuleIndex<C>,
fn apply_one_shot_rules_inner(
index: &RuleIndex,
ast: &mut Ast,
user_ctx: &mut C,
id: Id,
fresh: &tree_builder::FreshScope,
rewrite_depth: usize,
) -> Result<Vec<Id>, String> {
if rewrite_depth > MAX_REWRITE_DEPTH {
return Err(format!(
"Desugaring exceeded maximum rewrite depth ({MAX_REWRITE_DEPTH}). \
@@ -1069,28 +825,31 @@ fn apply_one_shot_rules_inner<C: Clone>(
let node_kind = ast.get_node(id).map(|n| n.kind()).unwrap_or("");
// Don't rewrite unnamed nodes (punctuation, keywords, etc.); leave them
// as-is. Rules target named nodes only.
if let Some(node) = ast.get_node(id) {
if !node.is_named() {
return Ok(vec![id]);
}
}
for rule in index.rules_for_kind(node_kind) {
if let Some(captures) = rule.try_match(ast, id)? {
// Snapshot the user context before invoking the rule so that any
// mutations the rule (or its transitively-translated captures)
// make are visible during this rule's transform, but not leaked
// to the parent's siblings.
let snapshot = user_ctx.clone();
// Build the translator handle the transform will use to
// recursively translate captures (or, for macro-generated
// rules, the auto-translate prefix uses it to translate every
// capture up front, preserving the legacy behavior).
let translator = TranslatorHandle {
inner: TranslatorImpl::OneShot {
index,
fresh,
rewrite_depth,
matched_root: id,
},
};
let result = rule.run_transform(ast, captures, id, fresh, user_ctx, translator)?;
*user_ctx = snapshot;
return Ok(result);
if let Some(mut captures) = rule.try_match(ast, id)? {
// Recursively translate every captured node before invoking the
// transform. The transform's output uses output-schema kinds, so
// we must translate captured input-schema nodes to their
// output-schema equivalents first.
captures.try_map_all_captures(|captured_id| {
// Avoid infinite recursion when a capture refers to the root
// node of the matched tree (e.g. an `@_` capture on the
// pattern root): re-analyzing it would match the same rule
// again indefinitely.
if captured_id == id {
return Ok(vec![captured_id]);
}
apply_one_shot_rules_inner(index, ast, captured_id, fresh, rewrite_depth + 1)
})?;
return Ok(rule.run_transform(ast, captures, id, fresh));
}
}
@@ -1118,15 +877,15 @@ pub enum PhaseKind {
/// starts. Rules within a phase compete for matches as usual; rules in
/// different phases never compete because each traversal only considers the
/// current phase's rules.
pub struct Phase<C = ()> {
pub struct Phase {
/// Name used in error messages.
pub name: String,
pub rules: Vec<Rule<C>>,
pub rules: Vec<Rule>,
pub kind: PhaseKind,
}
impl<C> Phase<C> {
pub fn new(name: impl Into<String>, kind: PhaseKind, rules: Vec<Rule<C>>) -> Self {
impl Phase {
pub fn new(name: impl Into<String>, kind: PhaseKind, rules: Vec<Rule>) -> Self {
Self {
name: name.into(),
rules,
@@ -1152,30 +911,17 @@ impl<C> Phase<C> {
/// .add_phase("desugar", PhaseKind::Repeating, desugar_rules)
/// .with_output_node_types_yaml(yaml);
/// ```
///
/// The optional type parameter `C` is the user context type threaded through
/// rule transforms. Defaults to `()` (no user context).
pub struct DesugaringConfig<C = ()> {
#[derive(Default)]
pub struct DesugaringConfig {
/// Phases of rule application, applied in order.
pub phases: Vec<Phase<C>>,
pub phases: Vec<Phase>,
/// Output node-types in YAML format. If `None`, the input grammar's
/// node types are used (i.e. the desugared AST has the same node types
/// as the tree-sitter grammar).
pub output_node_types_yaml: Option<&'static str>,
}
// Manual `Default` impl so users with a custom `C` that doesn't implement
// `Default` can still construct an empty config.
impl<C> Default for DesugaringConfig<C> {
fn default() -> Self {
Self {
phases: Vec::new(),
output_node_types_yaml: None,
}
}
}
impl<C> DesugaringConfig<C> {
impl DesugaringConfig {
/// Create an empty configuration. Add phases via [`add_phase`] and an
/// optional output schema via [`with_output_node_types_yaml`].
pub fn new() -> Self {
@@ -1187,7 +933,7 @@ impl<C> DesugaringConfig<C> {
mut self,
name: impl Into<String>,
kind: PhaseKind,
rules: Vec<Rule<C>>,
rules: Vec<Rule>,
) -> Self {
self.phases.push(Phase::new(name, kind, rules));
self
@@ -1209,15 +955,15 @@ impl<C> DesugaringConfig<C> {
}
}
pub struct Runner<'a, C = ()> {
pub struct Runner<'a> {
language: tree_sitter::Language,
schema: schema::Schema,
phases: &'a [Phase<C>],
phases: &'a [Phase],
}
impl<'a, C> Runner<'a, C> {
impl<'a> Runner<'a> {
/// Create a runner using the input grammar's schema for output.
pub fn new(language: tree_sitter::Language, phases: &'a [Phase<C>]) -> Self {
pub fn new(language: tree_sitter::Language, phases: &'a [Phase]) -> Self {
let schema = schema::Schema::from_language(&language);
Self {
language,
@@ -1230,7 +976,7 @@ impl<'a, C> Runner<'a, C> {
pub fn with_schema(
language: tree_sitter::Language,
schema: &schema::Schema,
phases: &'a [Phase<C>],
phases: &'a [Phase],
) -> Self {
Self {
language,
@@ -1242,7 +988,7 @@ impl<'a, C> Runner<'a, C> {
/// Create a runner from a [`DesugaringConfig`].
pub fn from_config(
language: tree_sitter::Language,
config: &'a DesugaringConfig<C>,
config: &'a DesugaringConfig,
) -> Result<Self, String> {
let schema = config.build_schema(&language)?;
Ok(Self {
@@ -1251,17 +997,11 @@ impl<'a, C> Runner<'a, C> {
phases: &config.phases,
})
}
}
impl<'a, C: Clone> Runner<'a, C> {
/// Parse `tree` against `source` and run all phases, threading
/// `user_ctx` through every rule transform. The caller owns the
/// initial context state.
pub fn run_from_tree_with_ctx(
pub fn run_from_tree(
&self,
tree: &tree_sitter::Tree,
source: &[u8],
user_ctx: &mut C,
) -> Result<Ast, String> {
let mut ast = Ast::from_tree_with_schema_and_source(
self.schema.clone(),
@@ -1269,13 +1009,11 @@ impl<'a, C: Clone> Runner<'a, C> {
&self.language,
source.to_vec(),
);
self.run_phases(&mut ast, user_ctx)?;
self.run_phases(&mut ast)?;
Ok(ast)
}
/// Parse `input` and run all phases, threading `user_ctx` through
/// every rule transform. The caller owns the initial context state.
pub fn run_with_ctx(&self, input: &str, user_ctx: &mut C) -> Result<Ast, String> {
pub fn run(&self, input: &str) -> Result<Ast, String> {
let mut parser = tree_sitter::Parser::new();
parser
.set_language(&self.language)
@@ -1289,24 +1027,20 @@ impl<'a, C: Clone> Runner<'a, C> {
&self.language,
input.as_bytes().to_vec(),
);
self.run_phases(&mut ast, user_ctx)?;
self.run_phases(&mut ast)?;
Ok(ast)
}
/// Apply each phase in turn to the AST, threading the root through.
/// A single `FreshScope` is shared across phases so that fresh
/// identifiers generated in different phases don't collide.
fn run_phases(&self, ast: &mut Ast, user_ctx: &mut C) -> Result<(), String> {
fn run_phases(&self, ast: &mut Ast) -> Result<(), String> {
let fresh = tree_builder::FreshScope::new();
let mut root = ast.get_root();
for phase in self.phases {
let res = match phase.kind {
PhaseKind::Repeating => {
apply_repeating_rules(&phase.rules, ast, user_ctx, root, &fresh)
}
PhaseKind::OneShot => {
apply_one_shot_rules(&phase.rules, ast, user_ctx, root, &fresh)
}
PhaseKind::Repeating => apply_repeating_rules(&phase.rules, ast, root, &fresh),
PhaseKind::OneShot => apply_one_shot_rules(&phase.rules, ast, root, &fresh),
}
.map_err(|e| format!("Phase `{}`: {e}", phase.name))?;
if res.len() != 1 {
@@ -1322,78 +1056,3 @@ impl<'a, C: Clone> Runner<'a, C> {
Ok(())
}
}
impl<'a, C: Clone + Default> Runner<'a, C> {
/// Parse `tree` against `source` and run all phases, using the
/// default context (`C::default()`) as the initial context state.
pub fn run_from_tree(&self, tree: &tree_sitter::Tree, source: &[u8]) -> Result<Ast, String> {
let mut user_ctx = C::default();
self.run_from_tree_with_ctx(tree, source, &mut user_ctx)
}
/// Parse `input` and run all phases, using the default context
/// (`C::default()`) as the initial context state.
pub fn run(&self, input: &str) -> Result<Ast, String> {
let mut user_ctx = C::default();
self.run_with_ctx(input, &mut user_ctx)
}
}
// ---------------------------------------------------------------------------
// Desugarer: type-erased view of a DesugaringConfig + Runner
// ---------------------------------------------------------------------------
/// Type-erased interface to a desugaring pipeline for a single language.
///
/// Consumers (e.g. a generic tree-sitter extractor) hold
/// `Box<dyn Desugarer>` so they can dispatch through the trait without
/// knowing the user context type `C` that's internal to yeast.
///
/// Construct one via [`ConcreteDesugarer::new`] from a
/// [`DesugaringConfig<C>`] and a [`tree_sitter::Language`].
pub trait Desugarer: Send + Sync {
/// The output AST schema (in YAML format), or `None` if the input
/// grammar's schema should be used.
fn output_node_types_yaml(&self) -> Option<&'static str>;
/// Parse `tree` against `source` and run the desugaring pipeline.
/// Each call constructs a fresh default user context internally.
fn run_from_tree(&self, tree: &tree_sitter::Tree, source: &[u8]) -> Result<Ast, String>;
}
/// A concrete [`Desugarer`] backed by a [`DesugaringConfig<C>`] for a
/// specific user context type `C`. Stores the language and a pre-built
/// schema so that per-call cost is bounded to constructing a transient
/// [`Runner`] and cloning the schema (no YAML re-parsing).
pub struct ConcreteDesugarer<C: Default + Clone + Send + Sync + 'static> {
language: tree_sitter::Language,
schema: schema::Schema,
config: DesugaringConfig<C>,
}
impl<C: Default + Clone + Send + Sync + 'static> ConcreteDesugarer<C> {
/// Build a desugarer for `language` from `config`. Parses the output
/// schema YAML once (if set) and stores it for reuse across files.
pub fn new(
language: tree_sitter::Language,
config: DesugaringConfig<C>,
) -> Result<Self, String> {
let schema = config.build_schema(&language)?;
Ok(Self {
language,
schema,
config,
})
}
}
impl<C: Default + Clone + Send + Sync + 'static> Desugarer for ConcreteDesugarer<C> {
fn output_node_types_yaml(&self) -> Option<&'static str> {
self.config.output_node_types_yaml
}
fn run_from_tree(&self, tree: &tree_sitter::Tree, source: &[u8]) -> Result<Ast, String> {
let runner = Runner::with_schema(self.language.clone(), &self.schema, &self.config.phases);
runner.run_from_tree(tree, source)
}
}

View File

@@ -242,7 +242,10 @@ pub fn convert(yaml_input: &str) -> Result<String, String> {
/// Apply YAML node-type definitions to a mutable Schema.
/// Registers all types, fields, and allowed types from the YAML into the schema.
fn apply_yaml_to_schema(yaml: &YamlNodeTypes, schema: &mut crate::schema::Schema) {
fn apply_yaml_to_schema(
yaml: &YamlNodeTypes,
schema: &mut crate::schema::Schema,
) {
// Register all supertypes as node kinds
for name in yaml.supertypes.keys() {
schema.register_kind(name);
@@ -304,8 +307,7 @@ fn apply_yaml_to_schema(yaml: &YamlNodeTypes, schema: &mut crate::schema::Schema
.into_vec()
.into_iter()
.map(|type_ref| {
let (kind, named) =
resolve_type_ref_pair(&type_ref, &named_types, &unnamed_types);
let (kind, named) = resolve_type_ref_pair(&type_ref, &named_types, &unnamed_types);
crate::schema::NodeType { kind, named }
})
.collect::<Vec<_>>();

View File

@@ -198,8 +198,13 @@ impl Schema {
.insert((parent_kind.to_string(), field_id), node_types);
}
pub fn field_types(&self, parent_kind: &str, field_id: FieldId) -> Option<&Vec<NodeType>> {
self.field_types.get(&(parent_kind.to_string(), field_id))
pub fn field_types(
&self,
parent_kind: &str,
field_id: FieldId,
) -> Option<&Vec<NodeType>> {
self.field_types
.get(&(parent_kind.to_string(), field_id))
}
pub fn set_field_cardinality(

View File

@@ -7,7 +7,7 @@ const OUTPUT_SCHEMA_YAML: &str = include_str!("node-types.yml");
/// Helper: parse Ruby source with no rules, return dump.
fn parse_and_dump(input: &str) -> String {
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run(input).unwrap();
dump_ast(&ast, ast.get_root(), input)
}
@@ -18,23 +18,13 @@ fn run_and_dump(input: &str, rules: Vec<Rule>) -> String {
run_phased_and_dump(input, vec![Phase::new("test", PhaseKind::Repeating, rules)])
}
/// Helper: parse Ruby source with custom rules and return the transformed AST.
fn run_and_ast(input: &str, rules: Vec<Rule>) -> Ast {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let phases = vec![Phase::new("test", PhaseKind::Repeating, rules)];
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
runner.run(input).unwrap()
}
/// Helper: parse Ruby source with a custom output schema and multiple
/// rule phases, return dump.
fn run_phased_and_dump(input: &str, phases: Vec<Phase>) -> String {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let runner = Runner::with_schema(lang, &schema, &phases);
let ast = runner.run(input).unwrap();
dump_ast(&ast, ast.get_root(), input)
}
@@ -46,7 +36,7 @@ fn run_and_get_error(input: &str, rules: Vec<Rule>) -> String {
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let phases = vec![Phase::new("test", PhaseKind::Repeating, rules)];
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let runner = Runner::with_schema(lang, &schema, &phases);
runner
.run(input)
.expect_err("expected runner to return an error")
@@ -54,7 +44,7 @@ fn run_and_get_error(input: &str, rules: Vec<Rule>) -> String {
/// Helper: parse Ruby source with no rules and dump with schema type errors.
fn parse_and_dump_typed(input: &str, schema_yaml: &str) -> String {
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run(input).unwrap();
let schema = yeast::node_types_yaml::schema_from_yaml(schema_yaml).unwrap();
dump_ast_with_type_errors(&ast, ast.get_root(), input, &schema)
@@ -64,10 +54,10 @@ fn parse_and_dump_typed(input: &str, schema_yaml: &str) -> String {
/// building schema with language IDs so field checks align with parser fields.
fn parse_and_dump_typed_with_language(input: &str, schema_yaml: &str) -> String {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let runner: Runner = Runner::new(lang.clone(), &[]);
let runner = Runner::new(lang.clone(), &[]);
let ast = runner.run(input).unwrap();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(schema_yaml, &lang).unwrap();
let schema = yeast::node_types_yaml::schema_from_yaml_with_language(schema_yaml, &lang)
.unwrap();
dump_ast_with_type_errors(&ast, ast.get_root(), input, &schema)
}
@@ -76,7 +66,7 @@ fn run_and_dump_typed(input: &str, rules: Vec<Rule>, schema_yaml: &str) -> Strin
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema = yeast::node_types_yaml::schema_from_yaml(schema_yaml).unwrap();
let phases = vec![Phase::new("test", PhaseKind::Repeating, rules)];
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let runner = Runner::with_schema(lang, &schema, &phases);
let ast = runner.run(input).unwrap();
dump_ast_with_type_errors(&ast, ast.get_root(), input, &schema)
}
@@ -166,7 +156,7 @@ fn test_parse_for_loop() {
#[test]
fn test_dump_highlights_type_errors_inline() {
let schema_yaml = r#"
let schema_yaml = r#"
named:
program:
$children*: assignment
@@ -176,13 +166,13 @@ named:
identifier:
"#;
let dump = parse_and_dump_typed("x = 1", schema_yaml);
assert!(dump.contains("integer \"1\" <-- ERROR:"));
let dump = parse_and_dump_typed("x = 1", schema_yaml);
assert!(dump.contains("integer \"1\" <-- ERROR:"));
}
#[test]
fn test_dump_reports_preserved_unknown_kind_after_transformation() {
let schema_yaml = r#"
let schema_yaml = r#"
named:
program:
$children*: assignment
@@ -192,25 +182,25 @@ named:
identifier:
"#;
// This rewrite runs and preserves the RHS node kind via capture.
// With schema above, preserving `integer` should be reported inline.
let rules: Vec<Rule> = vec![yeast::rule!(
(assignment left: (_) @left right: (_) @right)
=>
(assignment
left: {left}
right: {right}
)
)];
// This rewrite runs and preserves the RHS node kind via capture.
// With schema above, preserving `integer` should be reported inline.
let rules = vec![yeast::rule!(
(assignment left: (_) @left right: (_) @right)
=>
(assignment
left: {left}
right: {right}
)
)];
let dump = run_and_dump_typed("x = 1", rules, schema_yaml);
assert!(dump.contains("integer \"1\" <-- ERROR:"));
assert!(dump.contains("node kind 'integer' not in schema"));
let dump = run_and_dump_typed("x = 1", rules, schema_yaml);
assert!(dump.contains("integer \"1\" <-- ERROR:"));
assert!(dump.contains("node kind 'integer' not in schema"));
}
#[test]
fn test_dump_reports_undeclared_field_on_node() {
let schema_yaml = r#"
let schema_yaml = r#"
named:
program:
$children*: assignment
@@ -219,14 +209,14 @@ named:
identifier:
"#;
let dump = parse_and_dump_typed_with_language("x = y", schema_yaml);
assert!(dump.contains("right: identifier \"y\" <-- ERROR:"));
assert!(dump.contains("the node 'assignment' has no field 'right'"));
let dump = parse_and_dump_typed_with_language("x = y", schema_yaml);
assert!(dump.contains("right: identifier \"y\" <-- ERROR:"));
assert!(dump.contains("the node 'assignment' has no field 'right'"));
}
#[test]
fn test_dump_reports_disallowed_kind_in_field_type() {
let schema_yaml = r#"
let schema_yaml = r#"
named:
program:
$children*: assignment
@@ -237,17 +227,17 @@ named:
integer:
"#;
let dump = parse_and_dump_typed_with_language("x = 1", schema_yaml);
assert!(dump.contains("right: integer \"1\" <-- ERROR:"));
assert!(dump.contains("should contain"));
assert!(dump.contains("but got integer"));
let dump = parse_and_dump_typed_with_language("x = 1", schema_yaml);
assert!(dump.contains("right: integer \"1\" <-- ERROR:"));
assert!(dump.contains("should contain"));
assert!(dump.contains("but got integer"));
}
// ---- Query tests ----
#[test]
fn test_query_match() {
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let query = yeast::query!(
@@ -268,7 +258,7 @@ fn test_query_match() {
#[test]
fn test_query_no_match() {
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let query = yeast::query!(
@@ -293,7 +283,7 @@ fn test_query_skips_extras_in_positional_match() {
// captured comment to nothing (a common idiom, e.g.
// `(comment) => ()` in Swift) leaves the capture's match-list empty
// and causes the transform to fail with "Variable X has 0 matches".
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("[1, # comment\n2]").unwrap();
// Navigate to the `array` node: program -> array.
@@ -309,11 +299,15 @@ fn test_query_skips_extras_in_positional_match() {
let matched = query.do_match(&ast, array_id, &mut captures).unwrap();
assert!(matched);
assert_eq!(
ast.get_node(captures.get_var("a").unwrap()).unwrap().kind(),
ast.get_node(captures.get_var("a").unwrap())
.unwrap()
.kind(),
"integer"
);
assert_eq!(
ast.get_node(captures.get_var("b").unwrap()).unwrap().kind(),
ast.get_node(captures.get_var("b").unwrap())
.unwrap()
.kind(),
"integer"
);
}
@@ -321,14 +315,14 @@ fn test_query_skips_extras_in_positional_match() {
#[test]
fn test_reachable_nodes_excludes_orphaned_rewrite_nodes() {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let phases: Vec<Phase> = vec![Phase::new(
let schema = yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang)
.unwrap();
let phases = vec![Phase::new(
"test",
PhaseKind::Repeating,
vec![yeast::rule!((integer) => (identifier "replaced"))],
)];
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let runner = Runner::with_schema(lang, &schema, &phases);
let input = "x = 1";
let ast = runner.run(input).unwrap();
@@ -346,7 +340,7 @@ fn test_reachable_nodes_excludes_orphaned_rewrite_nodes() {
#[test]
fn test_query_repeated_capture() {
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x, y, z = 1").unwrap();
let query = yeast::query!(
@@ -371,7 +365,7 @@ fn test_query_repeated_capture() {
#[test]
fn test_capture_unnamed_node_parenthesized() {
// `("=") @op` captures the unnamed `=` token between left and right.
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let query = yeast::query!(
@@ -395,33 +389,10 @@ fn test_capture_unnamed_node_parenthesized() {
assert!(!op_node.is_named());
}
#[test]
fn test_capture_bare_underscore_repeated() {
// `_` matches named and unnamed nodes in bare-child position. On this
// assignment shape, bare children correspond to unnamed tokens (the `=`).
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let query = yeast::query!((assignment _* @all));
let mut cursor = AstCursor::new(&ast);
cursor.goto_first_child();
let assignment_id = cursor.node_id();
let mut captures = yeast::captures::Captures::new();
let matched = query.do_match(&ast, assignment_id, &mut captures).unwrap();
assert!(matched);
let all = captures.get_all("all");
assert_eq!(all.len(), 1);
assert_eq!(ast.get_node(all[0]).unwrap().kind(), "=");
assert!(!ast.get_node(all[0]).unwrap().is_named());
}
#[test]
fn test_capture_unnamed_node_bare_literal() {
// `"=" @op` (without surrounding parens) is the same as `("=") @op`.
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let query = yeast::query!(
@@ -450,7 +421,7 @@ fn test_bare_underscore_matches_unnamed() {
// Bare `_` matches any node, including unnamed tokens, while `(_)`
// matches only named nodes. Demonstrate by matching the unnamed `=`
// token in the implicit `child` field of an `assignment`.
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let mut cursor = AstCursor::new(&ast);
@@ -489,7 +460,7 @@ fn test_bare_forms_in_field_position() {
// field's value, not just in the bare-children position. This is
// syntactic sugar for `(_)` / `("…")` and goes through the same
// code paths.
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let mut cursor = AstCursor::new(&ast);
@@ -528,7 +499,7 @@ fn test_forward_scan_finds_unnamed_token_late() {
// query for `("end")` skip past the first two and match the third.
// Without forward-scan, the matcher took the first child unconditionally
// and failed.
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("for x in list do\n y\nend").unwrap();
// Navigate: program > for > do (the body wrapper).
@@ -555,7 +526,7 @@ fn test_forward_scan_preserves_order() {
// order. A query for ("end") then ("do") should fail because `do`
// appears before `end` in the source order; once forward-scan has
// consumed `end`, the iterator is exhausted.
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("for x in list do\n y\nend").unwrap();
let mut cursor = AstCursor::new(&ast);
@@ -576,7 +547,7 @@ fn test_forward_scan_preserves_order() {
#[test]
fn test_tree_builder() {
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let mut ast = runner.run("x = 1").unwrap();
let input = "x = 1";
@@ -594,8 +565,7 @@ fn test_tree_builder() {
// Swap left and right
let fresh = yeast::tree_builder::FreshScope::new();
let mut user_ctx = ();
let mut ctx = yeast::build::BuildCtx::new(&mut ast, &captures, &fresh, &mut user_ctx);
let mut ctx = yeast::build::BuildCtx::new(&mut ast, &captures, &fresh);
let new_id = yeast::tree!(ctx,
(program
child: (assignment
@@ -623,7 +593,7 @@ fn test_tree_builder() {
// tree-sitter-ruby grammar with named fields for nodes that only have
// unnamed children in tree-sitter (e.g. block_body.stmt, block_parameters.parameter).
fn ruby_rules() -> Vec<Rule> {
let assign_rule: Rule = yeast::rule!(
let assign_rule = yeast::rule!(
(assignment
left: (left_assignment_list
(identifier)* @left
@@ -648,7 +618,7 @@ fn ruby_rules() -> Vec<Rule> {
)}
);
let for_rule: Rule = yeast::rule!(
let for_rule = yeast::rule!(
(for
pattern: (_) @pat
value: (in (_) @val)
@@ -730,7 +700,7 @@ fn test_desugar_for_loop() {
#[test]
fn test_shorthand_rule() {
let rule: Rule = yeast::rule!(
let rule = yeast::rule!(
(assignment
left: (_) @method
right: (_) @receiver
@@ -882,7 +852,7 @@ fn test_phase_error_includes_phase_name() {
PhaseKind::Repeating,
vec![swap_assignment_rule().repeated()],
)];
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let runner = Runner::with_schema(lang, &schema, &phases);
let err = runner
.run("x = 1")
.expect_err("expected runner to return an error");
@@ -925,7 +895,7 @@ fn test_one_shot_phase() {
PhaseKind::OneShot,
one_shot_xeq1_rules(),
)];
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let runner = Runner::with_schema(lang, &schema, &phases);
let input = "x = 1";
let ast = runner.run(input).unwrap();
@@ -951,7 +921,7 @@ fn test_one_shot_phase_errors_when_no_rule_matches() {
let mut rules = one_shot_xeq1_rules();
rules.pop();
let phases = vec![Phase::new("translate", PhaseKind::OneShot, rules)];
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let runner = Runner::with_schema(lang, &schema, &phases);
let err = runner
.run("x = 1")
@@ -975,7 +945,7 @@ fn test_one_shot_recurses_into_returned_capture() {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let rules: Vec<Rule> = vec![
let rules = vec![
yeast::rule!(
(program (_)* @stmts)
=>
@@ -991,7 +961,7 @@ fn test_one_shot_recurses_into_returned_capture() {
yeast::rule!((integer) => (integer "INT")),
];
let phases = vec![Phase::new("translate", PhaseKind::OneShot, rules)];
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let runner = Runner::with_schema(lang, &schema, &phases);
let input = "x = 1";
let ast = runner.run(input).unwrap();
@@ -1017,7 +987,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let rules: Vec<Rule> = vec![
let rules = vec![
yeast::rule!(
(program (_)* @stmts)
=>
@@ -1038,7 +1008,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {
yeast::rule!((integer) => (integer "INT")),
];
let phases = vec![Phase::new("translate", PhaseKind::OneShot, rules)];
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let runner = Runner::with_schema(lang, &schema, &phases);
let input = "x = 1";
let ast = runner.run(input).unwrap();
@@ -1062,7 +1032,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {
#[test]
fn test_cursor_navigation() {
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let mut cursor = AstCursor::new(&ast);
@@ -1136,7 +1106,7 @@ fn test_desugar_for_with_multiple_assignment() {
/// resolves to the captured node's source text via `YeastDisplay`.
#[test]
fn test_hash_brace_renders_capture_source_text() {
let rule: Rule = rule!(
let rule = rule!(
(call
method: (identifier) @name
receiver: (identifier) @recv
@@ -1165,7 +1135,7 @@ fn test_hash_brace_renders_capture_source_text() {
/// `Display` impl (covered by `YeastDisplay`'s blanket impls for primitives).
#[test]
fn test_hash_brace_renders_integer_expression() {
let rule: Rule = rule!(
let rule = rule!(
(identifier) @_
=>
(identifier #{1 + 2})
@@ -1179,39 +1149,3 @@ fn test_hash_brace_renders_integer_expression() {
"#,
);
}
/// Regression test: `(kind #{capture})` should inherit the captured node's
/// source location, not the full source range of the matched rule root.
#[test]
fn test_hash_brace_uses_capture_location_for_leaf() {
let rule: Rule = rule!(
(call
method: (identifier) @name
receiver: (identifier) @recv
)
=>
(call
method: (identifier #{name})
receiver: (identifier #{recv})
arguments: (argument_list)
)
);
let ast = run_and_ast("foo.bar()", vec![rule]);
let mut bar_ids: Vec<usize> = Vec::new();
for id in ast.reachable_node_ids() {
let Some(node) = ast.get_node(id) else {
continue;
};
if node.kind() == "identifier" && ast.source_text(id) == "bar" {
bar_ids.push(id);
}
}
assert_eq!(bar_ids.len(), 1, "expected exactly one identifier 'bar'");
let bar = ast.get_node(bar_ids[0]).unwrap();
assert_eq!(bar.start_byte(), 4);
assert_eq!(bar.end_byte(), 7);
}

View File

@@ -3,21 +3,25 @@
This is a CodeQL extractor based on tree-sitter.
## Building
- To build the extractor, run `scripts/create-extractor-pack.sh`
To build the extractor, run `scripts/create-extractor-pack.sh`
## Swift Parser
- The Swift parser is defined by `extractor/tree-sitter-swift/grammar.js` and can be edited if needed.
## Editing the Swift grammar
The vendored tree-sitter-swift grammar lives at
`extractor/tree-sitter-swift/`. After editing `grammar.js` (or any other
grammar source), run `scripts/regenerate-grammar.sh` to:
- regenerate `extractor/tree-sitter-swift/src/{parser.c, grammar.json,
node-types.json}` (and the `src/tree_sitter/*.h` headers) via
`tree-sitter generate`; and
- refresh `extractor/tree-sitter-swift/node-types.yml`, the
human-readable companion to `src/node-types.json` produced by yeast's
`node_types_yaml` binary.
- After editing the grammar, always run `scripts/regenerate-grammar.sh`.
`node-types.yml` is the recommended review surface for grammar changes —
it shows the impact of a grammar tweak on the named node kinds, fields,
and child types in a form much easier to read than the raw JSON.
- The raw parse tree is described by `extractor/tree-sitter-swift/node-types.yml` and should be reviewed after grammar changes.
## AST Mapping
- The target AST shape is described by `extractor/ast_types.yml`.
- The mapping from the parse tree to the target AST is found in `extractor/src/languages/swift/swift.rs`
- To run tests for the parser and mapping, run `cargo test` in the `extractor` directory.
## Extractor Testing
- To run extractor tests, run `cargo test` in the `extractor` directory.
- Do not edit the printed ASTs in `extractor/test/corpus` directly. To regenerate the ASTs, run `scripts/update-corpus.sh`.

View File

@@ -2,103 +2,36 @@ supertypes:
expr:
- name_expr
- int_literal
- float_literal
- boolean_literal
- string_literal
- regex_literal
- builtin_expr
- binary_expr
- unary_expr
- call_expr
- member_access_expr
- super_expr
- function_expr
- array_literal
- map_literal
- key_value_pair
- tuple_expr
- type_cast_expr
- type_test_expr
- if_expr
- assign_expr
- compound_assign_expr
- pattern_guard_expr
- empty_expr
- block
- break_expr
- continue_expr
- return_expr
- throw_expr
- try_expr
- switch_expr
- lambda_expr
- unsupported_node
expr_or_pattern:
- expr
- pattern
expr_or_type:
- expr
- type_expr
pattern:
- name_pattern
- tuple_pattern
- constructor_pattern
- ignore_pattern
- expr_equality_pattern
- bulk_importing_pattern
- unsupported_node
# A statement is anything that can appear in a block.
# This type contains all of 'expr' and has partial overlap with 'member'.
# For example, type_alias_declaration can appear either as a stmt or member.
# constructor_declaration and destructor_declaration appear here because
# tree-sitter-swift's error recovery for #if/#endif in class bodies can place
# init/deinit declarations at the wrong (statement) level.
stmt:
- expr
- variable_declaration
- type_alias_declaration
- function_declaration
- import_declaration
- operator_syntax_declaration
- class_like_declaration
- accessor_declaration
- constructor_declaration
- destructor_declaration
- empty_stmt
- block_stmt
- expr_stmt
- if_stmt
- variable_declaration_stmt
- guard_if_stmt
- for_each_stmt
- while_stmt
- do_while_stmt
- labeled_stmt
# A member is anything that can appear in the body of a class-like declaration
member:
- constructor_declaration
- destructor_declaration
- function_declaration
- variable_declaration
- accessor_declaration
- initializer_declaration
- class_like_declaration
- type_alias_declaration
- associated_type_declaration
- unsupported_node
type_expr:
- named_type_expr
- generic_type_expr
- tuple_type_expr
- function_type_expr
- inferred_type_expr
condition:
- expr_condition
- let_pattern_condition
- sequence_condition
- unsupported_node
pattern:
- var_pattern
- apply_pattern
- tuple_pattern
- ignore_pattern
- unsupported_node
type_constraint:
- equality_type_constraint
- bound_type_constraint
operator:
- infix_operator
- prefix_operator
- postfix_operator
named:
# Top-level is the root node, containing a single block of statements
# (which are themselves expressions or declarations).
# Top-level is the root node, currently containing a list of expressions
top_level:
body: block
body*: [expr, stmt]
# An identifier used in the context of an expression
name_expr:
@@ -107,28 +40,13 @@ named:
# An integer literal
int_literal:
# A floating-point literal
float_literal:
# A boolean literal
boolean_literal:
# A literal backed by a keyword such as `nil`, `null`, or `nullptr`.
#
# Although nil/null are keyword literals in many languages there should be
# no attempt to normalize "null-like" named entities, like Python's `None`.
builtin_expr:
# A string literal
string_literal:
# A regex literal
regex_literal:
# Application of a binary operator, such as `a + b`
binary_expr:
left: expr
operator: infix_operator
operator: operator
right: expr
# Application of a unary operator, such as `!x`
@@ -136,310 +54,86 @@ named:
operand: expr
operator: operator
# Plain assignment
assign_expr:
target: expr_or_pattern
value: expr
# Compound assignment
compound_assign_expr:
target: expr
operator: infix_operator
value: expr
# A function or method call, such as `f(x)` or `obj.m(x)`.
#
# Method calls are represented as a call whose `function` is a `member_access_expr`.
#
# Constructor calls are marked by a language-specific modifier, and the target may be
# a `type_expr` if the parser can deduce that the target is a type.
# A function or method call, such as `f(x)` or `obj.m(x)`. Method calls
# are represented as a call whose `function` is a `member_access_expr`.
call_expr:
modifier*: modifier
callee: expr_or_type
argument*: argument
argument:
modifier*: modifier
name?: identifier
value: expr
function: expr
argument*: expr
# Member access, such as `obj.member`.
#
# The base may be a type expression when it is a static member access like `Array<Int>.method`.
# In ambiguous cases where the parser cannot distinguish static and instance member access, the base
# will be typically be an expression.
#
# For `super.x` the base will be an instance of `super_expr`.
member_access_expr:
base: expr_or_type
target: expr
member: identifier
# A type expression that refers to a type inferred from the contextual type.
# This is used to translate Swift's leading-dot syntax, `.foo`, which means `T.foo` where
# `T` is the contextual type of some enclosing expression. This is translated to a member_access
# with an inferred_type_expr as the base.
inferred_type_expr:
# A `super` token, which can usually only appear as the base of member access.
super_expr:
function_expr:
modifier*: modifier
capture_declaration*: variable_declaration
lambda_expr:
parameter*: parameter
return_type?: type_expr
body: block
body: [expr, stmt]
array_literal:
element*: expr
map_literal:
element*: expr
# A key-value pair, usually appearing as a named argument or as part of a map literal.
#
# For some languages, the key-value pair is a first class value and this type of expression
# may thus appear anywhere in the general case.
key_value_pair:
key: expr
value: expr
# A tuple expression, such as `(a, b, c)`.
tuple_expr:
element*: expr
# A parameter.
#
# `type` is its declared type annotation (if any)
#
# `pattern` binds the parameter's internal name(s). For a simple parameter this is a
# `name_pattern`, but may be an arbitrary pattern for languages where patterns may appear
# in the parameter list.
#
# `external_name` is the name by which to call sites refer to the parameter, if the parameter
# can be passed as a named parameter. For example, the Swift function `func greet(person id: String)`
# would have `person` as the external name and a `name_pattern` wrapping `id` is the parameter's pattern.
# A parameter
parameter:
modifier*: modifier
external_name?: identifier
type?: type_expr
pattern?: pattern
default?: expr
# An expression that does nothing. Used where the grammar permits an
# empty statement (e.g. a stray `;`).
empty_expr:
# A brace-delimited sequence of statements (`{ ... }`). Blocks are the
# only nodes that can directly contain statements; every other body-like
# field holds a single `block`.
block:
stmt*: stmt
if_expr:
condition: expr
then?: expr
else?: expr
# A variable declaration or destructuring assignment that introduces new variables.
#
# Any occurrence of `var_patterns` in 'pattern' result in fresh bindings that are
# in scope for the rest of the enclosing block.
#
# The initializer is optional (but typically cannot be omitted if combined with a non-trivial pattern).
#
# Modifiers should include 'var', 'let', 'const', etc, if they are significant.
# A grouped declaration like `let x = 1, y = 2` is emitted as a sequence of
# `variable_declaration`s directly into the enclosing stmt/member slot; every
# declaration after the first in such a group is tagged with a synthetic
# `chained_declaration` modifier so the grouping can be recovered downstream.
variable_declaration:
modifier*: modifier
pattern: pattern
type?: type_expr
empty_stmt:
block_stmt:
body*: stmt
expr_stmt:
expr: expr
if_stmt:
condition: condition
then?: stmt
else?: stmt
variable_declaration_stmt:
variable_declarator+: variable_declarator
# A variable declaration, or assignment to a pattern.
# The initializer is optional (but typically only possible in combination with a simple variable pattern).
variable_declarator:
pattern: pattern
value?: expr
# Evaluate 'condition', and if false, execute 'else' which must break from the enclosing block scope (return, break, etc).
# Any variables bound by 'condition' will be in scope for the remainder of the enclosing block scope
# (which differs from how if_expr works).
# (which differs from how if_stmt works).
guard_if_stmt:
condition: expr
else: block
condition: condition
else: stmt
# `break` (with optional label)
break_expr:
label?: identifier
# Evaluates the given condition and interprets it as a boolean (by language conventions)
expr_condition:
expr: expr
# `continue` (with optional label)
continue_expr:
label?: identifier
# A labeled statement, such as `outer: for ... { ... }`. The labeled
# statement appears as the `stmt` field; `break`/`continue` may target
# the label.
labeled_stmt:
label: identifier
stmt: stmt
# `return value` or bare `return`
return_expr:
value?: expr
# `throw value`
throw_expr:
value?: expr
# An import declaration.
#
# The semantics of an import are generally:
# - Evaluate the 'imported_expr' to a value (possibly a compile-time value, such as namespace)
# - Filter away possible values based on modifiers (e.g. type-only imports only accept types)
# - Assign the value to the pattern, binding variables and/or type names in scope
#
import_declaration:
modifier*: modifier
imported_expr: expr # Qualified names are encoded as a chain of member_access_expr ending with a name_expr
pattern?: pattern # Binds local names in scope (possibly via bulk_importing_pattern)
# `typealias Name = Type`
type_alias_declaration:
modifier*: modifier
name: identifier
type_parameter*: type_parameter
type_constraint*: type_constraint
type: type_expr
# A top-level function declaration.
function_declaration:
modifier*: modifier
name: identifier
type_parameter*: type_parameter
type_constraint*: type_constraint
parameter*: parameter
return_type?: type_expr
body?: block
# `for pattern in iterable [where guard] { body }`.
for_each_stmt:
modifier*: modifier
pattern: pattern
iterable: expr
guard?: expr
body?: block
# `while condition { body }`.
while_stmt:
modifier*: modifier
condition: expr
body?: block
# `repeat { body } while condition`.
do_while_stmt:
modifier*: modifier
body?: block
condition: expr
# `do { body } catch pattern { ... } catch ...`. Swift uses `do`/`catch`
# for error handling; for languages with `try`/`catch`, this is the same shape.
try_expr:
modifier*: modifier
body: block
catch_clause*: catch_clause
catch_clause:
modifier*: modifier
pattern?: pattern
guard?: expr
body: block
# `switch value { case pattern: body case ...: default: body }`
switch_expr:
modifier*: modifier
value: expr
case*: switch_case
# A single `case ...:` (or `default:`) entry in a switch.
# An entry with multiple `case p1, p2:` patterns has multiple `pattern`s.
# A `default:` entry has no patterns.
# An optional `guard` corresponds to a `where`-clause on the case.
switch_case:
modifier*: modifier
pattern*: pattern
guard?: expr
body: block
# A series of statements that are executed before evaluating the trailing condition.
# Useful for languages where a conditional clause may be preceded by side-effecting
# syntactic elements (e.g. binding clauses) that don't themselves form a condition.
sequence_condition:
stmt*: stmt
condition: condition
# Evaluate 'expr' and match its result against 'pattern', and return true if it matches.
# Variables bound by the pattern will be in scope within the 'true' branch controlled by this expression.
#
# In Swift, `if case let PATTERN = EXPR` maps to this node
#
# Java: 'if (x instanceof Foo y && w ...) { ... }'
pattern_guard_expr:
# Variables bound by the pattern will be in scope within the 'true' branch controlled by this condition.
let_pattern_condition:
pattern: pattern
value: expr
# A type cast expression, such as `x as T`, `x as? T`, or `x as! T`. The
# operator distinguishes between the variants.
type_cast_expr:
expr: expr
operator: infix_operator
type: type_expr
# A type-test expression, such as `x is T`. Yields a boolean indicating
# whether `expr` is an instance of `type`.
type_test_expr:
expr: expr
operator: infix_operator
type: type_expr
# An identifier that introduces a variable.
#
# When used as a pattern, the pattern matches anything and binds its incoming value to the variable
name_pattern:
modifier*: modifier
# A pattern matching anything, binding its value to the given variable
var_pattern:
identifier: identifier
# A pattern matching anything, binding no variables, usually using the syntax "_"
ignore_pattern:
# A pattern that matches if the incoming value is equal to the value of the given expression.
# Used for literal patterns in switch (e.g. `case 1:`).
expr_equality_pattern:
expr: expr
# A pattern such as `Some(x)` where `Some` is the constructor and `x` is an argument
apply_pattern:
constructor: expr
argument*: pattern
# A tuple pattern such as `(a, b)` in `let (a, b) = pair`.
#
# Elements of the tuple pattern can have names, such as Swift's `let (foo: x, bar: y) = tuple`.
tuple_pattern:
modifier*: modifier
element*: pattern_element
# A pattern such as `Some(x)` where `Some` is the constructor and `x` is an element.
# The element names are interpreted as argument labels and/or field names.
constructor_pattern:
modifier*: modifier
constructor: expr_or_type
element*: pattern_element
# A pattern with an optional associated name.
pattern_element:
modifier*: modifier
key?: identifier
pattern: pattern
# A pattern that checks if the incoming value has the given type, and if so, the
# value is matched against the given nested pattern (and succeeds iff the nested match succeeds).
#
# In Swift: `if let y = x as? Foo` is a pattern_guard_expr containing a type_test_pattern
# In Java: `x instanceof Foo y` is a type_test_pattern wrapping a name_pattern
type_test_pattern:
pattern: pattern
type: type_expr
# A '*' pattern that imports all members of the incoming value into the local scope
# Currently this can only appear in import declarations.
bulk_importing_pattern:
modifier*: modifier
element*: pattern
# An simple unqualified identifier token
identifier:
@@ -447,129 +141,4 @@ named:
# A node that we don't yet translate
unsupported_node:
infix_operator:
prefix_operator:
postfix_operator:
# The fixity of a custom operator declaration (e.g. "prefix", "infix",
# "postfix"). The value is the keyword string.
fixity:
type_parameter:
modifier*: modifier
name: identifier
bound?: type_expr
# A generic constraint of the form `T == U`, requiring two types to be
# equal. Appears in `where` clauses on generic declarations
# (e.g. Swift `func foo<T, U>() where T == U`).
equality_type_constraint:
left: type_expr
right: type_expr
# A generic constraint of the form `T: Bound`, requiring a type parameter
# to conform to (or inherit from) some other type. Appears in `where`
# clauses on generic declarations (e.g. Swift `where T: Equatable`).
bound_type_constraint:
type: type_expr
bound: type_expr
# `infix operator +++` (and the like) — a declaration of a custom operator.
operator_syntax_declaration:
modifier*: modifier
name: identifier
# The fixity specifier (`prefix`, `infix`, `postfix`), when applicable.
fixity?: fixity
# The declared precedence level, when present (e.g. Swift's
# `infix operator +++ : AdditionPrecedence`).
precedence?: expr
# A class-like declaration: class, struct, interface (protocol), enum (or actor).
# The syntactic kind is carried as a `modifier` (e.g. "class", "struct",
# "interface", "enum", "extension"). The `"enum_case"` modifier additionally
# marks a declaration as an enum case with associated values. Extensions are
# represented as a class-like declaration with the `"extension"` modifier and
# no `name`; the extended type appears as a `base_type`.
class_like_declaration:
modifier*: modifier
name?: identifier
type_parameter*: type_parameter
type_constraint*: type_constraint
base_type*: base_type
member*: member
# One of the base types of a class declaration.
#
# If the language has multiple kinds of base classes (e.g. extends/implements) the
# kind should be included as a modifier on this node.
base_type:
modifier*: modifier
type: type_expr
constructor_declaration:
modifier*: modifier
name?: identifier
parameter*: parameter
body: block
# A destructor / finalizer (Swift `deinit`, C++ `~T()`, etc.).
destructor_declaration:
modifier*: modifier
body: block
# Declaration of a single accessor for a property (such as a getter, setter,
# or observer like Swift's `willSet`/`didSet`).
#
# Multiple accessors for the same property are emitted as a sequence of
# accessor_declaration nodes; every accessor after the first is tagged with
# a synthetic `chained_declaration` modifier so the grouping can be recovered
# downstream. Stored properties with observers are emitted as a
# variable_declaration followed by one accessor_declaration per observer
# (each observer also tagged with `chained_declaration`).
accessor_declaration:
modifier*: modifier
name: identifier
accessor_kind: accessor_kind
parameter*: parameter
type?: type_expr
body?: block
# "get", "set", or a language-specific kind like "didSet"
accessor_kind:
# Static or instance initializer block. That is, code that runs at initialization time of either the class or an instance.
initializer_declaration:
modifier*: modifier
body: block
associated_type_declaration:
modifier*: modifier
name: identifier
bound?: type_expr
named_type_expr:
qualifier?: type_expr
name: identifier
generic_type_expr:
base: type_expr
type_argument*: type_expr
# A tuple type such as `(Int, String)` or `(a: A, b: B)`.
tuple_type_expr:
element*: tuple_type_element
# An element of a `tuple_type_expr`, optionally carrying a label.
tuple_type_element:
name?: identifier
type: type_expr
# A function type such as `(Int, String) -> Bool` or `(x: Int) -> Bool`.
function_type_expr:
parameter*: parameter
return_type: type_expr
# A modifier such as 'static', 'public', or 'async'. For now this is just a leaf node with a string value.
modifier:
operator:

View File

@@ -1,9 +1,9 @@
use clap::Args;
use std::path::PathBuf;
use crate::languages;
use codeql_extractor::extractor::simple;
use codeql_extractor::trap;
use crate::languages;
#[derive(Args)]
pub struct Options {
@@ -35,9 +35,7 @@ pub fn run(options: Options) -> std::io::Result<()> {
prefix: "unified".to_string(),
languages,
trap_dir: options.output_dir,
trap_compression: trap::Compression::from_env(
"CODEQL_EXTRACTOR_UNIFIED_OPTION_TRAP_COMPRESSION",
),
trap_compression: trap::Compression::from_env("CODEQL_EXTRACTOR_UNIFIED_OPTION_TRAP_COMPRESSION"),
source_archive_dir: options.source_archive_dir,
file_lists: vec![options.file_list],
};

View File

@@ -22,19 +22,14 @@ pub fn run(options: Options) -> std::io::Result<()> {
// The QL-visible schema is the unified output AST, not the per-language
// input grammars. Pass it via `desugar.output_node_types_yaml` so the
// generator converts the YAML to JSON node-types.
let desugar =
yeast::DesugaringConfig::new().with_output_node_types_yaml(languages::OUTPUT_AST_SCHEMA);
let desugar = yeast::DesugaringConfig::new()
.with_output_node_types_yaml(languages::OUTPUT_AST_SCHEMA);
let languages = vec![Language {
name: "Unified".to_owned(),
node_types: "", // unused: generator picks up output_node_types_yaml above
node_types: "", // unused: generator picks up output_node_types_yaml above
desugar: Some(desugar),
}];
generate(
languages,
options.dbscheme,
options.library,
"run unified/scripts/create-extractor-pack.sh",
)
generate(languages, options.dbscheme, options.library, "run unified/scripts/create-extractor-pack.sh")
}

File diff suppressed because it is too large Load Diff

View File

@@ -50,35 +50,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "f"
value:
function_expr
body:
block
stmt:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "x"
right: int_literal "2"
parameter:
parameter
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
return_type:
named_type_expr
name: identifier "Int"
===
Closure with shorthand parameters
@@ -111,26 +82,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "f"
value:
function_expr
body:
block
stmt:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "$0"
right:
name_expr
identifier: identifier "$1"
===
Trailing closure
@@ -163,28 +114,6 @@ source_file
top_level
body:
block
stmt:
call_expr
argument:
argument
value:
function_expr
body:
block
stmt:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "$0"
right: int_literal "2"
callee:
member_access_expr
base:
name_expr
identifier: identifier "xs"
member: identifier "map"
===
Closure with capture list
@@ -234,31 +163,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "f"
value:
function_expr
body:
block
stmt:
call_expr
callee:
member_access_expr
base:
name_expr
identifier: identifier "self"
member: identifier "doThing"
capture_declaration:
variable_declaration
modifier: modifier "weak"
pattern:
name_pattern
identifier: identifier "self"
===
Multi-statement closure
@@ -332,46 +236,3 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "f"
value:
function_expr
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "y"
value:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "x"
right: int_literal "1"
return_expr
value:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "y"
right: int_literal "2"
parameter:
parameter
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
return_type:
named_type_expr
name: identifier "Int"

View File

@@ -28,19 +28,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "xs"
value:
array_literal
element:
int_literal "1"
int_literal "2"
int_literal "3"
===
Empty array literal with type
@@ -81,22 +68,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "xs"
type:
generic_type_expr
base:
named_type_expr
name: identifier "Array"
type_argument:
named_type_expr
name: identifier "Int"
value: array_literal "[]"
===
Dictionary literal
@@ -135,14 +106,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "d"
value: map_literal "[\"a\": 1, \"b\": 2]"
===
Set literal
@@ -192,22 +155,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "s"
type:
named_type_expr
name: identifier "Set<Int>"
value:
array_literal
element:
int_literal "1"
int_literal "2"
int_literal "3"
===
Tuple literal
@@ -244,14 +191,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "t"
value: tuple_expr "(1, \"two\", 3.0)"
===
Subscript access
@@ -293,21 +232,9 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "first"
value:
call_expr
argument:
argument
value: int_literal "0"
callee:
name_expr
identifier: identifier "xs"
unsupported_node "// TODO: tree-sitter-swift parses `xs[0]` as a call_expression (same shape"
unsupported_node "// as `xs(0)`), so the mapping currently produces a call_expr. Update the"
unsupported_node "// parser / add a separate subscript_expr node and remap when fixed."
===
Dictionary subscript
@@ -349,21 +276,8 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "v"
value:
call_expr
argument:
argument
value: string_literal "\"key\""
callee:
name_expr
identifier: identifier "d"
unsupported_node "// TODO: same parser issue as the array subscript case above —"
unsupported_node "// `d[\"key\"]` is parsed as `call_expression(d, (\"key\"))`."
===
Tuple member access
@@ -395,16 +309,3 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "n"
value:
member_access_expr
base:
name_expr
identifier: identifier "t"
member: identifier "0"

View File

@@ -35,28 +35,6 @@ source_file
top_level
body:
block
stmt:
if_expr
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
then:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "x"
callee:
name_expr
identifier: identifier "print"
===
If-else
@@ -112,43 +90,6 @@ source_file
top_level
body:
block
stmt:
if_expr
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
else:
block
stmt:
call_expr
argument:
argument
value:
unary_expr
operand:
name_expr
identifier: identifier "x"
operator: prefix_operator "-"
callee:
name_expr
identifier: identifier "print"
then:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "x"
callee:
name_expr
identifier: identifier "print"
===
If-else-if chain
@@ -224,55 +165,6 @@ source_file
top_level
body:
block
stmt:
if_expr
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
else:
if_expr
condition:
binary_expr
operator: infix_operator "<"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
else:
block
stmt:
call_expr
argument:
argument
value: int_literal "3"
callee:
name_expr
identifier: identifier "print"
then:
block
stmt:
call_expr
argument:
argument
value: int_literal "2"
callee:
name_expr
identifier: identifier "print"
then:
block
stmt:
call_expr
argument:
argument
value: int_literal "1"
callee:
name_expr
identifier: identifier "print"
===
If-let optional binding
@@ -315,39 +207,6 @@ source_file
top_level
body:
block
stmt:
if_expr
condition:
pattern_guard_expr
pattern:
constructor_pattern
element:
pattern_element
pattern:
name_pattern
identifier: identifier "value"
constructor:
member_access_expr
base:
named_type_expr
name: identifier "Optional"
member: identifier "some"
value:
name_expr
identifier: identifier "optional"
then:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "value"
callee:
name_expr
identifier: identifier "print"
===
Guard let
@@ -381,30 +240,6 @@ source_file
top_level
body:
block
stmt:
guard_if_stmt
condition:
pattern_guard_expr
pattern:
constructor_pattern
element:
pattern_element
pattern:
name_pattern
identifier: identifier "value"
constructor:
member_access_expr
base:
named_type_expr
name: identifier "Optional"
member: identifier "some"
value:
name_expr
identifier: identifier "optional"
else:
block
stmt: return_expr "return"
===
Ternary expression
@@ -442,27 +277,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "y"
value:
if_expr
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
else:
unary_expr
operand: int_literal "1"
operator: prefix_operator "-"
then: int_literal "1"
===
Switch statement
@@ -543,54 +357,6 @@ source_file
top_level
body:
block
stmt:
switch_expr
case:
switch_case
body:
block
stmt:
call_expr
argument:
argument
value: string_literal "\"one\""
callee:
name_expr
identifier: identifier "print"
pattern:
expr_equality_pattern
expr: int_literal "1"
switch_case
body:
block
stmt:
call_expr
argument:
argument
value: string_literal "\"two or three\""
callee:
name_expr
identifier: identifier "print"
pattern:
expr_equality_pattern
expr: int_literal "2"
expr_equality_pattern
expr: int_literal "3"
switch_case
body:
block
stmt:
call_expr
argument:
argument
value: string_literal "\"other\""
callee:
name_expr
identifier: identifier "print"
value:
name_expr
identifier: identifier "x"
===
Switch with binding pattern
@@ -630,7 +396,6 @@ source_file
pattern:
pattern
bound_identifier: simple_identifier "r"
dot: .
name: simple_identifier "circle"
statement:
call_expression
@@ -663,7 +428,6 @@ source_file
pattern:
pattern
bound_identifier: simple_identifier "s"
dot: .
name: simple_identifier "square"
statement:
call_expression
@@ -681,207 +445,3 @@ source_file
top_level
body:
block
stmt:
switch_expr
case:
switch_case
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "r"
callee:
name_expr
identifier: identifier "print"
pattern:
constructor_pattern
element:
pattern_element
pattern:
name_pattern
identifier: identifier "r"
constructor:
member_access_expr
base: inferred_type_expr "."
member: identifier "circle"
switch_case
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "s"
callee:
name_expr
identifier: identifier "print"
pattern:
constructor_pattern
element:
pattern_element
pattern:
name_pattern
identifier: identifier "s"
constructor:
member_access_expr
base: inferred_type_expr "."
member: identifier "square"
value:
name_expr
identifier: identifier "shape"
===
Switch with labeled case pattern arguments
===
switch x {
case .implicit(isAcknowledged: false):
print("yes")
case .thread(threadRowId: _, let rowId):
print(rowId)
}
---
source_file
statement:
switch_statement
entry:
switch_entry
pattern:
switch_pattern
pattern:
pattern
kind:
case_pattern
arguments:
tuple_pattern
item:
tuple_pattern_item
name: simple_identifier "isAcknowledged"
pattern:
pattern
kind:
boolean_literal
dot: .
name: simple_identifier "implicit"
statement:
call_expression
function: simple_identifier "print"
suffix:
call_suffix
arguments:
value_arguments
argument:
value_argument
value:
line_string_literal
text: line_str_text "yes"
switch_entry
pattern:
switch_pattern
pattern:
pattern
kind:
case_pattern
arguments:
tuple_pattern
item:
tuple_pattern_item
name: simple_identifier "threadRowId"
pattern:
pattern
kind: wildcard_pattern "_"
tuple_pattern_item
pattern:
pattern
kind:
binding_pattern
binding:
value_binding_pattern
mutability: let
pattern:
pattern
bound_identifier: simple_identifier "rowId"
dot: .
name: simple_identifier "thread"
statement:
call_expression
function: simple_identifier "print"
suffix:
call_suffix
arguments:
value_arguments
argument:
value_argument
value: simple_identifier "rowId"
expr: simple_identifier "x"
---
top_level
body:
block
stmt:
switch_expr
case:
switch_case
body:
block
stmt:
call_expr
argument:
argument
value: string_literal "\"yes\""
callee:
name_expr
identifier: identifier "print"
pattern:
constructor_pattern
element:
pattern_element
key: identifier "isAcknowledged"
pattern:
expr_equality_pattern
expr: boolean_literal "false"
constructor:
member_access_expr
base: inferred_type_expr "."
member: identifier "implicit"
switch_case
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "rowId"
callee:
name_expr
identifier: identifier "print"
pattern:
constructor_pattern
element:
pattern_element
key: identifier "threadRowId"
pattern: ignore_pattern "_"
pattern_element
pattern:
name_pattern
identifier: identifier "rowId"
constructor:
member_access_expr
base: inferred_type_expr "."
member: identifier "thread"
value:
name_expr
identifier: identifier "x"

View File

@@ -17,12 +17,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "+"
left: int_literal "1"
right: int_literal "2"
===
Another additive expression is desugared
@@ -43,144 +37,3 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "foo"
right:
name_expr
identifier: identifier "bar"
===
Simple import with single name
===
import Foundation
---
source_file
statement:
import_declaration
name:
identifier
part: simple_identifier "Foundation"
---
top_level
body:
block
stmt:
import_declaration
pattern: bulk_importing_pattern "import Foundation"
imported_expr:
name_expr
identifier: identifier "Foundation"
===
Import with dotted path (two parts)
===
import Foundation.Networking
---
source_file
statement:
import_declaration
name:
identifier
part:
simple_identifier "Foundation"
simple_identifier "Networking"
---
top_level
body:
block
stmt:
import_declaration
pattern: bulk_importing_pattern "import Foundation.Networking"
imported_expr:
member_access_expr
base:
name_expr
identifier: identifier "Foundation"
member: identifier "Networking"
===
Import with deeply nested path (three parts)
===
import Foundation.Networking.URLSession
---
source_file
statement:
import_declaration
name:
identifier
part:
simple_identifier "Foundation"
simple_identifier "Networking"
simple_identifier "URLSession"
---
top_level
body:
block
stmt:
import_declaration
pattern: bulk_importing_pattern "import Foundation.Networking.URLSession"
imported_expr:
member_access_expr
base:
member_access_expr
base:
name_expr
identifier: identifier "Foundation"
member: identifier "Networking"
member: identifier "URLSession"
===
Scoped import uses name_pattern
===
import struct Foundation.Date
---
source_file
statement:
import_declaration
name:
identifier
part:
simple_identifier "Foundation"
simple_identifier "Date"
scoped_import_kind: struct
---
top_level
body:
block
stmt:
import_declaration
modifier: modifier "struct"
pattern:
name_pattern
identifier: identifier "Date"
imported_expr:
member_access_expr
base:
name_expr
identifier: identifier "Foundation"
member: identifier "Date"

View File

@@ -31,20 +31,6 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
call_expr
argument:
argument
value: string_literal "\"hello\""
callee:
name_expr
identifier: identifier "print"
name: identifier "greet"
===
Function with parameters and return type
@@ -107,37 +93,6 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
return_expr
value:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
name: identifier "add"
parameter:
parameter
external_name: identifier "_"
pattern:
name_pattern
identifier: identifier "a"
parameter
external_name: identifier "_"
pattern:
name_pattern
identifier: identifier "b"
return_type:
named_type_expr
name: identifier "Int"
===
Function with named parameters
@@ -183,28 +138,6 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "name"
callee:
name_expr
identifier: identifier "print"
name: identifier "greet"
parameter:
parameter
external_name: identifier "person"
pattern:
name_pattern
identifier: identifier "name"
===
Function with default parameter value
@@ -252,28 +185,6 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "name"
callee:
name_expr
identifier: identifier "print"
name: identifier "greet"
parameter:
parameter
default: string_literal "\"world\""
pattern:
name_pattern
identifier: identifier "name"
===
Variadic function
@@ -338,38 +249,6 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
return_expr
value:
call_expr
argument:
argument
value: int_literal "0"
argument
value:
name_expr
identifier: identifier "+"
callee:
member_access_expr
base:
name_expr
identifier: identifier "values"
member: identifier "reduce"
name: identifier "sum"
parameter:
parameter
external_name: identifier "_"
pattern:
name_pattern
identifier: identifier "values"
return_type:
named_type_expr
name: identifier "Int"
===
Function call
@@ -397,17 +276,6 @@ source_file
top_level
body:
block
stmt:
call_expr
argument:
argument
value: int_literal "1"
argument
value: int_literal "2"
callee:
name_expr
identifier: identifier "foo"
===
Function call with labelled arguments
@@ -438,16 +306,6 @@ source_file
top_level
body:
block
stmt:
call_expr
argument:
argument
name: identifier "person"
value: string_literal "\"Bob\""
callee:
name_expr
identifier: identifier "greet"
===
Method call
@@ -478,18 +336,6 @@ source_file
top_level
body:
block
stmt:
call_expr
argument:
argument
value: int_literal "1"
callee:
member_access_expr
base:
name_expr
identifier: identifier "list"
member: identifier "append"
===
Generic function
@@ -541,117 +387,3 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
return_expr
value:
name_expr
identifier: identifier "x"
name: identifier "identity"
parameter:
parameter
external_name: identifier "_"
pattern:
name_pattern
identifier: identifier "x"
return_type:
named_type_expr
name: identifier "T"
===
Leading-dot expression value
===
let x = .foo
---
source_file
statement:
property_declaration
binding:
value_binding_pattern
mutability: let
declarator:
property_binding
name:
pattern
bound_identifier: simple_identifier "x"
value:
prefix_expression
operation: .
target: simple_identifier "foo"
---
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
value:
member_access_expr
base: inferred_type_expr ".foo"
member: identifier "foo"
===
Leading-dot expression call
===
let y = .some(1)
---
source_file
statement:
property_declaration
binding:
value_binding_pattern
mutability: let
declarator:
property_binding
name:
pattern
bound_identifier: simple_identifier "y"
value:
call_expression
function:
prefix_expression
operation: .
target: simple_identifier "some"
suffix:
call_suffix
arguments:
value_arguments
argument:
value_argument
value: integer_literal "1"
---
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "y"
value:
call_expr
argument:
argument
value: int_literal "1"
callee:
member_access_expr
base: inferred_type_expr ".some"
member: identifier "some"

View File

@@ -13,8 +13,6 @@ source_file
top_level
body:
block
stmt: int_literal "42"
===
Negative integer literal
@@ -34,11 +32,6 @@ source_file
top_level
body:
block
stmt:
unary_expr
operand: int_literal "7"
operator: prefix_operator "-"
===
Floating-point literal
@@ -55,8 +48,6 @@ source_file
top_level
body:
block
stmt: float_literal "3.14"
===
Boolean literals
@@ -76,10 +67,6 @@ source_file
top_level
body:
block
stmt:
boolean_literal "true"
boolean_literal "false"
===
Nil literal
@@ -96,8 +83,6 @@ source_file
top_level
body:
block
stmt: builtin_expr "nil"
===
String literal
@@ -116,8 +101,6 @@ source_file
top_level
body:
block
stmt: string_literal "\"hello\""
===
String with interpolation
@@ -139,5 +122,3 @@ source_file
top_level
body:
block
stmt: string_literal "\"hello \\(name)\""

View File

@@ -37,30 +37,6 @@ source_file
top_level
body:
block
stmt:
for_each_stmt
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "x"
callee:
name_expr
identifier: identifier "print"
pattern:
name_pattern
identifier: identifier "x"
iterable:
array_literal
element:
int_literal "1"
int_literal "2"
int_literal "3"
===
For-in over range
@@ -100,29 +76,6 @@ source_file
top_level
body:
block
stmt:
for_each_stmt
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "i"
callee:
name_expr
identifier: identifier "print"
pattern:
name_pattern
identifier: identifier "i"
iterable:
binary_expr
operator: infix_operator "..<"
left: int_literal "0"
right: int_literal "10"
===
For-in with where clause
@@ -166,34 +119,6 @@ source_file
top_level
body:
block
stmt:
for_each_stmt
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "x"
callee:
name_expr
identifier: identifier "print"
pattern:
name_pattern
identifier: identifier "x"
guard:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
iterable:
name_expr
identifier: identifier "xs"
===
While loop
@@ -229,25 +154,6 @@ source_file
top_level
body:
block
stmt:
while_stmt
body:
block
stmt:
compound_assign_expr
operator: infix_operator "-="
target:
name_expr
identifier: identifier "x"
value: int_literal "1"
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
===
Repeat-while loop
@@ -283,25 +189,6 @@ source_file
top_level
body:
block
stmt:
do_while_stmt
body:
block
stmt:
compound_assign_expr
operator: infix_operator "-="
target:
name_expr
identifier: identifier "x"
value: int_literal "1"
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
===
Break and continue
@@ -365,46 +252,3 @@ source_file
top_level
body:
block
stmt:
for_each_stmt
body:
block
stmt:
if_expr
condition:
binary_expr
operator: infix_operator "<"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
then:
block
stmt: continue_expr "continue"
if_expr
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "100"
then:
block
stmt: break_expr "break"
call_expr
argument:
argument
value:
name_expr
identifier: identifier "x"
callee:
name_expr
identifier: identifier "print"
pattern:
name_pattern
identifier: identifier "x"
iterable:
name_expr
identifier: identifier "xs"

View File

@@ -17,16 +17,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Subtraction
@@ -47,16 +37,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "-"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Multiplication
@@ -77,16 +57,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Division
@@ -107,16 +77,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "/"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Operator precedence: addition and multiplication
@@ -141,22 +101,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "a"
right:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "b"
right:
name_expr
identifier: identifier "c"
===
Parenthesised expression
@@ -185,14 +129,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "*"
left: tuple_expr "(a + b)"
right:
name_expr
identifier: identifier "c"
===
Comparison
@@ -213,16 +149,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "<"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Equality
@@ -243,16 +169,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "=="
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Logical and
@@ -273,16 +189,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "&&"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Logical or
@@ -303,16 +209,6 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "||"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Logical not
@@ -332,13 +228,6 @@ source_file
top_level
body:
block
stmt:
unary_expr
operand:
name_expr
identifier: identifier "a"
operator: prefix_operator "!"
===
Range operator
@@ -359,9 +248,3 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "..."
left: int_literal "1"
right: int_literal "10"

View File

@@ -34,22 +34,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
type:
generic_type_expr
base:
named_type_expr
name: identifier "Optional"
type_argument:
named_type_expr
name: identifier "Int"
value: builtin_expr "nil"
===
Optional chaining
@@ -90,22 +74,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "n"
value:
member_access_expr
base:
member_access_expr
base:
name_expr
identifier: identifier "obj"
member: identifier "foo"
member: identifier "bar"
===
Force unwrap
@@ -135,19 +103,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "n"
value:
unary_expr
operand:
name_expr
identifier: identifier "opt"
operator: postfix_operator "!"
===
Nil-coalescing
@@ -177,20 +132,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "n"
value:
binary_expr
operator: infix_operator "??"
left:
name_expr
identifier: identifier "opt"
right: int_literal "0"
===
Throwing function
@@ -226,18 +167,6 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
return_expr
value: string_literal "\"\""
name: identifier "read"
return_type:
named_type_expr
name: identifier "String"
===
Do-catch
@@ -287,33 +216,6 @@ source_file
top_level
body:
block
stmt:
try_expr
body:
block
stmt:
unary_expr
operand:
call_expr
callee:
name_expr
identifier: identifier "foo"
operator: prefix_operator "try"
catch_clause:
catch_clause
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "error"
callee:
name_expr
identifier: identifier "print"
===
Try? expression
@@ -350,21 +252,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "result"
value:
unary_expr
operand:
call_expr
callee:
name_expr
identifier: identifier "foo"
operator: prefix_operator "try?"
===
Try! expression
@@ -401,18 +288,3 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "result"
value:
unary_expr
operand:
call_expr
callee:
name_expr
identifier: identifier "foo"
operator: prefix_operator "try!"

View File

@@ -18,11 +18,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
modifier: modifier "class"
name: identifier "Foo"
===
Class with stored properties
@@ -84,28 +79,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "y"
type:
named_type_expr
name: identifier "Int"
modifier: modifier "class"
name: identifier "Point"
===
Class with initializer
@@ -179,34 +152,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
constructor_declaration
body:
block
stmt:
assign_expr
target:
member_access_expr
base:
name_expr
identifier: identifier "self"
member: identifier "x"
value:
name_expr
identifier: identifier "x"
modifier: modifier "class"
name: identifier "Point"
===
Class with method
@@ -255,29 +200,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "n"
value: int_literal "0"
function_declaration
body:
block
stmt:
compound_assign_expr
operator: infix_operator "+="
target:
name_expr
identifier: identifier "n"
value: int_literal "1"
name: identifier "bump"
modifier: modifier "class"
name: identifier "Counter"
===
Class inheritance
@@ -306,11 +228,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
modifier: modifier "class"
name: identifier "Dog"
===
Struct
@@ -372,28 +289,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "y"
type:
named_type_expr
name: identifier "Int"
modifier: modifier "struct"
name: identifier "Point"
===
Enum with cases
@@ -437,32 +332,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "enum_case"
pattern:
name_pattern
identifier: identifier "north"
variable_declaration
modifier: modifier "enum_case"
pattern:
name_pattern
identifier: identifier "south"
variable_declaration
modifier: modifier "enum_case"
pattern:
name_pattern
identifier: identifier "east"
variable_declaration
modifier: modifier "enum_case"
pattern:
name_pattern
identifier: identifier "west"
modifier: modifier "enum"
name: identifier "Direction"
===
Enum with associated values
@@ -520,40 +389,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
class_like_declaration
member:
constructor_declaration
body: block "circle(radius: Double)"
parameter:
parameter
pattern:
name_pattern
identifier: identifier "radius"
type:
named_type_expr
name: identifier "Double"
modifier: modifier "enum_case"
name: identifier "circle"
class_like_declaration
member:
constructor_declaration
body: block "square(side: Double)"
parameter:
parameter
pattern:
name_pattern
identifier: identifier "side"
type:
named_type_expr
name: identifier "Double"
modifier: modifier "enum_case"
name: identifier "square"
modifier: modifier "enum"
name: identifier "Shape"
===
Protocol declaration
@@ -579,15 +414,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
function_declaration
body: block "func draw()"
name: identifier "draw"
modifier: modifier "protocol"
name: identifier "Drawable"
===
Extension
@@ -637,30 +463,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
function_declaration
body:
block
stmt:
return_expr
value:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "self"
right:
name_expr
identifier: identifier "self"
name: identifier "squared"
return_type:
named_type_expr
name: identifier "Int"
modifier: modifier "extension"
name: identifier "Int"
===
Computed property
@@ -753,48 +555,6 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "w"
type:
named_type_expr
name: identifier "Double"
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "h"
type:
named_type_expr
name: identifier "Double"
accessor_declaration
body:
block
stmt:
return_expr
value:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "w"
right:
name_expr
identifier: identifier "h"
modifier: modifier "var"
name: identifier "area"
type:
named_type_expr
name: identifier "Double"
accessor_kind: accessor_kind "get"
modifier: modifier "class"
name: identifier "Rect"
===
Property with getter and setter
@@ -879,204 +639,3 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "_v"
value: int_literal "0"
accessor_declaration
body:
block
stmt:
return_expr
value:
name_expr
identifier: identifier "_v"
modifier: modifier "var"
name: identifier "v"
type:
named_type_expr
name: identifier "Int"
accessor_kind: accessor_kind "get"
accessor_declaration
body:
block
stmt:
assign_expr
target:
name_expr
identifier: identifier "_v"
value:
name_expr
identifier: identifier "newValue"
modifier:
modifier "var"
modifier "chained_declaration"
name: identifier "v"
type:
named_type_expr
name: identifier "Int"
accessor_kind: accessor_kind "set"
modifier: modifier "class"
name: identifier "Box"
===
Protocol with read-only and read-write property requirements
===
protocol P {
var foo: Int { get }
var bar: String { get set }
}
---
source_file
statement:
protocol_declaration
body:
protocol_body
member:
protocol_property_declaration
name:
pattern
binding:
value_binding_pattern
mutability: var
bound_identifier: simple_identifier "foo"
requirements:
protocol_property_requirements
accessor:
getter_specifier
type:
type_annotation
type:
type
name:
user_type
part:
simple_user_type
name: type_identifier "Int"
protocol_property_declaration
name:
pattern
binding:
value_binding_pattern
mutability: var
bound_identifier: simple_identifier "bar"
requirements:
protocol_property_requirements
accessor:
getter_specifier
setter_specifier
type:
type_annotation
type:
type
name:
user_type
part:
simple_user_type
name: type_identifier "String"
name: type_identifier "P"
---
top_level
body:
block
stmt:
class_like_declaration
member:
accessor_declaration
name: identifier "foo"
type:
named_type_expr
name: identifier "Int"
accessor_kind: accessor_kind "get"
accessor_declaration
name: identifier "bar"
type:
named_type_expr
name: identifier "String"
accessor_kind: accessor_kind "get"
accessor_declaration
modifier: modifier "chained_declaration"
name: identifier "bar"
type:
named_type_expr
name: identifier "String"
accessor_kind: accessor_kind "set"
modifier: modifier "protocol"
name: identifier "P"
===
Enum with comma-separated cases (chained_declaration)
===
enum Suit {
case clubs, diamonds, hearts, spades
}
---
source_file
statement:
class_declaration
body:
enum_class_body
member:
enum_entry
case:
enum_case_entry
name: simple_identifier "clubs"
enum_case_entry
name: simple_identifier "diamonds"
enum_case_entry
name: simple_identifier "hearts"
enum_case_entry
name: simple_identifier "spades"
declaration_kind: enum
name: type_identifier "Suit"
---
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "enum_case"
pattern:
name_pattern
identifier: identifier "clubs"
variable_declaration
modifier:
modifier "chained_declaration"
modifier "enum_case"
pattern:
name_pattern
identifier: identifier "diamonds"
variable_declaration
modifier:
modifier "chained_declaration"
modifier "enum_case"
pattern:
name_pattern
identifier: identifier "hearts"
variable_declaration
modifier:
modifier "chained_declaration"
modifier "enum_case"
pattern:
name_pattern
identifier: identifier "spades"
modifier: modifier "enum"
name: identifier "Suit"

View File

@@ -23,14 +23,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
value: int_literal "1"
===
Var binding
@@ -57,14 +49,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "x"
value: int_literal "1"
===
Let with type annotation
@@ -100,17 +84,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
value: int_literal "1"
===
Var without initialiser
@@ -145,16 +118,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
===
Tuple destructuring binding
@@ -191,28 +154,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
tuple_pattern
element:
pattern_element
pattern:
expr_equality_pattern
expr:
name_expr
identifier: identifier "a"
pattern_element
pattern:
expr_equality_pattern
expr:
name_expr
identifier: identifier "b"
value:
name_expr
identifier: identifier "pair"
===
Multiple bindings on one line
@@ -244,22 +185,6 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
value: int_literal "1"
variable_declaration
modifier:
modifier "let"
modifier "chained_declaration"
pattern:
name_pattern
identifier: identifier "y"
value: int_literal "2"
===
Assignment
@@ -282,13 +207,6 @@ source_file
top_level
body:
block
stmt:
assign_expr
target:
name_expr
identifier: identifier "x"
value: int_literal "1"
===
Compound assignment
@@ -311,138 +229,3 @@ source_file
top_level
body:
block
stmt:
compound_assign_expr
operator: infix_operator "+="
target:
name_expr
identifier: identifier "x"
value: int_literal "1"
===
Property with willSet and didSet observers
===
class C {
var x: Int = 0 {
willSet { print(newValue) }
didSet { print(oldValue) }
}
}
---
source_file
statement:
class_declaration
body:
class_body
member:
property_declaration
binding:
value_binding_pattern
mutability: var
declarator:
property_binding
name:
pattern
bound_identifier: simple_identifier "x"
observers:
willset_didset_block
didset:
didset_clause
body:
block
statement:
call_expression
function: simple_identifier "print"
suffix:
call_suffix
arguments:
value_arguments
argument:
value_argument
value: simple_identifier "oldValue"
willset:
willset_clause
body:
block
statement:
call_expression
function: simple_identifier "print"
suffix:
call_suffix
arguments:
value_arguments
argument:
value_argument
value: simple_identifier "newValue"
type:
type_annotation
type:
type
name:
user_type
part:
simple_user_type
name: type_identifier "Int"
value: integer_literal "0"
declaration_kind: class
name: type_identifier "C"
---
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
value: int_literal "0"
accessor_declaration
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "newValue"
callee:
name_expr
identifier: identifier "print"
modifier:
modifier "var"
modifier "chained_declaration"
name: identifier "x"
accessor_kind: accessor_kind "willSet"
accessor_declaration
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "oldValue"
callee:
name_expr
identifier: identifier "print"
modifier:
modifier "var"
modifier "chained_declaration"
name: identifier "x"
accessor_kind: accessor_kind "didSet"
modifier: modifier "class"
name: identifier "C"

View File

@@ -2,7 +2,7 @@ use std::fs;
use std::path::Path;
use codeql_extractor::extractor::simple;
use yeast::{Runner, dump::dump_ast, dump::dump_ast_with_type_errors};
use yeast::{dump::dump_ast, dump::dump_ast_with_type_errors, Runner};
#[path = "../src/languages/mod.rs"]
mod languages;
@@ -146,36 +146,29 @@ fn render_corpus(cases: &[CorpusCase]) -> String {
out
}
fn run_desugaring(lang: &simple::LanguageSpec, input: &str) -> Result<yeast::Ast, String> {
match lang.desugar.as_deref() {
Some(desugarer) => {
// Parse the input ourselves so we don't depend on the desugarer
// knowing about the language.
let mut parser = tree_sitter::Parser::new();
parser
.set_language(&lang.ts_language)
.map_err(|e| format!("Failed to set language: {e}"))?;
let tree = parser
.parse(input, None)
.ok_or_else(|| "Failed to parse input".to_string())?;
desugarer
.run_from_tree(&tree, input.as_bytes())
.map_err(|e| format!("Desugaring failed: {e}"))
}
None => {
let runner: Runner = Runner::new(lang.ts_language.clone(), &[]);
runner
.run(input)
.map_err(|e| format!("Failed to parse input: {e}"))
}
}
fn run_desugaring(
lang: &simple::LanguageSpec,
input: &str,
) -> Result<yeast::Ast, String> {
let runner = match lang.desugar.as_ref() {
Some(config) => Runner::from_config(lang.ts_language.clone(), config)
.map_err(|e| format!("Failed to create yeast runner: {e}"))?,
None => Runner::new(lang.ts_language.clone(), &[]),
};
runner
.run(input)
.map_err(|e| format!("Failed to parse input: {e}"))
}
/// Produce the raw tree-sitter parse tree dump for `input`, with no
/// desugaring rules applied. Uses a `Runner` with an empty phase list and
/// the input grammar's own schema.
fn dump_raw_parse(lang: &simple::LanguageSpec, input: &str) -> Result<String, String> {
let runner: Runner = Runner::new(lang.ts_language.clone(), &[]);
fn dump_raw_parse(
lang: &simple::LanguageSpec,
input: &str,
) -> Result<String, String> {
let runner = Runner::new(lang.ts_language.clone(), &[]);
let ast = runner
.run(input)
.map_err(|e| format!("Failed to parse input: {e}"))?;
@@ -279,7 +272,11 @@ fn test_corpus() {
}
}
assert!(failures.is_empty(), "{}", failures.join("\n\n") + "\n\n");
assert!(
failures.is_empty(),
"{}",
failures.join("\n\n") + "\n\n"
);
if update_mode {
let updated = render_corpus(&cases);
@@ -288,9 +285,7 @@ fn test_corpus() {
write_result.is_ok(),
"Failed to update corpus file {}: {}",
corpus_path.display(),
write_result
.err()
.map_or_else(String::new, |e| e.to_string())
write_result.err().map_or_else(String::new, |e| e.to_string())
);
}
}

View File

@@ -1368,7 +1368,7 @@ module.exports = grammar({
seq(
field("modifiers", optional($.modifiers)),
"import",
optional(field("scoped_import_kind", $._import_kind)),
optional($._import_kind),
field("name", $.identifier)
),
_import_kind: ($) =>
@@ -1930,7 +1930,7 @@ module.exports = grammar({
seq(
optional("case"),
optional(field("type", $.user_type)), // XXX this should just be _type but that creates ambiguity
field("dot", $._dot),
$._dot,
field("name", $.simple_identifier),
optional(field("arguments", $.tuple_pattern))
),

View File

@@ -173,7 +173,6 @@ named:
value?: expression
case_pattern:
arguments?: tuple_pattern
dot: "."
name: simple_identifier
type?: user_type
catch_block:
@@ -352,7 +351,6 @@ named:
import_declaration:
modifiers?: modifiers
name: identifier
scoped_import_kind?: ["class", "enum", "func", "let", "protocol", "struct", "typealias", "var"]
infix_expression:
lhs: expression
op: custom_operator

File diff suppressed because it is too large Load Diff

View File

@@ -1,18 +0,0 @@
/** Provides classes for working with comments. */
private import unified
/**
* A comment appearing in the source code.
*/
class Comment extends TriviaToken {
// At the moment, comments are the only type trivia token we extract
/**
* Gets the text inside this comment, not counting the delimeters.
*/
string getCommentText() {
result = this.getValue().regexpCapture("//(.*)", 1)
or
result = this.getValue().regexpCapture("(?s)/\\*(.*)\\*/", 1)
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,8 +0,0 @@
/**
* Provides classes for working with the AST, as well as files and locations.
*/
import codeql.Locations
import codeql.files.FileSystem
import codeql.unified.Ast::Unified
import codeql.unified.Comments

View File

@@ -1,37 +1,9 @@
nameExpr
| name_expr.swift:1:9:1:9 | NameExpr | y |
| test.swift:1:8:1:17 | NameExpr | Foundation |
| test.swift:8:9:8:13 | NameExpr | items |
| test.swift:8:22:8:25 | NameExpr | item |
| test.swift:12:16:12:20 | NameExpr | items |
| test.swift:12:31:12:34 | NameExpr | item |
| test.swift:25:18:25:22 | NameExpr | Array |
| test.swift:25:24:25:28 | NameExpr | first |
| test.swift:26:17:26:22 | NameExpr | second |
| test.swift:27:13:27:18 | NameExpr | result |
| test.swift:27:29:27:32 | NameExpr | item |
| test.swift:28:13:28:18 | NameExpr | result |
| test.swift:28:27:28:30 | NameExpr | item |
| test.swift:31:12:31:17 | NameExpr | result |
| test.swift:40:16:40:19 | NameExpr | data |
| test.swift:44:9:44:12 | NameExpr | data |
| test.swift:48:15:48:19 | NameExpr | index |
| test.swift:48:29:48:33 | NameExpr | index |
| test.swift:48:37:48:40 | NameExpr | data |
| test.swift:49:16:49:19 | NameExpr | data |
| test.swift:49:21:49:25 | NameExpr | index |
| test.swift:53:9:53:12 | NameExpr | data |
| test.swift:53:21:53:24 | NameExpr | item |
| test.swift:63:16:63:19 | NameExpr | self |
| test.swift:65:29:65:37 | NameExpr | transform |
| test.swift:65:39:65:43 | NameExpr | value |
| test.swift:67:29:67:33 | NameExpr | error |
| test.swift:76:16:76:19 | NameExpr | self |
| test.swift:76:21:76:21 | NameExpr | i |
| test.swift:76:26:76:29 | NameExpr | self |
| test.swift:76:31:76:31 | NameExpr | i |
| test.swift:86:12:86:17 | NameExpr | values |
| test.swift:87:12:87:17 | NameExpr | values |
| test.swift:87:38:87:43 | NameExpr | values |
| test.swift:87:49:87:57 | NameExpr | transform |
unsupported
| test.swift:3:1:3:38 | | |
| test.swift:16:1:16:32 | | |
| test.swift:23:1:23:37 | | |
| test.swift:34:1:34:49 | | |
| test.swift:57:1:57:30 | | |
| test.swift:72:1:72:37 | | |
| test.swift:84:1:84:24 | | |

View File

@@ -1,3 +0,0 @@
| comments.swift:1:1:1:22 | // Hello this is swift | Hello this is swift |
| comments.swift:3:1:6:3 | /*\n * This is a multi-line comment\n * It should be ignored by the parser\n */ | \n * This is a multi-line comment\n * It should be ignored by the parser\n |
| comments.swift:9:5:9:36 | // This is a single-line comment | This is a single-line comment |

View File

@@ -1,3 +0,0 @@
import unified
query predicate comments(Comment c, string text) { text = c.getCommentText() }

View File

@@ -1,11 +0,0 @@
// Hello this is swift
/*
* This is a multi-line comment
* It should be ignored by the parser
*/
func hello() {
// This is a single-line comment
print("Hello, world!")
}