codeql

mirror of https://github.com/github/codeql.git synced 2026-07-04 19:15:29 +02:00

Author	SHA1	Message	Date
copilot-swe-agent[bot]	041a8e6adc	Fix source_text call in @@raw_lhs documentation example	2026-06-29 11:26:07 +00:00
Taus	fb424020af	yeast: Delete the `Cursor` trait, inline its methods on `AstCursor` The trait had a single implementor (`AstCursor`), three type parameters of which one (`T`) was never used in any method signature, and one external consumer that needed `use yeast::Cursor;` in scope just to call methods on the cursor. The abstraction was overhead without a second implementor to justify it. Move the six trait methods to an inherent `impl AstCursor` block; delete `shared/yeast/src/cursor.rs`, the `pub mod cursor;` and `pub use cursor::Cursor;` lines in `lib.rs`, and the `use yeast::Cursor;` in `tree-sitter-extractor`'s `traverse_yeast`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-29 10:34:36 +00:00
Taus	807bb51df7	yeast: Unify `Node::kind()` and `Node::kind_name()` Both accessors returned the same private `kind_name: &'static str` field; `kind_name()` is widely used (mainly by dump.rs and schema diagnostics) and `kind()` had only 2 internal callers in lib.rs and a handful in tests. Pick the more descriptive name and update the callers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-29 10:34:36 +00:00
Taus	b6abfe6e5c	yeast: Remove dead `prepend_field` / `prepend_field_child` `BuildCtx::prepend_field` and the underlying `Ast::prepend_field_child` existed to support the create-then-mutate pattern in swift.rs (build an output node, then prepend modifiers to its `modifier:` field). The SwiftContext-based refactor on the previous branches eliminated all such call sites: every emitted declaration now carries its modifiers from birth, so the in-place prepend operation has no users. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-29 10:34:35 +00:00
Taus	b3dc7009a4	yeast: Remove dead `BuildCtx::translate_opt` `translate_opt` was a convenience for the manual_rule! body code, collapsing `Option<I>` to `Option<Id>` via `translate`. Since the `@@` raw-capture migration replaced manual_rule! with rule!, no callers remain — the auto-translate prefix handles `Option<Id>` captures directly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-29 10:34:35 +00:00
Taus	e59f646870	yeast: Remove dead Captures methods `Captures::map_captures`, `Captures::map_captures_to`, and `Captures::try_map_all_captures` had no callers. The last one was subsumed by `try_map_captures_except` (which takes a skip list and degenerates to the old behaviour when the list is empty). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-29 10:34:35 +00:00
Taus	cc3c232631	yeast: Replace `{..expr}` splice syntax with trait-dispatched `{expr}` In the initial implementation of yeast, the splice syntax was needed do distinguish between splicing multiple nodes or just a single node. However, this was always an ugly "wart" in the syntax, since the user shouldn't have to worry about these things. To fix this, we add an `IntoFieldIds` trait that dispatches on the value's type: `Id` pushes a single id, and a blanket impl for `IntoIterator<Item: Into<Id>>` handles `Vec<Id>`, `Option<Id>`, and arbitrary iterator chains. With this, we no longer need to use the special splice syntax, and hence we can get rid of it.	2026-06-29 10:34:35 +00:00
Taus	9a5cc3c5e3	yeast: Make `Id` a newtype, delete `NodeRef` Previously, the `Id` type was a bare usize alias. The `NodeRef` newtype existed solely to carry the AST-aware `YeastDisplay` / `YeastSourceRange` impls (so that `#{captured_node}` rendered source text rather than the numeric id) without colliding with the impls for raw integer types. This commit promotes `Id` itself to a (transparent) newtype struct and moves the AST-aware trait impls directly onto it. With `Id` and `usize` now being different types, the integer-display impl (for `usize`) and the source-text impl (for `Id`) coexist without conflict, and `NodeRef` becomes redundant (and so we remove it).	2026-06-29 10:33:32 +00:00
Taus	70ca7af04c	Address PR review comments - unified/swift: Mark `binding_kind` as a raw `@@` capture in the property_declaration rule. It is only used to read its source text (`ctx.ast.source_text`), never as a translated node. With `@` the auto-translate prefix would route the unnamed `let`/`var` token through the catch-all `_ @node => {node}` fallback for a no-op roundtrip; `@@` makes the intent explicit and removes that reliance. - shared/yeast/tests: Reword a stale comment in test_raw_capture_marker. The text claimed a "second assertion" exists in this test, but the explicit-translation check actually lives in the companion test_raw_capture_marker_explicit_translate. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-26 13:30:01 +00:00
Taus	664f0125b9	yeast: Remove now-unused `manual_rule!` The `manual_rule!` macro is now fully subsumed by `rule!` + `@@name`, so this commit simply gets rid of the now no longer needed code.	2026-06-26 12:07:22 +00:00
Taus	1b7f589000	unified/swift: Migrate `manual_rule!` sites to `rule!` + `@@` With `@@name` available, there's no longer a need to use `manual_rule!`. Every place where it is used, we can instead just mark the relevant raw captures as such. This results in quite a lot of cleanup! (Also, to me at least, it makes these rules a lot easier to reason about.) A first iteration of this approach resulted in a lot of `.map(Into::into)` being needed, because `SwiftContext` stores `Id`s, but captures produce `NodeRef`s. To avoid this, I swapped it around so that the context stores `NodeRef`s. This does require adding `.into()` in a few places, but it makes the rest of the code a lot more ergonomic.	2026-06-26 12:07:22 +00:00
Taus	eb7f8cc43d	yeast: Add `@@name` raw-capture syntax to `rule!` The `@@name` capture marker in `rule!` queries skips the auto-translate prefix for that specific capture, letting the body see the original capture (and thus delay its translation using `ctx.translate` until it becomes convenient). Regular `@name` captures continue to be auto-translated as before. Specifically these are translated _eagerly_, before the main body of the rewrite rule is run. I settled on `@@` as the syntax because it did not add new symbols that the user has to keep track of (it's still a kind of capture), but it's still visually distinct enough that the user should be able to tell that there's something special going on. In principle one could accidentally write one form of capture where the other was intended, but in practice this would result in code that did not compile (because the types would not match).	2026-06-26 12:07:21 +00:00
Taus	af7ae8c4cb	Apply rustfmt Format the touched Rust crates (shared/tree-sitter-extractor, shared/yeast, shared/yeast-macros, unified/extractor) so the tree-sitter-extractor CI fmt check passes. No functional changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-25 17:28:24 +02:00
Taus	1c4552edb0	unified/swift: Use `tree!` instead of ctx.node Cleans up a few places where we were constructing trees piece by piece rather than using the `tree!` macro. In the process, Copilot noticed an issue that should probably be addressed: the labeled_statement rule can never fire, since there are no such nodes in the input. This is possibly a simple as making _labeled_statement (which _does_ exist) named, but I haven't attempted this. Finally, a small change to yeast makes it so that the contents of a {} interpolation can be a Rust block (previously it could only be a single expression). This avoids the need to double-wrap instances where you want to interpolate a single node produced as the final value of some block.	2026-06-25 17:28:24 +02:00
Taus	85c39c04e0	yeast: Hide desugaring behind Desugarer trait This was necessary since otherwise the generic type of the user-specified context (which should only be a concern for yeast) starts to bleed out into the shared extractor. Instead, we type-erase it by putting it inside the aforementioned trait.	2026-06-25 17:28:24 +02:00
Taus	1ee142d8bd	yeast: Add macro for fine-grained rules Adds `manual_rule!` which provides a more low-level interface for defining rewrites. (I'm not entirely sold on the name, so any suggestions would be welcome.) Notably, the captures bound in the body of such rules have _not_ been translated yet -- they still come from the _input_ tree. It is the user's duty to call ctx.translate on these (which has the effect of recursively invoking the translation) before substituting them into the output. For _truly_ low-level access, the user can still construct a Rule directly, but this is now somewhat cumbersome as the closure contained therein takes quite a few parameters. Still, the possibility remains.	2026-06-25 17:28:24 +02:00
Taus	a523c7f47f	yeast: Pass raw captures to `Rule::new` rules This enables users to specify how and when these captures get translated. In conjunction with the context mechanism, this can be used to e.g. translate some piece of information (e.g. the type of something), record it in the context, and then recursively translate some other capture that relies on this information. This allows information to be cleanly passed into descendants (which can be written using context accesses in the `rule!` macro form). As a consequence of this change, we now need to pass around a TranslatorHandle to perform the manual translation. For Repeating rules, it doesn't really make sense to translate things, so in this case we simply signal an error. Also, the implementation of the `rule!` macro changes slightly (without changing semantics): it now essentially delegates to `Rule::new`, receiving raw captures, but then immediately applies the translation to those captures (which, for the majority of cases, is likely the desired behaviour).	2026-06-25 17:28:24 +02:00
Taus	5f73754b95	yeast: Make transforms return `Result` This will enable us to actually capture and log errors in complicated rules (e.g. ones written in Rust) rather than just panicking.	2026-06-25 17:28:24 +02:00
Taus	e0fa6cf785	yeast: Reify the context and allow user-defined data in it Renames what was previously called `__yeast_ctx` into just `ctx`, and adds a new field `user_ctx` to this context. Said field can contain a struct of any user type (necessitating making various parts of the implementation generic in said type). Through some Deref magic, field accesses are delegated to the inner struct (assuming they are not already defined on `ctx`), which should hopefully make the interface a bit more ergonomic.	2026-06-25 17:28:24 +02:00
Asger F	6c74cd31e4	Yeast: use child locations instead of rule target Previously, when a node was synthesized it would always take the location from the node that matched the current rule. This resulted in overly broad locations however. For (foo #{bar}) we now take the location of the 'bar' node. For non-leaf nodes we merge all its child node locations.	2026-06-18 14:26:30 +02:00
Asger F	0baa126473	Add ability to prepend fields in Yeast	2026-06-15 10:49:35 +02:00
Asger F	ddc9516e92	Yeast: better support for rewriting unnamed nodes - Ensure the full wildcard _ supports quantifiers - Also rewrite unnamed nodes in one-shot phases	2026-06-15 10:49:31 +02:00
Asger F	3f3bed62d3	yeast: type-check for missing required fields Add FieldCardinality to Schema to track required/multiple per field, populated from the ast_types.yml suffixes (bare = required single, ? = optional single, + = required multiple, * = optional multiple). dump_ast_with_type_errors now emits: <-- ERROR: missing required field 'name' for any node in the output AST whose declared schema requires a field that is absent from the actual node. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-01 14:18:37 +02:00
Asger F	1ecdc3614f	Yeast: Fix matching against extras like comments	2026-06-01 14:18:37 +02:00
Asger F	e3b3888bee	Yeast: Fix handling of captures with multiple results	2026-06-01 14:18:36 +02:00
Asger F	6b58482dfb	Yeast: Fix text associated with synthesized nodes	2026-05-13 10:35:22 +02:00
Asger F	5772ee4d9b	YEAST: add NodeRef type, YeastDisplay trait, and source text storage Introduce NodeRef as a typed wrapper around node arena IDs. Captures in desugaring rules are now bound as NodeRef instead of raw usize, which prevents accidental misuse and enables source-text-aware rendering. Add the YeastDisplay trait as an alternative to Display: its yeast_to_string method receives the Ast, allowing NodeRef to resolve to the captured node's source text instead of printing a numeric ID. Store the original source bytes in the Ast so that NodeContent::Range values (from synthesized literal nodes) can be resolved back to text. Update yeast-macros to emit NodeRef-typed capture bindings and use Into::<usize>::into where raw IDs are needed. The #{expr} template syntax now uses YeastDisplay instead of Display. The effect is visible in the corpus tests: operator nodes now correctly render as e.g. operator "+" instead of operator "3". Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-13 10:35:17 +02:00
Asger F	5d0cb9e805	YEAST: fix one-shot rules for unnamed nodes and self-captures One-shot desugaring rules now skip unnamed nodes (punctuation, keywords, etc.) since rules are intended to target named nodes only. Also prevent infinite recursion when a capture refers to the root node of the matched tree (e.g. an @_ capture on the pattern root). Additionally fix the swift.rs add_phase call to match the updated 3-arg signature introduced by the one-shot phase kind commit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-13 10:35:12 +02:00
Asger F	c3a9218dcf	Yeast: Add one-shot phase kind	2026-05-13 10:35:09 +02:00
Asger F	a049850c51	Yeast: add type-checking errors in AST dump	2026-05-13 10:35:07 +02:00
Asger F	49f19092fb	Yeast: add reachable_node_ids()	2026-05-13 10:35:05 +02:00
Taus	15936a5f8d	yeast: Take fields by ownership in apply_rules_inner Previously, apply_rules_inner snapshotted a node's fields by cloning the BTreeMap into a Vec<(FieldId, Vec<Id>)>, then built a fresh BTreeMap of new_fields for the rewritten Ids. For a node with N fields, this allocated 2N+1 things per visit (the snapshot Vec, N cloned children Vecs, the new BTreeMap entries) — even when nothing in the subtree was rewritten. Use std::mem::take to swap the parent's fields out by ownership: the recursion can mutate the AST (including pushing new nodes from rule firings) without any conflict, since we hold the owned BTreeMap locally. Iterate values_mut() and only allocate a fresh children Vec on the first divergence (lazy alloc): unchanged children stay in the existing slot. When done, swap the fields back. For a subtree with no rewrites, this is now zero allocations per node (modulo the recursion itself). For nodes with rewrites, it's one Vec allocation per field that contains a rewritten child, instead of two plus the BTreeMap rebuild.	2026-05-08 12:48:10 +00:00
Taus	7bd27b83e0	yeast: Mutate parent fields in place; remove redundant Node::id apply_rules_inner used to handle the "child was rewritten, so the parent needs new field IDs" case by cloning the parent node, swapping in the new fields, pushing the clone onto the arena, and returning the new Id. Every ancestor on the path from the rewrite up to the root was duplicated this way, with the originals retained as garbage in the arena. Switch to in-place mutation: assign `ast.nodes[id].fields = new_fields` and return the same Id. Rule firings still produce genuinely new nodes via BuildCtx (their structure differs from the input), but the ancestor-rebuild spine no longer copies anything. This is safe because apply_rules_inner already works entirely by Id: the field snapshot is cloned out before recursing, no &Node references are held across mutations of the arena, and captures are scoped to a single rule firing so the now-stable Ids do not break anything. Memory effect: a desugaring pass that rewrites R leaves of a tree of average depth d previously appended R*d ancestor clones to the arena. Now appends 0. With Ids stable for the lifetime of an Ast, the Node::id field becomes truly redundant and is removed (along with the Node::id() accessor). AstCursor switches from caching `node: &Node` to tracking `node_id: Id` and looking the node up via the arena on each access; ChildrenIter now yields Ids directly. A new AstCursor::node_id() method gives callers access to the cursor position by Id.	2026-05-08 12:47:22 +00:00
Taus	5a4dee50f7	Merge pull request #21810 from github/tausbn/yeast-forward-scan-queries yeast: Align query semantics more closely with tree-sitter	2026-05-08 13:30:43 +02:00
Taus	af6e921da5	yeast: Forward-scan bare child patterns instead of strict positional Previously, a bare child pattern in a query took whatever the next child of the iterator was and either matched or failed: it would not scan ahead to find a match. So `(foo ("baz"))` against a `foo` whose implicit `child` field was `["bar", "baz"]` would fail (the pattern took "bar" first). Switch to forward-scan semantics: a SingleNode matcher advances through the iterator until it finds a child that matches its sub-query. Patterns that are named-only continue to skip past unnamed children for free. Order is preserved across multiple bare patterns at the same level — each pattern advances the shared iterator past whatever it consumed — so a query cannot match children out of source order. Captures from a failed match attempt are rolled back via a snapshot, so partial captures from a complex sub-query do not leak across attempts. Add two regression tests against the `do` body wrapper in a Ruby for-loop, whose implicit `child` field contains [do, identifier, end]: - a query for ("end") matches by skipping past `do` and the identifier - a query for ("end") then ("do") fails, demonstrating order preservation	2026-05-07 15:08:22 +00:00
Taus	6f643a3604	yeast: Use canonical ID when registering unnamed kinds in Schema Schema::from_language registered unnamed kinds via or_insert(id), where `id` came from iterating 0..node_kind_count. For names with multiple unnamed IDs (notably "end" in tree-sitter-ruby has IDs 0 and 13, where ID 0 is the reserved error token), this picked the first encountered ID — typically the wrong one. The visitor sets node.kind via language.id_for_node_kind(name, false), which returns the canonical ID. So a query for ("end") would compare node.kind=13 against schema=0 and silently fail to match, with no diagnostic. Use language.id_for_node_kind(name, false) to obtain the canonical ID when registering, mirroring the named-kind path that already does the same with id_for_node_kind(name, true).	2026-05-07 15:08:21 +00:00
Taus	a4df96aad6	yeast: Support capturing unnamed nodes in queries Three improvements to the query parser, all aimed at allowing query patterns to refer to unnamed tokens: 1. Bare-literal capture: `"=" @op` now captures the unnamed `=` token, matching the parenthesized form `("=") @op`. Previously the literal branch in parse_query_list skipped the maybe_wrap_capture call, so the `@op` was a leftover token and would error. 2. Bare `_` matches any node, named or unnamed. Previously bare `_` and `(_)` both produced QueryNode::Any with the same matches_named_only behaviour, so bare `_` would skip unnamed children. Now Any carries a match_unnamed flag: false for `(_)` (named-only, tree-sitter default) and true for bare `_` (any node). 3. Named fields and bare child patterns may be intermixed in any order. Previously, once parse_query_fields saw a bare pattern it would stop accepting named fields. The fix accumulates bare patterns into the implicit `child` field and keeps parsing. Each named field independently selects its target field for matching, so the source-order of fields in the query is purely cosmetic and intermixing is safe. Add tests covering parenthesized capture, bare-literal capture, and the named-vs-any distinction between `(_)` and bare `_`. Update query-syntax docs to reflect all three.	2026-05-07 15:08:21 +00:00
copilot-swe-agent[bot]	e0d663f79b	yeast: address review wording in phase docs Agent-Logs-Url: https://github.com/github/codeql/sessions/6d23db05-a6e9-4de4-8951-b465980fd0ef Co-authored-by: tausbn <1104778+tausbn@users.noreply.github.com>	2026-05-07 12:35:46 +00:00
Taus	957c89b478	yeast: Support multi-phase desugaring via DesugaringConfig::add_phase Extend the desugaring config from a single flat list of rules to an ordered sequence of named Phases. Each phase runs to completion (a full traversal applying its rules) before the next phase starts. Rules in different phases never compete for matches. The config is built via the new chainable API: DesugaringConfig::new() .add_phase("cleanup", cleanup_rules) .add_phase("desugar", desugar_rules) .with_output_node_types_yaml(yaml); Single-phase configs are just .add_phase(...) called once. A single FreshScope is shared across phases so generated identifier names (e.g. $tmp-N) are unique throughout the run. Phase names appear in error messages, e.g. "Phase `desugar`: exceeded maximum rewrite depth". Add two regression tests: one verifying basic two-phase chained desugaring, and one verifying that errors include the failing phase name.	2026-05-06 21:17:31 +00:00
Taus	9a94836974	yeast: Add per-rule .repeated() flag to opt into iterative matching Previously, after a rule fired the engine would always re-try that same rule on the result root. A rule whose output matched its own query (intentionally or by accident) would loop until the global MAX_REWRITE_DEPTH safety net kicked in. Make the default behavior fire-once-per-node: after a rule fires on node N, the engine no longer tries that same rule on the result root. Other rules and child traversal are unaffected. Rules that intentionally rewrite iteratively can opt into the old behavior via the new Rule::repeated() builder method. Add two regression tests using a self-swapping assignment rule: - with .repeated(), the swap loops and trips the depth limit - without it (default), the swap fires once and terminates	2026-05-06 12:33:18 +00:00
Taus	a0a0e9e9a7	yeast: Add test for chained rules with output-only kinds Adds a regression test verifying that desugaring rules can chain across output-only node kinds: a first rule rewrites an input kind to an output-only kind, and a second rule then rewrites that output-only kind into another output-only kind. This exercises the schema lookup for query patterns whose root kind is not present in the input tree-sitter grammar.	2026-05-06 11:45:53 +00:00
Taus	60dcf88b50	yeast: Add Bazel build rules for yeast crates Add BUILD.bazel files for the yeast and yeast-macros crates, register them as dependencies of the shared tree-sitter extractor, and refresh the vendored crate dependencies via update_tree_sitter_extractors_deps.sh.	2026-05-06 11:34:09 +00:00
Taus	cc28ff9a48	yeast: Add yeast documentation Covers architecture, query language, template language (tree!/trees!/rule!), capture semantics, fresh identifiers, and extractor integration. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Taus	6e580446fd	yeast: Add yeast test suite 12 tests covering parsing, queries, tree building, desugaring rules, cursor navigation, and the shorthand rule! syntax. Tests use a custom output node-types.yml with named fields for all children (parameter, stmt, index), loaded via schema_from_yaml_with_language. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Taus	4c5548363c	yeast: Add AST dumper for human-readable tree output Produces indented text showing node kinds, named fields, and leaf content. Unnamed tokens are hidden unless inside a named field. Used by tests for readable assertions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Taus	8a9e53cc58	yeast: Add YAML node-types format and converter Human-friendly YAML alternative to tree-sitter node-types.json with three sections: supertypes, named, unnamed. Supports bidirectional conversion and building Schema objects from YAML. Includes CLI binary (node_types_yaml) and documentation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Taus	04f587190e	yeast: AST desugaring framework with proc-macro DSL YEAST (YEAST Elaborates Abstract Syntax Trees) is a framework for transforming tree-sitter parse trees before CodeQL extraction. Core components: - shared/yeast/ — Ast, Node, Schema, query matching engine, captures, FreshScope, BuildCtx - shared/yeast-macros/ — proc macros: query!, tree!, trees!, rule! The query language is inspired by tree-sitter queries: (assignment left: (_) @lhs right: (_) @rhs) Templates support embedded Rust ({expr}), splicing ({..expr}), computed literals (#{expr}), and fresh identifiers ($name). The rule! macro combines query and transform: rule!((for pattern: (_) @pat ...) => (call receiver: {val} ...)) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00

47 Commits