codeql

mirror of https://github.com/github/codeql.git synced 2026-06-09 23:14:13 +02:00

Author	SHA1	Message	Date
Asger F	1ecdc3614f	Yeast: Fix matching against extras like comments	2026-06-01 14:18:37 +02:00
Asger F	5772ee4d9b	YEAST: add NodeRef type, YeastDisplay trait, and source text storage Introduce NodeRef as a typed wrapper around node arena IDs. Captures in desugaring rules are now bound as NodeRef instead of raw usize, which prevents accidental misuse and enables source-text-aware rendering. Add the YeastDisplay trait as an alternative to Display: its yeast_to_string method receives the Ast, allowing NodeRef to resolve to the captured node's source text instead of printing a numeric ID. Store the original source bytes in the Ast so that NodeContent::Range values (from synthesized literal nodes) can be resolved back to text. Update yeast-macros to emit NodeRef-typed capture bindings and use Into::<usize>::into where raw IDs are needed. The #{expr} template syntax now uses YeastDisplay instead of Display. The effect is visible in the corpus tests: operator nodes now correctly render as e.g. operator "+" instead of operator "3". Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-13 10:35:17 +02:00
Asger F	c3a9218dcf	Yeast: Add one-shot phase kind	2026-05-13 10:35:09 +02:00
Asger F	a049850c51	Yeast: add type-checking errors in AST dump	2026-05-13 10:35:07 +02:00
Asger F	49f19092fb	Yeast: add reachable_node_ids()	2026-05-13 10:35:05 +02:00
Taus	7bd27b83e0	yeast: Mutate parent fields in place; remove redundant Node::id apply_rules_inner used to handle the "child was rewritten, so the parent needs new field IDs" case by cloning the parent node, swapping in the new fields, pushing the clone onto the arena, and returning the new Id. Every ancestor on the path from the rewrite up to the root was duplicated this way, with the originals retained as garbage in the arena. Switch to in-place mutation: assign `ast.nodes[id].fields = new_fields` and return the same Id. Rule firings still produce genuinely new nodes via BuildCtx (their structure differs from the input), but the ancestor-rebuild spine no longer copies anything. This is safe because apply_rules_inner already works entirely by Id: the field snapshot is cloned out before recursing, no &Node references are held across mutations of the arena, and captures are scoped to a single rule firing so the now-stable Ids do not break anything. Memory effect: a desugaring pass that rewrites R leaves of a tree of average depth d previously appended R*d ancestor clones to the arena. Now appends 0. With Ids stable for the lifetime of an Ast, the Node::id field becomes truly redundant and is removed (along with the Node::id() accessor). AstCursor switches from caching `node: &Node` to tracking `node_id: Id` and looking the node up via the arena on each access; ChildrenIter now yields Ids directly. A new AstCursor::node_id() method gives callers access to the cursor position by Id.	2026-05-08 12:47:22 +00:00
Taus	5a4dee50f7	Merge pull request #21810 from github/tausbn/yeast-forward-scan-queries yeast: Align query semantics more closely with tree-sitter	2026-05-08 13:30:43 +02:00
Taus	af6e921da5	yeast: Forward-scan bare child patterns instead of strict positional Previously, a bare child pattern in a query took whatever the next child of the iterator was and either matched or failed: it would not scan ahead to find a match. So `(foo ("baz"))` against a `foo` whose implicit `child` field was `["bar", "baz"]` would fail (the pattern took "bar" first). Switch to forward-scan semantics: a SingleNode matcher advances through the iterator until it finds a child that matches its sub-query. Patterns that are named-only continue to skip past unnamed children for free. Order is preserved across multiple bare patterns at the same level — each pattern advances the shared iterator past whatever it consumed — so a query cannot match children out of source order. Captures from a failed match attempt are rolled back via a snapshot, so partial captures from a complex sub-query do not leak across attempts. Add two regression tests against the `do` body wrapper in a Ruby for-loop, whose implicit `child` field contains [do, identifier, end]: - a query for ("end") matches by skipping past `do` and the identifier - a query for ("end") then ("do") fails, demonstrating order preservation	2026-05-07 15:08:22 +00:00
Taus	a4df96aad6	yeast: Support capturing unnamed nodes in queries Three improvements to the query parser, all aimed at allowing query patterns to refer to unnamed tokens: 1. Bare-literal capture: `"=" @op` now captures the unnamed `=` token, matching the parenthesized form `("=") @op`. Previously the literal branch in parse_query_list skipped the maybe_wrap_capture call, so the `@op` was a leftover token and would error. 2. Bare `_` matches any node, named or unnamed. Previously bare `_` and `(_)` both produced QueryNode::Any with the same matches_named_only behaviour, so bare `_` would skip unnamed children. Now Any carries a match_unnamed flag: false for `(_)` (named-only, tree-sitter default) and true for bare `_` (any node). 3. Named fields and bare child patterns may be intermixed in any order. Previously, once parse_query_fields saw a bare pattern it would stop accepting named fields. The fix accumulates bare patterns into the implicit `child` field and keeps parsing. Each named field independently selects its target field for matching, so the source-order of fields in the query is purely cosmetic and intermixing is safe. Add tests covering parenthesized capture, bare-literal capture, and the named-vs-any distinction between `(_)` and bare `_`. Update query-syntax docs to reflect all three.	2026-05-07 15:08:21 +00:00
Taus	957c89b478	yeast: Support multi-phase desugaring via DesugaringConfig::add_phase Extend the desugaring config from a single flat list of rules to an ordered sequence of named Phases. Each phase runs to completion (a full traversal applying its rules) before the next phase starts. Rules in different phases never compete for matches. The config is built via the new chainable API: DesugaringConfig::new() .add_phase("cleanup", cleanup_rules) .add_phase("desugar", desugar_rules) .with_output_node_types_yaml(yaml); Single-phase configs are just .add_phase(...) called once. A single FreshScope is shared across phases so generated identifier names (e.g. $tmp-N) are unique throughout the run. Phase names appear in error messages, e.g. "Phase `desugar`: exceeded maximum rewrite depth". Add two regression tests: one verifying basic two-phase chained desugaring, and one verifying that errors include the failing phase name.	2026-05-06 21:17:31 +00:00
Taus	9a94836974	yeast: Add per-rule .repeated() flag to opt into iterative matching Previously, after a rule fired the engine would always re-try that same rule on the result root. A rule whose output matched its own query (intentionally or by accident) would loop until the global MAX_REWRITE_DEPTH safety net kicked in. Make the default behavior fire-once-per-node: after a rule fires on node N, the engine no longer tries that same rule on the result root. Other rules and child traversal are unaffected. Rules that intentionally rewrite iteratively can opt into the old behavior via the new Rule::repeated() builder method. Add two regression tests using a self-swapping assignment rule: - with .repeated(), the swap loops and trips the depth limit - without it (default), the swap fires once and terminates	2026-05-06 12:33:18 +00:00
Taus	a0a0e9e9a7	yeast: Add test for chained rules with output-only kinds Adds a regression test verifying that desugaring rules can chain across output-only node kinds: a first rule rewrites an input kind to an output-only kind, and a second rule then rewrites that output-only kind into another output-only kind. This exercises the schema lookup for query patterns whose root kind is not present in the input tree-sitter grammar.	2026-05-06 11:45:53 +00:00
Taus	6e580446fd	yeast: Add yeast test suite 12 tests covering parsing, queries, tree building, desugaring rules, cursor navigation, and the shorthand rule! syntax. Tests use a custom output node-types.yml with named fields for all children (parameter, stmt, index), loaded via schema_from_yaml_with_language. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00

13 Commits