Commit Graph

87423 Commits

Author SHA1 Message Date
Asger F
cbe4c81ca6 Unified: add tuple_pattern and sequence_condition; refine if-let/guard mapping
ast_types.yml additions:
- tuple_pattern { element*: pattern } in the pattern supertype.
- sequence_condition { stmt*: stmt, condition: condition } in the
  condition supertype.

swift.rs:
- Map Swift tuple destructuring (e.g. `let (a, b) = pair`) to the new
  tuple_pattern instead of synthesizing an apply_pattern.
- if-let / guard-let: explicitly match the value_binding_pattern
  (the `let` keyword) and bind the source expression as the next
  condition child, so `let` no longer leaks into the output.
2026-05-13 10:35:29 +02:00
Asger F
3b7a53f678 yeast-macros: merge repeated field declarations and support repetition in field patterns
Two changes to parse_query_fields:

- Allow `field: (kind)* @cap` (repetition + optional capture) in field
  position, mirroring how it works for bare children.
- When the same field name is declared multiple times in a query (e.g.
  `condition: (foo) condition: (bar)`), merge them into a single
  ordered list of children rather than emitting duplicate field
  entries (which at runtime restart the iterator for the field and
  cause the second declaration to re-match from the first child).
2026-05-13 10:35:27 +02:00
Asger F
ccc1dd5d3e Unified: Add tuple_pattern 2026-05-13 10:35:26 +02:00
Asger F
a966dff76e Unified: Add more patterns and some fixes to the AST 2026-05-13 10:35:24 +02:00
Asger F
6b58482dfb Yeast: Fix text associated with synthesized nodes 2026-05-13 10:35:22 +02:00
Asger F
2307839050 Yeast: Change how patterns with repetition are parsed 2026-05-13 10:35:21 +02:00
Asger F
92838011dd Unified: Add some more AST nodes and rules 2026-05-13 10:35:19 +02:00
Asger F
5772ee4d9b YEAST: add NodeRef type, YeastDisplay trait, and source text storage
Introduce NodeRef as a typed wrapper around node arena IDs. Captures in
desugaring rules are now bound as NodeRef instead of raw usize, which
prevents accidental misuse and enables source-text-aware rendering.

Add the YeastDisplay trait as an alternative to Display: its
yeast_to_string method receives the Ast, allowing NodeRef to resolve to
the captured node's source text instead of printing a numeric ID.

Store the original source bytes in the Ast so that NodeContent::Range
values (from synthesized literal nodes) can be resolved back to text.

Update yeast-macros to emit NodeRef-typed capture bindings and use
Into::<usize>::into where raw IDs are needed. The #{expr} template
syntax now uses YeastDisplay instead of Display.

The effect is visible in the corpus tests: operator nodes now correctly
render as e.g. operator "+" instead of operator "3".

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-13 10:35:17 +02:00
Asger F
72b683d63c Unified: Add Swift corpus tests
Add corpus test cases for Swift covering closures, collections, control
flow, functions, literals, loops, operators, optionals/errors, types,
and variables. Update existing desugar.txt with raw parse sections.

Note: operator nodes currently render their node ID instead of the actual
operator text (e.g. operator "3" instead of operator "+"). This will be
fixed in the next commit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-13 10:35:16 +02:00
Asger F
8a2a48d2dd Unified extractor: add AST schema, swift translation rules, and corpus framework
Add ast_types.yml defining the unified output AST schema with supertypes
(expr, stmt, condition, pattern) and named nodes (top_level, binary_expr,
name_expr, etc.).

Rewrite swift translation rules to map from tree-sitter Swift grammar to
the unified AST, using one-shot phase rules.

Update the generator to use the output AST schema for dbscheme/QL
generation, and normalize the extraction table prefix to 'unified'.

Improve the corpus test framework to include raw tree-sitter parse output,
type-error checking against the output schema, and better failure
reporting.

Regenerate Ast.qll, unified.dbscheme, and update BasicTest accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-13 10:35:14 +02:00
Asger F
5d0cb9e805 YEAST: fix one-shot rules for unnamed nodes and self-captures
One-shot desugaring rules now skip unnamed nodes (punctuation, keywords,
etc.) since rules are intended to target named nodes only.

Also prevent infinite recursion when a capture refers to the root node of
the matched tree (e.g. an @_ capture on the pattern root).

Additionally fix the swift.rs add_phase call to match the updated 3-arg
signature introduced by the one-shot phase kind commit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-13 10:35:12 +02:00
Asger F
bb9e996cb6 Shared: Do not emit ReservedWord class when there are no unnamed tokens 2026-05-13 10:35:11 +02:00
Asger F
c3a9218dcf Yeast: Add one-shot phase kind 2026-05-13 10:35:09 +02:00
Asger F
a049850c51 Yeast: add type-checking errors in AST dump 2026-05-13 10:35:07 +02:00
Asger F
49f19092fb Yeast: add reachable_node_ids() 2026-05-13 10:35:05 +02:00
Asger F
f668b99d6d Unified: Add support for tree-sitter-style corpus tests
This adds tests consisting of source code and a printout of its rewritten AST.
2026-05-13 10:35:02 +02:00
Asger F
cfa175357b Merge pull request #21815 from asgerf/asgerf/missing-node-kind-error
Shared: Nicer panic message if node kind is missing
2026-05-13 10:11:14 +02:00
Owen Mansel-Chan
0b808e1170 Merge pull request #21807 from owen-mc/java/improve-qhelp-unsafe-deserialization
Shared: improve qhelp for unsafe deserialization queries
2026-05-12 22:22:49 +01:00
Taus
5508b1576f Merge pull request #21821 from github/tausbn/unified-swift-grammar-cleanup-phase-1
unified: Swift grammar cleanup part 1
2026-05-12 16:12:09 +02:00
Taus
911e59caef unified: regenerate files 2026-05-12 12:57:26 +00:00
Taus
ff5c0b40f1 unified: add supertypes for various kinds of declarations
Hides a bunch of huge unions under (hopefully) sensible supertypes.
2026-05-12 12:57:26 +00:00
Taus
a5a1312e51 unified: regenerate files 2026-05-12 12:57:25 +00:00
Taus
2608db9fd9 unified: Prevent field bleed-through from _if_let_binding
Same procedure as before -- we change the anonymous node to a named
node, and the problem magically goes away.
2026-05-12 12:57:25 +00:00
Taus
f9e7f90896 unified: regenerate files 2026-05-12 12:57:25 +00:00
Taus
31386f566c unified: drop element field on _parenthesized_type
Same pattern we've seen many times before: a field on an anonymous node
gets attached to the parent node instead.

I'm not 100% sure this is the right solution, but it seemed wrong to
just make `_parenthesized_type` named instead (we don't usually name
parentheticals). At the very least, this cleans up the spurious
navigation_expression.element and tuple_type_item.element fields.
2026-05-12 12:57:25 +00:00
Taus
e9822f67ee unified: regenerate files 2026-05-12 12:57:25 +00:00
Taus
994b27bdbd unified: convert _type into a named rule
Because `_type` was anonymous, its body was inlined in all of the places
it appeared. Because this body contained a `name` field, this field was
_also_ inlined. This caused a bunch of nodes to have spurious `name`
fields, and for some of them (that already had such a field) it caused
that field have multiplicity greater than one.

To fix this, we make the `_type` node named, which prevents the errant
field from escaping.
2026-05-12 12:57:25 +00:00
Taus
a720e258ac unified: regenerate files 2026-05-12 12:57:25 +00:00
Taus
8b977ef8e1 unified: Get rid of some "." bleed
Adds a new type `nested_type_identifier`, which contains the
choice-branch that previously allowed those tokens to bleed through into
the closest parent field.
2026-05-12 12:57:25 +00:00
Taus
caa9b04ad8 unified: regenerate files 2026-05-12 12:57:25 +00:00
Taus
91a46f0340 unified: stop "!" bleeding through
You know the drill. We just make an anonymous node named instead. In
this case, however, we have to be a bit more clever about how to rewrite
it. We turn the sequence of a type followed by an optional ! into a
_choice_ between mere type or type followed by bang (the latter being
our new named node).
2026-05-12 12:57:24 +00:00
Taus
37e1e3c879 unified: regenerate files 2026-05-12 12:57:24 +00:00
Taus
70f3fd1158 unified: make unannotated_type named and supertype
Gets rid of a bunch of ad-hoc node type unions.
2026-05-12 12:57:24 +00:00
Taus
9abfaca98c unified: regenerate files 2026-05-12 12:57:24 +00:00
Taus
38473f9e0b unified: make expression named and a supertype
Supertypes are a honking great idea. We should use more of them.

This massively cleans up the node types, without polluting the AST with
`expression` nodes.
2026-05-12 12:57:24 +00:00
Taus
c7c6e45254 unified: regenerate files 2026-05-12 12:57:24 +00:00
Taus
c0efc52cc7 unified: make if-condition nodes named, to stop bleed
Before, the `condition` field of an if statement supposedly could
contain things like parentheses and commas, due to bleeding from
referenced anonymous nodes. Making the node named makes this issue go
away.
2026-05-12 12:57:24 +00:00
Taus
5c16b0faf9 unified: regenerate files 2026-05-12 12:57:24 +00:00
Taus
7854a534fd unified: stop operators bleeding through everywhere
We make _referenceable_operator a named node. This prevents it from
bleeding through to the _expression definition. It likely also makes the
output easier to deal with, as bare operators used as arguments now have
a named node wrapping them in the AST.

Also removes a duplicated inclusion of _comparison_operator that served
no purpose.
2026-05-12 12:57:24 +00:00
Taus
76a1a87c41 unified: regenerate files 2026-05-12 12:57:23 +00:00
Taus
9062bba168 unified: get rid of undesirable self-recursion in _expression
This caused any field containing an _expression to appear as if it could
countain any number of such nodes. It also threw away the information
that there was a `?` marker there.

To fix it, we simply move the definition into its own named node.
2026-05-12 12:57:23 +00:00
Taus
e709650449 unified: Rebuild generated files
The astute reader will note that we seem to _lose_ some node types in
the process. Apparently, these were unreachable in the grammar, and the
newer version of tree-sitter removes such "dead code".
2026-05-12 12:57:23 +00:00
Taus
513c7bb30b unified: Add scripts for automatically rebuilding Swift grammar 2026-05-12 12:57:23 +00:00
Taus
9c958a420a Merge pull request #21819 from github/tausbn/unified-vendor-in-tree-sitter-swift
unified: use a vendored-in copy of tree-sitter-swift
2026-05-12 14:55:35 +02:00
Taus
2e9de7878b unified: update build dependencies 2026-05-12 11:25:15 +00:00
Taus
c5ae315dbe unified: auto-generate parser files
Uses the `tree-sitter-generate` crate to generate these files on the
fly.
2026-05-12 11:24:35 +00:00
Owen Mansel-Chan
592c7c0437 Merge pull request #21826 from AriehSchneier/fix/go-extractor-root-test-files
Go: Fix extractor to extract root internal test files
2026-05-12 10:34:42 +01:00
Owen Mansel-Chan
c0798f7b1d Merge pull request #21829 from owen-mc/static/update-framework-report-sink-kinds
C#, Go, Java: Use all path injection sinks when generating docs
2026-05-12 10:16:31 +01:00
Jeroen Ketema
cac7262a45 Merge pull request #21831 from jketema/jketema/swift-declared-interface-type
Swift: Expose the declared interface type of a type decl
2026-05-12 09:47:39 +02:00
Owen Mansel-Chan
6b65866ff4 Merge branch 'main' into fix/go-extractor-root-test-files 2026-05-11 17:18:43 +01:00