Compare commits

..

19 Commits

Author SHA1 Message Date
Anders Fugmann
831e87b957 Kotlin extractor: scope Object-method redeclaration recovery
Why this is needed:
- library-tests/java-kotlin-collection-type-generic-methods/test.ql regressed with extra equals(Object) rows on generic collection/map/list declaration variants.
- At the same time, java-interface-redeclares-tostring must still recover Object-method redeclarations for Java binary interfaces under K2.

What changed:
- In K2 ASM probing, treat classes with kotlin.Metadata as non-Java binaries for javaBinaryDeclaresMethod, so Java-redeclaration recovery does not fire on Kotlin binary classes.
- Keep equals(Object) K2 Any/Any? compatibility handling, but constrain the workaround to non-generic parent classes and skip it when a concrete sibling declaration already exists.
- Preserve the existing toString/hashCode redeclaration recovery path for affected Java binaries.

Effect:
- Removes the spurious equals(Object) rows in java-kotlin-collection-type-generic-methods while retaining expected Object-method extraction in java-interface-redeclares-tostring.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 16:23:24 +02:00
Anders Fugmann
4b71b704ae Kotlin extractor: keep synthetic locations for unresolved file classes
Why this is needed:
- library-tests/multiple_files/method_accesses.ql regressed because receiver class locations for external file-class members became concrete file paths.
- For stdlib-style unresolved container-source paths, forcing a concrete location changed stable output from synthetic unknown location to external path-based locations.

What changed:
- Added shouldUseConcreteExternalFileClassLocation to distinguish reliable concrete paths from unresolved placeholders.
- In external package-fragment parent handling, only write an external file-class location when the normalized path is concrete and stable.
- If no reliable path is available, keep prior synthetic behaviour by not forcing a concrete location.

Effect:
- Restores stable receiver-location output for method_accesses while preserving concrete locations when we have trustworthy binary-path information.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 16:23:24 +02:00
Anders Fugmann
9f29100d7c Kotlin extractor: disambiguate binary overload probing
Why this is needed:
- library-tests/exprs/DB-CHECK was failing with INVALID_KEY and INVALID_KEY_SET in params for kotlin.jvm.internal.Intrinsics.areEqual.
- The Java binary probing code matched methods by name plus arity and used the first match, which is ambiguous when both primitive and boxed overloads exist.
- Under that ambiguity, callable labels could be boxed while extracted params remained primitive (or vice versa), creating conflicting rows for the same key.

What changed:
- For both parameter and return-type probing, gather all matching overloads and compute classifier-vs-primitive from the full candidate set.
- Return a concrete answer only when all matches agree; return null when matches disagree.
- Apply the same unambiguous matching rule in both K1 metadata and K2 ASM fallback paths.

Effect:
- The boxing fallback now activates only when the Java binary evidence is deterministic, preventing callable-label collisions and restoring DB integrity in the affected Kotlin2 dataset check.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 16:23:23 +02:00
Anders Fugmann
1eefc06c7a Kotlin tests: roll Kotlin1 suites forward to language-version 2.0
Why this is needed:
- The extractor compatibility fixes now preserve the information these Kotlin1-era
  tests were protecting, even when compiled with Kotlin 2.4 and
  `-language-version 2.0`.
- Keeping mixed legacy language-version wiring in individual tests is no longer
  necessary and obscures the intended steady-state execution mode.

What this changes:
- Update all affected Kotlin1 compatibility integration tests to run with
  `-language-version 2.0` directly.
- Keep the expected extraction signal aligned for extractor information output.
- Remove the obsolete CODEOWNERS entry for the retired `java/ql/test-kotlin1/`
  path.

This consolidates the language-version transition into a single test rollup
commit, as requested.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 16:23:05 +02:00
Anders Fugmann
3f0bb894c2 Kotlin extractor: reconcile Java binary signatures under K2
Why this is needed:
- Under K2, binary Java symbols are represented differently from K1:
  JavaSourceElement metadata is often absent and sources are exposed through
  VirtualFileBasedSourceElement.
- Without recovery logic, callable matching can miss declared Java methods,
  callable labels can drift (primitive vs boxed reference types), and external
  Java declaration stubs can gain wildcard noise when Java signatures are not
  available.
- These differences produced Kotlin 2.0 parity drift in tests that rely on
  stable Java/Kotlin cross-extractor callable identity.

What this changes:
- Add K2-aware Java binary inspection helpers (ASM-based fallback) to detect
  declared methods and parameter/return reference-vs-primitive shape when
  JavaSourceElement metadata is unavailable.
- Recover Java callables more reliably in KotlinUsesExtractor, including a
  binary-class fallback path.
- Normalise callable labels and call result typing to boxed Java classes when
  K2 enhanced reference types appear as Kotlin primitives.
- Accept K2's `Any` form for Object.equals(Object) and keep binary declaration
  checks stable.
- Suppress default wildcard insertion for external Java declaration stubs when
  no Java callable metadata is available, preventing synthetic wildcard drift.

This commit restores Java interop parity for Kotlin 2.0 extraction paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 16:21:26 +02:00
Anders Fugmann
572e096ed3 Kotlin extractor: anchor local variable locations to the identifier
Why this is needed:
- With Kotlin 2.0 analysis, some local-variable locations resolve to a wider
  declaration span than before.
- The previous extractor logic used provider-based ranges that can cover type,
  annotations, and modifiers, which shifts expected variable location facts.
- This caused parity drift in tests that expect the location to point at the
  variable name token itself.

What this changes:
- Cache current source text per file during extraction.
- Derive variable-name offsets by scanning the declaration slice and locating
  the declared identifier token.
- Emit local-variable declaration/expr locations from that identifier span,
  with fallback to the previous provider when source offsets are unavailable.

This restores stable name-anchored variable locations under Kotlin 2.0.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 16:21:26 +02:00
Anders Fugmann
c5e1f38583 Kotlin extractor: restore external file-class locations under K2
Why this is needed:
- Under K2, top-level declarations from external binaries are attached directly
  to IrExternalPackageFragment rather than to an IrClass file-class parent.
- That bypassed the normal class-source location path, so some external file-class
  entities ended up without stable binary file locations.
- Missing/unstable locations caused drift in tests that depend on external file
  class member resolution and location facts.

What this changes:
- Resolve binary paths from IrMemberWithContainerSource (JvmPackagePartSource)
  via a dedicated getContainerSourceBinaryPath helper.
- In KotlinUsesExtractor, when extracting top-level external declarations,
  attach file-class location from container-source binary path when available.
- Track external file classes whose locations were emitted to avoid duplicate
  hasLocation facts.

This targets the K2 external file-class location gap (for example file_classes and
external-property-overloads parity).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 16:21:25 +02:00
Anders Fugmann
0921cd71ec Kotlin wrapper: keep selected compiler install available after cleanups
Why this is needed:
- The dev wrapper persisted the selected version in .kotlinc_version, but only installed binaries when the selected version changed.
- After a clean working directory (which can remove .kotlinc_installed), the version file can still point at an already-selected compiler, causing forward execution to fail because the binary directory no longer exists.

What this changes:
- Make install() idempotent by returning early when install dir already exists.
- Call install() unconditionally from main() so the selected version is always materialised before forwarding.
- Keep explicit reinstall behaviour on version switches by removing the old install directory when selection changes.

This is an independent reliability fix and not tied to Kotlin 1.x test routing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 16:21:25 +02:00
Taus
f1cc1e5c47 Merge pull request #22084 from github/tausbn/yeast-miscellaneous-cleanup
yeast: Miscellaneous cleanup
2026-06-29 14:14:24 +02:00
copilot-swe-agent[bot]
041a8e6adc Fix source_text call in @@raw_lhs documentation example 2026-06-29 11:26:07 +00:00
Taus
fb424020af yeast: Delete the Cursor trait, inline its methods on AstCursor
The trait had a single implementor (`AstCursor`), three type parameters
of which one (`T`) was never used in any method signature, and one
external consumer that needed `use yeast::Cursor;` in scope just to
call methods on the cursor. The abstraction was overhead without a
second implementor to justify it.

Move the six trait methods to an inherent `impl AstCursor` block;
delete `shared/yeast/src/cursor.rs`, the `pub mod cursor;` and
`pub use cursor::Cursor;` lines in `lib.rs`, and the `use yeast::Cursor;`
in `tree-sitter-extractor`'s `traverse_yeast`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 10:34:36 +00:00
Taus
bda8e7dae1 yeast-macros: Remove unused .map and .reduce_left chain syntax
The `{expr}.map(p -> tpl)` and `{expr}.reduce_left(first -> init, acc,
elem -> fold)` post-fix chains on `{expr}` placeholders had no
remaining users in the codebase: `.map` was never used, and the
4 `.reduce_left` sites in `swift.rs` were rewritten to plain
`Iterator::reduce` via an `and_chain` helper in an earlier commit.

Removes the entire `parse_chain_suffix` function (~90 lines) and the
`has_chain` detection / dispatch branches at the two call sites
(field-position in `parse_direct_node_inner` and body-position in
`parse_direct_list`). The remaining `{expr}` path is the
trait-dispatched one introduced by the splice-syntax cleanup, which
handles single ids and iterables uniformly via `IntoFieldIds`.

Also strips the chain syntax from the `tree!` macro doc comment.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 10:34:36 +00:00
Taus
37c8111c18 yeast-macros: Add error message to defensive expect_ident in parse_ctx_or_implicit
The empty error string passed to `expect_ident` was dead code (the
preceding lookahead has already confirmed the token is an ident),
but it would have been a confusing message if it ever fired. Replace
with an explicit "unreachable" string that makes the intent
clearer to readers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 10:34:36 +00:00
Taus
807bb51df7 yeast: Unify Node::kind() and Node::kind_name()
Both accessors returned the same private `kind_name: &'static str`
field; `kind_name()` is widely used (mainly by dump.rs and schema
diagnostics) and `kind()` had only 2 internal callers in lib.rs and
a handful in tests. Pick the more descriptive name and update the
callers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 10:34:36 +00:00
Taus
b6abfe6e5c yeast: Remove dead prepend_field / prepend_field_child
`BuildCtx::prepend_field` and the underlying `Ast::prepend_field_child`
existed to support the create-then-mutate pattern in swift.rs (build
an output node, then prepend modifiers to its `modifier:` field). The
SwiftContext-based refactor on the previous branches eliminated all
such call sites: every emitted declaration now carries its modifiers
from birth, so the in-place prepend operation has no users.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 10:34:35 +00:00
Taus
b3dc7009a4 yeast: Remove dead BuildCtx::translate_opt
`translate_opt` was a convenience for the manual_rule! body code,
collapsing `Option<I>` to `Option<Id>` via `translate`. Since the
`@@` raw-capture migration replaced manual_rule! with rule!, no
callers remain — the auto-translate prefix handles `Option<Id>`
captures directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 10:34:35 +00:00
Taus
e59f646870 yeast: Remove dead Captures methods
`Captures::map_captures`, `Captures::map_captures_to`, and
`Captures::try_map_all_captures` had no callers. The last one was
subsumed by `try_map_captures_except` (which takes a skip list and
degenerates to the old behaviour when the list is empty).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-29 10:34:35 +00:00
Taus
cc3c232631 yeast: Replace {..expr} splice syntax with trait-dispatched {expr}
In the initial implementation of yeast, the splice syntax was needed do
distinguish between splicing multiple nodes or just a single node.
However, this was always an ugly "wart" in the syntax, since the user
shouldn't have to worry about these things.

To fix this, we add an `IntoFieldIds` trait that dispatches on the
value's type: `Id` pushes a single id, and a blanket impl for
`IntoIterator<Item: Into<Id>>` handles `Vec<Id>`, `Option<Id>`, and
arbitrary iterator chains.

With this, we no longer need to use the special splice syntax, and hence
we can get rid of it.
2026-06-29 10:34:35 +00:00
Taus
9a5cc3c5e3 yeast: Make Id a newtype, delete NodeRef
Previously, the `Id` type  was a bare usize alias. The `NodeRef` newtype
existed solely to carry the AST-aware `YeastDisplay` /
`YeastSourceRange` impls (so that `#{captured_node}` rendered source
text rather than the numeric id) without colliding with the impls for
raw integer types.

This commit promotes `Id` itself to a (transparent) newtype struct and
moves the AST-aware trait impls directly onto it. With `Id` and `usize`
now being different types, the integer-display impl (for `usize`) and
the source-text impl (for `Id`) coexist without conflict, and `NodeRef`
becomes redundant (and so we remove it).
2026-06-29 10:33:32 +00:00
77 changed files with 879 additions and 4337 deletions

View File

@@ -28,7 +28,6 @@
/swift/extractor/ @github/codeql-swift @github/code-scanning-language-coverage
/misc/codegen/ @github/codeql-swift
/java/kotlin-extractor/ @github/codeql-kotlin @github/code-scanning-language-coverage
/java/ql/test-kotlin1/ @github/codeql-kotlin
/java/ql/test-kotlin2/ @github/codeql-kotlin
# Experimental CodeQL cryptography

View File

@@ -75,6 +75,9 @@ def get_version():
def install(version: str, quiet: bool):
if install_dir.exists():
return
if quiet:
info_out = subprocess.DEVNULL
info = lambda *args: None
@@ -83,8 +86,6 @@ def install(version: str, quiet: bool):
info = lambda *args: print(*args, file=sys.stderr)
file = file_template.format(version=version)
url = url_template.format(version=version)
if install_dir.exists():
shutil.rmtree(install_dir)
install_dir.mkdir()
zips_dir.mkdir(exist_ok=True)
zip = zips_dir / file
@@ -156,8 +157,11 @@ def main(opts, forwarded_opts):
selected_version = current_version or DEFAULT_VERSION
if selected_version != current_version:
# don't print information about install procedure unless explicitly using --select
install(selected_version, quiet=opts.select is None)
if install_dir.exists():
shutil.rmtree(install_dir)
version_file.write_text(selected_version)
# don't print information about install procedure unless explicitly using --select
install(selected_version, quiet=opts.select is None)
if opts.select and not forwarded_opts and not opts.version:
print(f"selected {selected_version}")
return

View File

@@ -6,6 +6,8 @@ import com.github.codeql.utils.*
import com.github.codeql.utils.versions.*
import com.semmle.extractor.java.OdasaOutput
import java.io.Closeable
import java.nio.file.Files
import java.nio.file.Path
import java.util.*
import kotlin.collections.ArrayList
import org.jetbrains.kotlin.backend.common.extensions.IrPluginContext
@@ -50,6 +52,7 @@ import org.jetbrains.kotlin.load.java.structure.JavaMethod
import org.jetbrains.kotlin.load.java.structure.JavaTypeParameter
import org.jetbrains.kotlin.load.java.structure.JavaTypeParameterListOwner
import org.jetbrains.kotlin.load.java.structure.impl.classFiles.BinaryJavaClass
import org.jetbrains.kotlin.fir.java.VirtualFileBasedSourceElement
import org.jetbrains.kotlin.name.FqName
import org.jetbrains.kotlin.types.Variance
import org.jetbrains.kotlin.util.OperatorNameConventions
@@ -161,23 +164,100 @@ open class KotlinFileExtractor(
}
}
private fun javaBinaryDeclaresMethod(c: IrClass, name: String) =
((c.source as? JavaSourceElement)?.javaElement as? BinaryJavaClass)?.methods?.any {
it.name.asString() == name
private fun javaBinaryDeclaresMethod(c: IrClass, name: String): Boolean? {
// K1 path: source is JavaSourceElement wrapping a BinaryJavaClass - inspect class metadata
val binaryJavaClass = (c.source as? JavaSourceElement)?.javaElement as? BinaryJavaClass
if (binaryJavaClass != null) {
return binaryJavaClass.methods.any { it.name.asString() == name }
}
// K2 path: binary Java classes use VirtualFileBasedSourceElement instead of
// JavaSourceElement. The BinaryJavaClass is not stored in the source element, so we parse
// the class bytes directly using ASM to check if the method is explicitly declared.
if (c.source is VirtualFileBasedSourceElement) {
val virtualFile = (c.source as VirtualFileBasedSourceElement).virtualFile
if (!virtualFile.name.endsWith(".class")) return null
return try {
val bytes = virtualFile.contentsToByteArray()
var found = false
var hasKotlinMetadata = false
val reader = org.jetbrains.org.objectweb.asm.ClassReader(bytes)
reader.accept(
object : org.jetbrains.org.objectweb.asm.ClassVisitor(
org.jetbrains.org.objectweb.asm.Opcodes.ASM9
) {
override fun visitAnnotation(
descriptor: String,
visible: Boolean
): org.jetbrains.org.objectweb.asm.AnnotationVisitor? {
if (descriptor == "Lkotlin/Metadata;") hasKotlinMetadata = true
return null
}
override fun visitMethod(
access: Int,
methodName: String,
descriptor: String,
signature: String?,
exceptions: Array<String>?
): org.jetbrains.org.objectweb.asm.MethodVisitor? {
if (methodName == name) found = true
return null
}
},
org.jetbrains.org.objectweb.asm.ClassReader.SKIP_CODE or
org.jetbrains.org.objectweb.asm.ClassReader.SKIP_DEBUG or
org.jetbrains.org.objectweb.asm.ClassReader.SKIP_FRAMES
)
if (hasKotlinMetadata) false else found
} catch (e: Exception) {
logger.warn("Failed to check binary class methods for ${c.fqNameWhenAvailable}: $e")
null
}
}
return null
}
private fun isJavaBinaryDeclaration(f: IrFunction) =
f.parentClassOrNull?.let { javaBinaryDeclaresMethod(it, f.name.asString()) } ?: false
private fun hasConcreteSiblingObjectMethod(f: IrFunction): Boolean {
val parentClass = f.parentClassOrNull ?: return false
return parentClass.declarations
.asSequence()
.filterIsInstance<IrFunction>()
.filter { sibling ->
sibling !== f &&
sibling.name == f.name &&
sibling.codeQlValueParameters.size == f.codeQlValueParameters.size
}
.any { sibling ->
val hasInvisibleFakeVisibility =
sibling.visibility.let {
it is DelegatedDescriptorVisibility && it.delegate == Visibilities.InvisibleFake
}
!sibling.isFakeOverride && !hasInvisibleFakeVisibility
}
}
private fun isJavaBinaryObjectMethodRedeclaration(d: IrDeclaration) =
when (d) {
is IrFunction ->
d.parentClassOrNull?.typeParameters?.isEmpty() == true &&
when (d.name.asString()) {
"toString" -> d.codeQlValueParameters.isEmpty()
"hashCode" -> d.codeQlValueParameters.isEmpty()
"equals" -> d.codeQlValueParameters.singleOrNull()?.type?.isNullableAny() ?: false
// Under K2 (language version 2.0+), the Object.equals(Object) parameter is
// typed as Any (non-nullable) rather than Any? (nullable). Accept both.
"equals" ->
d.codeQlValueParameters
.singleOrNull()
?.type
?.let { it.isNullableAny() || it.isAny() } ?: false
else -> false
} && isJavaBinaryDeclaration(d)
} &&
!hasConcreteSiblingObjectMethod(d) &&
isJavaBinaryDeclaration(d)
else -> false
}
@@ -1312,27 +1392,28 @@ open class KotlinFileExtractor(
): TypeResults {
with("value parameter", vp) {
val location = locOverride ?: getLocation(vp, classTypeArgsIncludingOuterClasses)
val parentFunction = vp.parent as? IrFunction
val javaCallable = parentFunction?.let { getJavaCallable(it) }
val maybeAlteredType =
(vp.parent as? IrFunction)?.let {
parentFunction?.let {
if (overridesCollectionsMethodWithAlteredParameterTypes(it))
eraseCollectionsMethodParameterType(vp.type, it.name.asString(), idx)
else if (
(vp.parent as? IrConstructor)?.parentClassOrNull?.kind ==
(parentFunction as? IrConstructor)?.parentClassOrNull?.kind ==
ClassKind.ANNOTATION_CLASS
)
kClassToJavaClass(vp.type)
else null
} ?: vp.type
val javaType =
(vp.parent as? IrFunction)?.let {
getJavaCallable(it)?.let { jCallable ->
getJavaValueParameterType(jCallable, idx)
}
}
val javaType = javaCallable?.let { jCallable -> getJavaValueParameterType(jCallable, idx) }
val addParameterWildcardsByDefault =
!getInnermostWildcardSupppressionAnnotation(vp) &&
!(javaCallable == null &&
parentFunction?.origin == IrDeclarationOrigin.IR_EXTERNAL_JAVA_DECLARATION_STUB)
val typeWithWildcards =
addJavaLoweringWildcards(
maybeAlteredType,
!getInnermostWildcardSupppressionAnnotation(vp),
addParameterWildcardsByDefault,
javaType
)
val substitutedType =
@@ -1346,9 +1427,9 @@ open class KotlinFileExtractor(
vp.origin == IrDeclarationOrigin.UNDERSCORE_PARAMETER ||
((vp.parent as? IrFunction)?.let { hasSynthesizedParameterNames(it) } ?: true)
val javaParameter =
when (val callable = (vp.parent as? IrFunction)?.let { getJavaCallable(it) }) {
is JavaConstructor -> callable.valueParameters.getOrNull(idx)
is JavaMethod -> callable.valueParameters.getOrNull(idx)
when (javaCallable) {
is JavaConstructor -> javaCallable.valueParameters.getOrNull(idx)
is JavaMethod -> javaCallable.valueParameters.getOrNull(idx)
else -> null
}
val extraAnnotations =
@@ -2874,6 +2955,45 @@ open class KotlinFileExtractor(
return v
}
private val sourceTextCache = mutableMapOf<String, String?>()
private fun getCurrentFileSourceText() =
sourceTextCache.getOrPut(filePath) {
runCatching { Files.readString(Path.of(filePath)) }.getOrNull()
}
private fun getVariableNameLocation(v: IrVariable): Label<DbLocation>? {
if (v.startOffset < 0 || v.endOffset < v.startOffset) return null
val source = getCurrentFileSourceText() ?: return null
if (v.startOffset >= source.length) return null
val name = v.name.asString()
if (name.isEmpty()) return null
val endExclusive = minOf(v.endOffset + 1, source.length)
val declarationText = source.substring(v.startOffset, endExclusive)
val nameOffsetInDeclaration = declarationText.indexOf(name)
if (nameOffsetInDeclaration < 0) return null
val nameStartOffset = v.startOffset + nameOffsetInDeclaration
val nameEndOffset = nameStartOffset + name.length - 1
return tw.getLocation(nameStartOffset, nameEndOffset)
}
private fun shouldUseVariableNameLocation(v: IrVariable): Boolean {
val initializer = v.initializer
return initializer is IrTypeOperatorCall && initializer.operator == IrTypeOperator.IMPLICIT_NOTNULL
}
private fun getVariableLocation(v: IrVariable): Label<DbLocation> {
if (shouldUseVariableNameLocation(v)) {
val nameLocation = getVariableNameLocation(v)
if (nameLocation != null) return nameLocation
}
return tw.getLocation(getVariableLocationProvider(v))
}
private fun extractVariable(
v: IrVariable,
callable: Label<out DbCallable>,
@@ -2882,7 +3002,7 @@ open class KotlinFileExtractor(
) {
with("variable", v) {
val stmtId = tw.getFreshIdLabel<DbLocalvariabledeclstmt>()
val locId = tw.getLocation(getVariableLocationProvider(v))
val locId = getVariableLocation(v)
tw.writeStmts_localvariabledeclstmt(stmtId, parent, idx, callable)
tw.writeHasLocation(stmtId, locId)
extractVariableExpr(v, callable, stmtId, 1, stmtId)
@@ -2900,7 +3020,7 @@ open class KotlinFileExtractor(
with("variable expr", v) {
val varId = useVariable(v)
val exprId = tw.getFreshIdLabel<DbLocalvariabledeclexpr>()
val locId = tw.getLocation(getVariableLocationProvider(v))
val locId = getVariableLocation(v)
val type = useType(v.type)
tw.writeLocalvars(varId, v.name.asString(), type.javaResult.id, exprId)
tw.writeLocalvarsKotlinType(varId, type.kotlinResult.id)
@@ -4066,6 +4186,28 @@ open class KotlinFileExtractor(
else -> false
}
private fun getCallResultType(c: IrCall, syntacticCallTarget: IrFunction): IrType {
if (syntacticCallTarget.origin != IrDeclarationOrigin.IR_EXTERNAL_JAVA_DECLARATION_STUB) {
return c.type
}
val primitiveInfo =
(c.type as? IrSimpleType)?.let { primitiveTypeMapping.getPrimitiveInfo(it) } ?: return c.type
val parentClass = syntacticCallTarget.parentClassOrNull ?: return c.type
val returnIsClassifier =
javaBinaryMethodReturnIsClassifierType(
parentClass,
getFunctionShortName(syntacticCallTarget).nameInDB,
syntacticCallTarget.codeQlValueParameters.size,
syntacticCallTarget is IrConstructor
)
return if (returnIsClassifier == true) {
primitiveInfo.javaClass.symbol.typeWith()
} else {
c.type
}
}
private fun isGenericArrayType(typeName: String) =
when (typeName) {
"Array" -> true
@@ -4111,7 +4253,7 @@ open class KotlinFileExtractor(
extractRawMethodAccess(
syntacticCallTarget,
c,
c.type,
getCallResultType(c, syntacticCallTarget),
callable,
parent,
idx,

View File

@@ -36,6 +36,7 @@ import org.jetbrains.kotlin.load.java.BuiltinMethodsWithSpecialGenericSignature
import org.jetbrains.kotlin.load.java.JvmAbi
import org.jetbrains.kotlin.load.java.sources.JavaSourceElement
import org.jetbrains.kotlin.load.java.structure.*
import org.jetbrains.kotlin.load.java.structure.impl.classFiles.BinaryJavaClass
import org.jetbrains.kotlin.load.java.typeEnhancement.hasEnhancedNullability
import org.jetbrains.kotlin.name.FqName
import org.jetbrains.kotlin.name.NameUtils
@@ -996,7 +997,20 @@ open class KotlinUsesExtractor(
)
return null
}
return extractFileClass(fqName)
val fileClassId = extractFileClass(fqName)
// Under K2, external file class members sit directly under IrExternalPackageFragment
// rather than under their IrClass parent. In that case the file class entity won't
// get a location set through the normal extractClassSource path.
if (d is IrMemberWithContainerSource && tw.lm.externalFileClassLocationsExtracted.add(fqName)) {
val binaryPath =
getContainerSourceBinaryPath(d.containerSource)
?.let { normalizeExternalFileClassBinaryPath(it, fqName) }
if (binaryPath != null && shouldUseConcreteExternalFileClassLocation(binaryPath)) {
val fileId = tw.mkFileId(binaryPath, true)
tw.writeHasLocation(fileClassId, tw.getWholeFileLocation(fileId))
}
}
return fileClassId
}
return useDeclarationParent(parent, canBeTopLevel, classTypeArguments, inReceiverContext)
}
@@ -1371,8 +1385,13 @@ open class KotlinUsesExtractor(
parentId: Label<out DbElement>,
classTypeArgsIncludingOuterClasses: List<IrTypeArgument>?,
maybeParameterList: List<IrValueParameter>? = null
): String =
getFunctionLabel(
): String {
val javaCallable = getJavaCallable(f)
val addParameterWildcardsByDefault =
!getInnermostWildcardSupppressionAnnotation(f) &&
!(javaCallable == null && f.origin == IrDeclarationOrigin.IR_EXTERNAL_JAVA_DECLARATION_STUB)
return getFunctionLabel(
f.parent,
parentId,
getFunctionShortName(f).nameInDB,
@@ -1382,9 +1401,10 @@ open class KotlinUsesExtractor(
getFunctionTypeParameters(f),
classTypeArgsIncludingOuterClasses,
overridesCollectionsMethodWithAlteredParameterTypes(f),
getJavaCallable(f),
!getInnermostWildcardSupppressionAnnotation(f)
javaCallable,
addParameterWildcardsByDefault
)
}
/*
* This function actually generates the label for a function.
@@ -1471,15 +1491,41 @@ open class KotlinUsesExtractor(
// Finally, mimic the Java extractor's behaviour by naming functions with type
// parameters for their erased types;
// those without type parameters are named for the generic type.
val maybeErased =
var maybeErased =
if (functionTypeParameters.isEmpty()) maybeSubbed else erase(maybeSubbed)
// K2 compatibility: under K2, Java @NotNull reference types such as @NotNull Integer
// are enhanced to Kotlin primitives (e.g. kotlin.Int). But the Java extractor uses
// the original reference type (java.lang.Integer) in callable labels. When we detect
// that the original Java parameter type is a reference (classifier) type but the
// Kotlin IR type is a primitive, revert to the boxed Java class so both extractors
// produce matching callable IDs.
if (functionTypeParameters.isEmpty()) {
val primitiveInfo = (maybeErased as? IrSimpleType)?.let {
primitiveTypeMapping.getPrimitiveInfo(it)
}
if (primitiveInfo != null) {
val parentClass = parent as? IrClass
if (parentClass != null) {
val isClassifierType = javaBinaryMethodParamIsClassifierType(
parentClass,
name,
allParamTypes.size,
name == "<init>",
it.index
)
if (isClassifierType == true) {
maybeErased = primitiveInfo.javaClass.symbol.typeWith()
}
}
}
}
"{${useType(maybeErased).javaResult.id}}"
}
val paramTypeIds =
allParamTypes
.withIndex()
.joinToString(separator = ",", transform = getIdForFunctionLabel)
val labelReturnType =
var labelReturnType =
if (name == "<init>") pluginContext.irBuiltIns.unitType
else
erase(
@@ -1489,6 +1535,28 @@ open class KotlinUsesExtractor(
pluginContext
)
)
// K2 compatibility: same as for parameters, if the Java binary method return type is a
// reference type but K2 enhanced it to a Kotlin primitive, use the boxed Java class.
if (functionTypeParameters.isEmpty() && name != "<init>") {
val primitiveInfo = (labelReturnType as? IrSimpleType)?.let {
primitiveTypeMapping.getPrimitiveInfo(it)
}
if (primitiveInfo != null) {
val parentClass = parent as? IrClass
if (parentClass != null) {
val returnIsClassifier =
javaBinaryMethodReturnIsClassifierType(
parentClass,
name,
allParamTypes.size,
false
)
if (returnIsClassifier == true) {
labelReturnType = primitiveInfo.javaClass.symbol.typeWith()
}
}
}
}
// Note that `addJavaLoweringWildcards` is not required here because the return type used to
// form the function
// label is always erased.
@@ -1594,9 +1662,23 @@ open class KotlinUsesExtractor(
}
@OptIn(ObsoleteDescriptorBasedAPI::class)
fun getJavaCallable(f: IrFunction) =
(f.descriptor.source as? JavaSourceElement)?.javaElement as? JavaMember
fun getJavaCallable(f: IrFunction): JavaMember? {
val fromDescriptor = (f.descriptor.source as? JavaSourceElement)?.javaElement as? JavaMember
if (fromDescriptor != null) return fromDescriptor
// K2 fallback: under K2, descriptor.source may not carry JavaSourceElement for binary Java
// methods. Try to get the JavaMember from the parent class's binary class directly.
val parentClass = f.parentClassOrNull ?: return null
val binaryJavaClass = (parentClass.source as? JavaSourceElement)?.javaElement as? BinaryJavaClass
?: return null
val name = getFunctionShortName(f).nameInDB
val nParams = f.codeQlValueParameters.size
return if (f is IrConstructor) {
binaryJavaClass.constructors.find { it.valueParameters.size == nParams }
} else {
binaryJavaClass.methods.find { it.name.asString() == name && it.valueParameters.size == nParams }
}
}
fun getJavaValueParameterType(m: JavaMember, idx: Int) =
when (m) {
is JavaMethod -> m.valueParameters[idx].type

View File

@@ -51,6 +51,13 @@ class TrapLabelManager {
* to avoid duplication.
*/
val fileClassLocationsExtracted = HashSet<IrFile>()
/**
* Tracks external file classes (by FqName) whose location has been set from a binary path.
* Used to avoid writing duplicate hasLocation facts for external file class entities extracted
* through the K2 code path where declarations sit directly under IrExternalPackageFragment.
*/
val externalFileClassLocationsExtracted = HashSet<org.jetbrains.kotlin.name.FqName>()
}
/**

View File

@@ -17,6 +17,7 @@ import org.jetbrains.kotlin.load.kotlin.JvmPackagePartSource
import org.jetbrains.kotlin.load.kotlin.KotlinJvmBinarySourceElement
import org.jetbrains.kotlin.load.kotlin.VirtualFileKotlinClass
import org.jetbrains.kotlin.name.FqName
import org.jetbrains.kotlin.serialization.deserialization.descriptors.DeserializedContainerSource
// Adapted from Kotlin's interpreter/Utils.kt function 'internalName'
// Translates class names into their JLS section 13.1 binary name,
@@ -176,15 +177,238 @@ fun getIrDeclarationBinaryPath(d: IrDeclaration): String? {
// This is in a file class.
val fqName = getFileClassFqName(d)
if (fqName != null) {
if (d is IrMemberWithContainerSource) {
val containerBinaryPath = getContainerSourceBinaryPath(d.containerSource)
if (containerBinaryPath != null) {
return normalizeExternalFileClassBinaryPath(containerBinaryPath, fqName)
}
}
return getUnknownBinaryLocation(fqName.asString())
}
}
return null
}
/**
* Attempts to get the binary file path from a container source (typically a
* [JvmPackagePartSource]). Returns null if the path is unavailable.
*/
fun getContainerSourceBinaryPath(containerSource: org.jetbrains.kotlin.serialization.deserialization.descriptors.DeserializedContainerSource?): String? {
if (containerSource !is JvmPackagePartSource) return null
val binaryClass = containerSource.knownJvmBinaryClass ?: return null
return when (binaryClass) {
is VirtualFileKotlinClass -> {
val vf = binaryClass.file
val path = vf.path
if (vf.fileSystem.protocol == StandardFileSystems.JRT_PROTOCOL)
"/${path.split("!/", limit = 2)[1]}"
else path
}
else -> binaryClass.location.takeIf { it.isNotEmpty() }
}
}
private fun getUnknownBinaryLocation(s: String): String {
return "/!unknown-binary-location/${s.replace(".", "/")}.class"
}
fun normalizeExternalFileClassBinaryPath(path: String, fqName: FqName): String {
if (path.contains(".kotlinc_installed")) {
return getUnknownBinaryLocation(fqName.asString())
}
val normalizedPath = path.replace('\\', '/')
val classInternalPath = "${fqName.asString().replace(".", "/")}.class"
val classSuffix = "/$classInternalPath"
if (normalizedPath.endsWith(classSuffix)) {
val classpathRoot = normalizedPath.removeSuffix(classSuffix).substringAfterLast('/')
if (classpathRoot.isNotEmpty()) {
return "$classpathRoot/$classInternalPath"
}
}
return path
}
fun shouldUseConcreteExternalFileClassLocation(path: String): Boolean {
val normalizedPath = path.replace('\\', '/')
return normalizedPath.contains("/") &&
!normalizedPath.startsWith("/!unknown-binary-location/")
}
fun getJavaEquivalentClassId(c: IrClass) =
c.fqNameWhenAvailable?.toUnsafe()?.let { JavaToKotlinClassMap.mapKotlinToJava(it) }
/**
* Checks whether a specific parameter of a Java binary method (identified by [methodName] and
* [paramIndex]) is a reference type (as opposed to a Java primitive). This is used to detect
* cases where K2 FIR has enhanced a reference type parameter (e.g. `@NotNull Integer`) to a
* Kotlin primitive (e.g. `kotlin.Int`), so that callable labels can use the original reference
* type and remain compatible with the Java extractor's callable IDs.
*
* Under K1, binary Java classes use [JavaSourceElement] and we can check [BinaryJavaClass.methods]
* directly. Under K2, they use [VirtualFileBasedSourceElement] and we fall back to reading the
* class bytes with ASM.
*
* Returns `null` if the information cannot be determined.
*/
fun javaBinaryMethodParamIsClassifierType(
parentClass: IrClass,
methodName: String,
nParams: Int,
isConstructor: Boolean,
paramIndex: Int
): Boolean? {
// K1 path: binary Java class has JavaSourceElement with a BinaryJavaClass.
val k1ParamKinds =
((parentClass.source as? JavaSourceElement)?.javaElement as? BinaryJavaClass)?.let {
binaryJavaClass ->
if (isConstructor)
binaryJavaClass.constructors
.asSequence()
.filter { it.valueParameters.size == nParams }
.mapNotNull { it.valueParameters.getOrNull(paramIndex)?.type }
.map { it is org.jetbrains.kotlin.load.java.structure.JavaClassifierType }
.toSet()
else
binaryJavaClass.methods
.asSequence()
.filter { it.name.asString() == methodName && it.valueParameters.size == nParams }
.mapNotNull { it.valueParameters.getOrNull(paramIndex)?.type }
.map { it is org.jetbrains.kotlin.load.java.structure.JavaClassifierType }
.toSet()
}
if (k1ParamKinds != null && k1ParamKinds.isNotEmpty()) {
return k1ParamKinds.singleOrNull()
}
// K2 path: binary Java class has VirtualFileBasedSourceElement
if (parentClass.source !is VirtualFileBasedSourceElement) return null
val vf = (parentClass.source as VirtualFileBasedSourceElement).virtualFile
if (!vf.name.endsWith(".class")) return null
return try {
val bytes = vf.contentsToByteArray()
val expectedMethodName = if (isConstructor) "<init>" else methodName
val descriptorKinds = mutableSetOf<Boolean>()
val reader = org.jetbrains.org.objectweb.asm.ClassReader(bytes)
reader.accept(
object : org.jetbrains.org.objectweb.asm.ClassVisitor(
org.jetbrains.org.objectweb.asm.Opcodes.ASM9
) {
override fun visitMethod(
access: Int,
name: String,
descriptor: String,
signature: String?,
exceptions: Array<String>?
): org.jetbrains.org.objectweb.asm.MethodVisitor? {
if (name != expectedMethodName) return null
val paramDescriptors = parseAsmMethodDescriptorParams(descriptor)
if (paramDescriptors.size != nParams) return null
val paramDesc = paramDescriptors.getOrNull(paramIndex) ?: return null
// Reference types start with 'L' or '['; Java primitives are single chars
descriptorKinds.add(paramDesc.startsWith("L") || paramDesc.startsWith("["))
return null
}
},
org.jetbrains.org.objectweb.asm.ClassReader.SKIP_CODE or
org.jetbrains.org.objectweb.asm.ClassReader.SKIP_DEBUG or
org.jetbrains.org.objectweb.asm.ClassReader.SKIP_FRAMES
)
descriptorKinds.singleOrNull()
} catch (e: Exception) {
null
}
}
/**
* Checks whether the return type of a Java binary method (identified by [methodName] and
* [nParams]) is a reference type (as opposed to a Java primitive).
*
* Returns `null` if the information cannot be determined.
*/
fun javaBinaryMethodReturnIsClassifierType(
parentClass: IrClass,
methodName: String,
nParams: Int,
isConstructor: Boolean
): Boolean? {
if (isConstructor) return false
// K1 path: binary Java class has JavaSourceElement with a BinaryJavaClass.
val k1ReturnKinds =
((parentClass.source as? JavaSourceElement)?.javaElement as? BinaryJavaClass)?.methods
?.asSequence()
?.filter { it.name.asString() == methodName && it.valueParameters.size == nParams }
?.map { it.returnType is org.jetbrains.kotlin.load.java.structure.JavaClassifierType }
?.toSet()
if (k1ReturnKinds != null && k1ReturnKinds.isNotEmpty()) {
return k1ReturnKinds.singleOrNull()
}
// K2 path: binary Java class has VirtualFileBasedSourceElement
if (parentClass.source !is VirtualFileBasedSourceElement) return null
val vf = (parentClass.source as VirtualFileBasedSourceElement).virtualFile
if (!vf.name.endsWith(".class")) return null
return try {
val bytes = vf.contentsToByteArray()
val returnKinds = mutableSetOf<Boolean>()
val reader = org.jetbrains.org.objectweb.asm.ClassReader(bytes)
reader.accept(
object : org.jetbrains.org.objectweb.asm.ClassVisitor(
org.jetbrains.org.objectweb.asm.Opcodes.ASM9
) {
override fun visitMethod(
access: Int,
name: String,
descriptor: String,
signature: String?,
exceptions: Array<String>?
): org.jetbrains.org.objectweb.asm.MethodVisitor? {
if (name != methodName) return null
if (parseAsmMethodDescriptorParams(descriptor).size != nParams) return null
val returnDescriptor = descriptor.substring(descriptor.lastIndexOf(')') + 1)
returnKinds.add(
returnDescriptor.startsWith("L") || returnDescriptor.startsWith("[")
)
return null
}
},
org.jetbrains.org.objectweb.asm.ClassReader.SKIP_CODE or
org.jetbrains.org.objectweb.asm.ClassReader.SKIP_DEBUG or
org.jetbrains.org.objectweb.asm.ClassReader.SKIP_FRAMES
)
returnKinds.singleOrNull()
} catch (e: Exception) {
null
}
}
fun parseAsmMethodDescriptorParams(descriptor: String): List<String> {
val params = mutableListOf<String>()
var i = descriptor.indexOf('(') + 1
val end = descriptor.lastIndexOf(')')
while (i < end) {
when (val c = descriptor[i]) {
'L' -> {
val semi = descriptor.indexOf(';', i)
params.add(descriptor.substring(i, semi + 1))
i = semi + 1
}
'[' -> {
var j = i + 1
while (j < end && descriptor[j] == '[') j++
if (descriptor[j] == 'L') {
val semi = descriptor.indexOf(';', j)
params.add(descriptor.substring(i, semi + 1))
i = semi + 1
} else {
params.add(descriptor.substring(i, j + 1))
i = j + 1
}
}
else -> { params.add(c.toString()); i++ }
}
}
return params
}

View File

@@ -1,11 +1,11 @@
import pathlib
def test(codeql, java_full, kotlinc_2_3_20):
def test(codeql, java_full):
java_srcs = " ".join([str(s) for s in pathlib.Path().glob("*.java")])
codeql.database.create(
command=[
f"javac {java_srcs} -d build",
"kotlinc -language-version 1.9 user.kt -cp build",
"kotlinc -language-version 2.0 user.kt -cp build",
]
)

View File

@@ -1,6 +1,6 @@
import commands
def test(codeql, java_full, kotlinc_2_3_20):
commands.run("kotlinc -language-version 1.9 test.kt -d lib")
codeql.database.create(command="kotlinc -language-version 1.9 user.kt -cp lib")
def test(codeql, java_full):
commands.run("kotlinc -language-version 2.0 test.kt -d lib")
codeql.database.create(command="kotlinc -language-version 2.0 user.kt -cp lib")

View File

@@ -9,4 +9,4 @@
| Percentage of calls with call target | 100 |
| Total number of lines | 3 |
| Total number of lines with extension kt | 3 |
| Uses Kotlin 2: false | 1 |
| Uses Kotlin 2: true | 1 |

View File

@@ -1,2 +1,2 @@
def test(codeql, java_full, kotlinc_2_3_20):
codeql.database.create(command=f"kotlinc -J-Xmx2G -language-version 1.9 SomeClass.kt")
def test(codeql, java_full):
codeql.database.create(command="kotlinc -J-Xmx2G -language-version 2.0 SomeClass.kt")

View File

@@ -1,6 +1,6 @@
import commands
def test(codeql, java_full, kotlinc_2_3_20):
commands.run("kotlinc -language-version 1.9 A.kt")
codeql.database.create(command="kotlinc -cp . -language-version 1.9 B.kt C.kt")
def test(codeql, java_full):
commands.run("kotlinc -language-version 2.0 A.kt")
codeql.database.create(command="kotlinc -cp . -language-version 2.0 B.kt C.kt")

View File

@@ -1,6 +1,6 @@
import commands
def test(codeql, java_full, kotlinc_2_3_20):
def test(codeql, java_full):
commands.run(["javac", "Test.java", "-d", "bin"])
codeql.database.create(command="kotlinc -language-version 1.9 user.kt -cp bin")
codeql.database.create(command="kotlinc -language-version 2.0 user.kt -cp bin")

View File

@@ -1,13 +1,13 @@
import commands
def test(codeql, java_full, kotlinc_2_3_20):
def test(codeql, java_full):
# Compile the JavaDefns2 copy outside tracing, to make sure the Kotlin view of it matches the Java view seen by the traced javac compilation of JavaDefns.java below.
commands.run(["javac", "JavaDefns2.java"])
codeql.database.create(
command=[
"kotlinc kotlindefns.kt",
"javac JavaUser.java JavaDefns.java -cp .",
"kotlinc -language-version 1.9 -cp . kotlinuser.kt",
"kotlinc -language-version 2.0 -cp . kotlinuser.kt",
]
)

View File

@@ -1,2 +0,0 @@
import semmle.python.controlflow.internal.AstNodeImpl
import ControlFlow::Consistency

View File

@@ -1,4 +0,0 @@
---
category: minorAnalysis
---
* A new Python control flow graph implementation has been added under `semmle.python.controlflow.internal.Cfg` (backed by `AstNodeImpl.qll`), built on the shared `codeql.controlflow.ControlFlowGraph` library. It is not yet used by the dataflow library or any production query; the legacy CFG in `semmle/python/Flow.qll` remains the default. The new library is exposed for tests and for upcoming migrations.

View File

@@ -1,4 +0,0 @@
---
category: minorAnalysis
---
* The new (shared-CFG-based) Python control flow graph now visits parameter and return type annotations as CFG nodes for function definitions, matching the legacy CFG. This restores annotation-based type tracking through framework models such as FastAPI's `Depends()`, Pydantic request models, Starlette `WebSocket` handlers, and any other models that flow a class reference through `Parameter.getAnnotation()` to identify instances of the annotated class.

View File

@@ -1,42 +0,0 @@
/**
* @name Print CFG
* @description Produces a representation of a file's Control Flow Graph.
* This query is used by the VS Code extension.
* @id py/print-cfg
* @kind graph
* @tags ide-contextual-queries/print-cfg
*/
import semmle.python.Files as Files
// import semmle.python.Scope
import semmle.python.controlflow.internal.AstNodeImpl
external string selectedSourceFile();
private predicate selectedSourceFileAlias = selectedSourceFile/0;
external int selectedSourceLine();
private predicate selectedSourceLineAlias = selectedSourceLine/0;
external int selectedSourceColumn();
private predicate selectedSourceColumnAlias = selectedSourceColumn/0;
module ViewCfgQueryInput implements ControlFlow::ViewCfgQueryInputSig<Files::File> {
predicate selectedSourceFile = selectedSourceFileAlias/0;
predicate selectedSourceLine = selectedSourceLineAlias/0;
predicate selectedSourceColumn = selectedSourceColumnAlias/0;
predicate cfgScopeSpan(
Ast::Callable scope, Files::File file, int startLine, int startColumn, int endLine,
int endColumn
) {
file = scope.getLocation().getFile() and
scope.getLocation().hasLocationInfo(_, startLine, startColumn, endLine, endColumn)
}
}
import ControlFlow::ViewCfgQuery<Files::File, ViewCfgQueryInput>

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +0,0 @@
consistencyOverview
| deadEnd | 1 |
deadEnd
| without_loop.py:7:5:7:9 | Break |

View File

@@ -1,32 +0,0 @@
/**
* Phase -1 of the dataflow CFG migration: verifies that every variable
* binding visible to the AST (`Name.defines(v)`) corresponds to a CFG node
* in the new CFG (`semmle.python.controlflow.internal.AstNodeImpl`).
*
* The expected tag is `cfgdefines=<name>`. Each binding annotation in the
* test sources looks like `# $ cfgdefines=x` for a binding currently
* covered by the new CFG, or `# $ MISSING: cfgdefines=x` for a binding
* that is known to be uncovered (a "red" test case that should be
* green-flipped once the corresponding `cfg-ext-*` extension lands).
*/
import python
import semmle.python.controlflow.internal.AstNodeImpl as CfgImpl
import utils.test.InlineExpectationsTest
module CfgBindingsTest implements TestSig {
string getARelevantTag() { result = "cfgdefines" }
predicate hasActualResult(Location location, string element, string tag, string value) {
exists(Name n, Variable v, CfgImpl::ControlFlowNode cfg |
n.defines(v) and
cfg.getAstNode().asExpr() = n and
location = n.getLocation() and
element = n.toString() and
tag = "cfgdefines" and
value = v.getId()
)
}
}
import MakeTest<CfgBindingsTest>

View File

@@ -1,13 +0,0 @@
# Annotated assignment (PEP 526). Both with and without an initializer.
a: int = 1 # $ cfgdefines=a
b: str = "hi" # $ cfgdefines=b
# Annotation without value: the AST records `c` as defined,
# and the new CFG now visits it via the AnnAssignStmt wrapper.
c: int # $ cfgdefines=c
class K: # $ cfgdefines=K
field: int = 0 # $ cfgdefines=field

View File

@@ -1,14 +0,0 @@
# Compound (tuple/list) assignment targets — actually wired in the new CFG.
a, b = (1, 2) # $ cfgdefines=a cfgdefines=b
[c, d] = [3, 4] # $ cfgdefines=c cfgdefines=d
# Nested unpacking.
(e, (f, g)) = (1, (2, 3)) # $ cfgdefines=e cfgdefines=f cfgdefines=g
# Star unpacking.
h, *i = [1, 2, 3] # $ cfgdefines=h cfgdefines=i
# Chained assignment with compound target.
j = k, l = (5, 6) # $ cfgdefines=j cfgdefines=k cfgdefines=l

View File

@@ -1,21 +0,0 @@
# Comprehension and `for` loop targets — wired in the new CFG.
# Comprehensions are nested function scopes with a synthetic `.0` parameter
# bound to the iterable.
# Bare-name `for` target.
for i in range(3): # $ cfgdefines=i
pass
# Compound `for` target.
for k, v in [(1, 2)]: # $ cfgdefines=k cfgdefines=v
pass
# Comprehension targets.
_ = [x for x in range(3)] # $ cfgdefines=_ cfgdefines=x cfgdefines=.0
_ = {y: z for y, z in []} # $ cfgdefines=_ cfgdefines=y cfgdefines=z cfgdefines=.0
_ = (a for a in []) # $ cfgdefines=_ cfgdefines=a cfgdefines=.0
# Nested comprehensions.
_ = [b for c in [] for b in c] # $ cfgdefines=_ cfgdefines=c cfgdefines=b cfgdefines=.0

View File

@@ -1,53 +0,0 @@
# Reachability of code following a try whose body always returns.
#
# The new CFG models exception edges for raise-prone expressions when
# they appear inside a `try` (or `with`) statement, mirroring Java's
# `mayThrow`. This means the body of a `try` has both a normal
# completion edge and an exception edge to its handlers, so code
# following the try-statement is reachable via the except-handler path
# even when the try-body would otherwise always return.
#
# Code that is not reachable under either normal or exception flow
# (for example, the `else` clause of a try whose body unconditionally
# raises) remains correctly classified as dead.
def f(obj): # $ cfgdefines=f cfgdefines=obj
try:
return len(obj)
except TypeError:
pass
# The try-body always returns, but `len(obj)` can raise (it is
# inside the try, so we model its exception edge). The
# `except TypeError: pass` handler falls through to here, making
# the code below reachable.
try:
hint = type(obj).__length_hint__ # $ cfgdefines=hint
except AttributeError:
return None
return hint
def g(): # $ cfgdefines=g
try:
raise Exception("inner")
except:
raise Exception("outer")
else:
# Unreachable: the inner try body always raises (via an explicit
# `raise`, which is modelled unconditionally), so the `else:`
# clause never runs.
hit_inner_else = True
def h(cache, key): # $ cfgdefines=h cfgdefines=cache cfgdefines=key
try:
return cache[key]
except KeyError:
pass
# Same pattern as `f`: reachable via the except-handler fall-through.
value = compute(key) # $ cfgdefines=value
cache[key] = value
return value

View File

@@ -1,30 +0,0 @@
# Decorated `def`/`class` — wired in the new CFG.
def deco(f): # $ cfgdefines=deco cfgdefines=f
return f
@deco
def decorated_func(): # $ cfgdefines=decorated_func
pass
@deco
class DecoratedClass: # $ cfgdefines=DecoratedClass
pass
# Stacked decorators.
@deco
@deco
def doubly(): # $ cfgdefines=doubly
pass
# Inside a class body.
class Outer: # $ cfgdefines=Outer
@staticmethod
def inner(): # $ cfgdefines=inner
pass

View File

@@ -1,19 +0,0 @@
# Exception-handler name bindings. These are already wired in the new
# CFG provided the try body can raise; `raise` statements are reliably
# treated as exception sources.
try:
raise ValueError("oops")
except ValueError as e: # $ cfgdefines=e
pass
try:
raise TypeError("oops")
except (TypeError, KeyError) as err: # $ cfgdefines=err
pass
# Exception groups (Python 3.11+).
try:
raise ValueError("oops")
except* ValueError as eg: # $ cfgdefines=eg
pass

View File

@@ -1,14 +0,0 @@
# Import aliases — all bound names below are now reachable via the new
# CFG's `ImportStmt` wrapper.
import os # $ cfgdefines=os
import os.path # $ cfgdefines=os
import os as o # $ cfgdefines=o
from os import path # $ cfgdefines=path
from os import path as p # $ cfgdefines=p
from os import sep, linesep # $ cfgdefines=sep cfgdefines=linesep
from os import (
getcwd, # $ cfgdefines=getcwd
getcwdb, # $ cfgdefines=getcwdb
)

View File

@@ -1,24 +0,0 @@
# Match-statement pattern bindings — wired in the new CFG.
def f(subject): # $ cfgdefines=f cfgdefines=subject
match subject:
case x: # $ cfgdefines=x
pass
case [a, b]: # $ cfgdefines=a cfgdefines=b
pass
case {"k": v}: # $ cfgdefines=v
pass
case Point(p, q): # $ cfgdefines=p cfgdefines=q
pass
case [_, *rest]: # $ cfgdefines=rest
pass
case (1 | 2) as n: # $ cfgdefines=n
pass
class Point: # $ cfgdefines=Point
__match_args__ = ("x", "y") # $ cfgdefines=__match_args__
x: int # $ cfgdefines=x
y: int # $ cfgdefines=y

View File

@@ -1,42 +0,0 @@
# Function parameters.
def positional(a, b): # $ cfgdefines=positional cfgdefines=a cfgdefines=b
pass
def with_default(x=1, y=2): # $ cfgdefines=with_default cfgdefines=x cfgdefines=y
pass
def with_vararg(*args): # $ cfgdefines=with_vararg cfgdefines=args
pass
def with_kwarg(**kwargs): # $ cfgdefines=with_kwarg cfgdefines=kwargs
pass
def with_kwonly(*, k1, k2=5): # $ cfgdefines=with_kwonly cfgdefines=k1 cfgdefines=k2
pass
def kitchen_sink(a, b=2, *args, k1, k2=5, **kw): # $ cfgdefines=kitchen_sink cfgdefines=a cfgdefines=b cfgdefines=args cfgdefines=k1 cfgdefines=k2 cfgdefines=kw
pass
# Methods get `self` / `cls`.
class C: # $ cfgdefines=C
def method(self, x): # $ cfgdefines=method cfgdefines=self cfgdefines=x
pass
@classmethod
def cmethod(cls, x): # $ cfgdefines=cmethod cfgdefines=cls cfgdefines=x
pass
# Lambda parameter.
_ = lambda p: p + 1 # $ cfgdefines=_ cfgdefines=p
# PEP 570 positional-only.
def pos_only(a, b, /, c): # $ cfgdefines=pos_only cfgdefines=a cfgdefines=b cfgdefines=c
pass

View File

@@ -1,14 +0,0 @@
# Simple bindings that should already work in the new CFG.
# No MISSING annotations expected.
x = 1 # $ cfgdefines=x
y = x + 1 # $ cfgdefines=y
def f(): # $ cfgdefines=f
pass
class C: # $ cfgdefines=C
pass
# Re-assignment.
x = 2 # $ cfgdefines=x

View File

@@ -1,21 +0,0 @@
# PEP 695 type parameters (Python 3.12+).
# PEP 695 type-param names on `def`/`class` bind in an annotation scope
# that nests the function/class body — they have no CFG node in the
# enclosing scope (matching the legacy CFG).
def func[T](x: T) -> T: # $ cfgdefines=func cfgdefines=x
return x
class Box[T]: # $ cfgdefines=Box
item: T # $ cfgdefines=item
# Multi-parameter, with bound and variadics.
def multi[T: int, *Ts, **P](x: T, *args: *Ts, **kwargs: P.kwargs) -> T: # $ cfgdefines=multi cfgdefines=x cfgdefines=args cfgdefines=kwargs
return x
# `type` statement (PEP 695).
type Alias[T] = list[T] # $ cfgdefines=Alias cfgdefines=T

View File

@@ -1,14 +0,0 @@
# Walrus and starred-target edge cases — wired in the new CFG.
# Walrus in expression context.
if (y := 5) > 0: # $ cfgdefines=y
pass
# Walrus in a comprehension. The comprehension introduces a synthetic
# `.0` parameter bound to the iterable.
_ = [w for _ in range(3) if (w := 1)] # $ cfgdefines=_ cfgdefines=w cfgdefines=.0
# Starred target in a Tuple LHS.
*head, tail = [1, 2, 3] # $ cfgdefines=head cfgdefines=tail

View File

@@ -1,21 +0,0 @@
# `with cm() as x:` bindings — wired in the new CFG.
class CM: # $ cfgdefines=CM
def __enter__(self): return self # $ cfgdefines=__enter__ cfgdefines=self
def __exit__(self, *a): pass # $ cfgdefines=__exit__ cfgdefines=self cfgdefines=a
with CM() as x: # $ cfgdefines=x
pass
# Multiple items.
with CM() as a, CM() as b: # $ cfgdefines=a cfgdefines=b
pass
# Parenthesised form (Python 3.10+).
with (CM() as p, CM() as q): # $ cfgdefines=p cfgdefines=q
pass
# Compound target in `with`.
with CM() as (m, n): # $ cfgdefines=m cfgdefines=n
pass

View File

@@ -1,14 +0,0 @@
/** New-CFG version of AllLiveReachable. */
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils
private import Utils::CfgTests
from TimerCfgNode a, TestFunction f
where allLiveReachable(a, f)
select a, "Unreachable live annotation; entry of $@ does not reach this node", f, f.getName()

View File

@@ -1,18 +0,0 @@
/**
* New-CFG version of AnnotationHasCfgNode.
*
* Checks that every timer annotation has a corresponding CFG node.
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils::CfgTests
from TimerAnnotation ann
where annotationWithoutCfgNode(ann)
select ann, "Annotation in $@ has no CFG node", ann.getTestFunction(),
ann.getTestFunction().getName()

View File

@@ -1,26 +0,0 @@
/**
* New-CFG version of BasicBlockAnnotationGap.
*
* Original:
* Checks that within a basic block, if a node is annotated then its
* successor is also annotated (or excluded). A gap in annotations
* within a basic block indicates a missing annotation, since there
* are no branches to justify the gap.
*
* Nodes with exceptional successors are excluded, as the exception
* edge leaves the basic block and the normal successor may be dead.
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils
private import Utils::CfgTests
from TimerCfgNode a, CfgNode succ
where basicBlockAnnotationGap(a, succ)
select a, "Annotated node followed by unannotated $@ in the same basic block", succ,
succ.getNode().toString()

View File

@@ -1,21 +0,0 @@
/**
* New-CFG version of BasicBlockOrdering.
*
* Original:
* Checks that within a single basic block, annotations appear in
* increasing minimum-timestamp order.
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils
private import Utils::CfgTests
from TimerCfgNode a, TimerCfgNode b, int minA, int minB
where basicBlockOrdering(a, b, minA, minB)
select a, "Basic block ordering: $@ appears before $@", a.getTimestampExpr(minA),
"timestamp " + minA, b.getTimestampExpr(minB), "timestamp " + minB

View File

@@ -1,80 +0,0 @@
/**
* New-CFG version of BranchTimestamps.
*
* Checks that when a node has both a true and false successor, the
* live timestamps on one branch are marked as dead on the other.
* This ensures that boolean branches are fully annotated with dead()
* markers for the paths not taken.
*
* Limitation: the `@ t[ts, ...]` / `dead(ts)` annotation scheme can only
* model branch-dead-ness for plain boolean control flow that reconverges
* linearly after the split — i.e. `if`-with-else and `if`-expression.
* It cannot model:
*
* * loops (`while` / `for`): body timestamps repeat across iterations,
* so the loop-exit annotation can't list them as dead;
* * `match` statements: each `case` body is a syntactically distinct
* sub-tree, and the branches don't reconverge through a common
* annotation point in the timeline;
* * `try` / `with` and `raise` / `assert`: exception edges are modelled
* as true/false but flow to syntactically distinct handlers, with no
* reconvergence in the linear annotation order;
* * short-circuit `and` / `or` (`BoolExpr`): the branches reconverge at
* the BoolExpr's after-node, so timestamps on one branch are live
* downstream of the other rather than dead;
* * `if` without an `else` clause, and `if`/`elif` chains: the false
* branch reconverges with the true branch at the post-if statement
* (no-else) or fans out across multiple elif-test annotations,
* neither of which fit the binary annotation scheme.
*
* Branch nodes inside those constructs are therefore whitelisted out
* below. The check still fires (and is useful) for plain `if`/`else`
* and conditional-expression branching.
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils
private import Utils::CfgTests
/**
* Holds if `f` contains a construct whose branches the linear-timestamp
* annotation scheme cannot describe (see file-level comment).
*/
private predicate hasUnmodellableBranching(Function f) {
exists(AstNode bad |
bad.getScope() = f and
(
bad instanceof While
or
bad instanceof For
or
bad instanceof MatchStmt
or
bad instanceof Try
or
bad instanceof With
or
bad instanceof Raise
or
bad instanceof Assert
or
bad instanceof BoolExpr
or
bad instanceof If and
(not exists(bad.(If).getAnOrelse()) or bad.(If).isElif())
)
)
}
from TimerCfgNode node, int ts, string branch
where
missingBranchTimestamp(node, ts, branch) and
not hasUnmodellableBranching(node.getTestFunction())
select node,
"Timestamp " + ts + " on true/false branch is missing a dead() annotation on the " + branch +
" successor in $@", node.getTestFunction(), node.getTestFunction().getName()

View File

@@ -1,22 +0,0 @@
/**
* New-CFG version of ConsecutivePredecessorTimestamps.
*
* Checks that each annotated node (except the minimum timestamp) has
* a predecessor annotation with timestamp `a - 1`. This is the reverse
* of ConsecutiveTimestamps: it catches nodes that are reachable but
* arrived at from the wrong place (skipping an intermediate node).
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils
private import Utils::CfgTests
from TimerAnnotation ann, int a
where consecutivePredecessorTimestamps(ann, a)
select ann, "$@ in $@ has no consecutive predecessor (expected " + (a - 1) + ")",
ann.getTimestampExpr(a), "Timestamp " + a, ann.getTestFunction(), ann.getTestFunction().getName()

View File

@@ -1,29 +0,0 @@
/**
* New-CFG version of ConsecutiveTimestamps.
*
* Original:
* Checks that consecutive annotated nodes have consecutive timestamps:
* for each annotation with timestamp `a`, some CFG node for that annotation
* must have a next annotation containing `a + 1`.
*
* Handles CFG splitting (e.g., finally blocks duplicated for normal/exceptional
* flow) by checking that at least one split has the required successor.
*
* Only applies to functions where all annotations are in the function's
* own scope (excludes tests with generators, async, comprehensions, or
* lambdas that have annotations in nested scopes).
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils
private import Utils::CfgTests
from TimerAnnotation ann, int a
where consecutiveTimestamps(ann, a)
select ann, "$@ in $@ has no consecutive successor (expected " + (a + 1) + ")",
ann.getTimestampExpr(a), "Timestamp " + a, ann.getTestFunction(), ann.getTestFunction().getName()

View File

@@ -1,120 +0,0 @@
/**
* Implementation of the evaluation-order CFG signature using the new
* shared control flow graph from AstNodeImpl.
*/
private import python as Py
import TimerUtils
private import semmle.python.controlflow.internal.AstNodeImpl as CfgImpl
private import codeql.controlflow.SuccessorType
private class NewControlFlowNode = CfgImpl::ControlFlowNode;
private class NewBasicBlock = CfgImpl::BasicBlock;
/** New (shared) CFG implementation of the evaluation-order signature. */
module NewCfg implements EvalOrderCfgSig {
class CfgNode instanceof NewControlFlowNode {
// We must pick a *unique* representative CFG node for each AST node. The
// shared CFG has several nodes per AST node (before / in-post-order / after
// / after-value splits), but the timer test framework keys annotations on
// `getNode()` and assumes one CFG node per annotated AST node. Without a
// filter, an annotated `f()` would map to both `f()` and `After f()`, which
// breaks two framework invariants: (1) the "no shared reachable" check
// requires that two distinct nodes sharing a timestamp be mutually
// unreachable (true/false branches of a condition), but `Before f()`,
// `f()` and `After f()` share the annotation's timestamp *and* lie on one
// linear path; and (2) the annotation walk (`nextTimerAnnotation`) halts at
// the first reachable representative, so a second node for the same AST
// node would stall the walk on the same timestamp instead of advancing to
// the next evaluation event.
//
// We use the "after" node (`isAfter`) rather than the canonical `injects`
// node, because `injects` represents short-circuit / conditional
// expressions (`and`/`or`/`not`/ternary) by their *before* node, placing
// them ahead of their operands — wrong for evaluation order. `isAfter`
// instead picks the post-evaluation node: the merged before/after node for
// simple leaves, the `TAfterNode` for post-order expressions, and the
// `AfterValueNode`(s) for pre-order conditionals, all positioned after the
// operands. The two value-split nodes of a conditional are genuinely
// distinct evaluation outcomes (handled by `getATrueSuccessor` /
// `getAFalseSuccessor`), so they do not violate the uniqueness assumption.
CfgNode() { NewControlFlowNode.super.isAfter(_) }
string toString() { result = NewControlFlowNode.super.toString() }
Py::Location getLocation() { result = NewControlFlowNode.super.getLocation() }
Py::AstNode getNode() {
result = CfgImpl::astNodeToPyNode(NewControlFlowNode.super.getAstNode())
}
CfgNode getASuccessor() { nextCfgNode(this, result) }
CfgNode getATrueSuccessor() {
NewControlFlowNode.super.isAfterTrue(_) and
// Only where there's also a false branch (true boolean split)
exists(NewControlFlowNode other | other.isAfterFalse(NewControlFlowNode.super.getAstNode())) and
nextCfgNodeFrom(this, result)
}
CfgNode getAFalseSuccessor() {
NewControlFlowNode.super.isAfterFalse(_) and
// Only where there's also a true branch (true boolean split)
exists(NewControlFlowNode other | other.isAfterTrue(NewControlFlowNode.super.getAstNode())) and
nextCfgNodeFrom(this, result)
}
CfgNode getAnExceptionalSuccessor() {
exists(NewControlFlowNode mid |
mid = NewControlFlowNode.super.getAnExceptionSuccessor() and
nextCfgNodeFrom(mid, result)
)
}
Py::Scope getScope() { result = NewControlFlowNode.super.getEnclosingCallable().asScope() }
BasicBlock getBasicBlock() {
exists(NewBasicBlock bb, int i | bb.getNode(i) = this and result = bb)
}
}
/**
* Holds if `next` is the nearest CfgNode reachable from `n` via
* one or more raw CFG successor edges, skipping non-CfgNode intermediaries.
*/
private predicate nextCfgNodeFrom(NewControlFlowNode n, CfgNode next) {
next = n.getASuccessor()
or
exists(NewControlFlowNode mid |
mid = n.getASuccessor() and
not mid instanceof CfgNode and
nextCfgNodeFrom(mid, next)
)
}
/**
* Holds if `next` is the nearest CfgNode successor of `n`,
* skipping synthetic intermediate nodes.
*/
private predicate nextCfgNode(CfgNode n, CfgNode next) { nextCfgNodeFrom(n, next) }
class BasicBlock instanceof NewBasicBlock {
string toString() { result = NewBasicBlock.super.toString() }
CfgNode getNode(int n) { result = NewBasicBlock.super.getNode(n) }
predicate reaches(BasicBlock bb) { this = bb or this.strictlyReaches(bb) }
predicate strictlyReaches(BasicBlock bb) { NewBasicBlock.super.getASuccessor+() = bb }
predicate strictlyDominates(BasicBlock bb) { NewBasicBlock.super.strictlyDominates(bb) }
}
CfgNode scopeGetEntryNode(Py::Scope s) {
exists(CfgImpl::ControlFlow::EntryNode entry |
entry.getEnclosingCallable().asScope() = s and
nextCfgNodeFrom(entry, result)
)
}
}

View File

@@ -1,21 +0,0 @@
/**
* New-CFG version of NeverReachable.
*
* Original:
* Checks that expressions annotated with `t.never` either have no CFG
* node, or if they do, that the node is not reachable from its scope's
* entry (including within the same basic block).
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils::CfgTests
from TimerAnnotation ann
where neverReachable(ann)
select ann, "Node annotated with t.never is reachable in $@", ann.getTestFunction(),
ann.getTestFunction().getName()

View File

@@ -1,22 +0,0 @@
/**
* New-CFG version of NoBackwardFlow.
*
* Original:
* Checks that time never flows backward between consecutive timer annotations
* in the CFG. For each pair of consecutive annotated nodes (A -> B), there must
* exist timestamps a in A and b in B with a < b.
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils
private import Utils::CfgTests
from TimerCfgNode a, TimerCfgNode b, int minA, int maxB
where noBackwardFlow(a, b, minA, maxB)
select a, "Backward flow: $@ flows to $@ (max timestamp $@)", a.getTimestampExpr(minA),
minA.toString(), b, b.getNode().toString(), b.getTimestampExpr(maxB), maxB.toString()

View File

@@ -1,18 +0,0 @@
/**
* New-CFG version of NoBasicBlock.
*
* Checks that every annotated CFG node belongs to a basic block.
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils
private import Utils::CfgTests
from CfgNode n, TestFunction f
where noBasicBlock(n, f)
select n, "CFG node in $@ does not belong to any basic block", f, f.getName()

View File

@@ -1,21 +0,0 @@
/**
* New-CFG version of NoSharedReachable.
*
* Original:
* Checks that two annotations sharing a timestamp value are on
* mutually exclusive CFG paths (neither can reach the other).
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils
private import Utils::CfgTests
from TimerCfgNode a, TimerCfgNode b, int ts
where noSharedReachable(a, b, ts)
select a, "Shared timestamp $@ but this node reaches $@", a.getTimestampExpr(ts), ts.toString(), b,
b.getNode().toString()

View File

@@ -1,22 +0,0 @@
/**
* New-CFG version of StrictForward.
*
* Original:
* Stronger version of NoBackwardFlow: for consecutive annotated nodes
* A -> B that both have a single timestamp (non-loop code) and B does
* NOT dominate A (forward edge), requires max(A) < min(B).
*/
import python
import TimerUtils
import NewCfgImpl
private module Utils = EvalOrderCfgUtils<NewCfg>;
private import Utils
private import Utils::CfgTests
from TimerCfgNode a, TimerCfgNode b, int maxA, int minB
where strictForward(a, b, maxA, minB)
select a, "Strict forward violation: $@ flows to $@", a.getTimestampExpr(maxA), "timestamp " + maxA,
b.getTimestampExpr(minB), "timestamp " + minB

View File

@@ -3,14 +3,14 @@
* Python control flow graph.
*/
private import python as Py
private import python as PY
import TimerUtils
/** Existing Python CFG implementation of the evaluation-order signature. */
module OldCfg implements EvalOrderCfgSig {
class CfgNode = Py::ControlFlowNode;
class CfgNode = PY::ControlFlowNode;
class BasicBlock = Py::BasicBlock;
class BasicBlock = PY::BasicBlock;
CfgNode scopeGetEntryNode(Py::Scope s) { result = s.getEntryNode() }
CfgNode scopeGetEntryNode(PY::Scope s) { result = s.getEntryNode() }
}

View File

@@ -85,7 +85,7 @@ def test_nested_if_else(t):
else:
z = 2 @ t[dead(4)]
else:
z = 3 @ t[dead(3), dead(4)]
z = 3 @ t[dead(4)]
w = 0 @ t[5]

View File

@@ -1,41 +0,0 @@
/**
* Inline-expectations test for the store/load/delete/parameter
* classification predicates on the new-CFG facade.
*
* Each tag fires when the corresponding predicate (`isLoad`,
* `isStore`, `isDelete`, `isParameter`, `isAugLoad`, `isAugStore`)
* holds on the canonical CFG node wrapping a `Py::Name` with the
* given identifier. Subscript and attribute stores are not covered
* by these tags — only the `Name`-typed targets/loads they involve.
*/
import python
import semmle.python.controlflow.internal.Cfg as Cfg
import utils.test.InlineExpectationsTest
module StoreLoadTest implements TestSig {
string getARelevantTag() { result = ["load", "store", "delete", "param", "augload", "augstore"] }
predicate hasActualResult(Location location, string element, string tag, string value) {
exists(Cfg::NameNode n |
location = n.getLocation() and
element = n.toString() and
value = n.getId() and
(
n.isLoad() and not n.isAugLoad() and tag = "load"
or
n.isStore() and not n.isAugStore() and tag = "store"
or
n.isDelete() and tag = "delete"
or
n.isParameter() and tag = "param"
or
n.isAugLoad() and tag = "augload"
or
n.isAugStore() and tag = "augstore"
)
)
}
}
import MakeTest<StoreLoadTest>

View File

@@ -1,56 +0,0 @@
# Store/load/delete/parameter classification on the new-CFG facade.
#
# Each annotated location carries the (sorted, deduplicated) set of
# kinds the CFG facade reports there. Comparing against the legacy
# 'semmle.python.Flow' classification is done by the comparison query
# 'StoreLoadParity.ql' — annotations here are only the positive
# assertions for the new facade.
#
# Tags:
# load=<id> -- isLoad() fires on the Name
# store=<id> -- isStore() fires
# delete=<id> -- isDelete() fires
# param=<id> -- isParameter() fires
# augload=<id> -- isAugLoad() fires (the LHS of x += ... when read)
# augstore=<id> -- isAugStore() fires (the LHS of x += ... when written)
# --- plain load / store / delete ---
x = 1 # $ store=x
y = x + 1 # $ store=y load=x
print(y) # $ load=print load=y
del x # $ delete=x
# --- function definitions (parameters) ---
def f(a, b=2, *args, c, **kwargs): # $ store=f param=a param=b param=args param=c param=kwargs
return a + b + c # $ load=a load=b load=c
# --- augmented assignment splits one Name into load + store halves ---
def aug(): # $ store=aug
n = 0 # $ store=n
n += 1 # $ augload=n augstore=n
return n # $ load=n
# --- subscript / attribute stores ---
class C: # $ store=C
pass
def stores(obj, container, idx): # $ store=stores param=obj param=container param=idx
obj.attr = 1 # $ load=obj
container[idx] = 2 # $ load=container load=idx
return obj # $ load=obj
# --- tuple unpacking ---
def unpack(pair): # $ store=unpack param=pair
a, b = pair # $ store=a store=b load=pair
return a + b # $ load=a load=b

View File

@@ -66,7 +66,7 @@ impl<'a> AstNode for Node<'a> {
impl AstNode for yeast::Node {
fn kind(&self) -> &str {
yeast::Node::kind(self)
yeast::Node::kind_name(self)
}
fn is_named(&self) -> bool {
yeast::Node::is_named(self)
@@ -882,7 +882,6 @@ fn emit_extras_in(visitor: &mut Visitor, node: Node<'_>) {
}
fn traverse_yeast(tree: &yeast::Ast, visitor: &mut Visitor) {
use yeast::Cursor;
let mut cursor = tree.walk();
visitor.enter_node(cursor.node());
let mut recurse = true;

View File

@@ -41,22 +41,14 @@ pub fn query(input: TokenStream) -> TokenStream {
/// (kind "literal") - leaf with static content
/// (kind #{expr}) - leaf with computed content (expr.to_string())
/// (kind $fresh) - leaf with auto-generated unique name
/// {expr} - embed a Rust expression returning Id
/// {..expr} - splice an iterable of Id (in child/field position)
/// field: {..expr} - splice into a named field
/// {expr}.map(p -> tpl) - apply tpl to each element; splice result
/// {expr}.reduce_left(f -> init, acc, e -> fold)
/// - fold with per-element init; splice 0 or 1 result
/// {expr} - embed a Rust expression, dispatched via
/// the `IntoFieldIds` trait: `Id` pushes a
/// single id; iterables (`Vec<Id>`,
/// `Option<Id>`, iterator chains) splice
/// their elements
/// field: {expr} - extend a named field with `{expr}`'s ids
/// ```
///
/// Chain syntax after `{expr}` or `{..expr}`:
/// - `.map(param -> template)` — one output node per input element.
/// - `.reduce_left(first -> init, acc, elem -> fold)` — fold left; the first
/// element is converted by `init`, subsequent elements are folded by `fold`
/// with the accumulator bound to `acc`. An empty iterable yields nothing.
/// - Chains always splice (the result is iterable).
/// - Multiple chains can be chained, e.g. `.map(...).reduce_left(...)`.
///
/// Can be called with an explicit context or using the implicit context
/// from an enclosing `rule!`:
///
@@ -100,7 +92,7 @@ pub fn trees(input: TokenStream) -> TokenStream {
/// rule!(
/// (query_pattern field: (_) @name (kind)* @repeated (_)? @optional)
/// =>
/// (output_template field: {name} {..repeated})
/// (output_template field: {name} {repeated})
/// )
///
/// // Shorthand: captures become fields on the output node

View File

@@ -304,7 +304,8 @@ fn parse_ctx_or_implicit(tokens: &mut Tokens) -> Ident {
&& matches!(lookahead.next(), Some(TokenTree::Punct(p)) if p.as_char() == ',');
if is_explicit {
let ctx = expect_ident(tokens, "").unwrap();
let ctx = expect_ident(tokens, "unreachable: ident was just peeked")
.expect("unreachable: ident was just peeked");
let _ = tokens.next(); // consume comma
ctx
} else {
@@ -342,7 +343,7 @@ pub fn parse_trees_top(input: TokenStream) -> Result<TokenStream> {
}
Ok(quote! {
{
let mut __nodes: Vec<usize> = Vec::new();
let mut __nodes: Vec<yeast::Id> = Vec::new();
#(#items)*
__nodes
}
@@ -356,7 +357,7 @@ fn parse_direct_node(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStream> {
Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Brace => {
let group = expect_group(tokens, Delimiter::Brace)?;
let expr = group.stream();
Ok(quote! { ::std::convert::Into::<usize>::into({ #expr }) })
Ok(quote! { ::std::convert::Into::<yeast::Id>::into({ #expr }) })
}
Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Parenthesis => {
let group = expect_group(tokens, Delimiter::Parenthesis)?;
@@ -429,49 +430,24 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
);
field_counter += 1;
// Check for field: {..expr}.chain or field: {expr}.chain — splice a Vec<Id> into the field
// Plain `field: {expr}` — trait-dispatched extend.
if peek_is_group(tokens, Delimiter::Brace) {
let group_clone = tokens.clone().next().unwrap();
if let TokenTree::Group(g) = &group_clone {
let mut inner_check = g.stream().into_iter();
let is_splice = matches!(inner_check.next(), Some(TokenTree::Punct(p)) if p.as_char() == '.')
&& matches!(inner_check.next(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
// Determine if a chain (.map(..)) follows the `{}` group.
let mut after = tokens.clone();
after.next(); // skip the brace group
let has_chain =
matches!(after.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
if is_splice || has_chain {
let group = expect_group(tokens, Delimiter::Brace)?;
let base: TokenStream = if is_splice {
let mut inner = group.stream().into_iter().peekable();
inner.next(); // consume first .
inner.next(); // consume second .
let expr: TokenStream = inner.collect();
quote! {
{ #expr }.into_iter().map(::std::convert::Into::<usize>::into)
}
} else {
let expr = group.stream();
quote! { { #expr }.into_iter() }
};
let chained = parse_chain_suffix(tokens, ctx, base)?;
stmts.push(quote! {
let #temp: Vec<usize> = #chained.collect();
});
// An empty splice means the field is absent — skip it
// entirely rather than emitting an empty named field.
field_args.push(quote! {
if !#temp.is_empty() { __fields.push((#field_str, #temp)); }
});
continue;
}
}
let group = expect_group(tokens, Delimiter::Brace)?;
let expr = group.stream();
stmts.push(quote! {
let mut #temp: Vec<yeast::Id> = Vec::new();
yeast::IntoFieldIds::extend_into({ #expr }, &mut #temp);
});
// An empty `{expr}` means the field is absent — skip it
// entirely rather than emitting an empty named field.
field_args.push(quote! {
if !#temp.is_empty() { __fields.push((#field_str, #temp)); }
});
continue;
}
let value = parse_direct_node(tokens, ctx)?;
stmts.push(quote! { let #temp: usize = #value; });
stmts.push(quote! { let #temp: yeast::Id = #value; });
field_args.push(quote! { __fields.push((#field_str, vec![#temp])); });
}
@@ -488,101 +464,13 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
Ok(quote! {
{
#(#stmts)*
let mut __fields: Vec<(&str, Vec<usize>)> = Vec::new();
let mut __fields: Vec<(&str, Vec<yeast::Id>)> = Vec::new();
#(#field_args)*
#ctx.node(#kind_str, __fields)
}
})
}
/// Parse a chain of `.method(args)` suffixes after a `{expr}` or `{..expr}`
/// placeholder in tree templates. Currently supports:
///
/// ```text
/// .map(param -> template) -- iterator map: produces Vec<usize>
/// ```
///
/// The chain may be empty (returns `base` unchanged). Multiple chained calls
/// are supported, e.g. `.map(p -> ...).map(q -> ...)`.
///
/// Each call expects the receiver to be an iterator. The `base` argument
/// should therefore already be an iterator (use `.into_iter()` on it before
/// calling this function).
fn parse_chain_suffix(tokens: &mut Tokens, ctx: &Ident, base: TokenStream) -> Result<TokenStream> {
let mut current = base;
while matches!(tokens.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.') {
tokens.next(); // consume .
let method = expect_ident(tokens, "expected method name after `.`")?;
let method_str = method.to_string();
let args_group = expect_group(tokens, Delimiter::Parenthesis)?;
match method_str.as_str() {
"map" => {
let mut inner = args_group.stream().into_iter().peekable();
let param = expect_ident(&mut inner, "expected lambda parameter name")?;
expect_punct(&mut inner, '-', "expected `->` after lambda parameter")?;
expect_punct(&mut inner, '>', "expected `->` after lambda parameter")?;
let body = parse_direct_node(&mut inner, ctx)?;
if let Some(tok) = inner.next() {
return Err(syn::Error::new_spanned(
tok,
"unexpected token after lambda body",
));
}
current = quote! {
#current.map(|#param| #body)
};
}
"reduce_left" => {
// Syntax: reduce_left(first -> init_tpl, acc, elem -> fold_tpl)
// - first -> init_tpl : converts the first element to the initial accumulator
// - acc, elem -> fold_tpl : fold step (acc = current accumulator, elem = next element)
// Empty iterator produces an empty iterator; non-empty produces a single-element iterator.
let mut inner = args_group.stream().into_iter().peekable();
let init_param = expect_ident(&mut inner, "expected initial lambda parameter")?;
expect_punct(&mut inner, '-', "expected `->` after init parameter")?;
expect_punct(&mut inner, '>', "expected `->` after init parameter")?;
let init_body = parse_direct_node(&mut inner, ctx)?;
expect_punct(&mut inner, ',', "expected `,` after init template")?;
let acc_param = expect_ident(&mut inner, "expected accumulator parameter")?;
expect_punct(&mut inner, ',', "expected `,` after accumulator parameter")?;
let elem_param = expect_ident(&mut inner, "expected element parameter")?;
expect_punct(&mut inner, '-', "expected `->` after element parameter")?;
expect_punct(&mut inner, '>', "expected `->` after element parameter")?;
let fold_body = parse_direct_node(&mut inner, ctx)?;
if let Some(tok) = inner.next() {
return Err(syn::Error::new_spanned(
tok,
"unexpected token after fold template",
));
}
current = quote! {
{
let mut __iter = #current;
let __result: Option<usize> = if let Some(#init_param) = __iter.next() {
let mut __acc: usize = #init_body;
for #elem_param in __iter {
let #acc_param: usize = __acc;
__acc = #fold_body;
}
Some(__acc)
} else {
None
};
__result.into_iter()
}
};
}
_ => {
return Err(syn::Error::new_spanned(
method,
format!("unknown builtin method `.{method_str}()`"),
));
}
}
}
Ok(current)
}
/// Parse the top-level list of a `trees!` template.
/// Each item is a node template or `{expr}` splice.
fn parse_direct_list(tokens: &mut Tokens, ctx: &Ident) -> Result<Vec<TokenStream>> {
@@ -603,35 +491,14 @@ fn parse_direct_list(tokens: &mut Tokens, ctx: &Ident) -> Result<Vec<TokenStream
continue;
}
// {expr} or {..expr} (with optional .chain) — single node or splice
// `{expr}` — extend `__nodes` via `IntoFieldIds`, which handles
// single ids and iterables uniformly.
if peek_is_group(tokens, Delimiter::Brace) {
let group = expect_group(tokens, Delimiter::Brace)?;
let has_chain =
matches!(tokens.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
let mut inner = group.stream().into_iter().peekable();
let is_splice = peek_is_dotdot(&inner);
if is_splice || has_chain {
let base: TokenStream = if is_splice {
inner.next(); // consume first .
inner.next(); // consume second .
let expr: TokenStream = inner.collect();
quote! {
{ #expr }.into_iter().map(::std::convert::Into::<usize>::into)
}
} else {
let expr = group.stream();
quote! { { #expr }.into_iter() }
};
let chained = parse_chain_suffix(tokens, ctx, base)?;
items.push(quote! {
__nodes.extend(#chained);
});
} else {
let expr = group.stream();
items.push(quote! {
__nodes.push(::std::convert::Into::<usize>::into({ #expr }));
});
}
let expr = group.stream();
items.push(quote! {
yeast::IntoFieldIds::extend_into({ #expr }, &mut __nodes);
});
continue;
}
@@ -649,7 +516,7 @@ struct CaptureInfo {
name: String,
multiplicity: CaptureMultiplicity,
/// `true` for `@@name` captures: the auto-translate prefix skips them,
/// so the bound `NodeRef` refers to the raw (input-schema) node.
/// so the bound `Id` refers to the raw (input-schema) node.
raw: bool,
}
@@ -804,22 +671,17 @@ pub fn parse_rule_top(input: TokenStream) -> Result<TokenStream> {
match cap.multiplicity {
CaptureMultiplicity::Repeated => {
quote! {
let #name: Vec<yeast::NodeRef> = __captures.get_all(#name_str)
.into_iter()
.map(yeast::NodeRef)
.collect();
let #name: Vec<yeast::Id> = __captures.get_all(#name_str);
}
}
CaptureMultiplicity::Optional => {
quote! {
let #name: Option<yeast::NodeRef> =
__captures.get_opt(#name_str).map(yeast::NodeRef);
let #name: Option<yeast::Id> = __captures.get_opt(#name_str);
}
}
CaptureMultiplicity::Single => {
quote! {
let #name: yeast::NodeRef =
yeast::NodeRef(__captures.get_var(#name_str).unwrap());
let #name: yeast::Id = __captures.get_var(#name_str).unwrap();
}
}
}
@@ -850,7 +712,7 @@ pub fn parse_rule_top(input: TokenStream) -> Result<TokenStream> {
__fields.insert(
__field_id,
#name.into_iter()
.map(::std::convert::Into::<usize>::into)
.map(::std::convert::Into::<yeast::Id>::into)
.collect(),
);
},
@@ -859,14 +721,14 @@ pub fn parse_rule_top(input: TokenStream) -> Result<TokenStream> {
.unwrap_or_else(|| panic!("field '{}' not found", #name_str));
if let Some(__id) = #name {
__fields.entry(__field_id).or_insert_with(Vec::new)
.push(::std::convert::Into::<usize>::into(__id));
.push(::std::convert::Into::<yeast::Id>::into(__id));
}
},
CaptureMultiplicity::Single => quote! {
let __field_id = #ctx_ident.ast.field_id_for_name(#name_str)
.unwrap_or_else(|| panic!("field '{}' not found", #name_str));
__fields.entry(__field_id).or_insert_with(Vec::new)
.push(::std::convert::Into::<usize>::into(#name));
.push(::std::convert::Into::<yeast::Id>::into(#name));
},
}
})
@@ -898,7 +760,7 @@ pub fn parse_rule_top(input: TokenStream) -> Result<TokenStream> {
}
quote! {
let mut __nodes: Vec<usize> = Vec::new();
let mut __nodes: Vec<yeast::Id> = Vec::new();
#(#transform_items)*
__nodes
}
@@ -919,7 +781,7 @@ pub fn parse_rule_top(input: TokenStream) -> Result<TokenStream> {
__translator.auto_translate_captures(&mut __captures, __ast, __user_ctx, __skip)?;
#(#bindings)*
let mut #ctx_ident = yeast::build::BuildCtx::with_translator(__ast, &__captures, __fresh, __source_range, __user_ctx, __translator);
let __result: Vec<usize> = { #transform_body };
let __result: Vec<yeast::Id> = { #transform_body };
Ok(__result)
}))
}
@@ -956,13 +818,6 @@ fn peek_is_hash(tokens: &mut Tokens) -> bool {
matches!(tokens.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '#')
}
/// Check for `..` (two consecutive dot punctuation tokens).
fn peek_is_dotdot(tokens: &Tokens) -> bool {
let mut lookahead = tokens.clone();
matches!(lookahead.next(), Some(TokenTree::Punct(p)) if p.as_char() == '.')
&& matches!(lookahead.next(), Some(TokenTree::Punct(p)) if p.as_char() == '.')
}
fn peek_is_underscore(tokens: &mut Tokens) -> bool {
matches!(tokens.peek(), Some(TokenTree::Ident(id)) if *id == "_")
}

View File

@@ -214,7 +214,7 @@ yeast::tree!(ctx,
```rust
yeast::trees!(ctx,
(assignment left: {tmp} right: {right})
{..body}
{body}
)
```
@@ -256,12 +256,26 @@ occurrences of the same `$name` within one `BuildCtx` share the same value:
### Embedded Rust expressions
`{expr}` embeds a Rust expression that returns a single node `Id`:
`{expr}` embeds a Rust expression whose value is appended to the
enclosing field (or to the rule body's id list). Dispatch happens via
the [`IntoFieldIds`] trait, which is implemented for:
- `Id` — pushes the single id.
- Any `IntoIterator<Item: Into<Id>>` — extends with all yielded ids
(covers `Vec<Id>`, `Option<Id>`, iterator chains, etc.).
So the same `{expr}` syntax handles single ids, splices, and zero-or-many
options uniformly:
```rust
(assignment
left: {some_node_id} // insert a pre-built node
right: {rhs} // insert a captured value (inside rule!)
left: {some_node_id} // a single Id
right: {rhs} // a captured value (inside rule!)
)
yeast::trees!(ctx,
(assignment left: {tmp} right: {right})
{extra_nodes} // splices a Vec<Id>
)
```
@@ -277,21 +291,17 @@ expressions (with `let` bindings) work too:
})
```
`{..expr}` splices a `Vec<Id>` (or any iterable of `Id`); the contents
are likewise a Rust block, so the splice can be the result of arbitrary
computation:
Inside `rule!`, captures are Rust variables — `{name}` works for
single, optional, and repeated captures alike:
```rust
yeast::trees!(ctx,
(assignment left: {tmp} right: {right})
{..extra_nodes} // splice a Vec<Id>
rule!(
(assignment left: @lhs right: _* @parts)
=>
(assignment left: {lhs} right: (block stmt: {parts}))
)
```
Inside `rule!`, captures are Rust variables, so `{name}` inserts a
single capture (`Id`) and `{..name}` splices a repeated capture
(`Vec<Id>`).
### Raw captures (`@@name`)
The default `@name` capture marker is *auto-translated*: in OneShot
@@ -302,7 +312,7 @@ already conforms to the output schema.
For rules that need the raw (input-schema) capture — typically to read
its source text or to translate it explicitly with mutable context
state between calls — use `@@name` instead. The body sees the original
input-schema `NodeRef`:
input-schema `Id`:
```rust
yeast::rule!(
@@ -310,7 +320,7 @@ yeast::rule!(
=>
{
// raw_lhs is untranslated: read its original source text.
let text = ctx.ast.source_text(raw_lhs.into());
let text = ctx.ast.source_text(raw_lhs);
// rhs is already translated by the auto-translate prefix.
tree!((call
method: (identifier #{text.as_str()})

View File

@@ -158,15 +158,6 @@ impl<'a, C> BuildCtx<'a, C> {
self.ast
.create_named_token_with_range(kind, generated, self.source_range)
}
/// Prepend a value to a field of an existing node.
pub fn prepend_field(&mut self, node_id: Id, field_name: &str, value_id: Id) {
let field_id = self
.ast
.field_id_for_name(field_name)
.unwrap_or_else(|| panic!("build: field '{field_name}' not found"));
self.ast.prepend_field_child(node_id, field_id, value_id);
}
}
impl<C: Clone> BuildCtx<'_, C> {
@@ -176,9 +167,6 @@ impl<C: Clone> BuildCtx<'_, C> {
/// (translation is not meaningful when input and output share a
/// schema).
///
/// Accepts any value convertible to [`Id`] (including [`crate::NodeRef`]),
/// so manual rules can pass capture bindings directly without unwrapping.
///
/// Errors if this `BuildCtx` was constructed by hand (without a
/// translator handle) — for example, in unit tests that don't go
/// through the rule driver.
@@ -189,20 +177,6 @@ impl<C: Clone> BuildCtx<'_, C> {
None => Err("translate() called on a BuildCtx without a translator handle".into()),
}
}
/// Translate an optional capture, returning the first translated id or
/// `None`. Convenience for `?`-quantifier captures (`Option<NodeRef>`).
///
/// If the underlying translation produces multiple ids for a single
/// input, only the first is returned. For most use cases (e.g.
/// translating a single type annotation) this is what you want; if
/// you need all ids, use [`translate`] directly.
pub fn translate_opt<I: Into<Id>>(&mut self, id: Option<I>) -> Result<Option<Id>, String> {
match id {
Some(id) => Ok(self.translate(id)?.into_iter().next()),
None => Ok(None),
}
}
}
impl<C> std::ops::Deref for BuildCtx<'_, C> {

View File

@@ -54,37 +54,15 @@ impl Captures {
self.captures.entry(key).or_default().push(id);
}
pub fn map_captures(&mut self, kind: &str, f: &mut impl FnMut(Id) -> Id) {
if let Some(ids) = self.captures.get_mut(kind) {
for id in ids {
*id = f(*id);
}
}
}
/// Apply a fallible function to every captured id (across all keys),
/// replacing each id with the results. A function returning an empty
/// vector removes the capture; returning multiple ids splices them
/// into the capture's value list (suitable for `*`/`+` captures).
/// Stops and returns the error on the first failure.
pub fn try_map_all_captures<E>(
&mut self,
mut f: impl FnMut(Id) -> Result<Vec<Id>, E>,
) -> Result<(), E> {
for ids in self.captures.values_mut() {
let mut new_ids = Vec::with_capacity(ids.len());
for &id in ids.iter() {
new_ids.extend(f(id)?);
}
*ids = new_ids;
}
Ok(())
}
/// Like [`try_map_all_captures`] but leaves captures whose name appears
/// in `skip` untouched. Used by the `rule!` macro to support `@@name`
/// (raw) captures alongside the default auto-translated `@name`
/// captures.
/// Apply a fallible function to every captured id, replacing each id
/// with the results. A function returning an empty vector removes
/// the capture; returning multiple ids splices them into the
/// capture's value list (suitable for `*`/`+` captures). Captures
/// whose name appears in `skip` are left untouched. Stops and
/// returns the error on the first failure.
///
/// Used by the `rule!` macro's auto-translate prefix to translate
/// every capture except those marked `@@name` (raw).
pub fn try_map_captures_except<E>(
&mut self,
skip: &[&str],
@@ -102,12 +80,6 @@ impl Captures {
}
Ok(())
}
pub fn map_captures_to(&mut self, from: &str, to: &'static str, f: &mut impl FnMut(Id) -> Id) {
if let Some(from_ids) = self.captures.get(from) {
let new_values = from_ids.iter().copied().map(f).collect();
self.captures.insert(to, new_values);
}
}
pub fn merge(&mut self, other: &Captures) {
for (key, ids) in &other.captures {

View File

@@ -1,8 +0,0 @@
pub trait Cursor<'a, T, N, F> {
fn node(&self) -> &'a N;
fn field_id(&self) -> Option<F>;
fn field_name(&self) -> Option<&'static str>;
fn goto_first_child(&mut self) -> bool;
fn goto_next_sibling(&mut self) -> bool;
fn goto_parent(&mut self) -> bool;
}

View File

@@ -1,6 +1,6 @@
use std::fmt::Write;
use crate::{schema::Schema, Ast, Node, NodeContent, CHILD_FIELD};
use crate::{schema::Schema, Ast, Id, Node, NodeContent, CHILD_FIELD};
/// Options for controlling AST dump output.
pub struct DumpOptions {
@@ -34,16 +34,11 @@ impl Default for DumpOptions {
/// method:
/// identifier "foo"
/// ```
pub fn dump_ast(ast: &Ast, root: usize, source: &str) -> String {
pub fn dump_ast(ast: &Ast, root: Id, source: &str) -> String {
dump_ast_with_options(ast, root, source, &DumpOptions::default())
}
pub fn dump_ast_with_options(
ast: &Ast,
root: usize,
source: &str,
options: &DumpOptions,
) -> String {
pub fn dump_ast_with_options(ast: &Ast, root: Id, source: &str, options: &DumpOptions) -> String {
let mut out = String::new();
dump_node(ast, root, source, options, 0, None, &mut out);
out
@@ -53,7 +48,7 @@ pub fn dump_ast_with_options(
///
/// Any node that does not match the expected type set for its parent field is
/// rendered with a trailing `" <-- ERROR: ..."` annotation on the same line.
pub fn dump_ast_with_type_errors(ast: &Ast, root: usize, source: &str, schema: &Schema) -> String {
pub fn dump_ast_with_type_errors(ast: &Ast, root: Id, source: &str, schema: &Schema) -> String {
dump_ast_with_type_errors_and_options(ast, root, source, schema, &DumpOptions::default())
}
@@ -63,7 +58,7 @@ pub fn dump_ast_with_type_errors(ast: &Ast, root: usize, source: &str, schema: &
/// rendered with a trailing `" <-- ERROR: ..."` annotation on the same line.
pub fn dump_ast_with_type_errors_and_options(
ast: &Ast,
root: usize,
root: Id,
source: &str,
schema: &Schema,
options: &DumpOptions,
@@ -176,7 +171,7 @@ fn expected_for_field<'a>(
fn dump_node(
ast: &Ast,
id: usize,
id: Id,
source: &str,
options: &DumpOptions,
indent: usize,
@@ -315,7 +310,7 @@ fn dump_node(
/// Dump a leaf node inline (no newline prefix, caller provides context).
fn dump_node_inline(
ast: &Ast,
id: usize,
id: Id,
source: &str,
options: &DumpOptions,
type_check: Option<(

View File

@@ -7,7 +7,6 @@ use serde_json::{json, Value};
pub mod build;
pub mod captures;
pub mod cursor;
pub mod dump;
pub mod node_types_yaml;
pub mod query;
@@ -19,38 +18,61 @@ mod visitor;
pub use yeast_macros::{query, rule, tree, trees};
use captures::Captures;
pub use cursor::Cursor;
use query::QueryNode;
/// Node ids are indexes into the arena
pub type Id = usize;
/// Node id: an index into the [`Ast`] arena. A newtype around `usize`
/// rather than a bare alias so that it can carry its own
/// [`YeastDisplay`] / [`YeastSourceRange`] / [`IntoFieldIds`] impls
/// without colliding with the impls for plain integers.
///
/// Use `id.0` (or `id.into()`) to obtain the raw arena index.
#[repr(transparent)]
#[derive(Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Debug, Hash, Serialize)]
pub struct Id(pub usize);
impl From<usize> for Id {
fn from(value: usize) -> Self {
Id(value)
}
}
impl From<Id> for usize {
fn from(value: Id) -> Self {
value.0
}
}
/// Field and Kind ids are provided by tree-sitter
type FieldId = u16;
type KindId = u16;
/// A typed reference to a node in an [`Ast`] arena. Wraps an [`Id`] but
/// deliberately does not implement [`std::fmt::Display`]: rendering a node
/// requires the [`Ast`] it lives in (to resolve [`NodeContent::Range`] back
/// to source text). Use [`YeastDisplay::yeast_to_string`] to format it.
#[derive(Copy, Clone, Eq, PartialEq, Debug, Hash)]
pub struct NodeRef(pub Id);
/// Trait for values that can be appended to a field's id list inside a
/// `tree!`/`trees!`/`rule!` template (in `{expr}` placeholders).
///
/// `Id` pushes a single id; the blanket impl for
/// `IntoIterator<Item: Into<Id>>` handles `Vec<Id>`, `Option<Id>`,
/// arbitrary iterators yielding `Id`, etc.
///
/// This lets `{expr}` interpolate any of these shapes without a
/// dedicated splice syntax — the macro emits the same trait-dispatched
/// call regardless of the value's type.
pub trait IntoFieldIds {
fn extend_into(self, out: &mut Vec<Id>);
}
impl NodeRef {
pub fn id(self) -> Id {
self.0
impl IntoFieldIds for Id {
fn extend_into(self, out: &mut Vec<Id>) {
out.push(self);
}
}
impl From<NodeRef> for Id {
fn from(value: NodeRef) -> Self {
value.0
}
}
impl From<Id> for NodeRef {
fn from(value: Id) -> Self {
NodeRef(value)
impl<I, T> IntoFieldIds for I
where
I: IntoIterator<Item = T>,
T: Into<Id>,
{
fn extend_into(self, out: &mut Vec<Id>) {
out.extend(self.into_iter().map(Into::into));
}
}
@@ -67,21 +89,21 @@ pub trait YeastDisplay {
/// Optional source range for values used in `#{expr}` interpolations.
///
/// By default this returns `None`, so synthesized leaves inherit the matched
/// rule's source range. `NodeRef` returns the referenced node's range, letting
/// rule's source range. `Id` returns the referenced node's range, letting
/// `(kind #{capture})` carry the captured node's location.
pub trait YeastSourceRange {
fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range>;
}
impl YeastDisplay for NodeRef {
impl YeastDisplay for Id {
fn yeast_to_string(&self, ast: &Ast) -> String {
ast.source_text(self.0)
ast.source_text(*self)
}
}
impl YeastSourceRange for NodeRef {
impl YeastSourceRange for Id {
fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range> {
ast.get_node(self.0).and_then(|n| match &n.content {
ast.get_node(*self).and_then(|n| match &n.content {
NodeContent::Range(r) => Some(r.clone()),
_ => n.source_range,
})
@@ -150,6 +172,36 @@ impl<'a> AstCursor<'a> {
self.node_id
}
pub fn node(&self) -> &'a Node {
&self.ast.nodes[self.node_id.0]
}
pub fn field_id(&self) -> Option<FieldId> {
let (_, children) = self.parents.last()?;
children.current_field()
}
pub fn field_name(&self) -> Option<&'static str> {
if self.field_id() == Some(CHILD_FIELD) {
None
} else {
self.field_id()
.and_then(|id| self.ast.field_name_for_id(id))
}
}
pub fn goto_first_child(&mut self) -> bool {
self.goto_first_child_opt().is_some()
}
pub fn goto_next_sibling(&mut self) -> bool {
self.goto_next_sibling_opt().is_some()
}
pub fn goto_parent(&mut self) -> bool {
self.goto_parent_opt().is_some()
}
fn goto_next_sibling_opt(&mut self) -> Option<()> {
self.node_id = self.parents.last_mut()?.1.next()?;
Some(())
@@ -170,37 +222,6 @@ impl<'a> AstCursor<'a> {
Some(())
}
}
impl<'a> Cursor<'a, Ast, Node, FieldId> for AstCursor<'a> {
fn node(&self) -> &'a Node {
&self.ast.nodes[self.node_id]
}
fn field_id(&self) -> Option<FieldId> {
let (_, children) = self.parents.last()?;
children.current_field()
}
fn field_name(&self) -> Option<&'static str> {
if self.field_id() == Some(CHILD_FIELD) {
None
} else {
self.field_id()
.and_then(|id| self.ast.field_name_for_id(id))
}
}
fn goto_first_child(&mut self) -> bool {
self.goto_first_child_opt().is_some()
}
fn goto_next_sibling(&mut self) -> bool {
self.goto_next_sibling_opt().is_some()
}
fn goto_parent(&mut self) -> bool {
self.goto_parent_opt().is_some()
}
}
/// An iterator over the child Ids of a node.
#[derive(Debug)]
@@ -347,16 +368,16 @@ impl Ast {
///
/// This reflects the effective AST after desugaring and excludes orphaned
/// arena nodes left behind by rewrite operations.
pub fn reachable_node_ids(&self) -> Vec<usize> {
pub fn reachable_node_ids(&self) -> Vec<Id> {
let mut reachable = Vec::new();
let mut stack = vec![self.root];
let mut seen = vec![false; self.nodes.len()];
while let Some(id) = stack.pop() {
if id >= self.nodes.len() || seen[id] {
if id.0 >= self.nodes.len() || seen[id.0] {
continue;
}
seen[id] = true;
seen[id.0] = true;
reachable.push(id);
if let Some(node) = self.get_node(id) {
@@ -380,11 +401,11 @@ impl Ast {
}
pub fn get_node(&self, id: Id) -> Option<&Node> {
self.nodes.get(id)
self.nodes.get(id.0)
}
pub fn print(&self, source: &str, root_id: Id) -> Value {
let root = &self.nodes()[root_id];
let root = &self.nodes()[root_id.0];
self.print_node(root, source)
}
@@ -427,7 +448,7 @@ impl Ast {
is_named,
source_range,
});
id
Id(id)
}
fn union_source_range_of_children(
@@ -494,15 +515,6 @@ impl Ast {
self.create_named_token_with_range(kind, content, None)
}
/// Prepend a child id to the given field of the given node.
pub fn prepend_field_child(&mut self, node_id: Id, field_id: FieldId, value_id: Id) {
let node = self
.nodes
.get_mut(node_id)
.expect("prepend_field_child: invalid node id");
node.fields.entry(field_id).or_default().insert(0, value_id);
}
pub fn create_named_token_with_range(
&mut self,
kind: &'static str,
@@ -524,7 +536,7 @@ impl Ast {
fields: BTreeMap::new(),
content: NodeContent::DynamicString(content),
});
id
Id(id)
}
pub fn field_name_for_id(&self, id: FieldId) -> Option<&'static str> {
@@ -608,10 +620,6 @@ pub struct Node {
}
impl Node {
pub fn kind(&self) -> &'static str {
self.kind_name
}
pub fn kind_name(&self) -> &'static str {
self.kind_name
}
@@ -956,7 +964,7 @@ fn apply_repeating_rules_inner<C: Clone>(
));
}
let node_kind = ast.get_node(id).map(|n| n.kind()).unwrap_or("");
let node_kind = ast.get_node(id).map(|n| n.kind_name()).unwrap_or("");
for rule in index.rules_for_kind(node_kind) {
let rule_ptr = *rule as *const Rule<C>;
if Some(rule_ptr) == skip_rule {
@@ -1008,7 +1016,7 @@ fn apply_repeating_rules_inner<C: Clone>(
//
// Child traversal does not increment rewrite depth and starts fresh
// (no rule is skipped on child subtrees).
let mut fields = std::mem::take(&mut ast.nodes[id].fields);
let mut fields = std::mem::take(&mut ast.nodes[id.0].fields);
for children in fields.values_mut() {
let mut new_children: Option<Vec<Id>> = None;
for (i, &child_id) in children.iter().enumerate() {
@@ -1041,7 +1049,7 @@ fn apply_repeating_rules_inner<C: Clone>(
*children = new;
}
}
ast.nodes[id].fields = fields;
ast.nodes[id.0].fields = fields;
Ok(vec![id])
}
@@ -1075,7 +1083,7 @@ fn apply_one_shot_rules_inner<C: Clone>(
));
}
let node_kind = ast.get_node(id).map(|n| n.kind()).unwrap_or("");
let node_kind = ast.get_node(id).map(|n| n.kind_name()).unwrap_or("");
for rule in index.rules_for_kind(node_kind) {
if let Some(captures) = rule.try_match(ast, id)? {

View File

@@ -49,7 +49,7 @@ impl Visitor {
pub fn build_with_schema(self, schema: crate::schema::Schema) -> Ast {
Ast {
root: 0,
root: Id(0),
schema,
nodes: self.nodes.into_iter().map(|n| n.inner).collect(),
source: Vec::new(),
@@ -72,7 +72,7 @@ impl Visitor {
},
parent: self.current,
});
id
Id(id)
}
fn enter_node(&mut self, node: tree_sitter::Node<'_>) -> bool {
@@ -83,10 +83,10 @@ impl Visitor {
fn leave_node(&mut self, field_name: Option<&'static str>, _node: tree_sitter::Node<'_>) {
let node_id = self.current.unwrap();
let node_parent = self.nodes[node_id].parent;
let node_parent = self.nodes[node_id.0].parent;
if let Some(parent_id) = node_parent {
let parent = self.nodes.get_mut(parent_id).unwrap();
let parent = self.nodes.get_mut(parent_id.0).unwrap();
if let Some(field) = field_name {
let field_id = self.language.field_id_for_name(field).unwrap().get();
parent

View File

@@ -300,7 +300,7 @@ fn test_query_skips_extras_in_positional_match() {
let mut cursor = AstCursor::new(&ast);
cursor.goto_first_child();
let array_id = cursor.node_id();
assert_eq!(ast.get_node(array_id).unwrap().kind(), "array");
assert_eq!(ast.get_node(array_id).unwrap().kind_name(), "array");
// Two positional wildcards should bind to the two integers, skipping
// the comment that sits between them.
@@ -309,11 +309,15 @@ fn test_query_skips_extras_in_positional_match() {
let matched = query.do_match(&ast, array_id, &mut captures).unwrap();
assert!(matched);
assert_eq!(
ast.get_node(captures.get_var("a").unwrap()).unwrap().kind(),
ast.get_node(captures.get_var("a").unwrap())
.unwrap()
.kind_name(),
"integer"
);
assert_eq!(
ast.get_node(captures.get_var("b").unwrap()).unwrap().kind(),
ast.get_node(captures.get_var("b").unwrap())
.unwrap()
.kind_name(),
"integer"
);
}
@@ -391,7 +395,7 @@ fn test_capture_unnamed_node_parenthesized() {
assert!(matched);
let op_id = captures.get_var("op").unwrap();
let op_node = ast.get_node(op_id).unwrap();
assert_eq!(op_node.kind(), "=");
assert_eq!(op_node.kind_name(), "=");
assert!(!op_node.is_named());
}
@@ -414,7 +418,7 @@ fn test_capture_bare_underscore_repeated() {
let all = captures.get_all("all");
assert_eq!(all.len(), 1);
assert_eq!(ast.get_node(all[0]).unwrap().kind(), "=");
assert_eq!(ast.get_node(all[0]).unwrap().kind_name(), "=");
assert!(!ast.get_node(all[0]).unwrap().is_named());
}
@@ -441,7 +445,7 @@ fn test_capture_unnamed_node_bare_literal() {
assert!(matched);
let op_id = captures.get_var("op").unwrap();
let op_node = ast.get_node(op_id).unwrap();
assert_eq!(op_node.kind(), "=");
assert_eq!(op_node.kind_name(), "=");
assert!(!op_node.is_named());
}
@@ -479,7 +483,7 @@ fn test_bare_underscore_matches_unnamed() {
.unwrap();
assert!(matched, "_ should match the unnamed `=`");
let any_node = ast.get_node(captures.get_var("any").unwrap()).unwrap();
assert_eq!(any_node.kind(), "=");
assert_eq!(any_node.kind_name(), "=");
assert!(!any_node.is_named());
}
@@ -506,7 +510,7 @@ fn test_bare_forms_in_field_position() {
assert_eq!(
ast.get_node(captures.get_var("lhs").unwrap())
.unwrap()
.kind(),
.kind_name(),
"identifier"
);
@@ -516,7 +520,7 @@ fn test_bare_forms_in_field_position() {
let matched = query.do_match(&ast, assignment_id, &mut captures).unwrap();
assert!(matched);
let op = ast.get_node(captures.get_var("op").unwrap()).unwrap();
assert_eq!(op.kind(), "=");
assert_eq!(op.kind_name(), "=");
assert!(!op.is_named());
}
@@ -535,7 +539,7 @@ fn test_forward_scan_finds_unnamed_token_late() {
let mut cursor = AstCursor::new(&ast);
cursor.goto_first_child(); // for
cursor.goto_first_child(); // do (the body)
while cursor.node().kind() != "do" || !cursor.node().is_named() {
while cursor.node().kind_name() != "do" || !cursor.node().is_named() {
assert!(cursor.goto_next_sibling(), "expected to find named `do`");
}
let do_id = cursor.node_id();
@@ -545,7 +549,7 @@ fn test_forward_scan_finds_unnamed_token_late() {
let matched = query.do_match(&ast, do_id, &mut captures).unwrap();
assert!(matched, "forward-scan should find the `end` keyword");
let kw = ast.get_node(captures.get_var("kw").unwrap()).unwrap();
assert_eq!(kw.kind(), "end");
assert_eq!(kw.kind_name(), "end");
assert!(!kw.is_named());
}
@@ -561,7 +565,7 @@ fn test_forward_scan_preserves_order() {
let mut cursor = AstCursor::new(&ast);
cursor.goto_first_child();
cursor.goto_first_child();
while cursor.node().kind() != "do" || !cursor.node().is_named() {
while cursor.node().kind_name() != "do" || !cursor.node().is_named() {
assert!(cursor.goto_next_sibling(), "expected to find named `do`");
}
let do_id = cursor.node_id();
@@ -635,7 +639,7 @@ fn ruby_rules() -> Vec<Rule> {
left: (identifier $tmp)
right: {right}
)
{..left.iter().enumerate().map(|(i, &lhs)|
{left.iter().enumerate().map(|(i, &lhs)|
yeast::tree!(
(assignment
left: {lhs}
@@ -667,7 +671,7 @@ fn ruby_rules() -> Vec<Rule> {
left: {pat}
right: (identifier $tmp)
)
stmt: {..body}
stmt: {body}
)
)
)
@@ -903,7 +907,7 @@ fn one_shot_xeq1_rules() -> Vec<Rule> {
yeast::rule!(
(program (_)* @stmts)
=>
(program stmt: {..stmts})
(program stmt: {stmts})
),
yeast::rule!(
(assignment left: (_) @left right: (_) @right)
@@ -979,7 +983,7 @@ fn test_one_shot_recurses_into_returned_capture() {
yeast::rule!(
(program (_)* @stmts)
=>
(program stmt: {..stmts})
(program stmt: {stmts})
),
// Returns the captured `left` verbatim, discarding `right`.
yeast::rule!(
@@ -1021,7 +1025,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {
yeast::rule!(
(program (_)* @stmts)
=>
(program stmt: {..stmts})
(program stmt: {stmts})
),
// Wraps `left` in nested `first_node`/`second_node` output kinds.
// Neither wrapper kind has a matching rule, so a buggy implementation
@@ -1059,7 +1063,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {
}
/// Verify that `@@name` capture markers skip the auto-translate prefix:
/// the body sees the *raw* (input-schema) NodeRef and can read its
/// the body sees the *raw* (input-schema) `Id` and can read its
/// source text or call `ctx.translate(...)` explicitly. Compare with
/// the bare `@name` form, where the auto-translate prefix runs the
/// same translation up front and the body sees the post-translate id.
@@ -1072,7 +1076,7 @@ fn test_raw_capture_marker() {
yeast::rule!(
(program (_)* @stmts)
=>
(program stmt: {..stmts})
(program stmt: {stmts})
),
// `@@raw_lhs` is untranslated: the body reads its source text
// ("x") and embeds it directly as the identifier content. `@rhs`
@@ -1081,7 +1085,7 @@ fn test_raw_capture_marker() {
(assignment left: (_) @@raw_lhs right: (_) @rhs)
=>
{
let text = ctx.ast.source_text(raw_lhs.into());
let text = ctx.ast.source_text(raw_lhs);
tree!((call
method: (identifier #{text.as_str()})
receiver: {rhs}))
@@ -1116,7 +1120,7 @@ fn test_raw_capture_marker() {
}
/// Companion to `test_raw_capture_marker`: confirms that calling
/// `ctx.translate(raw)` on a `@@`-captured NodeRef from the rule body
/// `ctx.translate(raw)` on a `@@`-captured `Id` from the rule body
/// produces the correctly-translated output-schema node. With `@`, the
/// translation has already happened, so `ctx.translate(...)` inside the
/// body would attempt to re-translate an output node (which has no
@@ -1130,7 +1134,7 @@ fn test_raw_capture_marker_explicit_translate() {
yeast::rule!(
(program (_)* @stmts)
=>
(program stmt: {..stmts})
(program stmt: {stmts})
),
yeast::rule!(
(assignment left: (_) @@raw_lhs right: (_) @rhs)
@@ -1138,7 +1142,7 @@ fn test_raw_capture_marker_explicit_translate() {
{
let translated_lhs = ctx.translate(raw_lhs)?;
tree!((call
method: {..translated_lhs}
method: {translated_lhs}
receiver: {rhs}))
}
),
@@ -1172,11 +1176,11 @@ fn test_cursor_navigation() {
let mut cursor = AstCursor::new(&ast);
// Start at root
assert_eq!(cursor.node().kind(), "program");
assert_eq!(cursor.node().kind_name(), "program");
// Go to first child (assignment)
assert!(cursor.goto_first_child());
assert_eq!(cursor.node().kind(), "assignment");
assert_eq!(cursor.node().kind_name(), "assignment");
// No sibling
assert!(!cursor.goto_next_sibling());
@@ -1187,10 +1191,10 @@ fn test_cursor_navigation() {
// Go back up
assert!(cursor.goto_parent());
assert_eq!(cursor.node().kind(), "assignment");
assert_eq!(cursor.node().kind_name(), "assignment");
assert!(cursor.goto_parent());
assert_eq!(cursor.node().kind(), "program");
assert_eq!(cursor.node().kind_name(), "program");
// Can't go further up
assert!(!cursor.goto_parent());
@@ -1235,10 +1239,8 @@ fn test_desugar_for_with_multiple_assignment() {
}
/// Regression test: `#{capture}` in a template must render the *source text*
/// of the captured node, not its arena `Id`. Previously, captures were bound
/// as `usize`, so `#{cap}` printed the integer id (e.g. `"3"`) via `Display`.
/// Captures are now bound as `NodeRef`, which has no `Display` impl and
/// resolves to the captured node's source text via `YeastDisplay`.
/// of the captured node, not its arena `Id`. Captures are bound as `Id`,
/// whose `YeastDisplay` impl resolves to the captured node's source text.
#[test]
fn test_hash_brace_renders_capture_source_text() {
let rule: Rule = rule!(
@@ -1266,7 +1268,7 @@ fn test_hash_brace_renders_capture_source_text() {
);
}
/// Regression test: non-`NodeRef` values in `#{expr}` still render via their
/// Regression test: non-`Id` values in `#{expr}` still render via their
/// `Display` impl (covered by `YeastDisplay`'s blanket impls for primitives).
#[test]
fn test_hash_brace_renders_integer_expression() {
@@ -1304,12 +1306,12 @@ fn test_hash_brace_uses_capture_location_for_leaf() {
let ast = run_and_ast("foo.bar()", vec![rule]);
let mut bar_ids: Vec<usize> = Vec::new();
let mut bar_ids: Vec<yeast::Id> = Vec::new();
for id in ast.reachable_node_ids() {
let Some(node) = ast.get_node(id) else {
continue;
};
if node.kind() == "identifier" && ast.source_text(id) == "bar" {
if node.kind_name() == "identifier" && ast.source_text(id) == "bar" {
bar_ids.push(id);
}
}

View File

@@ -15,26 +15,26 @@ struct SwiftContext {
/// (`computed_getter`/`computed_setter`/`computed_modify`/
/// `willset_clause`/`didset_clause`/`getter_specifier`/
/// `setter_specifier`).
property_name: Option<yeast::NodeRef>,
property_name: Option<yeast::Id>,
/// Translated type node for the property type. Set by the outer
/// `property_binding` rule (computed accessors variant) and
/// `protocol_property_declaration` when present; read by the
/// accessor inner rules.
property_type: Option<yeast::NodeRef>,
property_type: Option<yeast::Id>,
/// Default-value expression for the next translated `parameter`. Set
/// by the outer `function_parameter` rule; read by the `parameter`
/// rules.
default_value: Option<yeast::NodeRef>,
default_value: Option<yeast::Id>,
/// Translated outer modifiers (e.g. visibility, attributes) to
/// attach to each child of a flattening outer rule. Set by
/// `property_declaration`, `enum_entry`, and
/// `protocol_property_declaration`.
outer_modifiers: Vec<yeast::NodeRef>,
outer_modifiers: Vec<yeast::Id>,
/// The `let`/`var` binding modifier for a `property_declaration`.
/// Set by `property_declaration`; read by the inner declaration
/// rules (`property_binding` variants, accessor rules) so they
/// emit it as part of the output node's `modifier:` field.
binding_modifier: Option<yeast::NodeRef>,
binding_modifier: Option<yeast::Id>,
/// True when the current child of a flattening outer rule is not
/// the first one — its inner rule should emit a
/// `chained_declaration` modifier so the original grouping can be
@@ -45,10 +45,10 @@ struct SwiftContext {
/// Build a freshly-created `chained_declaration` modifier node if
/// `ctx.is_chained`, else `None`. Used by inner declaration rules to
/// emit the chained tag for non-first children of a flattening outer
/// rule. Returns `Option<NodeRef>` so it splices via `{..…}` to 0 or 1 ids.
fn chained_modifier(ctx: &mut yeast::build::BuildCtx<'_, SwiftContext>) -> Option<yeast::NodeRef> {
/// rule. Returns `Option<Id>` so it splices via `{…}` to 0 or 1 ids.
fn chained_modifier(ctx: &mut yeast::build::BuildCtx<'_, SwiftContext>) -> Option<yeast::Id> {
if ctx.is_chained {
Some(ctx.literal("modifier", "chained_declaration").into())
Some(ctx.literal("modifier", "chained_declaration"))
} else {
None
}
@@ -63,10 +63,10 @@ fn chained_modifier(ctx: &mut yeast::build::BuildCtx<'_, SwiftContext>) -> Optio
/// condition.
fn and_chain(
ctx: &mut yeast::build::BuildCtx<'_, SwiftContext>,
conds: Vec<yeast::NodeRef>,
conds: Vec<yeast::Id>,
) -> yeast::Id {
conds.into_iter()
.map(yeast::Id::from)
conds
.into_iter()
.reduce(|acc, elem| {
tree!((binary_expr operator: (infix_operator "&&") left: {acc} right: {elem}))
})
@@ -79,7 +79,7 @@ fn and_chain(
/// guarantees at least one part.
fn member_chain(
ctx: &mut yeast::build::BuildCtx<'_, SwiftContext>,
parts: Vec<yeast::NodeRef>,
parts: Vec<yeast::Id>,
) -> yeast::Id {
let mut iter = parts.into_iter();
let first = iter
@@ -100,7 +100,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(source_file statement: _* @children)
=>
(top_level
body: (block stmt: {..children})
body: (block stmt: {children})
)
),
// Declarations may be wrapped in local/global wrapper nodes.
@@ -144,12 +144,12 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
rule!(
(operator_declaration "prefix" (referenceable_operator _ @op) (simple_identifier)? @prec)
=>
(operator_syntax_declaration name: (identifier #{op}) fixity: (fixity "prefix") precedence: {..prec})
(operator_syntax_declaration name: (identifier #{op}) fixity: (fixity "prefix") precedence: {prec})
),
rule!(
(operator_declaration "postfix" (referenceable_operator _ @op) (simple_identifier)? @prec)
=>
(operator_syntax_declaration name: (identifier #{op}) fixity: (fixity "postfix") precedence: {..prec})
(operator_syntax_declaration name: (identifier #{op}) fixity: (fixity "postfix") precedence: {prec})
),
rule!(
(operator_declaration "infix" (referenceable_operator _ @op) (simple_identifier)? @prec)
@@ -157,7 +157,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(operator_syntax_declaration
name: (identifier #{op})
fixity: (fixity "infix")
precedence: {..prec})
precedence: {prec})
),
rule!((bitwise_operation lhs: @l op: @op rhs: @r) => (binary_expr left: {l} operator: (infix_operator #{op}) right: {r})),
rule!((nil_coalescing_expression value: @l if_nil: @r) => (binary_expr left: {l} operator: (infix_operator "??") right: {r})),
@@ -170,9 +170,9 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
rule!((postfix_expression operation: @op target: @operand) => (unary_expr operator: (postfix_operator #{op}) operand: {operand})),
// TODO: Parenthesised single-value tuple is a grouping expression and should pass through.
// Multi-value tuples become tuple_expr.
rule!((tuple_expression value: _* @v) => (tuple_expr element: {..v})),
rule!((tuple_expression value: _* @v) => (tuple_expr element: {v})),
// Blocks contain statement* directly.
rule!((block statement: _+ @stmts) => (block stmt: {..stmts})),
rule!((block statement: _+ @stmts) => (block stmt: {stmts})),
rule!((block) => (block)),
// ---- Variables ----
// property_binding rules — these produce variable_declaration and/or accessor_declaration
@@ -198,8 +198,8 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
type: _? @ty
computed_value: (computed_property accessor: _+ @@accessors))
=>
{..{
ctx.property_name = Some(tree!((identifier #{pattern})).into());
{{
ctx.property_name = Some(tree!((identifier #{pattern})));
ctx.property_type = ty;
let mut result = Vec::new();
@@ -223,13 +223,13 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
computed_value: (computed_property statement: _* @body))
=>
(accessor_declaration
modifier: {..ctx.binding_modifier}
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.binding_modifier}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
name: (identifier #{name})
type: {..ty}
type: {ty}
accessor_kind: (accessor_kind "get")
body: (block stmt: {..body}))
body: (block stmt: {body}))
),
// Stored property with willSet/didSet observers (initializer
// optional) → a `variable_declaration` followed by one
@@ -249,19 +249,19 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
value: _? @val
observers: (willset_didset_block willset: _? @@ws didset: _? @@ds))
=>
{..{
{{
let var_decl = tree!(
(variable_declaration
modifier: {..ctx.binding_modifier}
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.binding_modifier}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
pattern: (name_pattern identifier: (identifier #{name}))
type: {..ty}
value: {..val})
type: {ty}
value: {val})
);
// Publish the property name for the observer rules.
ctx.property_name = Some(tree!((identifier #{name})).into());
ctx.property_name = Some(tree!((identifier #{name})));
// Observers are subsequent outputs of this flattening
// rule, so they always get `chained_declaration`.
ctx.is_chained = true;
@@ -282,12 +282,12 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
value: _? @val)
=>
(variable_declaration
modifier: {..ctx.binding_modifier}
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.binding_modifier}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
pattern: {pattern}
type: {..ty}
value: {..val})
type: {ty}
value: {val})
),
// property_declaration: flatten declarators (each may translate
// to multiple nodes — variable_declaration and/or
@@ -305,9 +305,9 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
declarator: _* @@decls
(modifiers)* @mods)
=>
{..{
let binding_text = ctx.ast.source_text(binding_kind.into());
ctx.binding_modifier = Some(ctx.literal("modifier", &binding_text).into());
{{
let binding_text = ctx.ast.source_text(binding_kind);
ctx.binding_modifier = Some(ctx.literal("modifier", &binding_text));
ctx.outer_modifiers = mods;
let mut result = Vec::new();
@@ -342,19 +342,19 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
data_contents: (enum_type_parameters parameter: _* @params))
=>
(class_like_declaration
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
modifier: (modifier "enum_case")
name: (identifier #{name})
member: (constructor_declaration parameter: {..params} body: (block)))
member: (constructor_declaration parameter: {params} body: (block)))
),
// enum_case_entry with explicit raw value → variable_declaration with that value.
rule!(
(enum_case_entry name: @name raw_value: @val)
=>
(variable_declaration
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
modifier: (modifier "enum_case")
pattern: (name_pattern identifier: (identifier #{name}))
value: {val})
@@ -364,8 +364,8 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(enum_case_entry name: @name)
=>
(variable_declaration
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
modifier: (modifier "enum_case")
pattern: (name_pattern identifier: (identifier #{name})))
),
@@ -376,7 +376,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
rule!(
(enum_entry case: _+ @@cases (modifiers)* @mods)
=>
{..{
{{
ctx.outer_modifiers = mods;
let mut result = Vec::new();
@@ -418,7 +418,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(constructor_pattern
constructor: (member_access_expr base: {typ} member: (identifier #{name}))
element: {..items})
element: {items})
),
// case .foo(x,y) pattern
rule!(
@@ -426,10 +426,10 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(constructor_pattern
constructor: (member_access_expr base: (inferred_type_expr #{dot}) member: (identifier #{name}))
element: {..items})
element: {items})
),
// Tuple pattern and its (optionally named) items
rule!((pattern kind: (tuple_pattern item: _* @elems)) => (tuple_pattern element: {..elems})),
rule!((pattern kind: (tuple_pattern item: _* @elems)) => (tuple_pattern element: {elems})),
rule!((tuple_pattern_item name: @key pattern: @pat) => (pattern_element key: (identifier #{key}) pattern: {pat})),
rule!((tuple_pattern_item pattern: @pat) => (pattern_element pattern: {pat})),
// Type casting pattern (TODO)
@@ -452,9 +452,9 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(function_declaration
name: (identifier #{name})
parameter: {..params}
return_type: {..ret}
body: (block stmt: {..body_stmts}))
parameter: {params}
return_type: {ret}
body: (block stmt: {body_stmts}))
),
// Parameters are wrapped in function_parameter, which also carries
// optional default values. Publishes the default value into `ctx`
@@ -463,7 +463,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
rule!(
(function_parameter parameter: @@p default_value: _? @def)
=>
{..{
{{
ctx.default_value = def;
ctx.translate(p)?
}}
@@ -475,7 +475,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(parameter
external_name: (identifier #{ext})
pattern: (name_pattern identifier: (identifier #{name}))
default: {..ctx.default_value})
default: {ctx.default_value})
),
rule!(
(parameter external_name: @ext name: @name type: @ty)
@@ -484,7 +484,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
external_name: (identifier #{ext})
pattern: (name_pattern identifier: (identifier #{name}))
type: {ty}
default: {..ctx.default_value})
default: {ctx.default_value})
),
// Parameter with just name and type (no external name)
rule!(
@@ -492,7 +492,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(parameter
pattern: (name_pattern identifier: (identifier #{name}))
default: {..ctx.default_value})
default: {ctx.default_value})
),
rule!(
(parameter name: @name type: @ty)
@@ -500,7 +500,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(parameter
pattern: (name_pattern identifier: (identifier #{name}))
type: {ty}
default: {..ctx.default_value})
default: {ctx.default_value})
),
// Reference to a function, f(x:y:z:). This is parsed as a call with a single argument with multiple reference_specifier labels.
// We don't want downstream QL to try to handle this as a call_expr with a weird argument, so explicitly mark it as unsupported for now.
@@ -514,7 +514,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
rule!(
(call_expression function: @func suffix: (call_suffix arguments: (value_arguments argument: (value_argument)* @args)))
=>
(call_expr callee: {func} argument: {..args})
(call_expr callee: {func} argument: {args})
),
// Value argument with label (value: _ matches both named nodes and anonymous tokens like nil)
rule!(
@@ -537,7 +537,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
// Return / break / continue, one rule per keyword.
// The anonymous "return"/"break"/"continue" keywords are matched as
// string literals.
rule!((control_transfer_statement kind: "return" result: _? @val) => (return_expr value: {..val})),
rule!((control_transfer_statement kind: "return" result: _? @val) => (return_expr value: {val})),
rule!((control_transfer_statement kind: "break" result: @lbl) => (break_expr label: (identifier #{lbl}))),
rule!((control_transfer_statement kind: "break") => (break_expr)),
rule!((control_transfer_statement kind: "continue" result: @lbl) => (continue_expr label: (identifier #{lbl}))),
@@ -556,20 +556,20 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
statement: _* @body)
=>
(function_expr
modifier: {..attrs}
capture_declaration: {..captures}
parameter: {..params}
return_type: {..ret}
body: (block stmt: {..body}))
modifier: {attrs}
capture_declaration: {captures}
parameter: {params}
return_type: {ret}
body: (block stmt: {body}))
),
// capture_list_item with ownership modifier (e.g. [weak self], [unowned x])
rule!(
(capture_list_item ownership: _? @ownership name: @name value: _? @val)
=>
(variable_declaration
modifier: {..ownership}
modifier: {ownership}
pattern: (name_pattern identifier: (identifier #{name}))
value: {..val})
value: {val})
),
// Lambda parameter with type and optional external name
rule!(
@@ -615,7 +615,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(if_expr
condition: {and_chain(&mut ctx, cond)}
then: {then_body}
else: {..else_stmts})
else: {else_stmts})
),
// Guard statement
rule!(
@@ -623,7 +623,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(guard_if_stmt
condition: {and_chain(&mut ctx, cond)}
else: (block stmt: {..else_stmts}))
else: (block stmt: {else_stmts}))
),
// Ternary expression → if_expr
rule!(
@@ -635,7 +635,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
rule!(
(switch_statement expr: @val entry: (switch_entry)* @cases)
=>
(switch_expr value: {val} case: {..cases})
(switch_expr value: {val} case: {cases})
),
// Switch entry with multiple patterns and body
rule!(
@@ -644,19 +644,19 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
pattern: (switch_pattern pattern: @rest)+
statement: _* @body)
=>
(switch_case pattern: (or_pattern pattern: {first} pattern: {..rest}) body: (block stmt: {..body}))
(switch_case pattern: (or_pattern pattern: {first} pattern: {rest}) body: (block stmt: {body}))
),
// Switch entry with exactly one pattern and body
rule!(
(switch_entry pattern: (switch_pattern pattern: @pat) statement: _* @body)
=>
(switch_case pattern: {pat} body: (block stmt: {..body}))
(switch_case pattern: {pat} body: (block stmt: {body}))
),
// Switch entry: default case (no patterns)
rule!(
(switch_entry default: (default_keyword) statement: _* @body)
=>
(switch_case body: (block stmt: {..body}))
(switch_case body: (block stmt: {body}))
),
// if case PATTERN = expr — preserve the pattern directly (no Optional wrapping)
rule!(
@@ -702,8 +702,8 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(for_each_stmt
pattern: {pat}
iterable: {iter}
guard: {..guard}
body: (block stmt: {..body}))
guard: {guard}
body: (block stmt: {body}))
),
// While loop
rule!(
@@ -711,7 +711,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(while_stmt
condition: {and_chain(&mut ctx, cond)}
body: (block stmt: {..body}))
body: (block stmt: {body}))
),
// Repeat-while loop
rule!(
@@ -719,28 +719,28 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(do_while_stmt
condition: {and_chain(&mut ctx, cond)}
body: (block stmt: {..body}))
body: (block stmt: {body}))
),
// Labeled statement (e.g. `outer: for ...`). Strip the trailing ':' from the label token.
rule!((labeled_statement label: (statement_label) @lbl statement: @stmt) => {
let text = ctx.ast.source_text(lbl.into());
let text = ctx.ast.source_text(lbl);
let name = &text[..text.len() - 1];
tree!((labeled_stmt label: (identifier #{name}) stmt: {stmt}))
}),
// ---- Collections ----
// Array literal
rule!((array_literal element: _* @elems) => (array_literal element: {..elems})),
rule!((array_literal element: _* @elems) => (array_literal element: {elems})),
// Empty array literal
rule!((array_literal) => (array_literal)),
// Dictionary literal — zip keys and values into key_value_pairs
rule!(
(dictionary_literal key: _* @keys value: _* @vals)
=>
(map_literal element: {..keys.into_iter().zip(vals).map(|(k, v)|
(map_literal element: {keys.into_iter().zip(vals).map(|(k, v)|
tree!((key_value_pair key: {k} value: {v}))
)})
),
rule!((dictionary_literal element: _* @elems) => (map_literal element: {..elems})),
rule!((dictionary_literal element: _* @elems) => (map_literal element: {elems})),
rule!((dictionary_literal_item key: @k value: @v) => (key_value_pair key: {k} value: {v})),
// ---- Optionals and errors ----
// Optional chaining — unwrap the marker
@@ -753,8 +753,8 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(do_statement body: (block statement: _* @body) catch: (catch_block)* @catches)
=>
(try_expr
body: (block stmt: {..body})
catch_clause: {..catches})
body: (block stmt: {body})
catch_clause: {catches})
),
// Catch block with bound identifier; optional where-clause guard.
rule!(
@@ -766,14 +766,14 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(catch_clause
pattern: {pattern}
guard: {..guard}
body: (block stmt: {..body}))
guard: {guard}
body: (block stmt: {body}))
),
// Catch block without error binding
rule!(
(catch_block keyword: (catch_keyword) body: (block statement: _* @body))
=>
(catch_clause body: (block stmt: {..body}))
(catch_clause body: (block stmt: {body}))
),
// Empty catch block: catch {}
rule!(
@@ -787,7 +787,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(catch_clause
pattern: {pat}
body: (block stmt: {..body}))
body: (block stmt: {body}))
),
// As expression (type cast) — as?, as!
rule!((as_expression (as_operator) @op expr: @val type: @ty) => (type_cast_expr expr: {val} operator: (infix_operator #{op}) type: {ty})),
@@ -812,7 +812,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
pattern: (name_pattern identifier: (identifier #{parts.last().unwrap()}))
imported_expr: {name}
modifier: (modifier #{kind})
modifier: {..mods})
modifier: {mods})
),
// Non-scoped import declaration (for example `import Foundation`):
// flatten the identifier parts into a member_access_expr and use a
@@ -823,7 +823,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(import_declaration
pattern: (bulk_importing_pattern)
imported_expr: {name}
modifier: {..mods})
modifier: {mods})
),
// ---- Types and classes ----
// Self expression → name_expr
@@ -831,7 +831,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
// Super expression → super_expr
rule!((super_expression) => (super_expr)),
// Modifiers — unwrap to individual modifier children
rule!((modifiers _* @mods) => {..mods}),
rule!((modifiers _* @mods) => {mods}),
rule!((attribute) @m => (modifier #{m})),
rule!((visibility_modifier) @m => (modifier #{m})),
rule!((function_modifier) @m => (modifier #{m})),
@@ -848,7 +848,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
// Keep a conservative textual fallback to avoid dropping type information.
rule!((user_type) @ty => (named_type_expr name: (identifier #{ty}))),
// Tuple type → tuple_type_expr
rule!((tuple_type element: _* @elems) => (tuple_type_expr element: {..elems})),
rule!((tuple_type element: _* @elems) => (tuple_type_expr element: {elems})),
rule!((tuple_type_item name: @name type: @ty) => (tuple_type_element name: (identifier #{name}) type: {ty})),
rule!((tuple_type_item type: @ty) => (tuple_type_element type: {ty})),
// Array type `[T]` → generic_type_expr with Array base
@@ -865,7 +865,7 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
base: (named_type_expr name: (identifier "Optional"))
type_argument: {w})),
// Function type `(Params) -> Ret` → function_type_expr.
rule!((function_type parameter: _* @ps return_type: @ret) => (function_type_expr parameter: {..ps} return_type: {ret})),
rule!((function_type parameter: _* @ps return_type: @ret) => (function_type_expr parameter: {ps} return_type: {ret})),
rule!((function_type_parameter name: @name type: @ty) => (parameter external_name: (identifier #{name}) type: {ty})),
rule!((function_type_parameter type: @ty) => (parameter type: {ty})),
// Selector expression: `#selector(inner)` -- not yet supported
@@ -889,10 +889,10 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(class_like_declaration
modifier: (modifier #{kind})
modifier: {..mods}
modifier: {mods}
name: (identifier #{name})
base_type: {..bases}
member: {..members})
base_type: {bases}
member: {members})
),
// Enum class declaration: same as a regular class but with an enum body.
rule!(
@@ -905,10 +905,10 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(class_like_declaration
modifier: (modifier #{kind})
modifier: {..mods}
modifier: {mods}
name: (identifier #{name})
base_type: {..bases}
member: {..members})
base_type: {bases}
member: {members})
),
// Class declaration with empty body
rule!(
@@ -921,9 +921,9 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(class_like_declaration
modifier: (modifier #{kind})
modifier: {..mods}
modifier: {mods}
name: (identifier #{name})
base_type: {..bases})
base_type: {bases})
),
// Protocol declaration
rule!(
@@ -935,10 +935,10 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(class_like_declaration
modifier: (modifier "protocol")
modifier: {..mods}
modifier: {mods}
name: (identifier #{name})
base_type: {..bases}
member: {..members})
base_type: {bases}
member: {members})
),
// Protocol function — return type and body statements both optional.
rule!(
@@ -950,11 +950,11 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(modifiers)* @mods)
=>
(function_declaration
modifier: {..mods}
modifier: {mods}
name: (identifier #{name})
parameter: {..params}
return_type: {..ret}
body: (block stmt: {..body_stmts}))
parameter: {params}
return_type: {ret}
body: (block stmt: {body_stmts}))
),
// Init declaration → constructor_declaration. Body statements optional;
// body itself is also optional (protocol requirement).
@@ -965,9 +965,9 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(modifiers)* @mods)
=>
(constructor_declaration
modifier: {..mods}
parameter: {..params}
body: (block stmt: {..body_stmts}))
modifier: {mods}
parameter: {params}
body: (block stmt: {body_stmts}))
),
// Deinit declaration → destructor_declaration. Body statements optional.
rule!(
@@ -976,15 +976,15 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(modifiers)* @mods)
=>
(destructor_declaration
modifier: {..mods}
body: (block stmt: {..body_stmts}))
modifier: {mods}
body: (block stmt: {body_stmts}))
),
// Typealias declaration
rule!(
(typealias_declaration name: @name value: @val (modifiers)* @mods)
=>
(type_alias_declaration
modifier: {..mods}
modifier: {mods}
name: (identifier #{name})
r#type: {val})
),
@@ -999,9 +999,9 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(associatedtype_declaration name: @name inherits_from: _? @bound (modifiers)* @mods)
=>
(associated_type_declaration
modifier: {..mods}
modifier: {mods}
name: (identifier #{name})
bound: {..bound})
bound: {bound})
),
// Protocol property declaration: translate each accessor
// requirement to an `accessor_declaration` carrying the property
@@ -1018,8 +1018,8 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
type: _? @ty
(modifiers)* @mods)
=>
{..{
ctx.property_name = Some(tree!((identifier #{name})).into());
{{
ctx.property_name = Some(tree!((identifier #{name})));
ctx.property_type = ty;
ctx.outer_modifiers = mods;
@@ -1040,23 +1040,23 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
=>
(accessor_declaration
name: {ctx.property_name.ok_or("getter_specifier outside protocol_property_declaration context")?}
type: {..ctx.property_type}
type: {ctx.property_type}
accessor_kind: (accessor_kind "get")
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)})
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)})
),
rule!(
(setter_specifier)
=>
(accessor_declaration
name: {ctx.property_name.ok_or("setter_specifier outside protocol_property_declaration context")?}
type: {..ctx.property_type}
type: {ctx.property_type}
accessor_kind: (accessor_kind "set")
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)})
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)})
),
// protocol_property_requirements wrapper — should be consumed by above; fallback
rule!((protocol_property_requirements accessor: _* @accs) => {..accs}),
rule!((protocol_property_requirements accessor: _* @accs) => {accs}),
// Computed getter → accessor_declaration (body optional).
// Reads property name/type from the outer property_binding rule
// and binding/outer modifiers + chained tag from the outer
@@ -1065,58 +1065,58 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(computed_getter body: (block statement: _* @body)?)
=>
(accessor_declaration
modifier: {..ctx.binding_modifier}
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.binding_modifier}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
name: {ctx.property_name.ok_or("computed_getter outside property_binding context")?}
type: {..ctx.property_type}
type: {ctx.property_type}
accessor_kind: (accessor_kind "get")
body: (block stmt: {..body}))
body: (block stmt: {body}))
),
// Computed setter with explicit parameter name.
rule!(
(computed_setter parameter: @param body: (block statement: _* @body))
=>
(accessor_declaration
modifier: {..ctx.binding_modifier}
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.binding_modifier}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
name: {ctx.property_name.ok_or("computed_setter outside property_binding context")?}
type: {..ctx.property_type}
type: {ctx.property_type}
accessor_kind: (accessor_kind "set")
parameter: (parameter pattern: (name_pattern identifier: (identifier #{param})))
body: (block stmt: {..body}))
body: (block stmt: {body}))
),
// Computed setter without explicit parameter name; body optional.
rule!(
(computed_setter body: (block statement: _* @body)?)
=>
(accessor_declaration
modifier: {..ctx.binding_modifier}
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.binding_modifier}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
name: {ctx.property_name.ok_or("computed_setter outside property_binding context")?}
type: {..ctx.property_type}
type: {ctx.property_type}
accessor_kind: (accessor_kind "set")
body: (block stmt: {..body}))
body: (block stmt: {body}))
),
// Computed modify → accessor_declaration
rule!(
(computed_modify body: (block statement: _* @body))
=>
(accessor_declaration
modifier: {..ctx.binding_modifier}
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.binding_modifier}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
name: {ctx.property_name.ok_or("computed_modify outside property_binding context")?}
type: {..ctx.property_type}
type: {ctx.property_type}
accessor_kind: (accessor_kind "modify")
body: (block stmt: {..body}))
body: (block stmt: {body}))
),
// willset/didset block — spread to children (only reachable as a
// fallback; the outer property_binding manual rule normally
// captures the willset/didset clauses directly).
rule!((willset_didset_block _* @clauses) => {..clauses}),
rule!((willset_didset_block _* @clauses) => {clauses}),
// willset clause → accessor_declaration (body optional). Reads
// `ctx.property_name` set by the outer property_binding rule and
// binding/outer modifiers + chained tag from the outer
@@ -1125,24 +1125,24 @@ fn translation_rules() -> Vec<Rule<SwiftContext>> {
(willset_clause body: (block statement: _* @body)?)
=>
(accessor_declaration
modifier: {..ctx.binding_modifier}
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.binding_modifier}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
name: {ctx.property_name.ok_or("willset_clause outside property_binding context")?}
accessor_kind: (accessor_kind "willSet")
body: (block stmt: {..body}))
body: (block stmt: {body}))
),
// didset clause → accessor_declaration (body optional).
rule!(
(didset_clause body: (block statement: _* @body)?)
=>
(accessor_declaration
modifier: {..ctx.binding_modifier}
modifier: {..ctx.outer_modifiers.clone()}
modifier: {..chained_modifier(&mut ctx)}
modifier: {ctx.binding_modifier}
modifier: {ctx.outer_modifiers.clone()}
modifier: {chained_modifier(&mut ctx)}
name: {ctx.property_name.ok_or("didset_clause outside property_binding context")?}
accessor_kind: (accessor_kind "didSet")
body: (block stmt: {..body}))
body: (block stmt: {body}))
),
// Preprocessor conditionals — unsupported
rule!((diagnostic) => (unsupported_node)),