Compare commits

..

103 Commits

Author SHA1 Message Date
Taus
f75cf95db5 Apply rustfmt
Format the touched Rust crates (shared/tree-sitter-extractor,
shared/yeast, shared/yeast-macros, unified/extractor) so the
tree-sitter-extractor CI fmt check passes. No functional changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-25 12:26:52 +00:00
Taus
8a3bb915e4 unified/swift: Use tree! instead of ctx.node
Cleans up a few places where we were constructing trees piece by piece
rather than using the `tree!` macro.

In the process, Copilot noticed an issue that should probably be
addressed: the labeled_statement rule can never fire, since there are no
such nodes in the input. This is possibly a simple as making
_labeled_statement (which _does_ exist) named, but I haven't attempted
this.

Finally, a small change to yeast makes it so that the contents of a {}
interpolation can be a Rust block (previously it could only be a single
expression). This avoids the need to double-wrap instances where you
want to interpolate a single node produced as the final value of some
block.
2026-06-25 12:02:39 +00:00
Taus
ded5cc2901 unified/swift: Replace reduce_left with Rust helpers
(Both reduce_left and map are still supported, but we could remove them
at this point.)

I think this way of writing things makes the intent a lot clearer -- it
avoids extending the yeast rule language with complicated constructs,
pushing the complexity (such as it is) into Rust instead.
2026-06-25 12:02:39 +00:00
Taus
d9484e6196 unified/swift: Propagate property_declaration modifiers via context
Gets rid of the final uses of mutation (via prepend_field). The approach
is the same as in the preceding commits: we set the appropriate fields
on the context when processing the outer node, and then access these
fields on the inner nodes.

The repeated use of `modifier` fields is a _bit_ clunky, but since we're
likely moving to an out-of-band modifier mechanism at some point, I
think it's good enough for now.
2026-06-25 12:02:39 +00:00
Taus
7793702e8a unified/swift: Propagate enum_entry outer modifiers via context
Same as in the preceding commit, we added a test beforehand for testing
this syntax, and verified that it was unchanged by the cleanup in this
commit.
2026-06-25 12:02:39 +00:00
Taus
4730898b2f unified/swift: Translate protocol properties using context
Avoids more "mutation after creation" via prepend_field.

Also adds a test to the corpus for exercising this syntax. Although it's
not evident, the test output was unchanged by this refactoring.
2026-06-25 12:02:39 +00:00
Taus
86feaeff4e unified/swift: Propagate parameter default values via context
Extends the context with a field for keeping track of the default value.

In the process, we also rename the context to SwiftContext as it now
doesn't only concern itself with properties.
2026-06-25 12:02:39 +00:00
Taus
0c85c31129 yeast: Simplify Swift rules using the new machinery
Propagates in name and type information for various property
declarations, using the context mechanism. This avoids mutating
already-translated nodes in-place, and is generally much easier to read.
2026-06-25 12:02:39 +00:00
Taus
919c5b8c53 yeast: Hide desugaring behind Desugarer trait
This was necessary since otherwise the generic type of the
user-specified context (which should only be a concern for yeast) starts
to bleed out into the shared extractor. Instead, we type-erase it by
putting it inside the aforementioned trait.
2026-06-25 12:02:39 +00:00
Taus
c39bfa555d yeast: Add macro for fine-grained rules
Adds `manual_rule!` which provides a more low-level interface for
defining rewrites. (I'm not entirely sold on the name, so any
suggestions would be welcome.)

Notably, the captures bound in the body of such rules have _not_ been
translated yet -- they still come from the _input_ tree. It is the
user's duty to call ctx.translate on these (which has the effect of
recursively invoking the translation) before substituting them into the
output.

For _truly_ low-level access, the user can still construct a Rule
directly, but this is now somewhat cumbersome as the closure contained
therein takes quite a few parameters. Still, the possibility remains.
2026-06-25 12:02:39 +00:00
Taus
03350bf8d7 yeast: Pass raw captures to Rule::new rules
This enables users to specify how and when these captures get
translated. In conjunction with the context mechanism, this can be used
to e.g. translate some piece of information (e.g. the type of
something), record it in the context, and then recursively translate
some other capture that relies on this information. This allows
information to be cleanly passed into descendants (which can be written
using context accesses in the `rule!` macro form).

As a consequence of this change, we now need to pass around a
TranslatorHandle to perform the manual translation. For Repeating rules,
it doesn't really make sense to translate things, so in this case we
simply signal an error.

Also, the implementation of the `rule!` macro changes slightly (without
changing semantics): it now essentially delegates to `Rule::new`,
receiving raw captures, but then immediately applies the translation to
those captures (which, for the majority of cases, is likely the desired
behaviour).
2026-06-25 12:02:39 +00:00
Taus
d38ffe0ad5 yeast: Make transforms return Result
This will enable us to actually capture and log errors in complicated
rules (e.g. ones written in Rust) rather than just panicking.
2026-06-25 12:02:38 +00:00
Taus
d6373eaef7 yeast: Reify the context and allow user-defined data in it
Renames what was previously called `__yeast_ctx` into just `ctx`, and
adds a new field `user_ctx` to this context. Said field can contain a
struct of any user type (necessitating making various parts of the
implementation generic in said type).

Through some Deref magic, field accesses are delegated to the inner
struct (assuming they are not already defined on `ctx`), which should
hopefully make the interface a bit more ergonomic.
2026-06-25 12:02:38 +00:00
Asger F
89cd6770ae Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-06-25 13:18:27 +02:00
Asger F
66c1f037f5 Add TODO 2026-06-19 12:19:51 +02:00
Asger F
2675070291 unified/swift: Clean up translation of patterns
Patterns have an unusual parse tree, but now the matching should
at least be a bit easier to follow.

The TODO regarding not being able to pass down context to handle
var/let is still relevant, and can't be solved in the mapping alone.
2026-06-19 11:35:06 +02:00
Asger F
c01264d05c Coerce pattern_element.key to be an identifier 2026-06-19 10:31:34 +02:00
Asger F
63e1cc90e9 Test: add corpus test for switch case patterns with labeled arguments
Adds a test case 'Switch with labeled case pattern arguments' covering:
- case .implicit(isAcknowledged: false) — labeled bool literal
- case .thread(threadRowId: _, let rowId) — labeled wildcard + binding

The current output contains type errors: pattern_element::key is being
produced as name_expr instead of identifier. These will be fixed in the
following commit.
2026-06-19 10:27:20 +02:00
Asger F
2182265120 unified/swift: Better source range for inferred_type_expr 2026-06-18 14:57:55 +02:00
Asger F
0b666d47db Preserve the dot token in case patterns 2026-06-18 14:55:54 +02:00
Asger F
142ac47166 Refactor: map switch case patterns to constructor_pattern instead of tuple_pattern
Changed the desugaring rules to properly map case patterns with binding (e.g.,
'case .circle(let r):') to constructor_pattern nodes instead of tuple_pattern.

New rules added:
- tuple_pattern_item → pattern_element (preserves optional name/key)
- pattern.kind: binding_pattern → name_pattern (extracts bound identifier)
- pattern.kind: case_pattern → constructor_pattern (creates proper constructor
  with bound arguments as pattern_elements)

This provides a more semantically correct AST representation:
- Constructor name: name_expr identifier 'circle'
- Elements: pattern_element containing name_pattern identifier 'r'

Instead of the previous tuple_pattern string representation.

Updated control-flow.txt corpus outputs.
2026-06-18 14:54:59 +02:00
Asger F
2470c1388a Fix: preserve switch case patterns in desugared output
The switch_entry rule was capturing switch_pattern wrapper nodes instead of
drilling into them to extract the actual pattern nodes. This caused patterns
from switch cases to be lost during desugaring.

Changed the pattern match from:
  (switch_entry pattern: (switch_pattern)* @pats ...)
to:
  (switch_entry pattern: (switch_pattern pattern: @pats)* ...)

This now correctly extracts the pattern field from each switch_pattern node,
ensuring that patterns from cases like 'case 1:' and 'case .circle(let r):'
are preserved in the switch_case AST nodes.

Updated control-flow.txt corpus outputs to reflect the new behavior.
2026-06-18 14:37:42 +02:00
Asger F
fa98557dd9 Update QL test output 2026-06-18 14:26:49 +02:00
Asger F
1e167dfa6b unified/swift: add type and declaration-family mappings 2026-06-18 14:26:47 +02:00
Asger F
f362707493 unified/swift: Imports 2026-06-18 14:26:45 +02:00
Asger F
15208b70aa Unified: Add import_declaration.scoped_import_kind 2026-06-18 14:26:43 +02:00
Asger F
3522f35ab2 unified/swift: add collections, optionals/errors 2026-06-18 14:26:42 +02:00
Asger F
938396a751 unified/swift: add control-flow and loop mappings 2026-06-18 14:26:40 +02:00
Asger F
790d4f11be unified/swift: add closure and capture mappings 2026-06-18 14:26:38 +02:00
Asger F
8f747a355c unified/swift: add function and parameter mappings 2026-06-18 14:26:37 +02:00
Asger F
d17fd2d964 unified/swift: add variable/property/accessor and enum mappings 2026-06-18 14:26:35 +02:00
Asger F
4e9c3fb436 unified/swift: add literals, names, and operator expression mappings 2026-06-18 14:26:33 +02:00
Asger F
0e9d17b59c unified/swift: add top-level normalization and fallback scaffold 2026-06-18 14:26:31 +02:00
Asger F
6c74cd31e4 Yeast: use child locations instead of rule target
Previously, when a node was synthesized it would always take the
location from the node that matched the current rule. This resulted
in overly broad locations however.

For (foo #{bar}) we now take the location of the 'bar' node.

For non-leaf nodes we merge all its child node locations.
2026-06-18 14:26:30 +02:00
Asger F
166406acbb Unified: Elaborate a bit more on AGENTS.md 2026-06-18 14:26:28 +02:00
Asger F
b40cb5dedd Regenerate QL 2026-06-18 14:26:26 +02:00
Asger F
6dd7dedc19 Rewrite AST 2026-06-18 14:26:22 +02:00
Asger F
1d8e682e5f Reset mappings 2026-06-15 10:49:37 +02:00
Asger F
0baa126473 Add ability to prepend fields in Yeast 2026-06-15 10:49:35 +02:00
Asger F
d11b428292 yeast-macros: desugar 'field: @cap' to 'field: _ @cap'
When a field pattern has a bare capture with no preceding pattern
atom (i.e. `foo: @bar`), implicitly use a true wildcard (`_`,
match_unnamed: true) as the node pattern, making it equivalent to
`foo: _ @bar`.

This is a convenience shorthand: in practice every `field: _ @cap`
in the Swift rules can now be written more concisely as `field: @cap`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-15 10:49:33 +02:00
Asger F
ddc9516e92 Yeast: better support for rewriting unnamed nodes
- Ensure the full wildcard _ supports quantifiers
- Also rewrite unnamed nodes in one-shot phases
2026-06-15 10:49:31 +02:00
Asger F
00068948c1 yeast-macros: add .reduce_left(first -> init, acc, elem -> fold) chain
A left fold over an iterable where the first element seeds the accumulator:
- first -> init  : converts the first element to the initial accumulator
- acc, elem -> fold : fold step; acc = current accumulator, elem = next element
- Empty iterable produces nothing (0-element splice)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-15 10:49:29 +02:00
Asger F
28c879f58c yeast-macros: add .map(p -> tpl) chain syntax for tree templates
After a {expr} or {..expr} placeholder, an optional chain of
.<builtin>() calls may follow. Currently the only builtin is:

  .map(param -> template)

which applies the template to each element of the iterable and
collects the resulting node IDs. A chain auto-splices into the
enclosing field/child position.

Example:
  path: {parts}.map(p -> (identifier #{p}))

The framework is extensible: additional builtins can be added by
matching on the method name in parse_chain_suffix.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-15 10:49:27 +02:00
Asger F
6000c18c24 Unified: also QLDoc for unified.qll 2026-06-12 16:48:25 +02:00
Asger F
e81a3bcbc3 Unified: Add QLDoc 2026-06-12 16:47:06 +02:00
Asger F
7d6d5bfb4a Unified: add test for comments 2026-06-12 16:36:33 +02:00
Asger F
f83adb55ce Unified: regenerate AST 2026-06-12 16:33:51 +02:00
Asger F
5608369abe Extract trivia tokens from original parse tree 2026-06-12 16:32:57 +02:00
Tom Hvitved
f5919875b7 Merge pull request #21941 from hvitved/python/content-approx
Python: Implement `ContentApprox`
2026-06-09 15:46:04 +02:00
Owen Mansel-Chan
8d456df26f Merge pull request #21960 from github/dependabot/go_modules/go/extractor/extractor-dependencies-28a04969f3
Bump golang.org/x/mod from 0.36.0 to 0.37.0 in /go/extractor in the extractor-dependencies group
2026-06-09 05:30:45 +01:00
dependabot[bot]
72fcf27d1a Bump golang.org/x/mod
Bumps the extractor-dependencies group in /go/extractor with 1 update: [golang.org/x/mod](https://github.com/golang/mod).


Updates `golang.org/x/mod` from 0.36.0 to 0.37.0
- [Commits](https://github.com/golang/mod/compare/v0.36.0...v0.37.0)

---
updated-dependencies:
- dependency-name: golang.org/x/mod
  dependency-version: 0.37.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: extractor-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-06-09 03:03:37 +00:00
yoff
0cea01c22f Merge pull request #21926 from github/yoff/python-simplify-decorator-predicates
Python: simplify decorator-detection predicates to pure AST match
2026-06-08 22:04:33 +02:00
Anders Schack-Mulligen
a473565256 Merge pull request #21954 from aschackmull/cfg/consistency-child-idx
Cfg: Add consistency check for relevant child indices.
2026-06-08 14:44:20 +02:00
Anders Schack-Mulligen
c47135a40b Cfg: Add consistency check for relevant child indices. 2026-06-08 13:40:33 +02:00
Owen Mansel-Chan
3cbc8f0262 Merge pull request #21951 from github/workflow/go-version-update
Go: Update to 1.26.4
2026-06-08 11:47:47 +01:00
Tom Hvitved
cc1ea25856 Python: Implement ContentApprox 2026-06-08 08:41:28 +02:00
github-actions[bot]
5a38cbd5d5 Go: Update to 1.26.4 2026-06-08 04:30:10 +00:00
Owen Mansel-Chan
cf6d94cf8a Merge pull request #21324 from github/copilot/automate-go-version-updates-again
Automate Go version updates via scheduled workflow
2026-06-06 03:03:03 +01:00
Owen Mansel-Chan
292fc8b777 Fix detection of failed text replacement
I checked and the comment seems to be correct.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-06-06 02:52:21 +01:00
Owen Mansel-Chan
a1759d9834 Use --force-with-lease for slightly improved safety
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-06-06 02:51:36 +01:00
Owen Mansel-Chan
6b74874372 Minor improvement to PR text 2026-06-06 02:32:43 +01:00
copilot-swe-agent[bot]
ef29d22c75 Update Go version workflow to include patch numbers in messages 2026-06-06 01:03:44 +00:00
Owen Mansel-Chan
1f91f915c7 Merge pull request #21888 from owen-mc/py/remove-imprecise-container-steps
Python: Remove imprecise container steps #2
2026-06-04 22:16:24 +01:00
Jon Janego
ba8eebe2b5 Merge pull request #21948 from github/codeql-spark-run-26974832191
Update changelog documentation site for codeql-cli-2.25.6
2026-06-04 14:55:17 -05:00
github-actions[bot]
dc1409e5f4 update codeql documentation 2026-06-04 19:36:45 +00:00
Mario Campos
284f42bb9e Merge pull request #21945 from github/codeql-spark-run-26947645690
Update changelog documentation site for codeql-cli-2.25.6
2026-06-04 13:09:04 -05:00
Henry Mercer
2f3524de74 Merge branch 'rc/3.22' into codeql-spark-run-26947645690 2026-06-04 16:01:11 +01:00
github-actions[bot]
b32573b060 update codeql documentation 2026-06-04 14:57:38 +00:00
Owen Mansel-Chan
da999ee440 Address review comments 2026-06-03 21:24:16 +01:00
Henry Mercer
93a4b427e3 Merge pull request #21933 from github/post-release-prep/codeql-cli-2.25.6
Post-release preparation for codeql-cli-2.25.6
2026-06-03 16:57:48 +01:00
Owen Mansel-Chan
6f2cc43f32 Remove imprecise model for tuple() 2026-06-02 21:59:48 +01:00
Owen Mansel-Chan
5042fdee84 Remove imprecise model for list() 2026-06-02 21:59:46 +01:00
Owen Mansel-Chan
04341c47bd Tweak model for str.join 2026-06-02 21:59:44 +01:00
Owen Mansel-Chan
b27d08ee32 Update edges in expected test output 2026-06-02 18:29:56 +01:00
Owen Mansel-Chan
20ce679d61 Accept changed edges in test output
No changes to alerts
2026-06-02 16:15:08 +01:00
Owen Mansel-Chan
f62ebef9e0 Adjust expected test output 2026-06-02 16:15:06 +01:00
Owen Mansel-Chan
c3ef1ddd64 Add MaD models for lxml and xml etree.fromstringlist 2026-06-02 16:15:01 +01:00
Owen Mansel-Chan
dede5bc49b Track flow through tuple() with list with tainted elements 2026-06-02 16:14:59 +01:00
Owen Mansel-Chan
ad97b6dd64 Use access path for str.join model 2026-06-02 16:14:56 +01:00
yoff
5fb75ac987 Python: simplify decorator-detection predicates to pure AST match
The internal predicates that identify `@staticmethod`, `@classmethod` and
`@property` decorators previously required the decorator's `NameNode` to
satisfy `isGlobal()` (i.e. no SSA def reaches the decorator's name use).
That filter was correct but unnecessarily indirect: these three names
are builtins, and even when a class body redefines one, the class body
has not started executing at the decorator position, so Python uses the
builtin.

Match the decorator's AST `Name` directly instead, dropping the CFG/SSA
detour. The slight semantic change — `isGlobal()` would have rejected
module-level shadowing of these builtins — is negligible in practice
and explicitly documented in the change note.

`hasContextmanagerDecorator` and `hasOverloadDecorator` keep the
`NameNode.isGlobal()` check because their target names (`contextmanager`,
`overload`) are imported, not builtin, and local shadowing is a real
concern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-01 14:04:43 +00:00
Owen Mansel-Chan
b38440490a Address review comment 2026-05-31 21:47:44 +01:00
Owen Mansel-Chan
aee33a0cc9 Add missing code for TAnyTupleOrDictionaryElement 2026-05-29 10:26:24 +01:00
Owen Mansel-Chan
df15a719cb Add a ContentSet for any tuple or dictionary element 2026-05-28 16:48:23 +01:00
Owen Mansel-Chan
812e8e6b34 Add change note 2026-05-28 11:37:54 +01:00
Owen Mansel-Chan
80c6f082d1 Fix TODO in containerStep 2026-05-28 11:34:02 +01:00
Owen Mansel-Chan
ec13e1bcd3 Add wildcard ContentSets to avoid performance problems 2026-05-27 15:28:07 +01:00
Owen Mansel-Chan
e8779295ee Update test results 2026-05-22 11:43:18 +01:00
Rasmus Lerchedahl Petersen
fa758d6bf5 python: fix test 2026-05-21 16:59:19 +01:00
Rasmus Lerchedahl Petersen
fa9426c749 Python: extra tests for comprehension 2026-05-21 16:59:18 +01:00
Rasmus Lerchedahl Petersen
0ecca91dea Python: typo 2026-05-21 16:59:16 +01:00
Rasmus Lerchedahl Petersen
f669a4f3bf Python: Make sure all imprecise taint bubbles up 2026-05-21 16:59:14 +01:00
Rasmus Lerchedahl Petersen
3275c814bd Python: reset test expectations 2026-05-21 16:59:11 +01:00
Rasmus Lerchedahl Petersen
9a180036a5 Python: conversion step for format_map
and adjust collection test
2026-05-21 16:59:08 +01:00
Rasmus Lerchedahl Petersen
93e7ab52b7 Python: adjust test expectations
We now find an alert on this line as we hope to
It is not an alert for _full_ SSRF, though, since that configuration cannot handle multiple substitutions.
2026-05-21 16:58:51 +01:00
Rasmus Lerchedahl Petersen
facb3b681d Python: recover taint for % format strings 2026-05-21 16:57:50 +01:00
Rasmus Lerchedahl Petersen
b67694b2ab Python: Remove imprecise container steps
- remove `tupleStoreStep` and `dictStoreStep` from `containerStep`
   These are imprecise compared to the content being precise.
- add implicit reads to recover taint at sinks
- add implicit read steps for decoders
  to supplement the `AdditionalTaintStep`
  that now only covers when the full container is tainted.
2026-05-21 16:57:44 +01:00
Owen Mansel-Chan
a367294c23 Merge branch 'main' into copilot/automate-go-version-updates-again 2026-04-23 14:41:46 +01:00
copilot-swe-agent[bot]
b6004045bd Clean up Go version workflow - remove unnecessary escaping and checks
Co-authored-by: mbg <278086+mbg@users.noreply.github.com>
2026-02-13 11:23:44 +00:00
copilot-swe-agent[bot]
cc7e03b0f5 Add error handling and validation to Go version workflow
Co-authored-by: mbg <278086+mbg@users.noreply.github.com>
2026-02-13 11:22:36 +00:00
copilot-swe-agent[bot]
1cbd423251 Improve portability and fix PR detection in Go version workflow
Co-authored-by: mbg <278086+mbg@users.noreply.github.com>
2026-02-13 11:21:13 +00:00
copilot-swe-agent[bot]
437244fe90 Fix portability issues in Go version update workflow
Co-authored-by: mbg <278086+mbg@users.noreply.github.com>
2026-02-13 11:19:56 +00:00
copilot-swe-agent[bot]
f7cf24d1f9 Add Go version update workflow
Co-authored-by: mbg <278086+mbg@users.noreply.github.com>
2026-02-13 11:17:57 +00:00
copilot-swe-agent[bot]
c3bafacf81 Initial plan 2026-02-13 11:15:15 +00:00
82 changed files with 7861 additions and 1110 deletions

208
.github/workflows/go-version-update.yml vendored Normal file
View File

@@ -0,0 +1,208 @@
name: Update Go version
on:
workflow_dispatch:
schedule:
- cron: "0 3 * * 1" # Run weekly on Mondays at 3 AM UTC (1 = Monday)
permissions:
contents: write
pull-requests: write
jobs:
update-go-version:
name: Check and update Go version
if: github.repository == 'github/codeql'
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Set up Git
run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
- name: Fetch latest Go version
id: fetch-version
run: |
LATEST_GO_VERSION=$(curl -s https://go.dev/dl/?mode=json | jq -r '.[0].version')
if [ -z "$LATEST_GO_VERSION" ] || [ "$LATEST_GO_VERSION" = "null" ]; then
echo "Error: Failed to fetch latest Go version from go.dev"
exit 1
fi
echo "Latest Go version from go.dev: $LATEST_GO_VERSION"
echo "version=$LATEST_GO_VERSION" >> $GITHUB_OUTPUT
# Extract version numbers (e.g., go1.26.0 -> 1.26.0)
LATEST_VERSION_NUM=$(echo $LATEST_GO_VERSION | sed 's/^go//')
echo "version_num=$LATEST_VERSION_NUM" >> $GITHUB_OUTPUT
# Extract major.minor version (e.g., 1.26.0 -> 1.26)
LATEST_MAJOR_MINOR=$(echo $LATEST_VERSION_NUM | sed -E 's/^([0-9]+\.[0-9]+).*/\1/')
echo "major_minor=$LATEST_MAJOR_MINOR" >> $GITHUB_OUTPUT
- name: Check current Go version
id: current-version
run: |
CURRENT_VERSION=$(sed -n 's/.*go_sdk\.download(version = \"\([^\"]*\)\".*/\1/p' MODULE.bazel)
if [ -z "$CURRENT_VERSION" ]; then
echo "Error: Could not extract Go version from MODULE.bazel"
exit 1
fi
echo "Current Go version in MODULE.bazel: $CURRENT_VERSION"
echo "version=$CURRENT_VERSION" >> $GITHUB_OUTPUT
# Extract major.minor version
CURRENT_MAJOR_MINOR=$(echo $CURRENT_VERSION | sed -E 's/^([0-9]+\.[0-9]+).*/\1/')
echo "major_minor=$CURRENT_MAJOR_MINOR" >> $GITHUB_OUTPUT
- name: Compare versions
id: compare
run: |
LATEST="${{ steps.fetch-version.outputs.version_num }}"
CURRENT="${{ steps.current-version.outputs.version }}"
echo "Latest: $LATEST"
echo "Current: $CURRENT"
if [ "$LATEST" = "$CURRENT" ]; then
echo "Go version is up to date"
echo "needs_update=false" >> $GITHUB_OUTPUT
else
echo "Go version needs update from $CURRENT to $LATEST"
echo "needs_update=true" >> $GITHUB_OUTPUT
fi
- name: Update Go version in files
if: steps.compare.outputs.needs_update == 'true'
run: |
LATEST_VERSION_NUM="${{ steps.fetch-version.outputs.version_num }}"
LATEST_MAJOR_MINOR="${{ steps.fetch-version.outputs.major_minor }}"
CURRENT_VERSION="${{ steps.current-version.outputs.version }}"
CURRENT_MAJOR_MINOR="${{ steps.current-version.outputs.major_minor }}"
echo "Updating from $CURRENT_VERSION to $LATEST_VERSION_NUM"
# Escape dots in current version strings for use in sed patterns
CURRENT_VERSION_ESCAPED=$(echo "$CURRENT_VERSION" | sed 's/\./\\./g')
CURRENT_MAJOR_MINOR_ESCAPED=$(echo "$CURRENT_MAJOR_MINOR" | sed 's/\./\\./g')
# Update MODULE.bazel
sed -i "s/go_sdk\.download(version = \"$CURRENT_VERSION_ESCAPED\")/go_sdk.download(version = \"$LATEST_VERSION_NUM\")/" MODULE.bazel
if ! grep -q "go_sdk.download(version = \"$LATEST_VERSION_NUM\")" MODULE.bazel; then
echo "Error: Failed to update MODULE.bazel"
exit 1
fi
# Update go/extractor/go.mod
if ! sed -i "s/^go $CURRENT_MAJOR_MINOR_ESCAPED\$/go $LATEST_MAJOR_MINOR/" go/extractor/go.mod; then
echo "Warning: Failed to update go directive in go.mod"
fi
if ! sed -i "s/^toolchain go$CURRENT_VERSION_ESCAPED\$/toolchain go$LATEST_VERSION_NUM/" go/extractor/go.mod; then
echo "Warning: Failed to update toolchain in go.mod"
fi
# Update go/extractor/autobuilder/build-environment.go
if ! sed -i "s/var maxGoVersion = util\.NewSemVer(\"$CURRENT_MAJOR_MINOR_ESCAPED\")/var maxGoVersion = util.NewSemVer(\"$LATEST_MAJOR_MINOR\")/" go/extractor/autobuilder/build-environment.go; then
echo "Warning: Failed to update build-environment.go"
fi
# Update go/actions/test/action.yml
if ! sed -i "s/default: \"~$CURRENT_VERSION_ESCAPED\"/default: \"~$LATEST_VERSION_NUM\"/" go/actions/test/action.yml; then
echo "Warning: Failed to update action.yml"
fi
# Show what changed
git diff
- name: Check for changes
id: check-changes
if: steps.compare.outputs.needs_update == 'true'
run: |
if git diff --quiet; then
echo "No changes detected"
echo "has_changes=false" >> $GITHUB_OUTPUT
else
echo "Changes detected"
echo "has_changes=true" >> $GITHUB_OUTPUT
fi
- name: Check for existing PR
if: steps.check-changes.outputs.has_changes == 'true'
id: check-pr
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
BRANCH_NAME="workflow/go-version-update"
PR_NUMBER=$(gh pr list --head "$BRANCH_NAME" --state open --json number --jq '.[0].number')
if [ -n "$PR_NUMBER" ]; then
echo "Existing PR found: #$PR_NUMBER"
echo "pr_exists=true" >> $GITHUB_OUTPUT
echo "pr_number=$PR_NUMBER" >> $GITHUB_OUTPUT
else
echo "No existing PR found"
echo "pr_exists=false" >> $GITHUB_OUTPUT
fi
- name: Commit and push changes
if: steps.check-changes.outputs.has_changes == 'true'
run: |
BRANCH_NAME="workflow/go-version-update"
LATEST_VERSION_NUM="${{ steps.fetch-version.outputs.version_num }}"
LATEST_MAJOR_MINOR="${{ steps.fetch-version.outputs.major_minor }}"
# Create or switch to branch
git checkout -B "$BRANCH_NAME"
# Stage and commit changes
git add MODULE.bazel go/extractor/go.mod go/extractor/autobuilder/build-environment.go go/actions/test/action.yml
git commit -m "Go: Update to $LATEST_VERSION_NUM"
# Push changes
git push --force-with-lease origin "$BRANCH_NAME"
- name: Create or update PR
if: steps.check-changes.outputs.has_changes == 'true'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
BRANCH_NAME="workflow/go-version-update"
LATEST_VERSION_NUM="${{ steps.fetch-version.outputs.version_num }}"
CURRENT_VERSION="${{ steps.current-version.outputs.version }}"
PR_TITLE="Go: Update to $LATEST_VERSION_NUM"
PR_BODY=$(cat <<EOF
This PR updates Go from $CURRENT_VERSION to $LATEST_VERSION_NUM.
Updated files:
- \`MODULE.bazel\` - go_sdk.download version
- \`go/extractor/go.mod\` - go directive and toolchain
- \`go/extractor/autobuilder/build-environment.go\` - maxGoVersion (only if MAJOR.MINOR changes)
- \`go/actions/test/action.yml\` - default go-test-version
This PR was automatically created by the [Go version update workflow](https://github.com/${{ github.repository }}/blob/main/.github/workflows/go-version-update.yml).
EOF
)
if [ "${{ steps.check-pr.outputs.pr_exists }}" = "true" ]; then
echo "Updating existing PR #${{ steps.check-pr.outputs.pr_number }}"
gh pr edit "${{ steps.check-pr.outputs.pr_number }}" --title "$PR_TITLE" --body "$PR_BODY"
else
echo "Creating new PR"
gh pr create \
--title "$PR_TITLE" \
--body "$PR_BODY" \
--base main \
--head "$BRANCH_NAME" \
--label "Go"
fi

View File

@@ -273,7 +273,7 @@ use_repo(
)
go_sdk = use_extension("@rules_go//go:extensions.bzl", "go_sdk")
go_sdk.download(version = "1.26.0")
go_sdk.download(version = "1.26.4")
go_deps = use_extension("@gazelle//:extensions.bzl", "go_deps")
go_deps.from_file(go_mod = "//go/extractor:go.mod")

View File

@@ -38,7 +38,7 @@ Bug Fixes
GitHub Actions
""""""""""""""
* Adjusted (minor) help file descriptions for queries: :code:`actions/untrusted-checkout/critical`, :code:`actions/untrusted-checkout/high`, :code:`actions/untrusted-checkout/medium`. Clarified wording on in minor point, added one more listed resource and added one more recommendation for things to check.
* Adjusted (minor) help file descriptions for queries: :code:`actions/untrusted-checkout/critical`, :code:`actions/untrusted-checkout/high`, :code:`actions/untrusted-checkout/medium`. Clarified wording on a minor point, added one more listed resource and added one more recommendation for things to check.
Major Analysis Improvements
~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -104,7 +104,7 @@ JavaScript/TypeScript
Python
""""""
* The sensitive data heuristics used to identify code that handles passwords and private data have been improved. Most of the changes permit more variations of established patterns, thereby finding more sensitive data. Queries that use the sensitive data library (for example :code:`py/clear-text-logging-sensitive-data`) may find more correct results and less fewer positive results after these changes.
* The sensitive data heuristics used to identify code that handles passwords and private data have been improved. Most of the changes permit more variations of established patterns, thereby finding more sensitive data. Queries that use the sensitive data library (for example :code:`py/clear-text-logging-sensitive-data`) may find more correct results and fewer false positive results after these changes.
Swift
"""""
@@ -114,7 +114,7 @@ Swift
GitHub Actions
""""""""""""""
* The GitHub Actions analysis now recognizes more Bash regex checks that restrict a value to alphanumeric characters, include regexes like :code:`^[0-9a-zA-Z]{40}([0-9a-zA-Z]{24})?$` which check for a sha1 or sha256 hash. This may reduce false positive results where command output is validated with grouped or optional alphanumeric patterns before being used.
* The GitHub Actions analysis now recognizes more Bash regex checks that restrict a value to alphanumeric characters, including regexes like :code:`^[0-9a-zA-Z]{40}([0-9a-zA-Z]{24})?$` which check for a SHA-1 or SHA-256 hash. This may reduce false positive results where command output is validated with grouped or optional alphanumeric patterns before being used.
Rust
""""

View File

@@ -4,7 +4,7 @@ inputs:
go-test-version:
description: Which Go version to use for running the tests
required: false
default: "~1.26.0"
default: "~1.26.4"
run-code-checks:
description: Whether to run formatting, code and qhelp generation checks
required: false

View File

@@ -2,14 +2,14 @@ module github.com/github/codeql-go/extractor
go 1.26
toolchain go1.26.0
toolchain go1.26.4
// when updating this, run
// bazel run @rules_go//go -- mod tidy
// when adding or removing dependencies, run
// bazel mod tidy
require (
golang.org/x/mod v0.36.0
golang.org/x/mod v0.37.0
golang.org/x/tools v0.45.0
)

View File

@@ -6,8 +6,8 @@ github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZb
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
golang.org/x/mod v0.36.0 h1:JJjpVx6myfUsUdAzZuOSTTmRE0PfZeNWzzvKrP7amb4=
golang.org/x/mod v0.36.0/go.mod h1:moc6ELqsWcOw5Ef3xVprK5ul/MvtVvkIXLziUOICjUQ=
golang.org/x/mod v0.37.0 h1:vF1DjpVEshcIqoEaauuHebaLk1O1forxjxBaVn884JQ=
golang.org/x/mod v0.37.0/go.mod h1:m8S8VeM9r4dzDwjrKO0a1sZP3YjeMamRRlD+fmR2Q/0=
golang.org/x/sync v0.20.0 h1:e0PTpb7pjO8GAtTs2dQ6jYa5BWYlMuX047Dco/pItO4=
golang.org/x/sync v0.20.0/go.mod h1:9xrNwdLfx4jkKbNva9FpL6vEN7evnE43NNNJQ2LF3+0=
golang.org/x/tools v0.45.0 h1:18qN3FAooORvApf5XjCXgsuayZOEtXf6JK18I3+ONa8=

View File

@@ -36,6 +36,8 @@ private module Input implements InputSig<Location, PythonDataFlow> {
// parameter, but dataflow-consistency queries should _not_ complain about there not
// being a post-update node for the synthetic `**kwargs` parameter.
n instanceof SynthDictSplatParameterNode
or
Private::Conversions::readStep(n, _, _)
}
predicate uniqueParameterNodePositionExclude(DataFlowCallable c, ParameterPosition pos, Node p) {

View File

@@ -0,0 +1,4 @@
---
category: minorAnalysis
---
* Python taint tracking is now more precise for values flowing through container contents, such as list, set, tuple, and dictionary elements. This may remove some false positive alerts.

View File

@@ -0,0 +1,4 @@
---
category: minorAnalysis
---
* Simplified the internal predicates that detect `@staticmethod`, `@classmethod` and `@property` decorators to match the decorator's AST `Name` directly, rather than going through the CFG and requiring the name to resolve globally. Code that shadows these three builtin decorators at the module-scope will now be classified by the decorator name alone; in practice, shadowing these names is extremely rare and the call-graph results are unchanged.

View File

@@ -256,9 +256,12 @@ predicate parameterMatch(ParameterPosition ppos, ArgumentPosition apos) {
*/
overlay[local]
predicate isStaticmethod(Function func) {
exists(NameNode id | id.getId() = "staticmethod" and id.isGlobal() |
func.getADecorator() = id.getNode()
)
// The decorator is *syntactically* a `Name` "staticmethod" — we don't
// care which variable it resolves to. `staticmethod` is a builtin and
// is almost never shadowed in a module-level scope; even if a class
// redefines `staticmethod` in its body, the class body has not started
// executing yet at the decorator position, so Python uses the builtin.
func.getADecorator().(Name).getId() = "staticmethod"
}
/**
@@ -268,9 +271,9 @@ predicate isStaticmethod(Function func) {
*/
overlay[local]
predicate isClassmethod(Function func) {
exists(NameNode id | id.getId() = "classmethod" and id.isGlobal() |
func.getADecorator() = id.getNode()
)
// See `isStaticmethod` for the rationale for matching on the AST `Name`
// rather than going via the CFG and `isGlobal()`.
func.getADecorator().(Name).getId() = "classmethod"
or
exists(Class cls |
cls.getAMethod() = func and
@@ -285,9 +288,8 @@ predicate isClassmethod(Function func) {
/** Holds if the function `func` has a `property` decorator. */
overlay[local]
predicate hasPropertyDecorator(Function func) {
exists(NameNode id | id.getId() = "property" and id.isGlobal() |
func.getADecorator() = id.getNode()
)
// See `isStaticmethod` for the rationale for matching on the AST `Name`.
func.getADecorator().(Name).getId() = "property"
}
/**

View File

@@ -753,7 +753,7 @@ predicate jumpStepNotSharedWithTypeTracker(Node nodeFrom, Node nodeTo) {
* As of 2024-04-02 the type-tracking library only supports precise content, so there is
* no reason to include steps for list content right now.
*/
predicate storeStepCommon(Node nodeFrom, ContentSet c, Node nodeTo) {
predicate storeStepCommon(Node nodeFrom, Content c, Node nodeTo) {
tupleStoreStep(nodeFrom, c, nodeTo)
or
dictStoreStep(nodeFrom, c, nodeTo)
@@ -767,29 +767,31 @@ predicate storeStepCommon(Node nodeFrom, ContentSet c, Node nodeTo) {
* Holds if data can flow from `nodeFrom` to `nodeTo` via an assignment to
* content `c`.
*/
predicate storeStep(Node nodeFrom, ContentSet c, Node nodeTo) {
storeStepCommon(nodeFrom, c, nodeTo)
predicate storeStep(Node nodeFrom, ContentSet cs, Node nodeTo) {
exists(Content c | cs = singleton(c) |
storeStepCommon(nodeFrom, c, nodeTo)
or
listStoreStep(nodeFrom, c, nodeTo)
or
setStoreStep(nodeFrom, c, nodeTo)
or
attributeStoreStep(nodeFrom, c, nodeTo)
or
matchStoreStep(nodeFrom, c, nodeTo)
or
any(Orm::AdditionalOrmSteps es).storeStep(nodeFrom, c, nodeTo)
or
synthStarArgsElementParameterNodeStoreStep(nodeFrom, c, nodeTo)
or
synthDictSplatArgumentNodeStoreStep(nodeFrom, c, nodeTo)
or
yieldStoreStep(nodeFrom, c, nodeTo)
or
VariableCapture::storeStep(nodeFrom, c, nodeTo)
)
or
listStoreStep(nodeFrom, c, nodeTo)
or
setStoreStep(nodeFrom, c, nodeTo)
or
attributeStoreStep(nodeFrom, c, nodeTo)
or
matchStoreStep(nodeFrom, c, nodeTo)
or
any(Orm::AdditionalOrmSteps es).storeStep(nodeFrom, c, nodeTo)
or
FlowSummaryImpl::Private::Steps::summaryStoreStep(nodeFrom.(FlowSummaryNode).getSummaryNode(), c,
FlowSummaryImpl::Private::Steps::summaryStoreStep(nodeFrom.(FlowSummaryNode).getSummaryNode(), cs,
nodeTo.(FlowSummaryNode).getSummaryNode())
or
synthStarArgsElementParameterNodeStoreStep(nodeFrom, c, nodeTo)
or
synthDictSplatArgumentNodeStoreStep(nodeFrom, c, nodeTo)
or
yieldStoreStep(nodeFrom, c, nodeTo)
or
VariableCapture::storeStep(nodeFrom, c, nodeTo)
}
/**
@@ -985,7 +987,7 @@ predicate attributeStoreStep(Node nodeFrom, AttributeContent c, Node nodeTo) {
/**
* Subset of `readStep` that should be shared with type-tracking.
*/
predicate readStepCommon(Node nodeFrom, ContentSet c, Node nodeTo) {
predicate readStepCommon(Node nodeFrom, Content c, Node nodeTo) {
subscriptReadStep(nodeFrom, c, nodeTo)
or
iterableUnpackingReadStep(nodeFrom, c, nodeTo)
@@ -994,21 +996,25 @@ predicate readStepCommon(Node nodeFrom, ContentSet c, Node nodeTo) {
/**
* Holds if data can flow from `nodeFrom` to `nodeTo` via a read of content `c`.
*/
predicate readStep(Node nodeFrom, ContentSet c, Node nodeTo) {
readStepCommon(nodeFrom, c, nodeTo)
predicate readStep(Node nodeFrom, ContentSet cs, Node nodeTo) {
exists(Content c | cs = singleton(c) |
readStepCommon(nodeFrom, c, nodeTo)
or
matchReadStep(nodeFrom, c, nodeTo)
or
forReadStep(nodeFrom, c, nodeTo)
or
attributeReadStep(nodeFrom, c, nodeTo)
or
synthDictSplatParameterNodeReadStep(nodeFrom, c, nodeTo)
or
VariableCapture::readStep(nodeFrom, c, nodeTo)
)
or
matchReadStep(nodeFrom, c, nodeTo)
or
forReadStep(nodeFrom, c, nodeTo)
or
attributeReadStep(nodeFrom, c, nodeTo)
or
FlowSummaryImpl::Private::Steps::summaryReadStep(nodeFrom.(FlowSummaryNode).getSummaryNode(), c,
FlowSummaryImpl::Private::Steps::summaryReadStep(nodeFrom.(FlowSummaryNode).getSummaryNode(), cs,
nodeTo.(FlowSummaryNode).getSummaryNode())
or
synthDictSplatParameterNodeReadStep(nodeFrom, c, nodeTo)
or
VariableCapture::readStep(nodeFrom, c, nodeTo)
Conversions::readStep(nodeFrom, cs, nodeTo)
}
/** Data flows from a sequence to a subscript of the sequence. */
@@ -1064,23 +1070,68 @@ predicate attributeReadStep(Node nodeFrom, AttributeContent c, AttrRead nodeTo)
nodeTo.accesses(nodeFrom, c.getAttribute())
}
module Conversions {
private import semmle.python.Concepts
predicate decoderReadStep(Node nodeFrom, ContentSet c, Node nodeTo) {
exists(Decoding decoding |
nodeFrom = decoding.getAnInput() and
nodeTo = decoding.getOutput()
) and
c.isAnyTupleOrDictionaryElement()
}
predicate encoderReadStep(Node nodeFrom, ContentSet c, Node nodeTo) {
exists(Encoding encoding |
nodeFrom = encoding.getAnInput() and
nodeTo = encoding.getOutput()
) and
c.isAnyTupleOrDictionaryElement()
}
predicate formatReadStep(Node nodeFrom, ContentSet c, Node nodeTo) {
// % formatting
exists(BinaryExprNode fmt | fmt = nodeTo.asCfgNode() |
fmt.getOp() instanceof Mod and
fmt.getRight() = nodeFrom.asCfgNode()
) and
c.isAnyTupleElement()
or
// format_map
// see https://docs.python.org/3/library/stdtypes.html#str.format_map
nodeTo.(MethodCallNode).calls(_, "format_map") and
nodeTo.(MethodCallNode).getArg(0) = nodeFrom and
c.isAnyDictionaryElement()
}
predicate readStep(Node nodeFrom, ContentSet c, Node nodeTo) {
decoderReadStep(nodeFrom, c, nodeTo)
or
encoderReadStep(nodeFrom, c, nodeTo)
or
formatReadStep(nodeFrom, c, nodeTo)
}
}
/**
* Holds if values stored inside content `c` are cleared at node `n`. For example,
* any value stored inside `f` is cleared at the pre-update node associated with `x`
* in `x.f = newValue`.
*/
predicate clearsContent(Node n, ContentSet c) {
matchClearStep(n, c)
predicate clearsContent(Node n, ContentSet cs) {
exists(Content c | cs = singleton(c) |
matchClearStep(n, c)
or
attributeClearStep(n, c)
or
dictClearStep(n, c)
or
dictSplatParameterNodeClearStep(n, c)
or
VariableCapture::clearsContent(n, c)
)
or
attributeClearStep(n, c)
or
dictClearStep(n, c)
or
FlowSummaryImpl::Private::Steps::summaryClearsContent(n.(FlowSummaryNode).getSummaryNode(), c)
or
dictSplatParameterNodeClearStep(n, c)
or
VariableCapture::clearsContent(n, c)
FlowSummaryImpl::Private::Steps::summaryClearsContent(n.(FlowSummaryNode).getSummaryNode(), cs)
}
/**
@@ -1198,12 +1249,65 @@ predicate allowParameterReturnInSelf(ParameterNode p) {
)
}
bindingset[s]
private string getFirstChar(string s) {
result =
min(int i, string c |
c = s.charAt(i) and c != "_"
or
c = "" and i = s.length()
|
c order by i
)
}
private string getAttributeContentFirstChar(AttributeContent ac) {
result = getFirstChar(ac.getAttribute())
}
private string getDictionaryElementContentKeyFirstChar(DictionaryElementContent dec) {
result = getFirstChar(dec.getKey())
}
private newtype TContentApprox =
TListElementContentApprox() or
TSetElementContentApprox() or
TTupleElementContentApprox() or
TDictionaryElementContentApprox(string first) {
first = "" // for `TDictionaryElementAnyContent`
or
first = getDictionaryElementContentKeyFirstChar(_)
} or
TAttributeContentApprox(string first) { first = getAttributeContentFirstChar(_) } or
TCapturedVariableContentApprox()
/** An approximated `Content`. */
class ContentApprox = Unit;
class ContentApprox extends TContentApprox {
/** Gets a textual representation of this element. */
string toString() { result = "" }
}
/** Gets an approximated value for content `c`. */
pragma[inline]
ContentApprox getContentApprox(Content c) { any() }
ContentApprox getContentApprox(Content c) {
c = TListElementContent() and
result = TListElementContentApprox()
or
c = TSetElementContent() and
result = TSetElementContentApprox()
or
c = TTupleElementContent(_) and
result = TTupleElementContentApprox()
or
result = TDictionaryElementContentApprox(getDictionaryElementContentKeyFirstChar(c))
or
c = TDictionaryElementAnyContent() and
result = TDictionaryElementContentApprox("")
or
result = TAttributeContentApprox(getAttributeContentFirstChar(c))
or
c = TCapturedVariableContent(_) and
result = TCapturedVariableContentApprox()
}
/** Helper for `.getEnclosingCallable`. */
DataFlowCallable getCallableScope(Scope s) {

View File

@@ -898,19 +898,78 @@ class CapturedVariableContent extends Content, TCapturedVariableContent {
override string getMaDRepresentation() { none() }
}
/**
* An entity that represents a set of `Content`s.
*
* Most `ContentSet`s are singletons (i.e. they consist of a single `Content`),
* but `AnyDictionaryElement` and `AnyTupleElement` act as wildcards on the
* read side: a read at such a `ContentSet` matches any specific dictionary
* key / tuple index store, as well as (for dictionaries) the
* "unknown-bucket" Content `DictionaryElementAnyContent`.
*
* Keeping these as wildcard `ContentSet`s (rather than enumerating one
* `ContentSet` per key/index) keeps the dataflow `readSetEx` relation small
* when implicit reads are used (e.g. at sinks via `defaultImplicitTaintRead`).
*/
private newtype TContentSet =
TSingletonContent(Content c) or
TAnyTupleElement() or
TAnyDictionaryElement() or
TAnyTupleOrDictionaryElement()
/**
* An entity that represents a set of `Content`s.
*
* The set may be interpreted differently depending on whether it is
* stored into (`getAStoreContent`) or read from (`getAReadContent`).
*/
class ContentSet instanceof Content {
class ContentSet extends TContentSet {
/** Holds if this content set is the singleton `{c}`. */
predicate isSingleton(Content c) { this = TSingletonContent(c) }
/** Holds if this content set is the wildcard for all tuple elements. */
predicate isAnyTupleElement() { this = TAnyTupleElement() }
/** Holds if this content set is the wildcard for all dictionary elements. */
predicate isAnyDictionaryElement() { this = TAnyDictionaryElement() }
/** Holds if this content set is the wildcard for all tuple elements or dictionary elements. */
predicate isAnyTupleOrDictionaryElement() { this = TAnyTupleOrDictionaryElement() }
/** Gets a content that may be stored into when storing into this set. */
Content getAStoreContent() { result = this }
Content getAStoreContent() { this = TSingletonContent(result) }
/** Gets a content that may be read from when reading from this set. */
Content getAReadContent() { result = this }
Content getAReadContent() {
this = TSingletonContent(result)
or
// Wildcard expansion: a read at "any tuple element" matches a store at any
// specific tuple index. (Stores always target a specific index, so we don't
// need a `TupleElementAnyContent` Content kind here.)
this = TAnyTupleElement() and result instanceof TupleElementContent
or
this = TAnyDictionaryElement() and
(result instanceof DictionaryElementContent or result instanceof DictionaryElementAnyContent)
or
this = TAnyTupleOrDictionaryElement() and
(
result instanceof TupleElementContent or
result instanceof DictionaryElementContent or
result instanceof DictionaryElementAnyContent
)
}
/** Gets a textual representation of this content set. */
string toString() { result = super.toString() }
string toString() {
exists(Content c | this = TSingletonContent(c) | result = c.toString())
or
this = TAnyTupleElement() and result = "Any tuple element"
or
this = TAnyDictionaryElement() and result = "Any dictionary element"
or
this = TAnyTupleOrDictionaryElement() and result = "Any tuple or dictionary element"
}
}
/** Gets the singleton `ContentSet` wrapping the `Content` `c`. */
ContentSet singleton(Content c) { result = TSingletonContent(c) }

View File

@@ -66,21 +66,29 @@ module Input implements InputSig<Location, DataFlowImplSpecific::PythonDataFlow>
}
string encodeContent(ContentSet cs, string arg) {
cs = TListElementContent() and result = "ListElement" and arg = ""
or
cs = TSetElementContent() and result = "SetElement" and arg = ""
or
exists(int index |
cs = TTupleElementContent(index) and result = "TupleElement" and arg = index.toString()
exists(Content c | cs.isSingleton(c) |
c = TListElementContent() and result = "ListElement" and arg = ""
or
c = TSetElementContent() and result = "SetElement" and arg = ""
or
exists(int index |
c = TTupleElementContent(index) and result = "TupleElement" and arg = index.toString()
)
or
exists(string key |
c = TDictionaryElementContent(key) and result = "DictionaryElement" and arg = key
)
or
c = TDictionaryElementAnyContent() and result = "DictionaryElementAny" and arg = ""
or
exists(string attr | c = TAttributeContent(attr) and result = "Attribute" and arg = attr)
)
or
exists(string key |
cs = TDictionaryElementContent(key) and result = "DictionaryElement" and arg = key
)
cs.isAnyTupleElement() and result = "AnyTupleElement" and arg = ""
or
cs = TDictionaryElementAnyContent() and result = "DictionaryElementAny" and arg = ""
cs.isAnyDictionaryElement() and result = "AnyDictionaryElement" and arg = ""
or
exists(string attr | cs = TAttributeContent(attr) and result = "Attribute" and arg = attr)
cs.isAnyTupleOrDictionaryElement() and result = "AnyTupleOrDictionaryElement" and arg = ""
}
bindingset[token]
@@ -139,27 +147,29 @@ module Private {
predicate withContent = SC::withContent/1;
/** Gets a summary component that represents a list element. */
SummaryComponent listElement() { result = content(any(ListElementContent c)) }
SummaryComponent listElement() { result = content(singleton(any(ListElementContent c))) }
/** Gets a summary component that represents a set element. */
SummaryComponent setElement() { result = content(any(SetElementContent c)) }
SummaryComponent setElement() { result = content(singleton(any(SetElementContent c))) }
/** Gets a summary component that represents a tuple element. */
SummaryComponent tupleElement(int index) {
exists(TupleElementContent c | c.getIndex() = index and result = content(c))
exists(TupleElementContent c | c.getIndex() = index and result = content(singleton(c)))
}
/** Gets a summary component that represents a dictionary element. */
SummaryComponent dictionaryElement(string key) {
exists(DictionaryElementContent c | c.getKey() = key and result = content(c))
exists(DictionaryElementContent c | c.getKey() = key and result = content(singleton(c)))
}
/** Gets a summary component that represents a dictionary element at any key. */
SummaryComponent dictionaryElementAny() { result = content(any(DictionaryElementAnyContent c)) }
SummaryComponent dictionaryElementAny() {
result = content(singleton(any(DictionaryElementAnyContent c)))
}
/** Gets a summary component that represents an attribute element. */
SummaryComponent attribute(string attr) {
exists(AttributeContent c | c.getAttribute() = attr and result = content(c))
exists(AttributeContent c | c.getAttribute() = attr and result = content(singleton(c)))
}
/** Gets a summary component that represents the return value of a call. */

View File

@@ -11,12 +11,34 @@ private import semmle.python.ApiGraphs
*/
predicate defaultTaintSanitizer(DataFlow::Node node) { none() }
/**
* Holds if default taint tracking should read content `contentSet` implicitly and
* propagate taint from a container to reads of that content.
*/
private predicate defaultTaintReadContent(DataFlow::ContentSet contentSet) {
// Tuple and dictionary content is precise, so use wildcard content sets to avoid
// blowing up the size of `Stage1::readSetEx` (otherwise this predicate would
// expand to one row per (node, distinct key or index) and the framework's
// read-set relation grows quadratically). `ContentSet.getAReadContent` expands
// these wildcards back to the specific contents when matching against stores.
contentSet.isAnyTupleOrDictionaryElement()
or
// List and set element content is already imprecise, so no wildcard expansion is
// needed.
contentSet.getAStoreContent() instanceof DataFlow::ListElementContent
or
contentSet.getAStoreContent() instanceof DataFlow::SetElementContent
}
/**
* Holds if default `TaintTracking::Configuration`s should allow implicit reads
* of `c` at sinks and inputs to additional taint steps.
*/
bindingset[node]
predicate defaultImplicitTaintRead(DataFlow::Node node, DataFlow::ContentSet c) { none() }
predicate defaultImplicitTaintRead(DataFlow::Node node, DataFlow::ContentSet c) {
exists(node) and
defaultTaintReadContent(c)
}
private module Cached {
/**
@@ -128,11 +150,6 @@ predicate stringManipulation(DataFlow::CfgNode nodeFrom, DataFlow::CfgNode nodeT
nodeFrom.getNode() = object and
method_name in ["partition", "rpartition", "rsplit", "split", "splitlines"]
or
// Iterable[str] -> str
// TODO: check if these should be handled differently in regards to content
method_name = "join" and
nodeFrom.getNode() = call.getArg(0)
or
// Mapping[str, Any] -> str
method_name = "format_map" and
nodeFrom.getNode() = call.getArg(0)
@@ -161,32 +178,21 @@ predicate stringManipulation(DataFlow::CfgNode nodeFrom, DataFlow::CfgNode nodeT
}
/**
* Holds if taint can flow from `nodeFrom` to `nodeTo` with a step related to containers
* (lists/sets/dictionaries): literals, constructor invocation, methods. Note that this
* is currently very imprecise, as an example, since we model `dict.get`, we treat any
* `<tainted object>.get(<arg>)` will be tainted, whether it's true or not.
* Holds if taint can flow from `nodeFrom` to `nodeTo` with a step related to reading
* content from containers (lists/sets/dictionaries/tuples): subscripts, iteration,
* constructor invocation, methods.
*/
predicate containerStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
// construction by literal
//
// TODO: once we have proper flow-summary modeling, we might not need this step any
// longer -- but there needs to be a matching read-step for the store-step, and we
// don't provide that right now.
DataFlowPrivate::listStoreStep(nodeFrom, _, nodeTo)
or
DataFlowPrivate::setStoreStep(nodeFrom, _, nodeTo)
or
DataFlowPrivate::tupleStoreStep(nodeFrom, _, nodeTo)
or
DataFlowPrivate::dictStoreStep(nodeFrom, _, nodeTo)
or
// comprehension, so there is taint-flow from `x` in `[x for x in xs]` to the
// resulting list of the list-comprehension.
//
// TODO: once we have proper flow-summary modeling, we might not need this step any
// longer -- but there needs to be a matching read-step for the store-step, and we
// don't provide that right now.
DataFlowPrivate::yieldStoreStep(nodeFrom, _, nodeTo)
exists(DataFlow::ContentSet contentSet |
DataFlowPrivate::readStep(nodeFrom, contentSet, nodeTo) and
exists(DataFlow::Content c | c = contentSet.getAReadContent() |
c instanceof DataFlow::TupleElementContent or
c instanceof DataFlow::DictionaryElementContent or
c instanceof DataFlow::DictionaryElementAnyContent or
c instanceof DataFlow::ListElementContent or
c instanceof DataFlow::SetElementContent
)
)
}
/**

View File

@@ -241,7 +241,7 @@ module TypeTrackingInput implements Shared::TypeTrackingInput<Location> {
// is only fed set/list content)
not nodeFrom instanceof DataFlowPublic::IterableElementNode
or
TypeTrackerSummaryFlow::basicStoreStep(nodeFrom, nodeTo, content)
TypeTrackerSummaryFlow::basicStoreStep(nodeFrom, nodeTo, DataFlowPublic::singleton(content))
}
/**
@@ -272,14 +272,15 @@ module TypeTrackingInput implements Shared::TypeTrackingInput<Location> {
nodeFrom.asCfgNode() instanceof SequenceNode
)
or
TypeTrackerSummaryFlow::basicLoadStep(nodeFrom, nodeTo, content)
TypeTrackerSummaryFlow::basicLoadStep(nodeFrom, nodeTo, DataFlowPublic::singleton(content))
}
/**
* Holds if the `loadContent` of `nodeFrom` is stored in the `storeContent` of `nodeTo`.
*/
predicate loadStoreStep(Node nodeFrom, Node nodeTo, Content loadContent, Content storeContent) {
TypeTrackerSummaryFlow::basicLoadStoreStep(nodeFrom, nodeTo, loadContent, storeContent)
TypeTrackerSummaryFlow::basicLoadStoreStep(nodeFrom, nodeTo,
DataFlowPublic::singleton(loadContent), DataFlowPublic::singleton(storeContent))
}
/**

View File

@@ -4244,6 +4244,7 @@ module StdlibPrivate {
)
// TODO: Once we have DictKeyContent, we need to transform that into ListElementContent
) and
// Element content is mutated into list element content
output = "ReturnValue.ListElement" and
preservesValue = true
or
@@ -4270,11 +4271,9 @@ module StdlibPrivate {
preservesValue = true
)
or
// TODO: We need to also translate iterable content such as list element
// but we currently lack TupleElementAny
input = "Argument[0]" and
input = "Argument[0].ListElement" and
output = "ReturnValue" and
preservesValue = false
preservesValue = true
}
}
@@ -4969,6 +4968,26 @@ module StdlibPrivate {
}
}
/** A flow summary for `str.join`. */
class StrJoinSummary extends SummarizedCallable::Range {
StrJoinSummary() { this = "str.join" }
override DataFlow::CallCfgNode getACall() { result.(DataFlow::MethodCallNode).calls(_, "join") }
override DataFlow::ArgumentNode getACallback() {
result.(DataFlow::AttrRead).getAttributeName() = "join"
}
override predicate propagatesFlow(string input, string output, boolean preservesValue) {
(
// For code like `" ".join([name])`
input = "Argument[0,iterable:].ListElement" and
preservesValue = true
) and
output = "ReturnValue"
}
}
// ---------------------------------------------------------------------------
// asyncio
// ---------------------------------------------------------------------------

View File

@@ -0,0 +1,6 @@
extensions:
- addsTo:
pack: codeql/python-all
extensible: summaryModel
data:
- ['lxml', 'Member[etree].Member[fromstringlist]', 'Argument[0,strings:].ListElement', 'ReturnValue', 'taint']

View File

@@ -0,0 +1,6 @@
extensions:
- addsTo:
pack: codeql/python-all
extensible: summaryModel
data:
- ['xml', 'Member[etree].Member[fromstringlist]', 'Argument[0,strings:].ListElement', 'ReturnValue', 'taint']

View File

@@ -61,10 +61,11 @@ module EscapingCaptureFlowConfig implements DataFlow::ConfigSig {
predicate allowImplicitRead(DataFlow::Node node, DataFlow::ContentSet cs) {
isSink(node) and
(
cs.(DataFlow::TupleElementContent).getIndex() in [0 .. 10] or
cs instanceof DataFlow::ListElementContent or
cs instanceof DataFlow::SetElementContent or
cs instanceof DataFlow::DictionaryElementAnyContent
cs.isAnyTupleOrDictionaryElement()
or
cs.getAStoreContent() instanceof DataFlow::ListElementContent
or
cs.getAStoreContent() instanceof DataFlow::SetElementContent
)
}
}

View File

@@ -3,11 +3,15 @@ edges
| TarSlipImprov.py:15:7:15:39 | ControlFlowNode for Attribute() | TarSlipImprov.py:15:1:15:3 | ControlFlowNode for tar | provenance | |
| TarSlipImprov.py:17:5:17:10 | ControlFlowNode for member | TarSlipImprov.py:20:19:20:24 | ControlFlowNode for member | provenance | |
| TarSlipImprov.py:20:5:20:10 | [post] ControlFlowNode for result | TarSlipImprov.py:22:35:22:40 | ControlFlowNode for result | provenance | |
| TarSlipImprov.py:20:5:20:10 | [post] ControlFlowNode for result [List element] | TarSlipImprov.py:22:35:22:40 | ControlFlowNode for result | provenance | |
| TarSlipImprov.py:20:19:20:24 | ControlFlowNode for member | TarSlipImprov.py:20:5:20:10 | [post] ControlFlowNode for result | provenance | list.append |
| TarSlipImprov.py:20:19:20:24 | ControlFlowNode for member | TarSlipImprov.py:20:5:20:10 | [post] ControlFlowNode for result [List element] | provenance | list.append |
| TarSlipImprov.py:26:21:26:27 | ControlFlowNode for tarfile | TarSlipImprov.py:28:9:28:14 | ControlFlowNode for member | provenance | |
| TarSlipImprov.py:28:9:28:14 | ControlFlowNode for member | TarSlipImprov.py:35:23:35:28 | ControlFlowNode for member | provenance | |
| TarSlipImprov.py:35:9:35:14 | [post] ControlFlowNode for result | TarSlipImprov.py:36:12:36:17 | ControlFlowNode for result | provenance | |
| TarSlipImprov.py:35:9:35:14 | [post] ControlFlowNode for result [List element] | TarSlipImprov.py:36:12:36:17 | ControlFlowNode for result [List element] | provenance | |
| TarSlipImprov.py:35:23:35:28 | ControlFlowNode for member | TarSlipImprov.py:35:9:35:14 | [post] ControlFlowNode for result | provenance | list.append |
| TarSlipImprov.py:35:23:35:28 | ControlFlowNode for member | TarSlipImprov.py:35:9:35:14 | [post] ControlFlowNode for result [List element] | provenance | list.append |
| TarSlipImprov.py:38:1:38:3 | ControlFlowNode for tar | TarSlipImprov.py:39:65:39:67 | ControlFlowNode for tar | provenance | |
| TarSlipImprov.py:38:7:38:39 | ControlFlowNode for Attribute() | TarSlipImprov.py:38:1:38:3 | ControlFlowNode for tar | provenance | |
| TarSlipImprov.py:39:65:39:67 | ControlFlowNode for tar | TarSlipImprov.py:26:21:26:27 | ControlFlowNode for tarfile | provenance | |
@@ -34,16 +38,19 @@ edges
| TarSlipImprov.py:142:9:142:13 | ControlFlowNode for entry | TarSlipImprov.py:143:36:143:40 | ControlFlowNode for entry | provenance | |
| TarSlipImprov.py:151:14:151:50 | ControlFlowNode for closing() | TarSlipImprov.py:151:55:151:56 | ControlFlowNode for tf | provenance | |
| TarSlipImprov.py:151:22:151:49 | ControlFlowNode for Attribute() | TarSlipImprov.py:151:14:151:50 | ControlFlowNode for closing() | provenance | Config |
| TarSlipImprov.py:151:55:151:56 | ControlFlowNode for tf | TarSlipImprov.py:152:13:152:20 | ControlFlowNode for Yield | provenance | |
| TarSlipImprov.py:151:55:151:56 | ControlFlowNode for tf | TarSlipImprov.py:152:19:152:20 | ControlFlowNode for tf | provenance | |
| TarSlipImprov.py:152:13:152:20 | ControlFlowNode for Yield | TarSlipImprov.py:157:18:157:40 | ControlFlowNode for py2_tarxz() | provenance | |
| TarSlipImprov.py:152:13:152:20 | ControlFlowNode for Yield [List element] | TarSlipImprov.py:157:18:157:40 | ControlFlowNode for py2_tarxz() [List element] | provenance | |
| TarSlipImprov.py:152:19:152:20 | ControlFlowNode for tf | TarSlipImprov.py:152:13:152:20 | ControlFlowNode for Yield [List element] | provenance | |
| TarSlipImprov.py:152:19:152:20 | ControlFlowNode for tf | TarSlipImprov.py:157:18:157:40 | ControlFlowNode for py2_tarxz() | provenance | |
| TarSlipImprov.py:157:9:157:14 | ControlFlowNode for tar_cm | TarSlipImprov.py:162:20:162:23 | ControlFlowNode for tarc | provenance | |
| TarSlipImprov.py:157:9:157:14 | ControlFlowNode for tar_cm [List element] | TarSlipImprov.py:162:20:162:23 | ControlFlowNode for tarc [List element] | provenance | |
| TarSlipImprov.py:157:18:157:40 | ControlFlowNode for py2_tarxz() | TarSlipImprov.py:157:9:157:14 | ControlFlowNode for tar_cm | provenance | |
| TarSlipImprov.py:157:18:157:40 | ControlFlowNode for py2_tarxz() [List element] | TarSlipImprov.py:157:9:157:14 | ControlFlowNode for tar_cm [List element] | provenance | |
| TarSlipImprov.py:159:9:159:14 | ControlFlowNode for tar_cm | TarSlipImprov.py:162:20:162:23 | ControlFlowNode for tarc | provenance | |
| TarSlipImprov.py:159:18:159:52 | ControlFlowNode for closing() | TarSlipImprov.py:159:9:159:14 | ControlFlowNode for tar_cm | provenance | |
| TarSlipImprov.py:159:26:159:51 | ControlFlowNode for Attribute() | TarSlipImprov.py:159:18:159:52 | ControlFlowNode for closing() | provenance | Config |
| TarSlipImprov.py:162:20:162:23 | ControlFlowNode for tarc | TarSlipImprov.py:169:9:169:12 | ControlFlowNode for tarc | provenance | |
| TarSlipImprov.py:162:20:162:23 | ControlFlowNode for tarc [List element] | TarSlipImprov.py:169:9:169:12 | ControlFlowNode for tarc | provenance | |
| TarSlipImprov.py:176:6:176:31 | ControlFlowNode for Attribute() | TarSlipImprov.py:176:36:176:38 | ControlFlowNode for tar | provenance | |
| TarSlipImprov.py:176:36:176:38 | ControlFlowNode for tar | TarSlipImprov.py:177:9:177:13 | ControlFlowNode for entry | provenance | |
| TarSlipImprov.py:177:9:177:13 | ControlFlowNode for entry | TarSlipImprov.py:178:36:178:40 | ControlFlowNode for entry | provenance | |
@@ -60,7 +67,9 @@ edges
| TarSlipImprov.py:231:43:231:52 | ControlFlowNode for corpus_tar | TarSlipImprov.py:233:9:233:9 | ControlFlowNode for f | provenance | |
| TarSlipImprov.py:233:9:233:9 | ControlFlowNode for f | TarSlipImprov.py:235:28:235:28 | ControlFlowNode for f | provenance | |
| TarSlipImprov.py:235:13:235:19 | [post] ControlFlowNode for members | TarSlipImprov.py:236:44:236:50 | ControlFlowNode for members | provenance | |
| TarSlipImprov.py:235:13:235:19 | [post] ControlFlowNode for members [List element] | TarSlipImprov.py:236:44:236:50 | ControlFlowNode for members | provenance | |
| TarSlipImprov.py:235:28:235:28 | ControlFlowNode for f | TarSlipImprov.py:235:13:235:19 | [post] ControlFlowNode for members | provenance | list.append |
| TarSlipImprov.py:235:28:235:28 | ControlFlowNode for f | TarSlipImprov.py:235:13:235:19 | [post] ControlFlowNode for members [List element] | provenance | list.append |
| TarSlipImprov.py:258:6:258:26 | ControlFlowNode for Attribute() | TarSlipImprov.py:258:31:258:33 | ControlFlowNode for tar | provenance | |
| TarSlipImprov.py:258:31:258:33 | ControlFlowNode for tar | TarSlipImprov.py:259:9:259:13 | ControlFlowNode for entry | provenance | |
| TarSlipImprov.py:259:9:259:13 | ControlFlowNode for entry | TarSlipImprov.py:261:25:261:29 | ControlFlowNode for entry | provenance | |
@@ -85,19 +94,24 @@ edges
| TarSlipImprov.py:304:7:304:39 | ControlFlowNode for Attribute() | TarSlipImprov.py:304:1:304:3 | ControlFlowNode for tar | provenance | |
| TarSlipImprov.py:306:5:306:10 | ControlFlowNode for member | TarSlipImprov.py:309:19:309:24 | ControlFlowNode for member | provenance | |
| TarSlipImprov.py:309:5:309:10 | [post] ControlFlowNode for result | TarSlipImprov.py:310:49:310:54 | ControlFlowNode for result | provenance | |
| TarSlipImprov.py:309:5:309:10 | [post] ControlFlowNode for result [List element] | TarSlipImprov.py:310:49:310:54 | ControlFlowNode for result | provenance | |
| TarSlipImprov.py:309:19:309:24 | ControlFlowNode for member | TarSlipImprov.py:309:5:309:10 | [post] ControlFlowNode for result | provenance | list.append |
| TarSlipImprov.py:309:19:309:24 | ControlFlowNode for member | TarSlipImprov.py:309:5:309:10 | [post] ControlFlowNode for result [List element] | provenance | list.append |
nodes
| TarSlipImprov.py:15:1:15:3 | ControlFlowNode for tar | semmle.label | ControlFlowNode for tar |
| TarSlipImprov.py:15:7:15:39 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| TarSlipImprov.py:17:5:17:10 | ControlFlowNode for member | semmle.label | ControlFlowNode for member |
| TarSlipImprov.py:20:5:20:10 | [post] ControlFlowNode for result | semmle.label | [post] ControlFlowNode for result |
| TarSlipImprov.py:20:5:20:10 | [post] ControlFlowNode for result [List element] | semmle.label | [post] ControlFlowNode for result [List element] |
| TarSlipImprov.py:20:19:20:24 | ControlFlowNode for member | semmle.label | ControlFlowNode for member |
| TarSlipImprov.py:22:35:22:40 | ControlFlowNode for result | semmle.label | ControlFlowNode for result |
| TarSlipImprov.py:26:21:26:27 | ControlFlowNode for tarfile | semmle.label | ControlFlowNode for tarfile |
| TarSlipImprov.py:28:9:28:14 | ControlFlowNode for member | semmle.label | ControlFlowNode for member |
| TarSlipImprov.py:35:9:35:14 | [post] ControlFlowNode for result | semmle.label | [post] ControlFlowNode for result |
| TarSlipImprov.py:35:9:35:14 | [post] ControlFlowNode for result [List element] | semmle.label | [post] ControlFlowNode for result [List element] |
| TarSlipImprov.py:35:23:35:28 | ControlFlowNode for member | semmle.label | ControlFlowNode for member |
| TarSlipImprov.py:36:12:36:17 | ControlFlowNode for result | semmle.label | ControlFlowNode for result |
| TarSlipImprov.py:36:12:36:17 | ControlFlowNode for result [List element] | semmle.label | ControlFlowNode for result [List element] |
| TarSlipImprov.py:38:1:38:3 | ControlFlowNode for tar | semmle.label | ControlFlowNode for tar |
| TarSlipImprov.py:38:7:38:39 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| TarSlipImprov.py:39:49:39:68 | ControlFlowNode for members_filter1() | semmle.label | ControlFlowNode for members_filter1() |
@@ -133,14 +147,17 @@ nodes
| TarSlipImprov.py:151:14:151:50 | ControlFlowNode for closing() | semmle.label | ControlFlowNode for closing() |
| TarSlipImprov.py:151:22:151:49 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| TarSlipImprov.py:151:55:151:56 | ControlFlowNode for tf | semmle.label | ControlFlowNode for tf |
| TarSlipImprov.py:152:13:152:20 | ControlFlowNode for Yield | semmle.label | ControlFlowNode for Yield |
| TarSlipImprov.py:152:13:152:20 | ControlFlowNode for Yield [List element] | semmle.label | ControlFlowNode for Yield [List element] |
| TarSlipImprov.py:152:19:152:20 | ControlFlowNode for tf | semmle.label | ControlFlowNode for tf |
| TarSlipImprov.py:157:9:157:14 | ControlFlowNode for tar_cm | semmle.label | ControlFlowNode for tar_cm |
| TarSlipImprov.py:157:9:157:14 | ControlFlowNode for tar_cm [List element] | semmle.label | ControlFlowNode for tar_cm [List element] |
| TarSlipImprov.py:157:18:157:40 | ControlFlowNode for py2_tarxz() | semmle.label | ControlFlowNode for py2_tarxz() |
| TarSlipImprov.py:157:18:157:40 | ControlFlowNode for py2_tarxz() [List element] | semmle.label | ControlFlowNode for py2_tarxz() [List element] |
| TarSlipImprov.py:159:9:159:14 | ControlFlowNode for tar_cm | semmle.label | ControlFlowNode for tar_cm |
| TarSlipImprov.py:159:18:159:52 | ControlFlowNode for closing() | semmle.label | ControlFlowNode for closing() |
| TarSlipImprov.py:159:26:159:51 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| TarSlipImprov.py:162:20:162:23 | ControlFlowNode for tarc | semmle.label | ControlFlowNode for tarc |
| TarSlipImprov.py:162:20:162:23 | ControlFlowNode for tarc [List element] | semmle.label | ControlFlowNode for tarc [List element] |
| TarSlipImprov.py:169:9:169:12 | ControlFlowNode for tarc | semmle.label | ControlFlowNode for tarc |
| TarSlipImprov.py:176:6:176:31 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| TarSlipImprov.py:176:36:176:38 | ControlFlowNode for tar | semmle.label | ControlFlowNode for tar |
@@ -163,6 +180,7 @@ nodes
| TarSlipImprov.py:231:43:231:52 | ControlFlowNode for corpus_tar | semmle.label | ControlFlowNode for corpus_tar |
| TarSlipImprov.py:233:9:233:9 | ControlFlowNode for f | semmle.label | ControlFlowNode for f |
| TarSlipImprov.py:235:13:235:19 | [post] ControlFlowNode for members | semmle.label | [post] ControlFlowNode for members |
| TarSlipImprov.py:235:13:235:19 | [post] ControlFlowNode for members [List element] | semmle.label | [post] ControlFlowNode for members [List element] |
| TarSlipImprov.py:235:28:235:28 | ControlFlowNode for f | semmle.label | ControlFlowNode for f |
| TarSlipImprov.py:236:44:236:50 | ControlFlowNode for members | semmle.label | ControlFlowNode for members |
| TarSlipImprov.py:254:1:254:31 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
@@ -198,11 +216,13 @@ nodes
| TarSlipImprov.py:304:7:304:39 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| TarSlipImprov.py:306:5:306:10 | ControlFlowNode for member | semmle.label | ControlFlowNode for member |
| TarSlipImprov.py:309:5:309:10 | [post] ControlFlowNode for result | semmle.label | [post] ControlFlowNode for result |
| TarSlipImprov.py:309:5:309:10 | [post] ControlFlowNode for result [List element] | semmle.label | [post] ControlFlowNode for result [List element] |
| TarSlipImprov.py:309:19:309:24 | ControlFlowNode for member | semmle.label | ControlFlowNode for member |
| TarSlipImprov.py:310:49:310:54 | ControlFlowNode for result | semmle.label | ControlFlowNode for result |
| TarSlipImprov.py:316:1:316:46 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
subpaths
| TarSlipImprov.py:39:65:39:67 | ControlFlowNode for tar | TarSlipImprov.py:26:21:26:27 | ControlFlowNode for tarfile | TarSlipImprov.py:36:12:36:17 | ControlFlowNode for result | TarSlipImprov.py:39:49:39:68 | ControlFlowNode for members_filter1() |
| TarSlipImprov.py:39:65:39:67 | ControlFlowNode for tar | TarSlipImprov.py:26:21:26:27 | ControlFlowNode for tarfile | TarSlipImprov.py:36:12:36:17 | ControlFlowNode for result [List element] | TarSlipImprov.py:39:49:39:68 | ControlFlowNode for members_filter1() |
#select
| TarSlipImprov.py:22:35:22:40 | ControlFlowNode for result | TarSlipImprov.py:15:7:15:39 | ControlFlowNode for Attribute() | TarSlipImprov.py:22:35:22:40 | ControlFlowNode for result | Extraction of tarfile from $@ to a potentially untrusted source $@. | TarSlipImprov.py:15:7:15:39 | ControlFlowNode for Attribute() | ControlFlowNode for Attribute() | TarSlipImprov.py:22:35:22:40 | ControlFlowNode for result | ControlFlowNode for result |
| TarSlipImprov.py:39:49:39:68 | ControlFlowNode for members_filter1() | TarSlipImprov.py:38:7:38:39 | ControlFlowNode for Attribute() | TarSlipImprov.py:39:49:39:68 | ControlFlowNode for members_filter1() | Extraction of tarfile from $@ to a potentially untrusted source $@. | TarSlipImprov.py:38:7:38:39 | ControlFlowNode for Attribute() | ControlFlowNode for Attribute() | TarSlipImprov.py:39:49:39:68 | ControlFlowNode for members_filter1() | ControlFlowNode for members_filter1() |

View File

@@ -93,7 +93,9 @@ edges
| UnsafeUnpack.py:163:23:163:28 | ControlFlowNode for member | UnsafeUnpack.py:166:37:166:42 | ControlFlowNode for member | provenance | |
| UnsafeUnpack.py:163:33:163:35 | ControlFlowNode for tar | UnsafeUnpack.py:163:23:163:28 | ControlFlowNode for member | provenance | |
| UnsafeUnpack.py:166:23:166:28 | [post] ControlFlowNode for result | UnsafeUnpack.py:167:67:167:72 | ControlFlowNode for result | provenance | |
| UnsafeUnpack.py:166:23:166:28 | [post] ControlFlowNode for result [List element] | UnsafeUnpack.py:167:67:167:72 | ControlFlowNode for result | provenance | |
| UnsafeUnpack.py:166:37:166:42 | ControlFlowNode for member | UnsafeUnpack.py:166:23:166:28 | [post] ControlFlowNode for result | provenance | list.append |
| UnsafeUnpack.py:166:37:166:42 | ControlFlowNode for member | UnsafeUnpack.py:166:23:166:28 | [post] ControlFlowNode for result [List element] | provenance | list.append |
| UnsafeUnpack.py:171:1:171:8 | ControlFlowNode for response | UnsafeUnpack.py:174:15:174:22 | ControlFlowNode for response | provenance | |
| UnsafeUnpack.py:171:12:171:50 | ControlFlowNode for Attribute() | UnsafeUnpack.py:171:1:171:8 | ControlFlowNode for response | provenance | |
| UnsafeUnpack.py:173:11:173:17 | ControlFlowNode for tarpath | UnsafeUnpack.py:176:17:176:23 | ControlFlowNode for tarpath | provenance | |
@@ -189,6 +191,7 @@ nodes
| UnsafeUnpack.py:163:23:163:28 | ControlFlowNode for member | semmle.label | ControlFlowNode for member |
| UnsafeUnpack.py:163:33:163:35 | ControlFlowNode for tar | semmle.label | ControlFlowNode for tar |
| UnsafeUnpack.py:166:23:166:28 | [post] ControlFlowNode for result | semmle.label | [post] ControlFlowNode for result |
| UnsafeUnpack.py:166:23:166:28 | [post] ControlFlowNode for result [List element] | semmle.label | [post] ControlFlowNode for result [List element] |
| UnsafeUnpack.py:166:37:166:42 | ControlFlowNode for member | semmle.label | ControlFlowNode for member |
| UnsafeUnpack.py:167:67:167:72 | ControlFlowNode for result | semmle.label | ControlFlowNode for result |
| UnsafeUnpack.py:171:1:171:8 | ControlFlowNode for response | semmle.label | ControlFlowNode for response |

View File

@@ -3,8 +3,10 @@ edges
| Netmiko.py:18:16:18:18 | ControlFlowNode for cmd | Netmiko.py:20:45:20:47 | ControlFlowNode for cmd | provenance | |
| Netmiko.py:18:16:18:18 | ControlFlowNode for cmd | Netmiko.py:21:52:21:54 | ControlFlowNode for cmd | provenance | |
| Netmiko.py:18:16:18:18 | ControlFlowNode for cmd | Netmiko.py:22:52:22:54 | ControlFlowNode for cmd | provenance | |
| Netmiko.py:18:16:18:18 | ControlFlowNode for cmd | Netmiko.py:23:41:23:57 | ControlFlowNode for List | provenance | |
| Netmiko.py:18:16:18:18 | ControlFlowNode for cmd | Netmiko.py:23:43:23:45 | ControlFlowNode for cmd | provenance | |
| Netmiko.py:18:16:18:18 | ControlFlowNode for cmd | Netmiko.py:24:48:24:50 | ControlFlowNode for cmd | provenance | |
| Netmiko.py:23:42:23:56 | ControlFlowNode for List [List element] | Netmiko.py:23:41:23:57 | ControlFlowNode for List | provenance | |
| Netmiko.py:23:43:23:45 | ControlFlowNode for cmd | Netmiko.py:23:42:23:56 | ControlFlowNode for List [List element] | provenance | |
| Pexpect.py:15:16:15:18 | ControlFlowNode for cmd | Pexpect.py:16:14:16:16 | ControlFlowNode for cmd | provenance | |
| Pexpect.py:15:16:15:18 | ControlFlowNode for cmd | Pexpect.py:18:18:18:20 | ControlFlowNode for cmd | provenance | |
| Scrapli.py:13:16:13:18 | ControlFlowNode for cmd | Scrapli.py:24:42:24:44 | ControlFlowNode for cmd | provenance | |
@@ -32,6 +34,8 @@ nodes
| Netmiko.py:21:52:21:54 | ControlFlowNode for cmd | semmle.label | ControlFlowNode for cmd |
| Netmiko.py:22:52:22:54 | ControlFlowNode for cmd | semmle.label | ControlFlowNode for cmd |
| Netmiko.py:23:41:23:57 | ControlFlowNode for List | semmle.label | ControlFlowNode for List |
| Netmiko.py:23:42:23:56 | ControlFlowNode for List [List element] | semmle.label | ControlFlowNode for List [List element] |
| Netmiko.py:23:43:23:45 | ControlFlowNode for cmd | semmle.label | ControlFlowNode for cmd |
| Netmiko.py:24:48:24:50 | ControlFlowNode for cmd | semmle.label | ControlFlowNode for cmd |
| Pexpect.py:15:16:15:18 | ControlFlowNode for cmd | semmle.label | ControlFlowNode for cmd |
| Pexpect.py:16:14:16:16 | ControlFlowNode for cmd | semmle.label | ControlFlowNode for cmd |

View File

@@ -7,6 +7,7 @@ edges
| xslt.py:10:17:10:43 | ControlFlowNode for Attribute() | xslt.py:10:5:10:13 | ControlFlowNode for xsltQuery | provenance | |
| xslt.py:11:5:11:13 | ControlFlowNode for xslt_root | xslt.py:14:29:14:37 | ControlFlowNode for xslt_root | provenance | |
| xslt.py:11:17:11:36 | ControlFlowNode for Attribute() | xslt.py:11:5:11:13 | ControlFlowNode for xslt_root | provenance | |
| xslt.py:11:27:11:35 | ControlFlowNode for xsltQuery | xslt.py:11:17:11:36 | ControlFlowNode for Attribute() | provenance | |
| xslt.py:11:27:11:35 | ControlFlowNode for xsltQuery | xslt.py:11:17:11:36 | ControlFlowNode for Attribute() | provenance | Config |
| xslt.py:11:27:11:35 | ControlFlowNode for xsltQuery | xslt.py:11:17:11:36 | ControlFlowNode for Attribute() | provenance | Decoding-XML |
| xsltInjection.py:3:26:3:32 | ControlFlowNode for ImportMember | xsltInjection.py:3:26:3:32 | ControlFlowNode for request | provenance | |
@@ -21,6 +22,7 @@ edges
| xsltInjection.py:10:17:10:43 | ControlFlowNode for Attribute() | xsltInjection.py:10:5:10:13 | ControlFlowNode for xsltQuery | provenance | |
| xsltInjection.py:11:5:11:13 | ControlFlowNode for xslt_root | xsltInjection.py:12:28:12:36 | ControlFlowNode for xslt_root | provenance | |
| xsltInjection.py:11:17:11:36 | ControlFlowNode for Attribute() | xsltInjection.py:11:5:11:13 | ControlFlowNode for xslt_root | provenance | |
| xsltInjection.py:11:27:11:35 | ControlFlowNode for xsltQuery | xsltInjection.py:11:17:11:36 | ControlFlowNode for Attribute() | provenance | |
| xsltInjection.py:11:27:11:35 | ControlFlowNode for xsltQuery | xsltInjection.py:11:17:11:36 | ControlFlowNode for Attribute() | provenance | Config |
| xsltInjection.py:11:27:11:35 | ControlFlowNode for xsltQuery | xsltInjection.py:11:17:11:36 | ControlFlowNode for Attribute() | provenance | Decoding-XML |
| xsltInjection.py:17:5:17:13 | ControlFlowNode for xsltQuery | xsltInjection.py:18:27:18:35 | ControlFlowNode for xsltQuery | provenance | |
@@ -29,6 +31,7 @@ edges
| xsltInjection.py:17:17:17:43 | ControlFlowNode for Attribute() | xsltInjection.py:17:5:17:13 | ControlFlowNode for xsltQuery | provenance | |
| xsltInjection.py:18:5:18:13 | ControlFlowNode for xslt_root | xsltInjection.py:21:29:21:37 | ControlFlowNode for xslt_root | provenance | |
| xsltInjection.py:18:17:18:36 | ControlFlowNode for Attribute() | xsltInjection.py:18:5:18:13 | ControlFlowNode for xslt_root | provenance | |
| xsltInjection.py:18:27:18:35 | ControlFlowNode for xsltQuery | xsltInjection.py:18:17:18:36 | ControlFlowNode for Attribute() | provenance | |
| xsltInjection.py:18:27:18:35 | ControlFlowNode for xsltQuery | xsltInjection.py:18:17:18:36 | ControlFlowNode for Attribute() | provenance | Config |
| xsltInjection.py:18:27:18:35 | ControlFlowNode for xsltQuery | xsltInjection.py:18:17:18:36 | ControlFlowNode for Attribute() | provenance | Decoding-XML |
| xsltInjection.py:26:5:26:13 | ControlFlowNode for xsltQuery | xsltInjection.py:27:27:27:35 | ControlFlowNode for xsltQuery | provenance | |
@@ -37,6 +40,7 @@ edges
| xsltInjection.py:26:17:26:43 | ControlFlowNode for Attribute() | xsltInjection.py:26:5:26:13 | ControlFlowNode for xsltQuery | provenance | |
| xsltInjection.py:27:5:27:13 | ControlFlowNode for xslt_root | xsltInjection.py:31:24:31:32 | ControlFlowNode for xslt_root | provenance | |
| xsltInjection.py:27:17:27:36 | ControlFlowNode for Attribute() | xsltInjection.py:27:5:27:13 | ControlFlowNode for xslt_root | provenance | |
| xsltInjection.py:27:27:27:35 | ControlFlowNode for xsltQuery | xsltInjection.py:27:17:27:36 | ControlFlowNode for Attribute() | provenance | |
| xsltInjection.py:27:27:27:35 | ControlFlowNode for xsltQuery | xsltInjection.py:27:17:27:36 | ControlFlowNode for Attribute() | provenance | Config |
| xsltInjection.py:27:27:27:35 | ControlFlowNode for xsltQuery | xsltInjection.py:27:17:27:36 | ControlFlowNode for Attribute() | provenance | Decoding-XML |
| xsltInjection.py:35:5:35:13 | ControlFlowNode for xsltQuery | xsltInjection.py:36:34:36:42 | ControlFlowNode for xsltQuery | provenance | |
@@ -45,17 +49,22 @@ edges
| xsltInjection.py:35:17:35:43 | ControlFlowNode for Attribute() | xsltInjection.py:35:5:35:13 | ControlFlowNode for xsltQuery | provenance | |
| xsltInjection.py:36:5:36:13 | ControlFlowNode for xslt_root | xsltInjection.py:40:24:40:32 | ControlFlowNode for xslt_root | provenance | |
| xsltInjection.py:36:17:36:43 | ControlFlowNode for Attribute() | xsltInjection.py:36:5:36:13 | ControlFlowNode for xslt_root | provenance | |
| xsltInjection.py:36:34:36:42 | ControlFlowNode for xsltQuery | xsltInjection.py:36:17:36:43 | ControlFlowNode for Attribute() | provenance | |
| xsltInjection.py:36:34:36:42 | ControlFlowNode for xsltQuery | xsltInjection.py:36:17:36:43 | ControlFlowNode for Attribute() | provenance | Config |
| xsltInjection.py:36:34:36:42 | ControlFlowNode for xsltQuery | xsltInjection.py:36:17:36:43 | ControlFlowNode for Attribute() | provenance | Decoding-XML |
| xsltInjection.py:44:5:44:13 | ControlFlowNode for xsltQuery | xsltInjection.py:45:5:45:15 | ControlFlowNode for xsltStrings | provenance | |
| xsltInjection.py:44:5:44:13 | ControlFlowNode for xsltQuery | xsltInjection.py:45:20:45:28 | ControlFlowNode for xsltQuery | provenance | |
| xsltInjection.py:44:17:44:23 | ControlFlowNode for request | xsltInjection.py:44:17:44:28 | ControlFlowNode for Attribute | provenance | AdditionalTaintStep |
| xsltInjection.py:44:17:44:28 | ControlFlowNode for Attribute | xsltInjection.py:44:17:44:43 | ControlFlowNode for Attribute() | provenance | dict.get |
| xsltInjection.py:44:17:44:43 | ControlFlowNode for Attribute() | xsltInjection.py:44:5:44:13 | ControlFlowNode for xsltQuery | provenance | |
| xsltInjection.py:45:5:45:15 | ControlFlowNode for xsltStrings | xsltInjection.py:46:38:46:48 | ControlFlowNode for xsltStrings | provenance | |
| xsltInjection.py:45:5:45:15 | ControlFlowNode for xsltStrings [List element] | xsltInjection.py:46:38:46:48 | ControlFlowNode for xsltStrings [List element] | provenance | |
| xsltInjection.py:45:19:45:44 | ControlFlowNode for List [List element] | xsltInjection.py:45:5:45:15 | ControlFlowNode for xsltStrings [List element] | provenance | |
| xsltInjection.py:45:20:45:28 | ControlFlowNode for xsltQuery | xsltInjection.py:45:19:45:44 | ControlFlowNode for List [List element] | provenance | |
| xsltInjection.py:46:5:46:13 | ControlFlowNode for xslt_root | xsltInjection.py:50:24:50:32 | ControlFlowNode for xslt_root | provenance | |
| xsltInjection.py:46:17:46:49 | ControlFlowNode for Attribute() | xsltInjection.py:46:5:46:13 | ControlFlowNode for xslt_root | provenance | |
| xsltInjection.py:46:38:46:48 | ControlFlowNode for xsltStrings | xsltInjection.py:46:17:46:49 | ControlFlowNode for Attribute() | provenance | Config |
| xsltInjection.py:46:38:46:48 | ControlFlowNode for xsltStrings | xsltInjection.py:46:17:46:49 | ControlFlowNode for Attribute() | provenance | Decoding-XML |
| xsltInjection.py:46:38:46:48 | ControlFlowNode for xsltStrings [List element] | xsltInjection.py:46:17:46:49 | ControlFlowNode for Attribute() | provenance | |
| xsltInjection.py:46:38:46:48 | ControlFlowNode for xsltStrings [List element] | xsltInjection.py:46:17:46:49 | ControlFlowNode for Attribute() | provenance | Config |
| xsltInjection.py:46:38:46:48 | ControlFlowNode for xsltStrings [List element] | xsltInjection.py:46:17:46:49 | ControlFlowNode for Attribute() | provenance | Decoding-XML |
| xsltInjection.py:46:38:46:48 | ControlFlowNode for xsltStrings [List element] | xsltInjection.py:46:17:46:49 | ControlFlowNode for Attribute() | provenance | MaD:58660 |
nodes
| xslt.py:3:26:3:32 | ControlFlowNode for ImportMember | semmle.label | ControlFlowNode for ImportMember |
| xslt.py:3:26:3:32 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
@@ -105,10 +114,12 @@ nodes
| xsltInjection.py:44:17:44:23 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| xsltInjection.py:44:17:44:28 | ControlFlowNode for Attribute | semmle.label | ControlFlowNode for Attribute |
| xsltInjection.py:44:17:44:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| xsltInjection.py:45:5:45:15 | ControlFlowNode for xsltStrings | semmle.label | ControlFlowNode for xsltStrings |
| xsltInjection.py:45:5:45:15 | ControlFlowNode for xsltStrings [List element] | semmle.label | ControlFlowNode for xsltStrings [List element] |
| xsltInjection.py:45:19:45:44 | ControlFlowNode for List [List element] | semmle.label | ControlFlowNode for List [List element] |
| xsltInjection.py:45:20:45:28 | ControlFlowNode for xsltQuery | semmle.label | ControlFlowNode for xsltQuery |
| xsltInjection.py:46:5:46:13 | ControlFlowNode for xslt_root | semmle.label | ControlFlowNode for xslt_root |
| xsltInjection.py:46:17:46:49 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| xsltInjection.py:46:38:46:48 | ControlFlowNode for xsltStrings | semmle.label | ControlFlowNode for xsltStrings |
| xsltInjection.py:46:38:46:48 | ControlFlowNode for xsltStrings [List element] | semmle.label | ControlFlowNode for xsltStrings [List element] |
| xsltInjection.py:50:24:50:32 | ControlFlowNode for xslt_root | semmle.label | ControlFlowNode for xslt_root |
subpaths
#select

View File

@@ -32,11 +32,13 @@ edges
| agent_instructions.py:7:5:7:9 | ControlFlowNode for input | agent_instructions.py:9:50:9:89 | ControlFlowNode for BinaryExpr | provenance | Sink:MaD:11 |
| agent_instructions.py:7:13:7:19 | ControlFlowNode for request | agent_instructions.py:7:13:7:24 | ControlFlowNode for Attribute | provenance | AdditionalTaintStep |
| agent_instructions.py:7:13:7:24 | ControlFlowNode for Attribute | agent_instructions.py:7:13:7:37 | ControlFlowNode for Attribute() | provenance | dict.get |
| agent_instructions.py:7:13:7:24 | ControlFlowNode for Attribute | agent_instructions.py:7:13:7:37 | ControlFlowNode for Attribute() | provenance | dict.get(input) |
| agent_instructions.py:7:13:7:37 | ControlFlowNode for Attribute() | agent_instructions.py:7:5:7:9 | ControlFlowNode for input | provenance | |
| agent_instructions.py:17:5:17:9 | ControlFlowNode for input | agent_instructions.py:25:28:25:32 | ControlFlowNode for input | provenance | |
| agent_instructions.py:17:5:17:9 | ControlFlowNode for input | agent_instructions.py:35:28:35:32 | ControlFlowNode for input | provenance | |
| agent_instructions.py:17:13:17:19 | ControlFlowNode for request | agent_instructions.py:17:13:17:24 | ControlFlowNode for Attribute | provenance | AdditionalTaintStep |
| agent_instructions.py:17:13:17:24 | ControlFlowNode for Attribute | agent_instructions.py:17:13:17:37 | ControlFlowNode for Attribute() | provenance | dict.get |
| agent_instructions.py:17:13:17:24 | ControlFlowNode for Attribute | agent_instructions.py:17:13:17:37 | ControlFlowNode for Attribute() | provenance | dict.get(input) |
| agent_instructions.py:17:13:17:37 | ControlFlowNode for Attribute() | agent_instructions.py:17:5:17:9 | ControlFlowNode for input | provenance | |
| anthropic_test.py:2:26:2:32 | ControlFlowNode for ImportMember | anthropic_test.py:2:26:2:32 | ControlFlowNode for request | provenance | |
| anthropic_test.py:2:26:2:32 | ControlFlowNode for request | anthropic_test.py:11:15:11:21 | ControlFlowNode for request | provenance | |
@@ -61,7 +63,7 @@ edges
| openai_test.py:2:26:2:32 | ControlFlowNode for request | openai_test.py:13:13:13:19 | ControlFlowNode for request | provenance | |
| openai_test.py:12:5:12:11 | ControlFlowNode for persona | openai_test.py:17:22:17:46 | ControlFlowNode for BinaryExpr | provenance | Sink:MaD:10 |
| openai_test.py:12:5:12:11 | ControlFlowNode for persona | openai_test.py:22:22:22:46 | ControlFlowNode for BinaryExpr | provenance | Sink:MaD:10 |
| openai_test.py:12:5:12:11 | ControlFlowNode for persona | openai_test.py:23:15:37:9 | ControlFlowNode for List | provenance | Sink:MaD:9 |
| openai_test.py:12:5:12:11 | ControlFlowNode for persona | openai_test.py:26:28:26:51 | ControlFlowNode for BinaryExpr | provenance | |
| openai_test.py:12:5:12:11 | ControlFlowNode for persona | openai_test.py:26:28:26:51 | ControlFlowNode for BinaryExpr | provenance | |
| openai_test.py:12:5:12:11 | ControlFlowNode for persona | openai_test.py:41:22:41:46 | ControlFlowNode for BinaryExpr | provenance | Sink:MaD:10 |
| openai_test.py:12:5:12:11 | ControlFlowNode for persona | openai_test.py:63:28:63:51 | ControlFlowNode for BinaryExpr | provenance | Sink:MaD:8 |
@@ -72,7 +74,7 @@ edges
| openai_test.py:12:15:12:26 | ControlFlowNode for Attribute | openai_test.py:12:15:12:41 | ControlFlowNode for Attribute() | provenance | dict.get |
| openai_test.py:12:15:12:41 | ControlFlowNode for Attribute() | openai_test.py:12:5:12:11 | ControlFlowNode for persona | provenance | |
| openai_test.py:13:5:13:9 | ControlFlowNode for query | openai_test.py:18:15:18:19 | ControlFlowNode for query | provenance | Sink:MaD:9 |
| openai_test.py:13:5:13:9 | ControlFlowNode for query | openai_test.py:23:15:37:9 | ControlFlowNode for List | provenance | Sink:MaD:9 |
| openai_test.py:13:5:13:9 | ControlFlowNode for query | openai_test.py:33:33:33:37 | ControlFlowNode for query | provenance | |
| openai_test.py:13:5:13:9 | ControlFlowNode for query | openai_test.py:33:33:33:37 | ControlFlowNode for query | provenance | |
| openai_test.py:13:5:13:9 | ControlFlowNode for query | openai_test.py:42:15:42:19 | ControlFlowNode for query | provenance | Sink:MaD:9 |
| openai_test.py:13:5:13:9 | ControlFlowNode for query | openai_test.py:53:33:53:37 | ControlFlowNode for query | provenance | |
@@ -82,6 +84,14 @@ edges
| openai_test.py:13:13:13:19 | ControlFlowNode for request | openai_test.py:13:13:13:24 | ControlFlowNode for Attribute | provenance | AdditionalTaintStep |
| openai_test.py:13:13:13:24 | ControlFlowNode for Attribute | openai_test.py:13:13:13:37 | ControlFlowNode for Attribute() | provenance | dict.get |
| openai_test.py:13:13:13:37 | ControlFlowNode for Attribute() | openai_test.py:13:5:13:9 | ControlFlowNode for query | provenance | |
| openai_test.py:24:13:27:13 | ControlFlowNode for Dict [Dictionary element at key content] | openai_test.py:23:15:37:9 | ControlFlowNode for List | provenance | Sink:MaD:9 Sink:MaD:9 |
| openai_test.py:26:28:26:51 | ControlFlowNode for BinaryExpr | openai_test.py:24:13:27:13 | ControlFlowNode for Dict [Dictionary element at key content] | provenance | |
| openai_test.py:28:13:36:13 | ControlFlowNode for Dict [Dictionary element at key content, List element, Dictionary element at key text] | openai_test.py:23:15:37:9 | ControlFlowNode for List | provenance | Sink:MaD:9 Sink:MaD:9 |
| openai_test.py:28:13:36:13 | ControlFlowNode for Dict [Dictionary element at key content, List element, Dictionary element at key text] | openai_test.py:23:15:37:9 | ControlFlowNode for List | provenance | Sink:MaD:9 Sink:MaD:9 Sink:MaD:9 |
| openai_test.py:28:13:36:13 | ControlFlowNode for Dict [Dictionary element at key content, List element, Dictionary element at key text] | openai_test.py:23:15:37:9 | ControlFlowNode for List | provenance | Sink:MaD:9 Sink:MaD:9 Sink:MaD:9 Sink:MaD:9 |
| openai_test.py:30:28:35:17 | ControlFlowNode for List [List element, Dictionary element at key text] | openai_test.py:28:13:36:13 | ControlFlowNode for Dict [Dictionary element at key content, List element, Dictionary element at key text] | provenance | |
| openai_test.py:31:21:34:21 | ControlFlowNode for Dict [Dictionary element at key text] | openai_test.py:30:28:35:17 | ControlFlowNode for List [List element, Dictionary element at key text] | provenance | |
| openai_test.py:33:33:33:37 | ControlFlowNode for query | openai_test.py:31:21:34:21 | ControlFlowNode for Dict [Dictionary element at key text] | provenance | |
models
| 1 | Sink: Anthropic; Member[beta].Member[messages].Member[create].Argument[messages:].ListElement.DictionaryElement[content]; prompt-injection |
| 2 | Sink: Anthropic; Member[beta].Member[messages].Member[create].Argument[system:]; prompt-injection |
@@ -140,7 +150,13 @@ nodes
| openai_test.py:18:15:18:19 | ControlFlowNode for query | semmle.label | ControlFlowNode for query |
| openai_test.py:22:22:22:46 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
| openai_test.py:23:15:37:9 | ControlFlowNode for List | semmle.label | ControlFlowNode for List |
| openai_test.py:24:13:27:13 | ControlFlowNode for Dict [Dictionary element at key content] | semmle.label | ControlFlowNode for Dict [Dictionary element at key content] |
| openai_test.py:26:28:26:51 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
| openai_test.py:26:28:26:51 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
| openai_test.py:28:13:36:13 | ControlFlowNode for Dict [Dictionary element at key content, List element, Dictionary element at key text] | semmle.label | ControlFlowNode for Dict [Dictionary element at key content, List element, Dictionary element at key text] |
| openai_test.py:30:28:35:17 | ControlFlowNode for List [List element, Dictionary element at key text] | semmle.label | ControlFlowNode for List [List element, Dictionary element at key text] |
| openai_test.py:31:21:34:21 | ControlFlowNode for Dict [Dictionary element at key text] | semmle.label | ControlFlowNode for Dict [Dictionary element at key text] |
| openai_test.py:33:33:33:37 | ControlFlowNode for query | semmle.label | ControlFlowNode for query |
| openai_test.py:33:33:33:37 | ControlFlowNode for query | semmle.label | ControlFlowNode for query |
| openai_test.py:41:22:41:46 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
| openai_test.py:42:15:42:19 | ControlFlowNode for query | semmle.label | ControlFlowNode for query |

View File

@@ -131,6 +131,5 @@ from unknown_settings import password # $ SensitiveDataSource=password
print(password) # $ SensitiveUse=password
_config = {"sleep_timer": 5, "mysql_password": password}
# since we have taint-step from store of `password`, we will consider any item in the
# dictionary to be a password :(
print(_config["sleep_timer"]) # $ SPURIOUS: SensitiveUse=password
# since we have precise dictionary content, other items of the config are not tainted
print(_config["sleep_timer"])

View File

@@ -7,13 +7,9 @@ edges
| summaries.py:36:38:36:38 | ControlFlowNode for x | summaries.py:36:41:36:45 | ControlFlowNode for BinaryExpr | provenance | |
| summaries.py:36:48:36:53 | ControlFlowNode for SOURCE | summaries.py:36:18:36:54 | ControlFlowNode for apply_lambda() | provenance | apply_lambda |
| summaries.py:36:48:36:53 | ControlFlowNode for SOURCE | summaries.py:36:38:36:38 | ControlFlowNode for x | provenance | apply_lambda |
| summaries.py:44:1:44:12 | ControlFlowNode for tainted_list | summaries.py:45:6:45:20 | ControlFlowNode for Subscript | provenance | |
| summaries.py:44:1:44:12 | ControlFlowNode for tainted_list [List element] | summaries.py:45:6:45:17 | ControlFlowNode for tainted_list [List element] | provenance | |
| summaries.py:44:16:44:33 | ControlFlowNode for reversed() | summaries.py:44:1:44:12 | ControlFlowNode for tainted_list | provenance | |
| summaries.py:44:16:44:33 | ControlFlowNode for reversed() [List element] | summaries.py:44:1:44:12 | ControlFlowNode for tainted_list [List element] | provenance | |
| summaries.py:44:25:44:32 | ControlFlowNode for List | summaries.py:44:16:44:33 | ControlFlowNode for reversed() | provenance | builtins.reversed |
| summaries.py:44:25:44:32 | ControlFlowNode for List [List element] | summaries.py:44:16:44:33 | ControlFlowNode for reversed() [List element] | provenance | builtins.reversed |
| summaries.py:44:26:44:31 | ControlFlowNode for SOURCE | summaries.py:44:25:44:32 | ControlFlowNode for List | provenance | |
| summaries.py:44:26:44:31 | ControlFlowNode for SOURCE | summaries.py:44:25:44:32 | ControlFlowNode for List [List element] | provenance | |
| summaries.py:45:6:45:17 | ControlFlowNode for tainted_list [List element] | summaries.py:45:6:45:20 | ControlFlowNode for Subscript | provenance | |
| summaries.py:48:15:48:15 | ControlFlowNode for x | summaries.py:49:12:49:18 | ControlFlowNode for BinaryExpr | provenance | |
@@ -42,6 +38,7 @@ edges
| summaries.py:67:1:67:18 | ControlFlowNode for tainted_resultlist | summaries.py:68:6:68:26 | ControlFlowNode for Subscript | provenance | |
| summaries.py:67:1:67:18 | ControlFlowNode for tainted_resultlist [List element] | summaries.py:68:6:68:23 | ControlFlowNode for tainted_resultlist [List element] | provenance | |
| summaries.py:67:22:67:39 | ControlFlowNode for json_loads() [List element] | summaries.py:67:1:67:18 | ControlFlowNode for tainted_resultlist [List element] | provenance | |
| summaries.py:67:33:67:38 | ControlFlowNode for SOURCE | summaries.py:67:1:67:18 | ControlFlowNode for tainted_resultlist | provenance | |
| summaries.py:67:33:67:38 | ControlFlowNode for SOURCE | summaries.py:67:1:67:18 | ControlFlowNode for tainted_resultlist | provenance | Decoding-JSON |
| summaries.py:67:33:67:38 | ControlFlowNode for SOURCE | summaries.py:67:22:67:39 | ControlFlowNode for json_loads() [List element] | provenance | json.loads |
| summaries.py:68:6:68:23 | ControlFlowNode for tainted_resultlist [List element] | summaries.py:68:6:68:26 | ControlFlowNode for Subscript | provenance | |
@@ -56,11 +53,8 @@ nodes
| summaries.py:36:41:36:45 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
| summaries.py:36:48:36:53 | ControlFlowNode for SOURCE | semmle.label | ControlFlowNode for SOURCE |
| summaries.py:37:6:37:19 | ControlFlowNode for tainted_lambda | semmle.label | ControlFlowNode for tainted_lambda |
| summaries.py:44:1:44:12 | ControlFlowNode for tainted_list | semmle.label | ControlFlowNode for tainted_list |
| summaries.py:44:1:44:12 | ControlFlowNode for tainted_list [List element] | semmle.label | ControlFlowNode for tainted_list [List element] |
| summaries.py:44:16:44:33 | ControlFlowNode for reversed() | semmle.label | ControlFlowNode for reversed() |
| summaries.py:44:16:44:33 | ControlFlowNode for reversed() [List element] | semmle.label | ControlFlowNode for reversed() [List element] |
| summaries.py:44:25:44:32 | ControlFlowNode for List | semmle.label | ControlFlowNode for List |
| summaries.py:44:25:44:32 | ControlFlowNode for List [List element] | semmle.label | ControlFlowNode for List [List element] |
| summaries.py:44:26:44:31 | ControlFlowNode for SOURCE | semmle.label | ControlFlowNode for SOURCE |
| summaries.py:45:6:45:17 | ControlFlowNode for tainted_list [List element] | semmle.label | ControlFlowNode for tainted_list [List element] |

View File

@@ -32,7 +32,6 @@ def test_construction():
list(tainted_tuple), # $ tainted
list(tainted_set), # $ tainted
list(tainted_dict.values()), # $ tainted
list(tainted_dict.items()), # $ tainted
tuple(tainted_list), # $ tainted
set(tainted_list), # $ tainted
@@ -41,10 +40,11 @@ def test_construction():
dict(k = tainted_string)["k"], # $ tainted
dict(dict(k = tainted_string))["k"], # $ tainted
dict(["k", tainted_string]), # $ tainted
list(tainted_dict.items()), # $ tainted
)
ensure_not_tainted(
dict(k = tainted_string)["k1"]
dict(k = tainted_string)["k1"],
)
@@ -59,7 +59,7 @@ def test_access(x, y, z):
sorted(tainted_list), # $ tainted
reversed(tainted_list), # $ tainted
iter(tainted_list), # $ tainted
next(iter(tainted_list)), # $ MISSING: tainted
next(iter(tainted_list)), # $ tainted
[i for i in tainted_list], # $ tainted
[tainted_list for _i in [1,2,3]], # $ tainted
)

View File

@@ -53,7 +53,7 @@ def contrived_1():
(a, b, c), (d, e, f) = tainted_list, no_taint_list
ensure_tainted(a, b, c) # $ tainted
ensure_not_tainted(d, e, f) # $ SPURIOUS: tainted
ensure_not_tainted(d, e, f)
def contrived_2():

View File

@@ -3,10 +3,12 @@ edges
| taint_step_test.py:5:12:5:35 | ControlFlowNode for Attribute() | taint_step_test.py:5:5:5:8 | ControlFlowNode for path | provenance | |
| taint_step_test.py:6:5:6:8 | ControlFlowNode for file | taint_step_test.py:19:48:19:51 | ControlFlowNode for file | provenance | |
| taint_step_test.py:6:12:6:35 | ControlFlowNode for Attribute() | taint_step_test.py:6:5:6:8 | ControlFlowNode for file | provenance | |
| taint_step_test.py:11:18:11:21 | ControlFlowNode for path | taint_step_test.py:12:9:12:16 | ControlFlowNode for filepath | provenance | |
| taint_step_test.py:11:18:11:21 | ControlFlowNode for path | taint_step_test.py:12:9:12:16 | ControlFlowNode for filepath | provenance | AdditionalTaintStep |
| taint_step_test.py:11:18:11:21 | ControlFlowNode for path | taint_step_test.py:12:33:12:36 | ControlFlowNode for path | provenance | |
| taint_step_test.py:11:24:11:27 | ControlFlowNode for file | taint_step_test.py:12:9:12:16 | ControlFlowNode for filepath | provenance | AdditionalTaintStep |
| taint_step_test.py:12:9:12:16 | ControlFlowNode for filepath | taint_step_test.py:13:19:13:26 | ControlFlowNode for filepath | provenance | |
| taint_step_test.py:12:20:12:43 | ControlFlowNode for Attribute() | taint_step_test.py:12:9:12:16 | ControlFlowNode for filepath | provenance | |
| taint_step_test.py:12:33:12:36 | ControlFlowNode for path | taint_step_test.py:12:20:12:43 | ControlFlowNode for Attribute() | provenance | str.join |
| taint_step_test.py:19:43:19:46 | ControlFlowNode for path | taint_step_test.py:11:18:11:21 | ControlFlowNode for path | provenance | AdditionalTaintStep |
| taint_step_test.py:19:48:19:51 | ControlFlowNode for file | taint_step_test.py:11:24:11:27 | ControlFlowNode for file | provenance | AdditionalTaintStep |
nodes
@@ -17,6 +19,8 @@ nodes
| taint_step_test.py:11:18:11:21 | ControlFlowNode for path | semmle.label | ControlFlowNode for path |
| taint_step_test.py:11:24:11:27 | ControlFlowNode for file | semmle.label | ControlFlowNode for file |
| taint_step_test.py:12:9:12:16 | ControlFlowNode for filepath | semmle.label | ControlFlowNode for filepath |
| taint_step_test.py:12:20:12:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| taint_step_test.py:12:33:12:36 | ControlFlowNode for path | semmle.label | ControlFlowNode for path |
| taint_step_test.py:13:19:13:26 | ControlFlowNode for filepath | semmle.label | ControlFlowNode for filepath |
| taint_step_test.py:19:43:19:46 | ControlFlowNode for path | semmle.label | ControlFlowNode for path |
| taint_step_test.py:19:48:19:51 | ControlFlowNode for file | semmle.label | ControlFlowNode for file |

View File

@@ -6,16 +6,16 @@ pat = ... # some pattern
compiled_pat = re.compile(pat)
# see https://docs.python.org/3/library/re.html#functions
ensure_not_tainted(
# returns Match object, which is tested properly below. (note: with the flow summary
# modeling, objects containing tainted values are not themselves tainted).
re.search(pat, ts),
re.match(pat, ts),
re.fullmatch(pat, ts),
ensure_tainted(
# returns Match object, which is tested properly below. (note: the match objects contain
# tainted values but are not themselves tainted - this test relies on implicit reads at sinks).
re.search(pat, ts), # $ tainted
re.match(pat, ts), # $ tainted
re.fullmatch(pat, ts), # $ tainted
compiled_pat.search(ts),
compiled_pat.match(ts),
compiled_pat.fullmatch(ts),
compiled_pat.search(ts), # $ tainted
compiled_pat.match(ts), # $ tainted
compiled_pat.fullmatch(ts), # $ tainted
)
# Match object
@@ -80,9 +80,9 @@ ensure_tainted(
)
ensure_not_tainted(
re.subn(pat, repl="safe", string=ts),
re.subn(pat, repl="safe", string=ts)[1], # // the number of substitutions made
)
ensure_tainted(
re.subn(pat, repl="safe", string=ts), # $ tainted // implicit read at sink
re.subn(pat, repl="safe", string=ts)[0], # $ tainted // the string
)

View File

@@ -63,7 +63,8 @@ class TaintTest(tornado.web.RequestHandler):
request.headers["header-name"], # $ tainted
request.headers.get_list("header-name"), # $ tainted
request.headers.get_all(), # $ tainted
[(k, v) for (k, v) in request.headers.get_all()], # $ tainted
[(k, v) for (k, v) in request.headers.get_all()][0], # $ tainted
list([(k, v) for (k, v) in request.headers.get_all()])[0], # $ tainted
# Dict[str, http.cookies.Morsel]
request.cookies, # $ tainted
@@ -71,6 +72,11 @@ class TaintTest(tornado.web.RequestHandler):
request.cookies["cookie-name"].key, # $ tainted
request.cookies["cookie-name"].value, # $ tainted
request.cookies["cookie-name"].coded_value, # $ tainted
# The comprehension is not tainted, only the elements, but this passes due to implicit reads at sinks
[(k, v) for (k, v) in request.headers.get_all()], # $ tainted
# The list is not tainted, only the elements, but this passes due to implicit reads at sinks
list([(k, v) for (k, v) in request.headers.get_all()]), # $ tainted
)

View File

@@ -11,10 +11,13 @@
edges
| BindToAllInterfaces_test.py:5:9:5:17 | ControlFlowNode for StringLiteral | BindToAllInterfaces_test.py:5:9:5:24 | ControlFlowNode for Tuple | provenance | Sink:MaD:63 |
| BindToAllInterfaces_test.py:9:9:9:10 | ControlFlowNode for StringLiteral | BindToAllInterfaces_test.py:9:9:9:16 | ControlFlowNode for Tuple | provenance | Sink:MaD:63 |
| BindToAllInterfaces_test.py:16:1:16:10 | ControlFlowNode for ALL_LOCALS | BindToAllInterfaces_test.py:17:9:17:24 | ControlFlowNode for Tuple | provenance | Sink:MaD:63 |
| BindToAllInterfaces_test.py:16:1:16:10 | ControlFlowNode for ALL_LOCALS | BindToAllInterfaces_test.py:20:1:20:3 | ControlFlowNode for tup | provenance | |
| BindToAllInterfaces_test.py:16:1:16:10 | ControlFlowNode for ALL_LOCALS | BindToAllInterfaces_test.py:17:9:17:18 | ControlFlowNode for ALL_LOCALS | provenance | |
| BindToAllInterfaces_test.py:16:1:16:10 | ControlFlowNode for ALL_LOCALS | BindToAllInterfaces_test.py:20:8:20:17 | ControlFlowNode for ALL_LOCALS | provenance | |
| BindToAllInterfaces_test.py:16:14:16:22 | ControlFlowNode for StringLiteral | BindToAllInterfaces_test.py:16:1:16:10 | ControlFlowNode for ALL_LOCALS | provenance | |
| BindToAllInterfaces_test.py:20:1:20:3 | ControlFlowNode for tup | BindToAllInterfaces_test.py:21:8:21:10 | ControlFlowNode for tup | provenance | Sink:MaD:63 |
| BindToAllInterfaces_test.py:17:9:17:18 | ControlFlowNode for ALL_LOCALS | BindToAllInterfaces_test.py:17:9:17:24 | ControlFlowNode for Tuple | provenance | Sink:MaD:63 |
| BindToAllInterfaces_test.py:20:1:20:3 | ControlFlowNode for tup [Tuple element at index 0] | BindToAllInterfaces_test.py:21:8:21:10 | ControlFlowNode for tup | provenance | Sink:MaD:63 |
| BindToAllInterfaces_test.py:20:8:20:17 | ControlFlowNode for ALL_LOCALS | BindToAllInterfaces_test.py:20:8:20:23 | ControlFlowNode for Tuple [Tuple element at index 0] | provenance | |
| BindToAllInterfaces_test.py:20:8:20:23 | ControlFlowNode for Tuple [Tuple element at index 0] | BindToAllInterfaces_test.py:20:1:20:3 | ControlFlowNode for tup [Tuple element at index 0] | provenance | |
| BindToAllInterfaces_test.py:26:9:26:12 | ControlFlowNode for StringLiteral | BindToAllInterfaces_test.py:26:9:26:18 | ControlFlowNode for Tuple | provenance | Sink:MaD:63 |
| BindToAllInterfaces_test.py:33:18:33:21 | ControlFlowNode for self [Return] [Attribute bind_addr] | BindToAllInterfaces_test.py:41:10:41:17 | ControlFlowNode for Server() [Attribute bind_addr] | provenance | |
| BindToAllInterfaces_test.py:34:9:34:12 | [post] ControlFlowNode for self [Attribute bind_addr] | BindToAllInterfaces_test.py:33:18:33:21 | ControlFlowNode for self [Return] [Attribute bind_addr] | provenance | |
@@ -25,9 +28,10 @@ edges
| BindToAllInterfaces_test.py:41:1:41:6 | ControlFlowNode for server [Attribute bind_addr] | BindToAllInterfaces_test.py:42:1:42:6 | ControlFlowNode for server [Attribute bind_addr] | provenance | |
| BindToAllInterfaces_test.py:41:10:41:17 | ControlFlowNode for Server() [Attribute bind_addr] | BindToAllInterfaces_test.py:41:1:41:6 | ControlFlowNode for server [Attribute bind_addr] | provenance | |
| BindToAllInterfaces_test.py:42:1:42:6 | ControlFlowNode for server [Attribute bind_addr] | BindToAllInterfaces_test.py:37:15:37:18 | ControlFlowNode for self [Attribute bind_addr] | provenance | |
| BindToAllInterfaces_test.py:46:1:46:4 | ControlFlowNode for host | BindToAllInterfaces_test.py:48:9:48:18 | ControlFlowNode for Tuple | provenance | Sink:MaD:63 |
| BindToAllInterfaces_test.py:46:1:46:4 | ControlFlowNode for host | BindToAllInterfaces_test.py:48:9:48:12 | ControlFlowNode for host | provenance | |
| BindToAllInterfaces_test.py:46:8:46:44 | ControlFlowNode for Attribute() | BindToAllInterfaces_test.py:46:1:46:4 | ControlFlowNode for host | provenance | |
| BindToAllInterfaces_test.py:46:35:46:43 | ControlFlowNode for StringLiteral | BindToAllInterfaces_test.py:46:8:46:44 | ControlFlowNode for Attribute() | provenance | dict.get |
| BindToAllInterfaces_test.py:48:9:48:12 | ControlFlowNode for host | BindToAllInterfaces_test.py:48:9:48:18 | ControlFlowNode for Tuple | provenance | Sink:MaD:63 |
| BindToAllInterfaces_test.py:53:10:53:18 | ControlFlowNode for StringLiteral | BindToAllInterfaces_test.py:53:10:53:25 | ControlFlowNode for Tuple | provenance | Sink:MaD:63 |
| BindToAllInterfaces_test.py:58:10:58:18 | ControlFlowNode for StringLiteral | BindToAllInterfaces_test.py:58:10:58:25 | ControlFlowNode for Tuple | provenance | Sink:MaD:63 |
nodes
@@ -37,8 +41,11 @@ nodes
| BindToAllInterfaces_test.py:9:9:9:16 | ControlFlowNode for Tuple | semmle.label | ControlFlowNode for Tuple |
| BindToAllInterfaces_test.py:16:1:16:10 | ControlFlowNode for ALL_LOCALS | semmle.label | ControlFlowNode for ALL_LOCALS |
| BindToAllInterfaces_test.py:16:14:16:22 | ControlFlowNode for StringLiteral | semmle.label | ControlFlowNode for StringLiteral |
| BindToAllInterfaces_test.py:17:9:17:18 | ControlFlowNode for ALL_LOCALS | semmle.label | ControlFlowNode for ALL_LOCALS |
| BindToAllInterfaces_test.py:17:9:17:24 | ControlFlowNode for Tuple | semmle.label | ControlFlowNode for Tuple |
| BindToAllInterfaces_test.py:20:1:20:3 | ControlFlowNode for tup | semmle.label | ControlFlowNode for tup |
| BindToAllInterfaces_test.py:20:1:20:3 | ControlFlowNode for tup [Tuple element at index 0] | semmle.label | ControlFlowNode for tup [Tuple element at index 0] |
| BindToAllInterfaces_test.py:20:8:20:17 | ControlFlowNode for ALL_LOCALS | semmle.label | ControlFlowNode for ALL_LOCALS |
| BindToAllInterfaces_test.py:20:8:20:23 | ControlFlowNode for Tuple [Tuple element at index 0] | semmle.label | ControlFlowNode for Tuple [Tuple element at index 0] |
| BindToAllInterfaces_test.py:21:8:21:10 | ControlFlowNode for tup | semmle.label | ControlFlowNode for tup |
| BindToAllInterfaces_test.py:26:9:26:12 | ControlFlowNode for StringLiteral | semmle.label | ControlFlowNode for StringLiteral |
| BindToAllInterfaces_test.py:26:9:26:18 | ControlFlowNode for Tuple | semmle.label | ControlFlowNode for Tuple |
@@ -55,6 +62,7 @@ nodes
| BindToAllInterfaces_test.py:46:1:46:4 | ControlFlowNode for host | semmle.label | ControlFlowNode for host |
| BindToAllInterfaces_test.py:46:8:46:44 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| BindToAllInterfaces_test.py:46:35:46:43 | ControlFlowNode for StringLiteral | semmle.label | ControlFlowNode for StringLiteral |
| BindToAllInterfaces_test.py:48:9:48:12 | ControlFlowNode for host | semmle.label | ControlFlowNode for host |
| BindToAllInterfaces_test.py:48:9:48:18 | ControlFlowNode for Tuple | semmle.label | ControlFlowNode for Tuple |
| BindToAllInterfaces_test.py:53:10:53:18 | ControlFlowNode for StringLiteral | semmle.label | ControlFlowNode for StringLiteral |
| BindToAllInterfaces_test.py:53:10:53:25 | ControlFlowNode for Tuple | semmle.label | ControlFlowNode for Tuple |

View File

@@ -5,11 +5,13 @@ edges
| test.py:5:26:5:32 | ControlFlowNode for request | test.py:34:12:34:18 | ControlFlowNode for request | provenance | |
| test.py:5:26:5:32 | ControlFlowNode for request | test.py:42:12:42:18 | ControlFlowNode for request | provenance | |
| test.py:5:26:5:32 | ControlFlowNode for request | test.py:54:12:54:18 | ControlFlowNode for request | provenance | |
| test.py:13:5:13:12 | ControlFlowNode for data_raw | test.py:14:5:14:8 | ControlFlowNode for data | provenance | |
| test.py:13:5:13:12 | ControlFlowNode for data_raw | test.py:14:5:14:8 | ControlFlowNode for data | provenance | Decoding-Base64 |
| test.py:13:16:13:22 | ControlFlowNode for request | test.py:13:16:13:27 | ControlFlowNode for Attribute | provenance | AdditionalTaintStep |
| test.py:13:16:13:27 | ControlFlowNode for Attribute | test.py:13:16:13:39 | ControlFlowNode for Attribute() | provenance | dict.get |
| test.py:13:16:13:39 | ControlFlowNode for Attribute() | test.py:13:5:13:12 | ControlFlowNode for data_raw | provenance | |
| test.py:14:5:14:8 | ControlFlowNode for data | test.py:15:36:15:39 | ControlFlowNode for data | provenance | |
| test.py:23:5:23:12 | ControlFlowNode for data_raw | test.py:24:5:24:8 | ControlFlowNode for data | provenance | |
| test.py:23:5:23:12 | ControlFlowNode for data_raw | test.py:24:5:24:8 | ControlFlowNode for data | provenance | Decoding-Base64 |
| test.py:23:16:23:22 | ControlFlowNode for request | test.py:23:16:23:27 | ControlFlowNode for Attribute | provenance | AdditionalTaintStep |
| test.py:23:16:23:27 | ControlFlowNode for Attribute | test.py:23:16:23:39 | ControlFlowNode for Attribute() | provenance | dict.get |

View File

@@ -1,10 +1,13 @@
edges
| src/unsafe_shell_test.py:4:22:4:25 | ControlFlowNode for name | src/unsafe_shell_test.py:5:25:5:28 | ControlFlowNode for name | provenance | |
| src/unsafe_shell_test.py:4:22:4:25 | ControlFlowNode for name | src/unsafe_shell_test.py:8:23:8:26 | ControlFlowNode for name | provenance | |
| src/unsafe_shell_test.py:4:22:4:25 | ControlFlowNode for name | src/unsafe_shell_test.py:11:25:11:38 | ControlFlowNode for Attribute() | provenance | |
| src/unsafe_shell_test.py:4:22:4:25 | ControlFlowNode for name | src/unsafe_shell_test.py:14:25:14:40 | ControlFlowNode for Attribute() | provenance | |
| src/unsafe_shell_test.py:4:22:4:25 | ControlFlowNode for name | src/unsafe_shell_test.py:11:34:11:37 | ControlFlowNode for name | provenance | |
| src/unsafe_shell_test.py:4:22:4:25 | ControlFlowNode for name | src/unsafe_shell_test.py:14:35:14:38 | ControlFlowNode for name | provenance | |
| src/unsafe_shell_test.py:4:22:4:25 | ControlFlowNode for name | src/unsafe_shell_test.py:17:32:17:35 | ControlFlowNode for name | provenance | |
| src/unsafe_shell_test.py:4:22:4:25 | ControlFlowNode for name | src/unsafe_shell_test.py:20:27:20:30 | ControlFlowNode for name | provenance | |
| src/unsafe_shell_test.py:11:34:11:37 | ControlFlowNode for name | src/unsafe_shell_test.py:11:25:11:38 | ControlFlowNode for Attribute() | provenance | str.join |
| src/unsafe_shell_test.py:14:34:14:39 | ControlFlowNode for List [List element] | src/unsafe_shell_test.py:14:25:14:40 | ControlFlowNode for Attribute() | provenance | str.join |
| src/unsafe_shell_test.py:14:35:14:38 | ControlFlowNode for name | src/unsafe_shell_test.py:14:34:14:39 | ControlFlowNode for List [List element] | provenance | |
| src/unsafe_shell_test.py:26:20:26:23 | ControlFlowNode for name | src/unsafe_shell_test.py:29:30:29:33 | ControlFlowNode for name | provenance | |
| src/unsafe_shell_test.py:36:22:36:25 | ControlFlowNode for name | src/unsafe_shell_test.py:39:30:39:33 | ControlFlowNode for name | provenance | |
| src/unsafe_shell_test.py:36:22:36:25 | ControlFlowNode for name | src/unsafe_shell_test.py:44:20:44:23 | ControlFlowNode for name | provenance | |
@@ -15,7 +18,10 @@ nodes
| src/unsafe_shell_test.py:5:25:5:28 | ControlFlowNode for name | semmle.label | ControlFlowNode for name |
| src/unsafe_shell_test.py:8:23:8:26 | ControlFlowNode for name | semmle.label | ControlFlowNode for name |
| src/unsafe_shell_test.py:11:25:11:38 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| src/unsafe_shell_test.py:11:34:11:37 | ControlFlowNode for name | semmle.label | ControlFlowNode for name |
| src/unsafe_shell_test.py:14:25:14:40 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| src/unsafe_shell_test.py:14:34:14:39 | ControlFlowNode for List [List element] | semmle.label | ControlFlowNode for List [List element] |
| src/unsafe_shell_test.py:14:35:14:38 | ControlFlowNode for name | semmle.label | ControlFlowNode for name |
| src/unsafe_shell_test.py:17:32:17:35 | ControlFlowNode for name | semmle.label | ControlFlowNode for name |
| src/unsafe_shell_test.py:20:27:20:30 | ControlFlowNode for name | semmle.label | ControlFlowNode for name |
| src/unsafe_shell_test.py:26:20:26:23 | ControlFlowNode for name | semmle.label | ControlFlowNode for name |

View File

@@ -7,8 +7,10 @@ edges
| reflected_xss.py:9:18:9:24 | ControlFlowNode for request | reflected_xss.py:9:18:9:29 | ControlFlowNode for Attribute | provenance | AdditionalTaintStep |
| reflected_xss.py:9:18:9:29 | ControlFlowNode for Attribute | reflected_xss.py:9:18:9:45 | ControlFlowNode for Attribute() | provenance | dict.get |
| reflected_xss.py:9:18:9:45 | ControlFlowNode for Attribute() | reflected_xss.py:9:5:9:14 | ControlFlowNode for first_name | provenance | |
| reflected_xss.py:21:5:21:8 | ControlFlowNode for data | reflected_xss.py:22:26:22:41 | ControlFlowNode for Attribute() | provenance | |
| reflected_xss.py:21:5:21:8 | ControlFlowNode for data | reflected_xss.py:22:26:22:41 | ControlFlowNode for Attribute() | provenance | AdditionalTaintStep |
| reflected_xss.py:21:23:21:29 | ControlFlowNode for request | reflected_xss.py:21:5:21:8 | ControlFlowNode for data | provenance | AdditionalTaintStep |
| reflected_xss.py:27:5:27:8 | ControlFlowNode for data | reflected_xss.py:28:26:28:41 | ControlFlowNode for Attribute() | provenance | |
| reflected_xss.py:27:5:27:8 | ControlFlowNode for data | reflected_xss.py:28:26:28:41 | ControlFlowNode for Attribute() | provenance | AdditionalTaintStep |
| reflected_xss.py:27:23:27:29 | ControlFlowNode for request | reflected_xss.py:27:5:27:8 | ControlFlowNode for data | provenance | AdditionalTaintStep |
nodes

View File

@@ -7,7 +7,8 @@ edges
| test.py:50:29:50:31 | ControlFlowNode for err | test.py:50:16:50:32 | ControlFlowNode for format_error() | provenance | |
| test.py:50:29:50:31 | ControlFlowNode for err | test.py:52:18:52:20 | ControlFlowNode for msg | provenance | |
| test.py:52:18:52:20 | ControlFlowNode for msg | test.py:53:12:53:27 | ControlFlowNode for BinaryExpr | provenance | |
| test.py:65:25:65:25 | ControlFlowNode for e | test.py:66:24:66:40 | ControlFlowNode for Dict | provenance | |
| test.py:65:25:65:25 | ControlFlowNode for e | test.py:66:34:66:39 | ControlFlowNode for str() | provenance | |
| test.py:66:34:66:39 | ControlFlowNode for str() | test.py:66:24:66:40 | ControlFlowNode for Dict | provenance | |
nodes
| test.py:16:16:16:37 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| test.py:23:25:23:25 | ControlFlowNode for e | semmle.label | ControlFlowNode for e |
@@ -23,6 +24,7 @@ nodes
| test.py:53:12:53:27 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
| test.py:65:25:65:25 | ControlFlowNode for e | semmle.label | ControlFlowNode for e |
| test.py:66:24:66:40 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| test.py:66:34:66:39 | ControlFlowNode for str() | semmle.label | ControlFlowNode for str() |
subpaths
| test.py:50:29:50:31 | ControlFlowNode for err | test.py:52:18:52:20 | ControlFlowNode for msg | test.py:53:12:53:27 | ControlFlowNode for BinaryExpr | test.py:50:16:50:32 | ControlFlowNode for format_error() |
#select

View File

@@ -22,8 +22,6 @@ edges
| test.py:67:38:67:48 | ControlFlowNode for bank_number | test.py:70:15:70:25 | ControlFlowNode for bank_number | provenance | |
| test.py:67:76:67:78 | ControlFlowNode for ccn | test.py:73:15:73:17 | ControlFlowNode for ccn | provenance | |
| test.py:67:81:67:88 | ControlFlowNode for user_ccn | test.py:74:15:74:22 | ControlFlowNode for user_ccn | provenance | |
| test.py:101:5:101:10 | ControlFlowNode for config | test.py:105:11:105:31 | ControlFlowNode for Subscript | provenance | |
| test.py:103:21:103:37 | ControlFlowNode for Attribute | test.py:101:5:101:10 | ControlFlowNode for config | provenance | |
nodes
| test.py:19:5:19:12 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:19:16:19:29 | ControlFlowNode for get_password() | semmle.label | ControlFlowNode for get_password() |
@@ -68,9 +66,6 @@ nodes
| test.py:70:15:70:25 | ControlFlowNode for bank_number | semmle.label | ControlFlowNode for bank_number |
| test.py:73:15:73:17 | ControlFlowNode for ccn | semmle.label | ControlFlowNode for ccn |
| test.py:74:15:74:22 | ControlFlowNode for user_ccn | semmle.label | ControlFlowNode for user_ccn |
| test.py:101:5:101:10 | ControlFlowNode for config | semmle.label | ControlFlowNode for config |
| test.py:103:21:103:37 | ControlFlowNode for Attribute | semmle.label | ControlFlowNode for Attribute |
| test.py:105:11:105:31 | ControlFlowNode for Subscript | semmle.label | ControlFlowNode for Subscript |
subpaths
#select
| test.py:20:48:20:55 | ControlFlowNode for password | test.py:19:16:19:29 | ControlFlowNode for get_password() | test.py:20:48:20:55 | ControlFlowNode for password | This expression logs $@ as clear text. | test.py:19:16:19:29 | ControlFlowNode for get_password() | sensitive data (password) |
@@ -97,4 +92,3 @@ subpaths
| test.py:70:15:70:25 | ControlFlowNode for bank_number | test.py:67:38:67:48 | ControlFlowNode for bank_number | test.py:70:15:70:25 | ControlFlowNode for bank_number | This expression logs $@ as clear text. | test.py:67:38:67:48 | ControlFlowNode for bank_number | sensitive data (private) |
| test.py:73:15:73:17 | ControlFlowNode for ccn | test.py:67:76:67:78 | ControlFlowNode for ccn | test.py:73:15:73:17 | ControlFlowNode for ccn | This expression logs $@ as clear text. | test.py:67:76:67:78 | ControlFlowNode for ccn | sensitive data (private) |
| test.py:74:15:74:22 | ControlFlowNode for user_ccn | test.py:67:81:67:88 | ControlFlowNode for user_ccn | test.py:74:15:74:22 | ControlFlowNode for user_ccn | This expression logs $@ as clear text. | test.py:67:81:67:88 | ControlFlowNode for user_ccn | sensitive data (private) |
| test.py:105:11:105:31 | ControlFlowNode for Subscript | test.py:103:21:103:37 | ControlFlowNode for Attribute | test.py:105:11:105:31 | ControlFlowNode for Subscript | This expression logs $@ as clear text. | test.py:103:21:103:37 | ControlFlowNode for Attribute | sensitive data (password) |

View File

@@ -4,9 +4,11 @@ edges
| password_in_cookie.py:14:5:14:12 | ControlFlowNode for password | password_in_cookie.py:16:33:16:40 | ControlFlowNode for password | provenance | |
| password_in_cookie.py:14:16:14:43 | ControlFlowNode for Attribute() | password_in_cookie.py:14:5:14:12 | ControlFlowNode for password | provenance | |
| test.py:15:5:15:12 | ControlFlowNode for password | test.py:17:20:17:27 | ControlFlowNode for password | provenance | |
| test.py:15:5:15:12 | ControlFlowNode for password | test.py:18:9:18:13 | ControlFlowNode for lines | provenance | |
| test.py:15:5:15:12 | ControlFlowNode for password | test.py:18:18:18:32 | ControlFlowNode for BinaryExpr | provenance | |
| test.py:15:16:15:29 | ControlFlowNode for get_password() | test.py:15:5:15:12 | ControlFlowNode for password | provenance | |
| test.py:18:9:18:13 | ControlFlowNode for lines | test.py:19:25:19:29 | ControlFlowNode for lines | provenance | |
| test.py:18:9:18:13 | ControlFlowNode for lines [List element] | test.py:19:25:19:29 | ControlFlowNode for lines | provenance | |
| test.py:18:17:18:33 | ControlFlowNode for List [List element] | test.py:18:9:18:13 | ControlFlowNode for lines [List element] | provenance | |
| test.py:18:18:18:32 | ControlFlowNode for BinaryExpr | test.py:18:17:18:33 | ControlFlowNode for List [List element] | provenance | |
nodes
| password_in_cookie.py:7:5:7:12 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| password_in_cookie.py:7:16:7:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
@@ -17,7 +19,9 @@ nodes
| test.py:15:5:15:12 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:15:16:15:29 | ControlFlowNode for get_password() | semmle.label | ControlFlowNode for get_password() |
| test.py:17:20:17:27 | ControlFlowNode for password | semmle.label | ControlFlowNode for password |
| test.py:18:9:18:13 | ControlFlowNode for lines | semmle.label | ControlFlowNode for lines |
| test.py:18:9:18:13 | ControlFlowNode for lines [List element] | semmle.label | ControlFlowNode for lines [List element] |
| test.py:18:17:18:33 | ControlFlowNode for List [List element] | semmle.label | ControlFlowNode for List [List element] |
| test.py:18:18:18:32 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
| test.py:19:25:19:29 | ControlFlowNode for lines | semmle.label | ControlFlowNode for lines |
subpaths
#select

View File

@@ -82,14 +82,19 @@ edges
| full_partial_test.py:61:5:61:7 | ControlFlowNode for url | full_partial_test.py:63:18:63:20 | ControlFlowNode for url | provenance | |
| full_partial_test.py:66:5:66:14 | ControlFlowNode for user_input | full_partial_test.py:70:5:70:7 | ControlFlowNode for url | provenance | |
| full_partial_test.py:66:5:66:14 | ControlFlowNode for user_input | full_partial_test.py:74:5:74:7 | ControlFlowNode for url | provenance | |
| full_partial_test.py:66:5:66:14 | ControlFlowNode for user_input | full_partial_test.py:78:5:78:7 | ControlFlowNode for url | provenance | |
| full_partial_test.py:66:5:66:14 | ControlFlowNode for user_input | full_partial_test.py:78:38:78:47 | ControlFlowNode for user_input | provenance | |
| full_partial_test.py:66:18:66:24 | ControlFlowNode for request | full_partial_test.py:66:5:66:14 | ControlFlowNode for user_input | provenance | AdditionalTaintStep |
| full_partial_test.py:66:18:66:24 | ControlFlowNode for request | full_partial_test.py:67:5:67:13 | ControlFlowNode for query_val | provenance | AdditionalTaintStep |
| full_partial_test.py:67:5:67:13 | ControlFlowNode for query_val | full_partial_test.py:78:5:78:7 | ControlFlowNode for url | provenance | |
| full_partial_test.py:67:5:67:13 | ControlFlowNode for query_val | full_partial_test.py:78:50:78:58 | ControlFlowNode for query_val | provenance | |
| full_partial_test.py:67:17:67:23 | ControlFlowNode for request | full_partial_test.py:67:5:67:13 | ControlFlowNode for query_val | provenance | AdditionalTaintStep |
| full_partial_test.py:70:5:70:7 | ControlFlowNode for url | full_partial_test.py:72:18:72:20 | ControlFlowNode for url | provenance | |
| full_partial_test.py:74:5:74:7 | ControlFlowNode for url | full_partial_test.py:76:18:76:20 | ControlFlowNode for url | provenance | |
| full_partial_test.py:78:5:78:7 | ControlFlowNode for url | full_partial_test.py:80:18:80:20 | ControlFlowNode for url | provenance | |
| full_partial_test.py:78:11:78:59 | ControlFlowNode for BinaryExpr | full_partial_test.py:78:5:78:7 | ControlFlowNode for url | provenance | |
| full_partial_test.py:78:38:78:47 | ControlFlowNode for user_input | full_partial_test.py:78:38:78:58 | ControlFlowNode for Tuple [Tuple element at index 0] | provenance | |
| full_partial_test.py:78:38:78:58 | ControlFlowNode for Tuple [Tuple element at index 0] | full_partial_test.py:78:11:78:59 | ControlFlowNode for BinaryExpr | provenance | |
| full_partial_test.py:78:38:78:58 | ControlFlowNode for Tuple [Tuple element at index 1] | full_partial_test.py:78:11:78:59 | ControlFlowNode for BinaryExpr | provenance | |
| full_partial_test.py:78:50:78:58 | ControlFlowNode for query_val | full_partial_test.py:78:38:78:58 | ControlFlowNode for Tuple [Tuple element at index 1] | provenance | |
| full_partial_test.py:83:5:83:14 | ControlFlowNode for user_input | full_partial_test.py:87:5:87:7 | ControlFlowNode for url | provenance | |
| full_partial_test.py:83:5:83:14 | ControlFlowNode for user_input | full_partial_test.py:91:5:91:7 | ControlFlowNode for url | provenance | |
| full_partial_test.py:83:5:83:14 | ControlFlowNode for user_input | full_partial_test.py:95:5:95:7 | ControlFlowNode for url | provenance | |
@@ -274,6 +279,11 @@ nodes
| full_partial_test.py:74:5:74:7 | ControlFlowNode for url | semmle.label | ControlFlowNode for url |
| full_partial_test.py:76:18:76:20 | ControlFlowNode for url | semmle.label | ControlFlowNode for url |
| full_partial_test.py:78:5:78:7 | ControlFlowNode for url | semmle.label | ControlFlowNode for url |
| full_partial_test.py:78:11:78:59 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
| full_partial_test.py:78:38:78:47 | ControlFlowNode for user_input | semmle.label | ControlFlowNode for user_input |
| full_partial_test.py:78:38:78:58 | ControlFlowNode for Tuple [Tuple element at index 0] | semmle.label | ControlFlowNode for Tuple [Tuple element at index 0] |
| full_partial_test.py:78:38:78:58 | ControlFlowNode for Tuple [Tuple element at index 1] | semmle.label | ControlFlowNode for Tuple [Tuple element at index 1] |
| full_partial_test.py:78:50:78:58 | ControlFlowNode for query_val | semmle.label | ControlFlowNode for query_val |
| full_partial_test.py:80:18:80:20 | ControlFlowNode for url | semmle.label | ControlFlowNode for url |
| full_partial_test.py:83:5:83:14 | ControlFlowNode for user_input | semmle.label | ControlFlowNode for user_input |
| full_partial_test.py:83:18:83:24 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |

View File

@@ -7,25 +7,34 @@ edges
| PoC/server.py:1:26:1:32 | ControlFlowNode for request | PoC/server.py:98:14:98:20 | ControlFlowNode for request | provenance | |
| PoC/server.py:26:5:26:17 | ControlFlowNode for author_string | PoC/server.py:27:25:27:37 | ControlFlowNode for author_string | provenance | |
| PoC/server.py:26:21:26:27 | ControlFlowNode for request | PoC/server.py:26:5:26:17 | ControlFlowNode for author_string | provenance | AdditionalTaintStep |
| PoC/server.py:27:5:27:10 | ControlFlowNode for author | PoC/server.py:30:27:30:44 | ControlFlowNode for Dict | provenance | |
| PoC/server.py:27:5:27:10 | ControlFlowNode for author | PoC/server.py:31:34:31:51 | ControlFlowNode for Dict | provenance | |
| PoC/server.py:27:5:27:10 | ControlFlowNode for author | PoC/server.py:30:38:30:43 | ControlFlowNode for author | provenance | |
| PoC/server.py:27:5:27:10 | ControlFlowNode for author | PoC/server.py:31:45:31:50 | ControlFlowNode for author | provenance | |
| PoC/server.py:27:14:27:38 | ControlFlowNode for Attribute() | PoC/server.py:27:5:27:10 | ControlFlowNode for author | provenance | |
| PoC/server.py:27:25:27:37 | ControlFlowNode for author_string | PoC/server.py:27:14:27:38 | ControlFlowNode for Attribute() | provenance | Config |
| PoC/server.py:30:38:30:43 | ControlFlowNode for author | PoC/server.py:30:27:30:44 | ControlFlowNode for Dict | provenance | |
| PoC/server.py:31:45:31:50 | ControlFlowNode for author | PoC/server.py:31:34:31:51 | ControlFlowNode for Dict | provenance | |
| PoC/server.py:43:5:43:10 | ControlFlowNode for author | PoC/server.py:47:38:47:67 | ControlFlowNode for BinaryExpr | provenance | |
| PoC/server.py:43:14:43:20 | ControlFlowNode for request | PoC/server.py:43:5:43:10 | ControlFlowNode for author | provenance | AdditionalTaintStep |
| PoC/server.py:47:38:47:67 | ControlFlowNode for BinaryExpr | PoC/server.py:47:27:47:68 | ControlFlowNode for Dict | provenance | Config |
| PoC/server.py:52:5:52:10 | ControlFlowNode for author | PoC/server.py:54:17:54:70 | ControlFlowNode for BinaryExpr | provenance | |
| PoC/server.py:52:14:52:20 | ControlFlowNode for request | PoC/server.py:52:5:52:10 | ControlFlowNode for author | provenance | AdditionalTaintStep |
| PoC/server.py:53:5:53:10 | ControlFlowNode for search | PoC/server.py:61:27:61:58 | ControlFlowNode for Dict | provenance | |
| PoC/server.py:53:5:53:10 | ControlFlowNode for search | PoC/server.py:61:51:61:56 | ControlFlowNode for search | provenance | |
| PoC/server.py:53:14:57:5 | ControlFlowNode for Dict | PoC/server.py:53:5:53:10 | ControlFlowNode for search | provenance | |
| PoC/server.py:54:17:54:70 | ControlFlowNode for BinaryExpr | PoC/server.py:53:14:57:5 | ControlFlowNode for Dict | provenance | Config |
| PoC/server.py:61:37:61:57 | ControlFlowNode for Dict [Dictionary element at key $function] | PoC/server.py:61:27:61:58 | ControlFlowNode for Dict | provenance | |
| PoC/server.py:61:51:61:56 | ControlFlowNode for search | PoC/server.py:61:37:61:57 | ControlFlowNode for Dict [Dictionary element at key $function] | provenance | |
| PoC/server.py:77:5:77:10 | ControlFlowNode for author | PoC/server.py:80:23:80:101 | ControlFlowNode for BinaryExpr | provenance | |
| PoC/server.py:77:14:77:20 | ControlFlowNode for request | PoC/server.py:77:5:77:10 | ControlFlowNode for author | provenance | AdditionalTaintStep |
| PoC/server.py:78:5:78:15 | ControlFlowNode for accumulator | PoC/server.py:84:5:84:9 | ControlFlowNode for group | provenance | |
| PoC/server.py:78:5:78:15 | ControlFlowNode for accumulator | PoC/server.py:86:37:86:47 | ControlFlowNode for accumulator | provenance | |
| PoC/server.py:78:19:83:5 | ControlFlowNode for Dict | PoC/server.py:78:5:78:15 | ControlFlowNode for accumulator | provenance | |
| PoC/server.py:80:23:80:101 | ControlFlowNode for BinaryExpr | PoC/server.py:78:19:83:5 | ControlFlowNode for Dict | provenance | Config |
| PoC/server.py:84:5:84:9 | ControlFlowNode for group | PoC/server.py:91:29:91:47 | ControlFlowNode for Dict | provenance | |
| PoC/server.py:84:5:84:9 | ControlFlowNode for group | PoC/server.py:92:38:92:56 | ControlFlowNode for Dict | provenance | |
| PoC/server.py:84:5:84:9 | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] | PoC/server.py:91:41:91:45 | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] | provenance | |
| PoC/server.py:84:5:84:9 | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] | PoC/server.py:92:50:92:54 | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] | provenance | |
| PoC/server.py:84:13:87:5 | ControlFlowNode for Dict [Dictionary element at key author, Dictionary element at key $accumulator] | PoC/server.py:84:5:84:9 | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] | provenance | |
| PoC/server.py:86:19:86:49 | ControlFlowNode for Dict [Dictionary element at key $accumulator] | PoC/server.py:84:13:87:5 | ControlFlowNode for Dict [Dictionary element at key author, Dictionary element at key $accumulator] | provenance | |
| PoC/server.py:86:37:86:47 | ControlFlowNode for accumulator | PoC/server.py:86:19:86:49 | ControlFlowNode for Dict [Dictionary element at key $accumulator] | provenance | |
| PoC/server.py:91:41:91:45 | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] | PoC/server.py:91:29:91:47 | ControlFlowNode for Dict | provenance | |
| PoC/server.py:92:50:92:54 | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] | PoC/server.py:92:38:92:56 | ControlFlowNode for Dict | provenance | |
| PoC/server.py:98:5:98:10 | ControlFlowNode for author | PoC/server.py:99:5:99:10 | ControlFlowNode for mapper | provenance | |
| PoC/server.py:98:14:98:20 | ControlFlowNode for request | PoC/server.py:98:5:98:10 | ControlFlowNode for author | provenance | AdditionalTaintStep |
| PoC/server.py:99:5:99:10 | ControlFlowNode for mapper | PoC/server.py:102:9:102:14 | ControlFlowNode for mapper | provenance | |
@@ -39,16 +48,18 @@ edges
| flask_mongoengine_bad.py:20:30:20:42 | ControlFlowNode for unsafe_search | flask_mongoengine_bad.py:20:19:20:43 | ControlFlowNode for Attribute() | provenance | Config |
| flask_mongoengine_bad.py:26:5:26:17 | ControlFlowNode for unsafe_search | flask_mongoengine_bad.py:27:30:27:42 | ControlFlowNode for unsafe_search | provenance | |
| flask_mongoengine_bad.py:26:21:26:27 | ControlFlowNode for request | flask_mongoengine_bad.py:26:5:26:17 | ControlFlowNode for unsafe_search | provenance | AdditionalTaintStep |
| flask_mongoengine_bad.py:27:5:27:15 | ControlFlowNode for json_search | flask_mongoengine_bad.py:30:39:30:59 | ControlFlowNode for Dict | provenance | |
| flask_mongoengine_bad.py:27:5:27:15 | ControlFlowNode for json_search | flask_mongoengine_bad.py:30:48:30:58 | ControlFlowNode for json_search | provenance | |
| flask_mongoengine_bad.py:27:19:27:43 | ControlFlowNode for Attribute() | flask_mongoengine_bad.py:27:5:27:15 | ControlFlowNode for json_search | provenance | |
| flask_mongoengine_bad.py:27:30:27:42 | ControlFlowNode for unsafe_search | flask_mongoengine_bad.py:27:19:27:43 | ControlFlowNode for Attribute() | provenance | Config |
| flask_mongoengine_bad.py:30:48:30:58 | ControlFlowNode for json_search | flask_mongoengine_bad.py:30:39:30:59 | ControlFlowNode for Dict | provenance | |
| flask_pymongo_bad.py:1:26:1:32 | ControlFlowNode for ImportMember | flask_pymongo_bad.py:1:26:1:32 | ControlFlowNode for request | provenance | |
| flask_pymongo_bad.py:1:26:1:32 | ControlFlowNode for request | flask_pymongo_bad.py:11:21:11:27 | ControlFlowNode for request | provenance | |
| flask_pymongo_bad.py:11:5:11:17 | ControlFlowNode for unsafe_search | flask_pymongo_bad.py:12:30:12:42 | ControlFlowNode for unsafe_search | provenance | |
| flask_pymongo_bad.py:11:21:11:27 | ControlFlowNode for request | flask_pymongo_bad.py:11:5:11:17 | ControlFlowNode for unsafe_search | provenance | AdditionalTaintStep |
| flask_pymongo_bad.py:12:5:12:15 | ControlFlowNode for json_search | flask_pymongo_bad.py:14:31:14:51 | ControlFlowNode for Dict | provenance | |
| flask_pymongo_bad.py:12:5:12:15 | ControlFlowNode for json_search | flask_pymongo_bad.py:14:40:14:50 | ControlFlowNode for json_search | provenance | |
| flask_pymongo_bad.py:12:19:12:43 | ControlFlowNode for Attribute() | flask_pymongo_bad.py:12:5:12:15 | ControlFlowNode for json_search | provenance | |
| flask_pymongo_bad.py:12:30:12:42 | ControlFlowNode for unsafe_search | flask_pymongo_bad.py:12:19:12:43 | ControlFlowNode for Attribute() | provenance | Config |
| flask_pymongo_bad.py:14:40:14:50 | ControlFlowNode for json_search | flask_pymongo_bad.py:14:31:14:51 | ControlFlowNode for Dict | provenance | |
| mongoengine_bad.py:1:26:1:32 | ControlFlowNode for ImportMember | mongoengine_bad.py:1:26:1:32 | ControlFlowNode for request | provenance | |
| mongoengine_bad.py:1:26:1:32 | ControlFlowNode for request | mongoengine_bad.py:18:21:18:27 | ControlFlowNode for request | provenance | |
| mongoengine_bad.py:1:26:1:32 | ControlFlowNode for request | mongoengine_bad.py:26:21:26:27 | ControlFlowNode for request | provenance | |
@@ -58,24 +69,28 @@ edges
| mongoengine_bad.py:1:26:1:32 | ControlFlowNode for request | mongoengine_bad.py:57:21:57:27 | ControlFlowNode for request | provenance | |
| mongoengine_bad.py:18:5:18:17 | ControlFlowNode for unsafe_search | mongoengine_bad.py:19:30:19:42 | ControlFlowNode for unsafe_search | provenance | |
| mongoengine_bad.py:18:21:18:27 | ControlFlowNode for request | mongoengine_bad.py:18:5:18:17 | ControlFlowNode for unsafe_search | provenance | AdditionalTaintStep |
| mongoengine_bad.py:19:5:19:15 | ControlFlowNode for json_search | mongoengine_bad.py:22:26:22:46 | ControlFlowNode for Dict | provenance | |
| mongoengine_bad.py:19:5:19:15 | ControlFlowNode for json_search | mongoengine_bad.py:22:35:22:45 | ControlFlowNode for json_search | provenance | |
| mongoengine_bad.py:19:19:19:43 | ControlFlowNode for Attribute() | mongoengine_bad.py:19:5:19:15 | ControlFlowNode for json_search | provenance | |
| mongoengine_bad.py:19:30:19:42 | ControlFlowNode for unsafe_search | mongoengine_bad.py:19:19:19:43 | ControlFlowNode for Attribute() | provenance | Config |
| mongoengine_bad.py:22:35:22:45 | ControlFlowNode for json_search | mongoengine_bad.py:22:26:22:46 | ControlFlowNode for Dict | provenance | |
| mongoengine_bad.py:26:5:26:17 | ControlFlowNode for unsafe_search | mongoengine_bad.py:27:30:27:42 | ControlFlowNode for unsafe_search | provenance | |
| mongoengine_bad.py:26:21:26:27 | ControlFlowNode for request | mongoengine_bad.py:26:5:26:17 | ControlFlowNode for unsafe_search | provenance | AdditionalTaintStep |
| mongoengine_bad.py:27:5:27:15 | ControlFlowNode for json_search | mongoengine_bad.py:30:26:30:46 | ControlFlowNode for Dict | provenance | |
| mongoengine_bad.py:27:5:27:15 | ControlFlowNode for json_search | mongoengine_bad.py:30:35:30:45 | ControlFlowNode for json_search | provenance | |
| mongoengine_bad.py:27:19:27:43 | ControlFlowNode for Attribute() | mongoengine_bad.py:27:5:27:15 | ControlFlowNode for json_search | provenance | |
| mongoengine_bad.py:27:30:27:42 | ControlFlowNode for unsafe_search | mongoengine_bad.py:27:19:27:43 | ControlFlowNode for Attribute() | provenance | Config |
| mongoengine_bad.py:30:35:30:45 | ControlFlowNode for json_search | mongoengine_bad.py:30:26:30:46 | ControlFlowNode for Dict | provenance | |
| mongoengine_bad.py:34:5:34:17 | ControlFlowNode for unsafe_search | mongoengine_bad.py:35:30:35:42 | ControlFlowNode for unsafe_search | provenance | |
| mongoengine_bad.py:34:21:34:27 | ControlFlowNode for request | mongoengine_bad.py:34:5:34:17 | ControlFlowNode for unsafe_search | provenance | AdditionalTaintStep |
| mongoengine_bad.py:35:5:35:15 | ControlFlowNode for json_search | mongoengine_bad.py:38:26:38:46 | ControlFlowNode for Dict | provenance | |
| mongoengine_bad.py:35:5:35:15 | ControlFlowNode for json_search | mongoengine_bad.py:38:35:38:45 | ControlFlowNode for json_search | provenance | |
| mongoengine_bad.py:35:19:35:43 | ControlFlowNode for Attribute() | mongoengine_bad.py:35:5:35:15 | ControlFlowNode for json_search | provenance | |
| mongoengine_bad.py:35:30:35:42 | ControlFlowNode for unsafe_search | mongoengine_bad.py:35:19:35:43 | ControlFlowNode for Attribute() | provenance | Config |
| mongoengine_bad.py:38:35:38:45 | ControlFlowNode for json_search | mongoengine_bad.py:38:26:38:46 | ControlFlowNode for Dict | provenance | |
| mongoengine_bad.py:42:5:42:17 | ControlFlowNode for unsafe_search | mongoengine_bad.py:43:30:43:42 | ControlFlowNode for unsafe_search | provenance | |
| mongoengine_bad.py:42:21:42:27 | ControlFlowNode for request | mongoengine_bad.py:42:5:42:17 | ControlFlowNode for unsafe_search | provenance | AdditionalTaintStep |
| mongoengine_bad.py:43:5:43:15 | ControlFlowNode for json_search | mongoengine_bad.py:46:26:46:46 | ControlFlowNode for Dict | provenance | |
| mongoengine_bad.py:43:5:43:15 | ControlFlowNode for json_search | mongoengine_bad.py:46:35:46:45 | ControlFlowNode for json_search | provenance | |
| mongoengine_bad.py:43:19:43:43 | ControlFlowNode for Attribute() | mongoengine_bad.py:43:5:43:15 | ControlFlowNode for json_search | provenance | |
| mongoengine_bad.py:43:30:43:42 | ControlFlowNode for unsafe_search | mongoengine_bad.py:43:19:43:43 | ControlFlowNode for Attribute() | provenance | Config |
| mongoengine_bad.py:46:35:46:45 | ControlFlowNode for json_search | mongoengine_bad.py:46:26:46:46 | ControlFlowNode for Dict | provenance | |
| mongoengine_bad.py:50:5:50:17 | ControlFlowNode for unsafe_search | mongoengine_bad.py:51:30:51:42 | ControlFlowNode for unsafe_search | provenance | |
| mongoengine_bad.py:50:21:50:27 | ControlFlowNode for request | mongoengine_bad.py:50:5:50:17 | ControlFlowNode for unsafe_search | provenance | AdditionalTaintStep |
| mongoengine_bad.py:51:5:51:15 | ControlFlowNode for json_search | mongoengine_bad.py:53:34:53:44 | ControlFlowNode for json_search | provenance | |
@@ -83,9 +98,10 @@ edges
| mongoengine_bad.py:51:30:51:42 | ControlFlowNode for unsafe_search | mongoengine_bad.py:51:19:51:43 | ControlFlowNode for Attribute() | provenance | Config |
| mongoengine_bad.py:57:5:57:17 | ControlFlowNode for unsafe_search | mongoengine_bad.py:58:30:58:42 | ControlFlowNode for unsafe_search | provenance | |
| mongoengine_bad.py:57:21:57:27 | ControlFlowNode for request | mongoengine_bad.py:57:5:57:17 | ControlFlowNode for unsafe_search | provenance | AdditionalTaintStep |
| mongoengine_bad.py:58:5:58:15 | ControlFlowNode for json_search | mongoengine_bad.py:61:29:61:49 | ControlFlowNode for Dict | provenance | |
| mongoengine_bad.py:58:5:58:15 | ControlFlowNode for json_search | mongoengine_bad.py:61:38:61:48 | ControlFlowNode for json_search | provenance | |
| mongoengine_bad.py:58:19:58:43 | ControlFlowNode for Attribute() | mongoengine_bad.py:58:5:58:15 | ControlFlowNode for json_search | provenance | |
| mongoengine_bad.py:58:30:58:42 | ControlFlowNode for unsafe_search | mongoengine_bad.py:58:19:58:43 | ControlFlowNode for Attribute() | provenance | Config |
| mongoengine_bad.py:61:38:61:48 | ControlFlowNode for json_search | mongoengine_bad.py:61:29:61:49 | ControlFlowNode for Dict | provenance | |
| pymongo_test.py:1:26:1:32 | ControlFlowNode for ImportMember | pymongo_test.py:1:26:1:32 | ControlFlowNode for request | provenance | |
| pymongo_test.py:1:26:1:32 | ControlFlowNode for request | pymongo_test.py:12:21:12:27 | ControlFlowNode for request | provenance | |
| pymongo_test.py:1:26:1:32 | ControlFlowNode for request | pymongo_test.py:29:27:29:33 | ControlFlowNode for request | provenance | |
@@ -93,9 +109,10 @@ edges
| pymongo_test.py:1:26:1:32 | ControlFlowNode for request | pymongo_test.py:52:26:52:32 | ControlFlowNode for request | provenance | |
| pymongo_test.py:12:5:12:17 | ControlFlowNode for unsafe_search | pymongo_test.py:13:30:13:42 | ControlFlowNode for unsafe_search | provenance | |
| pymongo_test.py:12:21:12:27 | ControlFlowNode for request | pymongo_test.py:12:5:12:17 | ControlFlowNode for unsafe_search | provenance | AdditionalTaintStep |
| pymongo_test.py:13:5:13:15 | ControlFlowNode for json_search | pymongo_test.py:15:42:15:62 | ControlFlowNode for Dict | provenance | |
| pymongo_test.py:13:5:13:15 | ControlFlowNode for json_search | pymongo_test.py:15:51:15:61 | ControlFlowNode for json_search | provenance | |
| pymongo_test.py:13:19:13:43 | ControlFlowNode for Attribute() | pymongo_test.py:13:5:13:15 | ControlFlowNode for json_search | provenance | |
| pymongo_test.py:13:30:13:42 | ControlFlowNode for unsafe_search | pymongo_test.py:13:19:13:43 | ControlFlowNode for Attribute() | provenance | Config |
| pymongo_test.py:15:51:15:61 | ControlFlowNode for json_search | pymongo_test.py:15:42:15:62 | ControlFlowNode for Dict | provenance | |
| pymongo_test.py:29:5:29:12 | ControlFlowNode for event_id | pymongo_test.py:33:45:33:72 | ControlFlowNode for Fstring | provenance | |
| pymongo_test.py:29:16:29:51 | ControlFlowNode for Attribute() | pymongo_test.py:29:5:29:12 | ControlFlowNode for event_id | provenance | |
| pymongo_test.py:29:27:29:33 | ControlFlowNode for request | pymongo_test.py:29:27:29:50 | ControlFlowNode for Subscript | provenance | AdditionalTaintStep |
@@ -112,13 +129,23 @@ edges
| pymongo_test.py:52:15:52:50 | ControlFlowNode for Attribute() | pymongo_test.py:52:5:52:11 | ControlFlowNode for decoded | provenance | |
| pymongo_test.py:52:26:52:32 | ControlFlowNode for request | pymongo_test.py:52:26:52:49 | ControlFlowNode for Subscript | provenance | AdditionalTaintStep |
| pymongo_test.py:52:26:52:49 | ControlFlowNode for Subscript | pymongo_test.py:52:15:52:50 | ControlFlowNode for Attribute() | provenance | Config |
| pymongo_test.py:54:5:54:10 | ControlFlowNode for search | pymongo_test.py:59:25:59:56 | ControlFlowNode for Dict | provenance | |
| pymongo_test.py:54:5:54:10 | ControlFlowNode for search | pymongo_test.py:59:49:59:54 | ControlFlowNode for search | provenance | |
| pymongo_test.py:54:5:54:10 | ControlFlowNode for search [Dictionary element at key body] | pymongo_test.py:59:49:59:54 | ControlFlowNode for search [Dictionary element at key body] | provenance | |
| pymongo_test.py:54:14:58:5 | ControlFlowNode for Dict | pymongo_test.py:54:5:54:10 | ControlFlowNode for search | provenance | |
| pymongo_test.py:54:14:58:5 | ControlFlowNode for Dict [Dictionary element at key body] | pymongo_test.py:54:5:54:10 | ControlFlowNode for search [Dictionary element at key body] | provenance | |
| pymongo_test.py:55:17:55:23 | ControlFlowNode for decoded | pymongo_test.py:54:14:58:5 | ControlFlowNode for Dict | provenance | |
| pymongo_test.py:55:17:55:23 | ControlFlowNode for decoded | pymongo_test.py:54:14:58:5 | ControlFlowNode for Dict | provenance | Decoding-NoSQL |
| pymongo_test.py:55:17:55:23 | ControlFlowNode for decoded | pymongo_test.py:61:25:61:57 | ControlFlowNode for Dict | provenance | |
| pymongo_test.py:55:17:55:23 | ControlFlowNode for decoded | pymongo_test.py:62:25:62:42 | ControlFlowNode for Dict | provenance | |
| pymongo_test.py:55:17:55:23 | ControlFlowNode for decoded | pymongo_test.py:54:14:58:5 | ControlFlowNode for Dict [Dictionary element at key body] | provenance | |
| pymongo_test.py:55:17:55:23 | ControlFlowNode for decoded | pymongo_test.py:61:49:61:55 | ControlFlowNode for decoded | provenance | |
| pymongo_test.py:55:17:55:23 | ControlFlowNode for decoded | pymongo_test.py:62:35:62:41 | ControlFlowNode for decoded | provenance | |
| pymongo_test.py:55:17:55:23 | ControlFlowNode for decoded | pymongo_test.py:63:25:63:31 | ControlFlowNode for decoded | provenance | |
| pymongo_test.py:59:35:59:55 | ControlFlowNode for Dict [Dictionary element at key $function, Dictionary element at key body] | pymongo_test.py:59:25:59:56 | ControlFlowNode for Dict | provenance | |
| pymongo_test.py:59:35:59:55 | ControlFlowNode for Dict [Dictionary element at key $function] | pymongo_test.py:59:25:59:56 | ControlFlowNode for Dict | provenance | |
| pymongo_test.py:59:49:59:54 | ControlFlowNode for search | pymongo_test.py:59:35:59:55 | ControlFlowNode for Dict [Dictionary element at key $function] | provenance | |
| pymongo_test.py:59:49:59:54 | ControlFlowNode for search [Dictionary element at key body] | pymongo_test.py:59:35:59:55 | ControlFlowNode for Dict [Dictionary element at key $function, Dictionary element at key body] | provenance | |
| pymongo_test.py:61:35:61:56 | ControlFlowNode for Dict [Dictionary element at key $function] | pymongo_test.py:61:25:61:57 | ControlFlowNode for Dict | provenance | |
| pymongo_test.py:61:49:61:55 | ControlFlowNode for decoded | pymongo_test.py:61:35:61:56 | ControlFlowNode for Dict [Dictionary element at key $function] | provenance | |
| pymongo_test.py:62:35:62:41 | ControlFlowNode for decoded | pymongo_test.py:62:25:62:42 | ControlFlowNode for Dict | provenance | |
nodes
| PoC/server.py:1:26:1:32 | ControlFlowNode for ImportMember | semmle.label | ControlFlowNode for ImportMember |
| PoC/server.py:1:26:1:32 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
@@ -128,7 +155,9 @@ nodes
| PoC/server.py:27:14:27:38 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| PoC/server.py:27:25:27:37 | ControlFlowNode for author_string | semmle.label | ControlFlowNode for author_string |
| PoC/server.py:30:27:30:44 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| PoC/server.py:30:38:30:43 | ControlFlowNode for author | semmle.label | ControlFlowNode for author |
| PoC/server.py:31:34:31:51 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| PoC/server.py:31:45:31:50 | ControlFlowNode for author | semmle.label | ControlFlowNode for author |
| PoC/server.py:43:5:43:10 | ControlFlowNode for author | semmle.label | ControlFlowNode for author |
| PoC/server.py:43:14:43:20 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| PoC/server.py:47:27:47:68 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
@@ -139,14 +168,21 @@ nodes
| PoC/server.py:53:14:57:5 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| PoC/server.py:54:17:54:70 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
| PoC/server.py:61:27:61:58 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| PoC/server.py:61:37:61:57 | ControlFlowNode for Dict [Dictionary element at key $function] | semmle.label | ControlFlowNode for Dict [Dictionary element at key $function] |
| PoC/server.py:61:51:61:56 | ControlFlowNode for search | semmle.label | ControlFlowNode for search |
| PoC/server.py:77:5:77:10 | ControlFlowNode for author | semmle.label | ControlFlowNode for author |
| PoC/server.py:77:14:77:20 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| PoC/server.py:78:5:78:15 | ControlFlowNode for accumulator | semmle.label | ControlFlowNode for accumulator |
| PoC/server.py:78:19:83:5 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| PoC/server.py:80:23:80:101 | ControlFlowNode for BinaryExpr | semmle.label | ControlFlowNode for BinaryExpr |
| PoC/server.py:84:5:84:9 | ControlFlowNode for group | semmle.label | ControlFlowNode for group |
| PoC/server.py:84:5:84:9 | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] | semmle.label | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] |
| PoC/server.py:84:13:87:5 | ControlFlowNode for Dict [Dictionary element at key author, Dictionary element at key $accumulator] | semmle.label | ControlFlowNode for Dict [Dictionary element at key author, Dictionary element at key $accumulator] |
| PoC/server.py:86:19:86:49 | ControlFlowNode for Dict [Dictionary element at key $accumulator] | semmle.label | ControlFlowNode for Dict [Dictionary element at key $accumulator] |
| PoC/server.py:86:37:86:47 | ControlFlowNode for accumulator | semmle.label | ControlFlowNode for accumulator |
| PoC/server.py:91:29:91:47 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| PoC/server.py:91:41:91:45 | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] | semmle.label | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] |
| PoC/server.py:92:38:92:56 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| PoC/server.py:92:50:92:54 | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] | semmle.label | ControlFlowNode for group [Dictionary element at key author, Dictionary element at key $accumulator] |
| PoC/server.py:98:5:98:10 | ControlFlowNode for author | semmle.label | ControlFlowNode for author |
| PoC/server.py:98:14:98:20 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| PoC/server.py:99:5:99:10 | ControlFlowNode for mapper | semmle.label | ControlFlowNode for mapper |
@@ -165,6 +201,7 @@ nodes
| flask_mongoengine_bad.py:27:19:27:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| flask_mongoengine_bad.py:27:30:27:42 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| flask_mongoengine_bad.py:30:39:30:59 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| flask_mongoengine_bad.py:30:48:30:58 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| flask_pymongo_bad.py:1:26:1:32 | ControlFlowNode for ImportMember | semmle.label | ControlFlowNode for ImportMember |
| flask_pymongo_bad.py:1:26:1:32 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| flask_pymongo_bad.py:11:5:11:17 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
@@ -173,6 +210,7 @@ nodes
| flask_pymongo_bad.py:12:19:12:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| flask_pymongo_bad.py:12:30:12:42 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| flask_pymongo_bad.py:14:31:14:51 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| flask_pymongo_bad.py:14:40:14:50 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| mongoengine_bad.py:1:26:1:32 | ControlFlowNode for ImportMember | semmle.label | ControlFlowNode for ImportMember |
| mongoengine_bad.py:1:26:1:32 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| mongoengine_bad.py:18:5:18:17 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
@@ -181,24 +219,28 @@ nodes
| mongoengine_bad.py:19:19:19:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| mongoengine_bad.py:19:30:19:42 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| mongoengine_bad.py:22:26:22:46 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| mongoengine_bad.py:22:35:22:45 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| mongoengine_bad.py:26:5:26:17 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| mongoengine_bad.py:26:21:26:27 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| mongoengine_bad.py:27:5:27:15 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| mongoengine_bad.py:27:19:27:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| mongoengine_bad.py:27:30:27:42 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| mongoengine_bad.py:30:26:30:46 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| mongoengine_bad.py:30:35:30:45 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| mongoengine_bad.py:34:5:34:17 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| mongoengine_bad.py:34:21:34:27 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| mongoengine_bad.py:35:5:35:15 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| mongoengine_bad.py:35:19:35:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| mongoengine_bad.py:35:30:35:42 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| mongoengine_bad.py:38:26:38:46 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| mongoengine_bad.py:38:35:38:45 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| mongoengine_bad.py:42:5:42:17 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| mongoengine_bad.py:42:21:42:27 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| mongoengine_bad.py:43:5:43:15 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| mongoengine_bad.py:43:19:43:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| mongoengine_bad.py:43:30:43:42 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| mongoengine_bad.py:46:26:46:46 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| mongoengine_bad.py:46:35:46:45 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| mongoengine_bad.py:50:5:50:17 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| mongoengine_bad.py:50:21:50:27 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| mongoengine_bad.py:51:5:51:15 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
@@ -211,6 +253,7 @@ nodes
| mongoengine_bad.py:58:19:58:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| mongoengine_bad.py:58:30:58:42 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| mongoengine_bad.py:61:29:61:49 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| mongoengine_bad.py:61:38:61:48 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| pymongo_test.py:1:26:1:32 | ControlFlowNode for ImportMember | semmle.label | ControlFlowNode for ImportMember |
| pymongo_test.py:1:26:1:32 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| pymongo_test.py:12:5:12:17 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
@@ -219,6 +262,7 @@ nodes
| pymongo_test.py:13:19:13:43 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| pymongo_test.py:13:30:13:42 | ControlFlowNode for unsafe_search | semmle.label | ControlFlowNode for unsafe_search |
| pymongo_test.py:15:42:15:62 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| pymongo_test.py:15:51:15:61 | ControlFlowNode for json_search | semmle.label | ControlFlowNode for json_search |
| pymongo_test.py:29:5:29:12 | ControlFlowNode for event_id | semmle.label | ControlFlowNode for event_id |
| pymongo_test.py:29:16:29:51 | ControlFlowNode for Attribute() | semmle.label | ControlFlowNode for Attribute() |
| pymongo_test.py:29:27:29:33 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
@@ -236,11 +280,20 @@ nodes
| pymongo_test.py:52:26:52:32 | ControlFlowNode for request | semmle.label | ControlFlowNode for request |
| pymongo_test.py:52:26:52:49 | ControlFlowNode for Subscript | semmle.label | ControlFlowNode for Subscript |
| pymongo_test.py:54:5:54:10 | ControlFlowNode for search | semmle.label | ControlFlowNode for search |
| pymongo_test.py:54:5:54:10 | ControlFlowNode for search [Dictionary element at key body] | semmle.label | ControlFlowNode for search [Dictionary element at key body] |
| pymongo_test.py:54:14:58:5 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| pymongo_test.py:54:14:58:5 | ControlFlowNode for Dict [Dictionary element at key body] | semmle.label | ControlFlowNode for Dict [Dictionary element at key body] |
| pymongo_test.py:55:17:55:23 | ControlFlowNode for decoded | semmle.label | ControlFlowNode for decoded |
| pymongo_test.py:59:25:59:56 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| pymongo_test.py:59:35:59:55 | ControlFlowNode for Dict [Dictionary element at key $function, Dictionary element at key body] | semmle.label | ControlFlowNode for Dict [Dictionary element at key $function, Dictionary element at key body] |
| pymongo_test.py:59:35:59:55 | ControlFlowNode for Dict [Dictionary element at key $function] | semmle.label | ControlFlowNode for Dict [Dictionary element at key $function] |
| pymongo_test.py:59:49:59:54 | ControlFlowNode for search | semmle.label | ControlFlowNode for search |
| pymongo_test.py:59:49:59:54 | ControlFlowNode for search [Dictionary element at key body] | semmle.label | ControlFlowNode for search [Dictionary element at key body] |
| pymongo_test.py:61:25:61:57 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| pymongo_test.py:61:35:61:56 | ControlFlowNode for Dict [Dictionary element at key $function] | semmle.label | ControlFlowNode for Dict [Dictionary element at key $function] |
| pymongo_test.py:61:49:61:55 | ControlFlowNode for decoded | semmle.label | ControlFlowNode for decoded |
| pymongo_test.py:62:25:62:42 | ControlFlowNode for Dict | semmle.label | ControlFlowNode for Dict |
| pymongo_test.py:62:35:62:41 | ControlFlowNode for decoded | semmle.label | ControlFlowNode for decoded |
| pymongo_test.py:63:25:63:31 | ControlFlowNode for decoded | semmle.label | ControlFlowNode for decoded |
subpaths
#select

View File

@@ -2100,6 +2100,12 @@ module Make0<LocationSig Location, AstSig<Location> Ast> {
module Consistency {
/** Holds if the consistency query `query` has `results` results. */
query predicate consistencyOverview(string query, int results) {
query = "siblingsWithSameIndexInDefaultCfg" and
results =
strictcount(AstNode parent, AstNode child1, AstNode child2, int i |
siblingsWithSameIndexInDefaultCfg(parent, child1, child2, i)
)
or
query = "deadEnd" and results = strictcount(ControlFlowNode node | deadEnd(node))
or
query = "nonUniqueEnclosingCallable" and
@@ -2145,6 +2151,20 @@ module Make0<LocationSig Location, AstSig<Location> Ast> {
results = strictcount(ControlFlowNode node, SuccessorType t | selfLoop(node, t))
}
/**
* Holds if `parent` uses default left-to-right control flow and has
* two different children `child1` and `child2` at the same index
* `i`.
*/
query predicate siblingsWithSameIndexInDefaultCfg(
AstNode parent, AstNode child1, AstNode child2, int i
) {
defaultCfg(parent) and
getChild(parent, i) = child1 and
getChild(parent, i) = child2 and
child1 != child2
}
/**
* Holds if `node` is lacking a successor.
*

View File

@@ -280,10 +280,11 @@ pub fn location_label(writer: &mut trap::Writer, location: trap::Location) -> tr
}
/// Extracts the source file at `path`, which is assumed to be canonicalized.
/// When `yeast_runner` is `Some`, the parsed tree is first transformed
/// through the supplied yeast `Runner` before TRAP extraction. Building the
/// `Runner` (which parses YAML and constructs the schema) is the caller's
/// responsibility, allowing it to be done once and shared across files.
/// When `desugarer` is `Some`, the parsed tree is first transformed
/// through the supplied yeast desugarer before TRAP extraction. Building
/// the desugarer (which parses YAML and constructs the schema) is the
/// caller's responsibility, allowing it to be done once and shared across
/// files.
#[allow(clippy::too_many_arguments)]
pub fn extract(
language: &Language,
@@ -295,7 +296,7 @@ pub fn extract(
path: &Path,
source: &[u8],
ranges: &[Range],
yeast_runner: Option<&yeast::Runner<'_>>,
desugarer: Option<&dyn yeast::Desugarer>,
) {
let path_str = file_paths::normalize_and_transform_path(path, transformer);
let source_root = std::env::current_dir()
@@ -328,11 +329,14 @@ pub fn extract(
schema,
);
if let Some(yeast_runner) = yeast_runner {
let ast = yeast_runner
if let Some(desugarer) = desugarer {
let ast = desugarer
.run_from_tree(&tree, source)
.unwrap_or_else(|e| panic!("Desugaring failed for {path_str}: {e}"));
traverse_yeast(&ast, &mut visitor);
// Comments and other `extra` nodes are not represented in the desugared
// AST, so recover them directly from the original parse tree.
traverse_extras(&tree, &mut visitor);
} else {
traverse(&tree, &mut visitor);
}
@@ -365,6 +369,8 @@ struct Visitor<'a> {
ast_node_parent_table_name: String,
/// Language-specific name of the tokeninfo table
tokeninfo_table_name: String,
/// Language-specific name of the trivia tokeninfo table
trivia_tokeninfo_table_name: String,
/// A lookup table from type name to node types
schema: &'a NodeTypeMap,
/// A stack for gathering information from child nodes. Whenever a node is
@@ -395,11 +401,33 @@ impl<'a> Visitor<'a> {
ast_node_location_table_name: format!("{language_prefix}_ast_node_location"),
ast_node_parent_table_name: format!("{language_prefix}_ast_node_parent"),
tokeninfo_table_name: format!("{language_prefix}_tokeninfo"),
trivia_tokeninfo_table_name: format!("{language_prefix}_trivia_tokeninfo"),
schema,
stack: Vec::new(),
}
}
/// Emits a `TriviaToken` for the given `extra` node (e.g. a comment) from
/// the original parse tree. Trivia tokens carry a location and their source
/// text, but are not attached to a parent in the (possibly desugared) AST.
fn emit_trivia_token(&mut self, node: &Node) {
let id = self.trap_writer.fresh_id();
let loc = location_for(self, self.file_label, node);
let loc_label = location_label(self.trap_writer, loc);
self.trap_writer.add_tuple(
&self.ast_node_location_table_name,
vec![trap::Arg::Label(id), trap::Arg::Label(loc_label)],
);
self.trap_writer.add_tuple(
&self.trivia_tokeninfo_table_name,
vec![
trap::Arg::Label(id),
trap::Arg::Int(node.kind_id() as usize),
sliced_source_arg(self.source, node),
],
);
}
fn record_parse_error(&mut self, loc: trap::Label, mesg: &diagnostics::DiagnosticMessage) {
self.diagnostics_writer.write(mesg);
let id = self.trap_writer.fresh_id();
@@ -835,6 +863,24 @@ fn traverse(tree: &Tree, visitor: &mut Visitor) {
}
}
/// Walks the original tree-sitter tree and emits a `TriviaToken` for every
/// `extra` node (e.g. a comment). Used to preserve comments that would
/// otherwise be lost after a desugaring pass rewrites the tree.
fn traverse_extras(tree: &Tree, visitor: &mut Visitor) {
emit_extras_in(visitor, tree.root_node());
}
fn emit_extras_in(visitor: &mut Visitor, node: Node<'_>) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
if child.is_extra() {
visitor.emit_trivia_token(&child);
} else {
emit_extras_in(visitor, child);
}
}
}
fn traverse_yeast(tree: &yeast::Ast, visitor: &mut Visitor) {
use yeast::Cursor;
let mut cursor = tree.walk();

View File

@@ -13,11 +13,14 @@ pub struct LanguageSpec {
pub prefix: &'static str,
pub ts_language: tree_sitter::Language,
pub node_types: &'static str,
/// Optional yeast desugaring configuration. When set, the parsed
/// tree is rewritten through yeast before TRAP extraction. The
/// config's `output_node_types_yaml` (if set) provides the schema
/// used both at runtime (for the rewriter) and for TRAP validation.
pub desugar: Option<yeast::DesugaringConfig>,
/// Optional desugarer. When set, the parsed tree is rewritten through
/// the desugarer before TRAP extraction. The desugarer's
/// `output_node_types_yaml()` (if set) provides the schema used both
/// at runtime (for the rewriter) and for TRAP validation.
///
/// `Box<dyn yeast::Desugarer>` so the shared extractor is agnostic to
/// the user-defined context type the desugarer uses internally.
pub desugar: Option<Box<dyn yeast::Desugarer>>,
pub file_globs: Vec<String>,
}
@@ -91,35 +94,22 @@ impl Extractor {
.collect();
let mut schemas = vec![];
let mut yeast_runners = Vec::new();
for lang in &self.languages {
let effective_node_types: String =
match lang.desugar.as_ref().and_then(|c| c.output_node_types_yaml) {
Some(yaml) => yeast::node_types_yaml::convert(yaml).map_err(|e| {
std::io::Error::other(format!(
"Failed to convert YAML node-types to JSON for {}: {e}",
lang.prefix
))
})?,
None => lang.node_types.to_string(),
};
let schema = node_types::read_node_types_str(lang.prefix, &effective_node_types)?;
schemas.push(schema);
// Build the yeast runner once per language so the YAML schema
// isn't re-parsed for every file.
let yeast_runner = lang
let effective_node_types: String = match lang
.desugar
.as_ref()
.map(|config| yeast::Runner::from_config(lang.ts_language.clone(), config))
.transpose()
.map_err(|e| {
.and_then(|d| d.output_node_types_yaml())
{
Some(yaml) => yeast::node_types_yaml::convert(yaml).map_err(|e| {
std::io::Error::other(format!(
"Failed to build desugaring runner for {}: {e}",
"Failed to convert YAML node-types to JSON for {}: {e}",
lang.prefix
))
})?;
yeast_runners.push(yeast_runner);
})?,
None => lang.node_types.to_string(),
};
let schema = node_types::read_node_types_str(lang.prefix, &effective_node_types)?;
schemas.push(schema);
}
// Construct a single globset containing all language globs,
@@ -194,7 +184,7 @@ impl Extractor {
&path,
&source,
&[],
yeast_runners[i].as_ref(),
lang.desugar.as_deref(),
);
std::fs::create_dir_all(src_archive_file.parent().unwrap())?;
std::fs::copy(&path, &src_archive_file)?;

View File

@@ -68,7 +68,12 @@ pub fn generate(
let node_parent_table_name = format!("{}_ast_node_parent", &prefix);
let token_name = format!("{}_token", &prefix);
let tokeninfo_name = format!("{}_tokeninfo", &prefix);
let trivia_token_name = format!("{}_trivia_token", &prefix);
let trivia_tokeninfo_name = format!("{}_trivia_tokeninfo", &prefix);
let reserved_word_name = format!("{}_reserved_word", &prefix);
// When a desugaring is configured, comments and other `extra` nodes are
// preserved from the original parse tree as `TriviaToken`s.
let has_trivia_tokens = language.desugar.is_some();
let effective_node_types: String = match language
.desugar
.as_ref()
@@ -85,28 +90,35 @@ pub fn generate(
let nodes = node_types::read_node_types_str(&prefix, &effective_node_types)?;
let (dbscheme_entries, mut ast_node_members, token_kinds) = convert_nodes(&nodes);
ast_node_members.insert(&token_name);
if has_trivia_tokens {
ast_node_members.insert(&trivia_token_name);
}
writeln!(&mut dbscheme_writer, "/*- {} dbscheme -*/", language.name)?;
dbscheme::write(&mut dbscheme_writer, &dbscheme_entries)?;
let token_case = create_token_case(&token_name, token_kinds);
dbscheme::write(
&mut dbscheme_writer,
&[
dbscheme::Entry::Table(create_tokeninfo(&tokeninfo_name, &token_name)),
dbscheme::Entry::Case(token_case),
dbscheme::Entry::Union(dbscheme::Union {
name: &ast_node_name,
members: ast_node_members,
}),
dbscheme::Entry::Table(create_ast_node_location_table(
&node_location_table_name,
&ast_node_name,
)),
dbscheme::Entry::Table(create_ast_node_parent_table(
&node_parent_table_name,
&ast_node_name,
)),
],
)?;
let mut dbscheme_tail = vec![
dbscheme::Entry::Table(create_tokeninfo(&tokeninfo_name, &token_name)),
dbscheme::Entry::Case(token_case),
];
if has_trivia_tokens {
dbscheme_tail.push(dbscheme::Entry::Table(create_tokeninfo(
&trivia_tokeninfo_name,
&trivia_token_name,
)));
}
dbscheme_tail.push(dbscheme::Entry::Union(dbscheme::Union {
name: &ast_node_name,
members: ast_node_members,
}));
dbscheme_tail.push(dbscheme::Entry::Table(create_ast_node_location_table(
&node_location_table_name,
&ast_node_name,
)));
dbscheme_tail.push(dbscheme::Entry::Table(create_ast_node_parent_table(
&node_parent_table_name,
&ast_node_name,
)));
dbscheme::write(&mut dbscheme_writer, &dbscheme_tail)?;
let mut body = vec![
ql::TopLevel::Class(ql_gen::create_ast_node_class(
@@ -116,6 +128,12 @@ pub fn generate(
)),
ql::TopLevel::Class(ql_gen::create_token_class(&token_name, &tokeninfo_name)),
];
if has_trivia_tokens {
body.push(ql::TopLevel::Class(ql_gen::create_trivia_token_class(
&trivia_token_name,
&trivia_tokeninfo_name,
)));
}
// Only emit the ReservedWord class when there are actually unnamed token
// types in the schema (i.e., @{prefix}_reserved_word exists in the dbscheme).
// When converting from a YEAST YAML schema that has no unnamed tokens, this

View File

@@ -199,6 +199,70 @@ pub fn create_token_class<'a>(token_type: &'a str, tokeninfo: &'a str) -> ql::Cl
}
}
/// Creates the `TriviaToken` class. Trivia tokens (e.g. comments) are
/// `extra` nodes preserved from the original parse tree even when the tree has
/// been rewritten by a desugaring pass. They are not part of the regular
/// `Token` hierarchy because they do not appear in the (possibly desugared)
/// output schema.
pub fn create_trivia_token_class<'a>(
trivia_token_type: &'a str,
trivia_tokeninfo: &'a str,
) -> ql::Class<'a> {
let trivia_tokeninfo_arity = 3; // id, kind, value
let get_value = ql::Predicate {
qldoc: Some(String::from("Gets the source text of this trivia token.")),
name: "getValue",
overridden: false,
is_private: false,
is_final: true,
return_type: Some(ql::Type::String),
formal_parameters: vec![],
body: create_get_field_expr_for_column_storage(
"result",
trivia_tokeninfo,
1,
trivia_tokeninfo_arity,
),
overlay: None,
};
let to_string = ql::Predicate {
qldoc: Some(String::from(
"Gets a string representation of this element.",
)),
name: "toString",
overridden: true,
is_private: false,
is_final: true,
return_type: Some(ql::Type::String),
formal_parameters: vec![],
body: ql::Expression::Equals(
Box::new(ql::Expression::Var("result")),
Box::new(ql::Expression::Dot(
Box::new(ql::Expression::Var("this")),
"getValue",
vec![],
)),
),
overlay: None,
};
ql::Class {
qldoc: Some(String::from(
"A trivia token, such as a comment, preserved from the original parse tree.",
)),
name: "TriviaToken",
is_abstract: false,
supertypes: vec![ql::Type::At(trivia_token_type), ql::Type::Normal("AstNode")]
.into_iter()
.collect(),
characteristic_predicate: None,
predicates: vec![
get_value,
to_string,
create_get_a_primary_ql_class("TriviaToken", false),
],
}
}
// Creates the `ReservedWord` class.
pub fn create_reserved_word_class(db_name: &str) -> ql::Class<'_> {
let class_name = "ReservedWord";

View File

@@ -44,8 +44,19 @@ pub fn query(input: TokenStream) -> TokenStream {
/// {expr} - embed a Rust expression returning Id
/// {..expr} - splice an iterable of Id (in child/field position)
/// field: {..expr} - splice into a named field
/// {expr}.map(p -> tpl) - apply tpl to each element; splice result
/// {expr}.reduce_left(f -> init, acc, e -> fold)
/// - fold with per-element init; splice 0 or 1 result
/// ```
///
/// Chain syntax after `{expr}` or `{..expr}`:
/// - `.map(param -> template)` — one output node per input element.
/// - `.reduce_left(first -> init, acc, elem -> fold)` — fold left; the first
/// element is converted by `init`, subsequent elements are folded by `fold`
/// with the accumulator bound to `acc`. An empty iterable yields nothing.
/// - Chains always splice (the result is iterable).
/// - Multiple chains can be chained, e.g. `.map(...).reduce_left(...)`.
///
/// Can be called with an explicit context or using the implicit context
/// from an enclosing `rule!`:
///
@@ -110,3 +121,37 @@ pub fn rule(input: TokenStream) -> TokenStream {
Err(err) => err.to_compile_error().into(),
}
}
/// Define a desugaring rule whose transform is a hand-written Rust block.
///
/// Use `manual_rule!` when the transform needs control over capture
/// translation timing — for example, when an outer rule needs to set
/// state in `ctx` (the `BuildCtx`'s user context) before recursive
/// translation reaches inner rules that read that state.
///
/// ```text
/// manual_rule!(
/// (query_pattern field: (_) @name)
/// {
/// // `ctx` is a `&mut BuildCtx<'_, C>`; capture variables
/// // (`name: NodeRef`, etc.) are bound from the query.
/// let translated = ctx.translate(name)?;
/// Ok(translated)
/// }
/// )
/// ```
///
/// Differences from [`rule!`]:
/// - Captures are **not** auto-translated before the body runs; they
/// refer to raw input-schema nodes. Use [`BuildCtx::translate`] (or
/// [`BuildCtx::translate_opt`]) to translate them when you choose.
/// - The body is plain Rust returning `Result<Vec<Id>, String>` — no
/// tree template, no `Ok(...)` wrap.
#[proc_macro]
pub fn manual_rule(input: TokenStream) -> TokenStream {
let input2: TokenStream2 = input.into();
match parse::parse_manual_rule_top(input2) {
Ok(output) => output.into(),
Err(err) => err.to_compile_error().into(),
}
}

View File

@@ -121,9 +121,9 @@ fn parse_query_fields(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
std::collections::HashMap::new();
let mut bare_children: Vec<TokenStream> = Vec::new();
let push_field_elem = |order: &mut Vec<String>,
map: &mut std::collections::HashMap<String, Vec<TokenStream>>,
name: String,
elem: TokenStream| {
map: &mut std::collections::HashMap<String, Vec<TokenStream>>,
name: String,
elem: TokenStream| {
if !map.contains_key(&name) {
order.push(name.clone());
map.insert(name, vec![elem]);
@@ -141,7 +141,12 @@ fn parse_query_fields(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
// Parse the field's pattern. To support repetition like
// `field: (kind)* @cap`, parse the atom first, then check for
// a quantifier, and lastly handle a trailing `@capture`.
let atom = parse_query_atom(tokens)?;
// `field: @cap` is sugar for `field: _ @cap`.
let atom = if peek_is_at(tokens) {
quote! { yeast::query::QueryNode::Any { match_unnamed: true } }
} else {
parse_query_atom(tokens)?
};
if peek_is_repetition(tokens) {
let rep = expect_repetition(tokens)?;
let elem = quote! {
@@ -155,8 +160,7 @@ fn parse_query_fields(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
} else {
let child = if peek_is_at(tokens) {
tokens.next();
let capture_name =
expect_ident(tokens, "expected capture name after @")?;
let capture_name = expect_ident(tokens, "expected capture name after @")?;
let name_str = capture_name.to_string();
quote! {
yeast::query::QueryNode::Capture {
@@ -259,6 +263,7 @@ fn parse_query_list(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
yeast::query::QueryListElem::SingleNode(#node)
},
)?;
let elem = maybe_wrap_list_capture(tokens, elem)?;
elems.push(elem);
continue;
}
@@ -276,6 +281,7 @@ fn parse_query_list(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
yeast::query::QueryListElem::SingleNode(#node)
},
)?;
let elem = maybe_wrap_list_capture(tokens, elem)?;
elems.push(elem);
continue;
}
@@ -289,10 +295,10 @@ fn parse_query_list(tokens: &mut Tokens) -> Result<Vec<TokenStream>> {
// tree! / trees! parsing — direct code generation against BuildCtx
// ---------------------------------------------------------------------------
const IMPLICIT_CTX: &str = "__yeast_ctx";
const IMPLICIT_CTX: &str = "ctx";
/// Determine the context identifier: either explicit `ctx,` or the implicit
/// `__yeast_ctx` from an enclosing `rule!`.
/// `ctx` from an enclosing `rule!`.
fn parse_ctx_or_implicit(tokens: &mut Tokens) -> Ident {
// Check if first token is an ident followed by a comma
let mut lookahead = tokens.clone();
@@ -352,7 +358,7 @@ fn parse_direct_node(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStream> {
Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Brace => {
let group = expect_group(tokens, Delimiter::Brace)?;
let expr = group.stream();
Ok(quote! { ::std::convert::Into::<usize>::into(#expr) })
Ok(quote! { ::std::convert::Into::<usize>::into({ #expr }) })
}
Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Parenthesis => {
let group = expect_group(tokens, Delimiter::Parenthesis)?;
@@ -389,8 +395,10 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
let expr = group.stream();
return Ok(quote! {
{
let __value = yeast::YeastDisplay::yeast_to_string(&(#expr), &*#ctx.ast);
#ctx.literal(#kind_str, &__value)
let __expr = { #expr };
let __value = yeast::YeastDisplay::yeast_to_string(&__expr, &*#ctx.ast);
let __source_range = yeast::YeastSourceRange::yeast_source_range(&__expr, &*#ctx.ast);
#ctx.literal_with_source_range(#kind_str, &__value, __source_range)
}
});
}
@@ -411,7 +419,11 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
// Named fields — compute each value into a temp, then reference it
while peek_is_field(tokens) {
let field_name = expect_ident(tokens, "expected field name")?;
let field_str = field_name.to_string().strip_prefix("r#").unwrap_or(&field_name.to_string()).to_string();
let field_str = field_name
.to_string()
.strip_prefix("r#")
.unwrap_or(&field_name.to_string())
.to_string();
expect_punct(tokens, ':', "expected `:` after field name")?;
let temp = Ident::new(
&format!("__field_{field_str}_{field_counter}"),
@@ -419,23 +431,36 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
);
field_counter += 1;
// Check for field: {..expr} — splice a Vec<Id> into the field
// Check for field: {..expr}.chain or field: {expr}.chain — splice a Vec<Id> into the field
if peek_is_group(tokens, Delimiter::Brace) {
let group_clone = tokens.clone().next().unwrap();
if let TokenTree::Group(g) = &group_clone {
let mut inner_check = g.stream().into_iter();
let is_splice = matches!(inner_check.next(), Some(TokenTree::Punct(p)) if p.as_char() == '.')
&& matches!(inner_check.next(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
if is_splice {
// Determine if a chain (.map(..)) follows the `{}` group.
let mut after = tokens.clone();
after.next(); // skip the brace group
let has_chain =
matches!(after.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
if is_splice || has_chain {
let group = expect_group(tokens, Delimiter::Brace)?;
let mut inner = group.stream().into_iter().peekable();
inner.next(); // consume first .
inner.next(); // consume second .
let expr: proc_macro2::TokenStream = inner.collect();
let base: TokenStream = if is_splice {
let mut inner = group.stream().into_iter().peekable();
inner.next(); // consume first .
inner.next(); // consume second .
let expr: TokenStream = inner.collect();
quote! {
{ #expr }.into_iter().map(::std::convert::Into::<usize>::into)
}
} else {
let expr = group.stream();
quote! { { #expr }.into_iter() }
};
let chained = parse_chain_suffix(tokens, ctx, base)?;
stmts.push(quote! {
let #temp: Vec<usize> = (#expr).into_iter()
.map(::std::convert::Into::<usize>::into)
.collect();
let #temp: Vec<usize> = #chained.collect();
});
// An empty splice means the field is absent — skip it
// entirely rather than emitting an empty named field.
@@ -472,6 +497,94 @@ fn parse_direct_node_inner(tokens: &mut Tokens, ctx: &Ident) -> Result<TokenStre
})
}
/// Parse a chain of `.method(args)` suffixes after a `{expr}` or `{..expr}`
/// placeholder in tree templates. Currently supports:
///
/// ```text
/// .map(param -> template) -- iterator map: produces Vec<usize>
/// ```
///
/// The chain may be empty (returns `base` unchanged). Multiple chained calls
/// are supported, e.g. `.map(p -> ...).map(q -> ...)`.
///
/// Each call expects the receiver to be an iterator. The `base` argument
/// should therefore already be an iterator (use `.into_iter()` on it before
/// calling this function).
fn parse_chain_suffix(tokens: &mut Tokens, ctx: &Ident, base: TokenStream) -> Result<TokenStream> {
let mut current = base;
while matches!(tokens.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.') {
tokens.next(); // consume .
let method = expect_ident(tokens, "expected method name after `.`")?;
let method_str = method.to_string();
let args_group = expect_group(tokens, Delimiter::Parenthesis)?;
match method_str.as_str() {
"map" => {
let mut inner = args_group.stream().into_iter().peekable();
let param = expect_ident(&mut inner, "expected lambda parameter name")?;
expect_punct(&mut inner, '-', "expected `->` after lambda parameter")?;
expect_punct(&mut inner, '>', "expected `->` after lambda parameter")?;
let body = parse_direct_node(&mut inner, ctx)?;
if let Some(tok) = inner.next() {
return Err(syn::Error::new_spanned(
tok,
"unexpected token after lambda body",
));
}
current = quote! {
#current.map(|#param| #body)
};
}
"reduce_left" => {
// Syntax: reduce_left(first -> init_tpl, acc, elem -> fold_tpl)
// - first -> init_tpl : converts the first element to the initial accumulator
// - acc, elem -> fold_tpl : fold step (acc = current accumulator, elem = next element)
// Empty iterator produces an empty iterator; non-empty produces a single-element iterator.
let mut inner = args_group.stream().into_iter().peekable();
let init_param = expect_ident(&mut inner, "expected initial lambda parameter")?;
expect_punct(&mut inner, '-', "expected `->` after init parameter")?;
expect_punct(&mut inner, '>', "expected `->` after init parameter")?;
let init_body = parse_direct_node(&mut inner, ctx)?;
expect_punct(&mut inner, ',', "expected `,` after init template")?;
let acc_param = expect_ident(&mut inner, "expected accumulator parameter")?;
expect_punct(&mut inner, ',', "expected `,` after accumulator parameter")?;
let elem_param = expect_ident(&mut inner, "expected element parameter")?;
expect_punct(&mut inner, '-', "expected `->` after element parameter")?;
expect_punct(&mut inner, '>', "expected `->` after element parameter")?;
let fold_body = parse_direct_node(&mut inner, ctx)?;
if let Some(tok) = inner.next() {
return Err(syn::Error::new_spanned(
tok,
"unexpected token after fold template",
));
}
current = quote! {
{
let mut __iter = #current;
let __result: Option<usize> = if let Some(#init_param) = __iter.next() {
let mut __acc: usize = #init_body;
for #elem_param in __iter {
let #acc_param: usize = __acc;
__acc = #fold_body;
}
Some(__acc)
} else {
None
};
__result.into_iter()
}
};
}
_ => {
return Err(syn::Error::new_spanned(
method,
format!("unknown builtin method `.{method_str}()`"),
));
}
}
}
Ok(current)
}
/// Parse the top-level list of a `trees!` template.
/// Each item is a node template or `{expr}` splice.
fn parse_direct_list(tokens: &mut Tokens, ctx: &Ident) -> Result<Vec<TokenStream>> {
@@ -492,23 +605,33 @@ fn parse_direct_list(tokens: &mut Tokens, ctx: &Ident) -> Result<Vec<TokenStream
continue;
}
// {expr} or {..expr} — single node or splice
// {expr} or {..expr} (with optional .chain) — single node or splice
if peek_is_group(tokens, Delimiter::Brace) {
let group = expect_group(tokens, Delimiter::Brace)?;
let has_chain =
matches!(tokens.peek(), Some(TokenTree::Punct(p)) if p.as_char() == '.');
let mut inner = group.stream().into_iter().peekable();
if peek_is_dotdot(&inner) {
inner.next(); // consume first .
inner.next(); // consume second .
let expr: TokenStream = inner.collect();
let is_splice = peek_is_dotdot(&inner);
if is_splice || has_chain {
let base: TokenStream = if is_splice {
inner.next(); // consume first .
inner.next(); // consume second .
let expr: TokenStream = inner.collect();
quote! {
{ #expr }.into_iter().map(::std::convert::Into::<usize>::into)
}
} else {
let expr = group.stream();
quote! { { #expr }.into_iter() }
};
let chained = parse_chain_suffix(tokens, ctx, base)?;
items.push(quote! {
__nodes.extend(
(#expr).into_iter().map(::std::convert::Into::<usize>::into)
);
__nodes.extend(#chained);
});
} else {
let expr = group.stream();
items.push(quote! {
__nodes.push(::std::convert::Into::<usize>::into(#expr));
__nodes.push(::std::convert::Into::<usize>::into({ #expr }));
});
}
continue;
@@ -604,8 +727,11 @@ fn extract_captures_inner(
}
last_mult = CaptureMultiplicity::Single;
}
TokenTree::Punct(p) if matches!(p.as_char(), '*' | '+' | '?') => {
// Keep last_mult — the @capture follows
TokenTree::Punct(p) if p.as_char() == '*' || p.as_char() == '+' => {
last_mult = CaptureMultiplicity::Repeated;
}
TokenTree::Punct(p) if p.as_char() == '?' => {
last_mult = CaptureMultiplicity::Optional;
}
_ => {
last_mult = CaptureMultiplicity::Single;
@@ -763,10 +889,117 @@ pub fn parse_rule_top(input: TokenStream) -> Result<TokenStream> {
Ok(quote! {
{
let __query = #query_code;
yeast::Rule::new(__query, Box::new(|__ast: &mut yeast::Ast, __captures: yeast::captures::Captures, __fresh: &yeast::tree_builder::FreshScope, __source_range: Option<tree_sitter::Range>| {
yeast::Rule::new(__query, Box::new(|__ast: &mut yeast::Ast, mut __captures: yeast::captures::Captures, __fresh: &yeast::tree_builder::FreshScope, __source_range: Option<tree_sitter::Range>, __user_ctx: &mut _, __translator: yeast::TranslatorHandle<'_, _>| {
// Auto-translation prefix: recursively translate every
// captured node before invoking the user's transform body.
// For OneShot rules this preserves the legacy behaviour
// (input-schema captures translated to output-schema
// nodes); for Repeating rules it is a no-op.
__translator.auto_translate_captures(&mut __captures, __ast, __user_ctx)?;
#(#bindings)*
let mut #ctx_ident = yeast::build::BuildCtx::with_source_range(__ast, &__captures, __fresh, __source_range);
#transform_body
let mut #ctx_ident = yeast::build::BuildCtx::with_translator(__ast, &__captures, __fresh, __source_range, __user_ctx, __translator);
let __result: Vec<usize> = { #transform_body };
Ok(__result)
}))
}
})
}
/// Parse `manual_rule!( query { body } )`.
///
/// Like [`parse_rule_top`] but:
/// - Expects a Rust block `{ ... }` after the query (no `=>` arrow).
/// - Generates code that does NOT auto-translate captures before
/// running the body. Capture variables refer to raw (input-schema)
/// nodes; the body is responsible for explicit translation via
/// `ctx.translate(...)`.
/// - The body is included verbatim and must evaluate to
/// `Result<Vec<usize>, String>`.
pub fn parse_manual_rule_top(input: TokenStream) -> Result<TokenStream> {
let mut tokens = input.into_iter().peekable();
// Collect query tokens up to the body block `{ ... }`.
let mut query_tokens = Vec::new();
loop {
match tokens.peek() {
None => {
return Err(syn::Error::new(
Span::call_site(),
"expected a Rust block `{ ... }` after the query in manual_rule!",
))
}
Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Brace => break,
_ => {
query_tokens.push(tokens.next().unwrap());
}
}
}
let query_stream: TokenStream = query_tokens.into_iter().collect();
// Extract captures from the query (same as in `rule!`).
let captures = extract_captures(&query_stream);
// Parse the query into the QueryNode-building expression.
let query_code = parse_query_top(query_stream)?;
// Generate capture bindings (same as in `rule!`).
let ctx_ident = Ident::new(IMPLICIT_CTX, Span::call_site());
let bindings: Vec<TokenStream> = captures
.iter()
.map(|cap| {
let name = Ident::new(&cap.name, Span::call_site());
let name_str = &cap.name;
match cap.multiplicity {
CaptureMultiplicity::Repeated => quote! {
let #name: Vec<yeast::NodeRef> = __captures.get_all(#name_str)
.into_iter()
.map(yeast::NodeRef)
.collect();
},
CaptureMultiplicity::Optional => quote! {
let #name: Option<yeast::NodeRef> =
__captures.get_opt(#name_str).map(yeast::NodeRef);
},
CaptureMultiplicity::Single => quote! {
let #name: yeast::NodeRef =
yeast::NodeRef(__captures.get_var(#name_str).unwrap());
},
}
})
.collect();
// Consume the body block.
let body_group = match tokens.next() {
Some(TokenTree::Group(g)) if g.delimiter() == Delimiter::Brace => g,
other => {
return Err(syn::Error::new(
Span::call_site(),
format!(
"expected a Rust block `{{ ... }}` after the query in manual_rule!, found: {other:?}"
),
))
}
};
let body_stream = body_group.stream();
// No tokens should follow the body.
if let Some(tok) = tokens.next() {
return Err(syn::Error::new_spanned(
tok,
"unexpected token after manual_rule! body",
));
}
Ok(quote! {
{
let __query = #query_code;
yeast::Rule::new(__query, Box::new(|__ast: &mut yeast::Ast, __captures: yeast::captures::Captures, __fresh: &yeast::tree_builder::FreshScope, __source_range: Option<tree_sitter::Range>, __user_ctx: &mut _, __translator: yeast::TranslatorHandle<'_, _>| {
// No auto-translate prefix for manual rules — the body
// is responsible for translating captures explicitly.
#(#bindings)*
let mut #ctx_ident = yeast::build::BuildCtx::with_translator(__ast, &__captures, __fresh, __source_range, __user_ctx, __translator);
#body_stream
}))
}
})

View File

@@ -265,7 +265,21 @@ occurrences of the same `$name` within one `BuildCtx` share the same value:
)
```
`{..expr}` splices a `Vec<Id>` (or any iterable of `Id`):
The contents of `{…}` are treated as a Rust block, so multi-statement
expressions (with `let` bindings) work too:
```rust
(assignment
left: {tmp}
right: {
let lit = ctx.literal("integer", "0");
tree!((binary_expr op: (operator "+") left: {tmp} right: {lit}))
})
```
`{..expr}` splices a `Vec<Id>` (or any iterable of `Id`); the contents
are likewise a Rust block, so the splice can be the result of arbitrary
computation:
```rust
yeast::trees!(ctx,

View File

@@ -20,7 +20,7 @@ fn main() {
let args = Cli::parse();
let language = get_language(&args.language);
let source = std::fs::read_to_string(&args.file).unwrap();
let runner = yeast::Runner::new(language, &[]);
let runner: yeast::Runner = yeast::Runner::new(language, &[]);
let ast = runner.run(&source).unwrap();
println!("{}", ast.print(&source, ast.get_root()));
}

View File

@@ -2,28 +2,60 @@ use std::collections::BTreeMap;
use crate::captures::Captures;
use crate::tree_builder::FreshScope;
use crate::{Ast, FieldId, Id, NodeContent};
use crate::{Ast, FieldId, Id, NodeContent, TranslatorHandle};
/// Context for building new AST nodes during a transformation.
///
/// Used by the `tree!` and `trees!` macros. Holds a mutable reference to the
/// AST, a reference to the captures from a query match, and a `FreshScope` for
/// generating unique identifiers.
pub struct BuildCtx<'a> {
/// AST, a reference to the captures from a query match, a `FreshScope` for
/// generating unique identifiers, and a mutable reference to a user-defined
/// context of type `C`.
///
/// The user context `C` is shared across rules via the framework's driver:
/// outer rules can write to it before recursive translation, and inner rules
/// can read (or further mutate) it during their transforms. The framework
/// snapshots and restores the user context around each rule application, so
/// mutations made by a rule are visible to its descendants (via recursive
/// translation) but not to its parent's siblings.
///
/// `BuildCtx` implements [`Deref`] and [`DerefMut`] targeting `C`, so user
/// context fields are accessible as `ctx.my_field` directly (provided they
/// don't collide with `BuildCtx`'s own fields like `ast`, `captures`, etc.).
///
/// The default `C = ()` means rules that don't need any user context don't
/// pay any cost.
///
/// When constructed by the framework (via the rule! macro), `BuildCtx` also
/// carries a [`TranslatorHandle`] that the [`translate`] method delegates
/// to. When constructed by hand (e.g. in tests), the translator is `None`
/// and [`translate`] returns an error.
pub struct BuildCtx<'a, C: 'a = ()> {
pub ast: &'a mut Ast,
pub captures: &'a Captures,
pub fresh: &'a FreshScope,
/// Source range of the matched node, inherited by synthetic nodes.
pub source_range: Option<tree_sitter::Range>,
/// User-supplied context, accessible directly via `ctx.field` (via Deref).
pub user_ctx: &'a mut C,
/// Optional translator handle, populated when the context is built by
/// the framework's rule driver. None when the context is built by hand.
pub(crate) translator: Option<TranslatorHandle<'a, C>>,
}
impl<'a> BuildCtx<'a> {
pub fn new(ast: &'a mut Ast, captures: &'a Captures, fresh: &'a FreshScope) -> Self {
impl<'a, C> BuildCtx<'a, C> {
pub fn new(
ast: &'a mut Ast,
captures: &'a Captures,
fresh: &'a FreshScope,
user_ctx: &'a mut C,
) -> Self {
Self {
ast,
captures,
fresh,
source_range: None,
user_ctx,
translator: None,
}
}
@@ -32,12 +64,35 @@ impl<'a> BuildCtx<'a> {
captures: &'a Captures,
fresh: &'a FreshScope,
source_range: Option<tree_sitter::Range>,
user_ctx: &'a mut C,
) -> Self {
Self {
ast,
captures,
fresh,
source_range,
user_ctx,
translator: None,
}
}
/// Construct a `BuildCtx` carrying a translator handle. Used by the
/// `rule!` macro to enable [`translate`] inside rule transforms.
pub fn with_translator(
ast: &'a mut Ast,
captures: &'a Captures,
fresh: &'a FreshScope,
source_range: Option<tree_sitter::Range>,
user_ctx: &'a mut C,
translator: TranslatorHandle<'a, C>,
) -> Self {
Self {
ast,
captures,
fresh,
source_range,
user_ctx,
translator: Some(translator),
}
}
@@ -82,10 +137,83 @@ impl<'a> BuildCtx<'a> {
.create_named_token_with_range(kind, value.to_string(), self.source_range)
}
/// Create a leaf node with fixed content and an optional preferred source range.
/// If `source_range` is `None`, falls back to this context's inherited range.
pub fn literal_with_source_range(
&mut self,
kind: &'static str,
value: &str,
source_range: Option<tree_sitter::Range>,
) -> Id {
self.ast.create_named_token_with_range(
kind,
value.to_string(),
source_range.or(self.source_range),
)
}
/// Create a leaf node with an auto-generated unique name.
pub fn fresh(&mut self, kind: &'static str, name: &str) -> Id {
let generated = self.fresh.resolve(name);
self.ast
.create_named_token_with_range(kind, generated, self.source_range)
}
/// Prepend a value to a field of an existing node.
pub fn prepend_field(&mut self, node_id: Id, field_name: &str, value_id: Id) {
let field_id = self
.ast
.field_id_for_name(field_name)
.unwrap_or_else(|| panic!("build: field '{field_name}' not found"));
self.ast.prepend_field_child(node_id, field_id, value_id);
}
}
impl<C: Clone> BuildCtx<'_, C> {
/// Recursively translate a node via the framework's rule machinery.
/// In a OneShot phase, applies OneShot rules to the given node and
/// returns the resulting node ids. In a Repeating phase, errors
/// (translation is not meaningful when input and output share a
/// schema).
///
/// Accepts any value convertible to [`Id`] (including [`crate::NodeRef`]),
/// so manual rules can pass capture bindings directly without unwrapping.
///
/// Errors if this `BuildCtx` was constructed by hand (without a
/// translator handle) — for example, in unit tests that don't go
/// through the rule driver.
pub fn translate<I: Into<Id>>(&mut self, id: I) -> Result<Vec<Id>, String> {
let id = id.into();
match &self.translator {
Some(t) => t.translate(self.ast, self.user_ctx, id),
None => Err("translate() called on a BuildCtx without a translator handle".into()),
}
}
/// Translate an optional capture, returning the first translated id or
/// `None`. Convenience for `?`-quantifier captures (`Option<NodeRef>`).
///
/// If the underlying translation produces multiple ids for a single
/// input, only the first is returned. For most use cases (e.g.
/// translating a single type annotation) this is what you want; if
/// you need all ids, use [`translate`] directly.
pub fn translate_opt<I: Into<Id>>(&mut self, id: Option<I>) -> Result<Option<Id>, String> {
match id {
Some(id) => Ok(self.translate(id)?.into_iter().next()),
None => Ok(None),
}
}
}
impl<C> std::ops::Deref for BuildCtx<'_, C> {
type Target = C;
fn deref(&self) -> &C {
&*self.user_ctx
}
}
impl<C> std::ops::DerefMut for BuildCtx<'_, C> {
fn deref_mut(&mut self) -> &mut C {
&mut *self.user_ctx
}
}

View File

@@ -53,12 +53,7 @@ pub fn dump_ast_with_options(
///
/// Any node that does not match the expected type set for its parent field is
/// rendered with a trailing `" <-- ERROR: ..."` annotation on the same line.
pub fn dump_ast_with_type_errors(
ast: &Ast,
root: usize,
source: &str,
schema: &Schema,
) -> String {
pub fn dump_ast_with_type_errors(ast: &Ast, root: usize, source: &str, schema: &Schema) -> String {
dump_ast_with_type_errors_and_options(ast, root, source, schema, &DumpOptions::default())
}
@@ -74,7 +69,15 @@ pub fn dump_ast_with_type_errors_and_options(
options: &DumpOptions,
) -> String {
let mut out = String::new();
dump_node(ast, root, source, options, 0, Some((schema, None, None)), &mut out);
dump_node(
ast,
root,
source,
options,
0,
Some((schema, None, None)),
&mut out,
);
out
}
@@ -232,8 +235,8 @@ fn dump_node(
}
let field_name = ast.field_name_for_id(field_id).unwrap_or("?");
let child_type_check = type_check.map(|(schema, _, _)| {
let expected = expected_for_field(schema, node.kind_name(), field_id)
.or(Some(EMPTY_NODE_TYPES));
let expected =
expected_for_field(schema, node.kind_name(), field_id).or(Some(EMPTY_NODE_TYPES));
let parent_field = Some((node.kind_name(), field_name));
(schema, expected, parent_field)
});

View File

@@ -16,7 +16,7 @@ pub mod schema;
pub mod tree_builder;
mod visitor;
pub use yeast_macros::{query, rule, tree, trees};
pub use yeast_macros::{manual_rule, query, rule, tree, trees};
use captures::Captures;
pub use cursor::Cursor;
@@ -58,12 +58,30 @@ pub trait YeastDisplay {
fn yeast_to_string(&self, ast: &Ast) -> String;
}
/// Optional source range for values used in `#{expr}` interpolations.
///
/// By default this returns `None`, so synthesized leaves inherit the matched
/// rule's source range. `NodeRef` returns the referenced node's range, letting
/// `(kind #{capture})` carry the captured node's location.
pub trait YeastSourceRange {
fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range>;
}
impl YeastDisplay for NodeRef {
fn yeast_to_string(&self, ast: &Ast) -> String {
ast.source_text(self.0)
}
}
impl YeastSourceRange for NodeRef {
fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range> {
ast.get_node(self.0).and_then(|n| match &n.content {
NodeContent::Range(r) => Some(r.clone()),
_ => n.source_range,
})
}
}
macro_rules! impl_yeast_display_via_display {
($($t:ty),* $(,)?) => {
$(
@@ -72,6 +90,12 @@ macro_rules! impl_yeast_display_via_display {
::std::string::ToString::to_string(self)
}
}
impl YeastSourceRange for $t {
fn yeast_source_range(&self, _ast: &Ast) -> Option<tree_sitter::Range> {
None
}
}
)*
};
}
@@ -90,6 +114,12 @@ impl<T: YeastDisplay + ?Sized> YeastDisplay for &T {
}
}
impl<T: YeastSourceRange + ?Sized> YeastSourceRange for &T {
fn yeast_source_range(&self, ast: &Ast) -> Option<tree_sitter::Range> {
(**self).yeast_source_range(ast)
}
}
pub const CHILD_FIELD: u16 = u16::MAX;
#[derive(Debug)]
@@ -267,7 +297,9 @@ impl Ast {
/// Returns the source text for `id`, resolving `NodeContent::Range`
/// against the stored source bytes when available.
pub fn source_text(&self, id: Id) -> String {
let Some(node) = self.get_node(id) else { return String::new(); };
let Some(node) = self.get_node(id) else {
return String::new();
};
let read_range = |range: &tree_sitter::Range| {
let start = range.start_byte;
let end = range.end_byte;
@@ -368,6 +400,15 @@ impl Ast {
is_named: bool,
source_range: Option<tree_sitter::Range>,
) -> Id {
let source_range = match &content {
// Parsed nodes already carry an exact source range in their content.
NodeContent::Range(_) => source_range,
// Synthesized nodes derive location from children when possible,
// and fall back to the inherited rule-match range otherwise.
_ => self
.union_source_range_of_children(&fields)
.or(source_range),
};
let id = self.nodes.len();
self.nodes.push(Node {
kind,
@@ -383,10 +424,79 @@ impl Ast {
id
}
fn union_source_range_of_children(
&self,
fields: &BTreeMap<FieldId, Vec<Id>>,
) -> Option<tree_sitter::Range> {
let mut start_byte: Option<usize> = None;
let mut end_byte: Option<usize> = None;
let mut start_point = tree_sitter::Point { row: 0, column: 0 };
let mut end_point = tree_sitter::Point { row: 0, column: 0 };
for child_ids in fields.values() {
for &child_id in child_ids {
let Some(child) = self.get_node(child_id) else {
continue;
};
let child_start_byte = child.start_byte();
let child_end_byte = child.end_byte();
// Skip children that carry no usable location.
if child_start_byte == 0 && child_end_byte == 0 {
continue;
}
match start_byte {
None => {
start_byte = Some(child_start_byte);
start_point = child.start_position();
}
Some(current_start) if child_start_byte < current_start => {
start_byte = Some(child_start_byte);
start_point = child.start_position();
}
_ => {}
}
match end_byte {
None => {
end_byte = Some(child_end_byte);
end_point = child.end_position();
}
Some(current_end) if child_end_byte > current_end => {
end_byte = Some(child_end_byte);
end_point = child.end_position();
}
_ => {}
}
}
}
match (start_byte, end_byte) {
(Some(start_byte), Some(end_byte)) => Some(tree_sitter::Range {
start_byte,
end_byte,
start_point,
end_point,
}),
_ => None,
}
}
pub fn create_named_token(&mut self, kind: &'static str, content: String) -> Id {
self.create_named_token_with_range(kind, content, None)
}
/// Prepend a child id to the given field of the given node.
pub fn prepend_field_child(&mut self, node_id: Id, field_id: FieldId, value_id: Id) {
let node = self
.nodes
.get_mut(node_id)
.expect("prepend_field_child: invalid node id");
node.fields.entry(field_id).or_default().insert(0, value_id);
}
pub fn create_named_token_with_range(
&mut self,
kind: &'static str,
@@ -595,18 +705,118 @@ impl From<tree_sitter::Range> for NodeContent {
}
}
/// The transform function for a rule: takes the AST, captured variables, a
/// fresh-name scope, and the source range of the matched node, and returns
/// the IDs of the replacement nodes.
pub type Transform = Box<
dyn Fn(&mut Ast, Captures, &tree_builder::FreshScope, Option<tree_sitter::Range>) -> Vec<Id>
/// A handle that lets a rule transform recursively translate AST nodes via
/// the framework's rule machinery. Constructed by the driver and passed as
/// the last argument of every [`Transform`] invocation.
///
/// The `rule!` macro uses [`TranslatorHandle::auto_translate_captures`] in
/// its generated prefix to translate captures before running the user's
/// transform body. Manually-written transforms (using [`Rule::new`]
/// directly) can call [`TranslatorHandle::translate`] selectively on
/// specific node ids to control when translation happens.
pub struct TranslatorHandle<'a, C> {
inner: TranslatorImpl<'a, C>,
}
/// Internal phase-specific translation state. Kept private — callers
/// interact with [`TranslatorHandle`] only.
enum TranslatorImpl<'a, C> {
/// OneShot phase translator: recursively applies OneShot rules.
OneShot {
index: &'a RuleIndex<'a, C>,
fresh: &'a tree_builder::FreshScope,
rewrite_depth: usize,
/// The id of the node the current rule is matching. Used by
/// [`auto_translate_captures`] to avoid infinite recursion when a
/// rule captures its own match root (e.g. via `(_) @_`).
matched_root: Id,
},
/// Repeating phase translator: translation is not meaningful here
/// (input and output schemas are the same). [`translate`] errors;
/// [`auto_translate_captures`] is a no-op so the macro's auto-prefix
/// works unchanged for Repeating rules.
Repeating,
}
impl<'a, C: Clone> TranslatorHandle<'a, C> {
/// Recursively apply OneShot rules to `id` and return the resulting
/// node ids. Errors in a Repeating phase (where translation is not
/// meaningful).
pub fn translate(&self, ast: &mut Ast, user_ctx: &mut C, id: Id) -> Result<Vec<Id>, String> {
match &self.inner {
TranslatorImpl::OneShot {
index,
fresh,
rewrite_depth,
..
} => apply_one_shot_rules_inner(index, ast, user_ctx, id, fresh, rewrite_depth + 1),
TranslatorImpl::Repeating => {
Err("translate() is not available in a Repeating phase".into())
}
}
}
/// Translate every captured node in `captures` in place (OneShot phase
/// only). In a Repeating phase this is a no-op — Repeating rules
/// receive raw captures.
///
/// Used by the `rule!` macro's generated prefix to preserve the
/// pre-existing "auto-translate captures before running the transform
/// body" behavior. Manually-written transforms typically translate
/// captures selectively via [`translate`] instead.
///
/// To avoid infinite recursion, a capture whose id matches the rule's
/// matched root (e.g. from a `(_) @_` pattern) is left unchanged.
pub fn auto_translate_captures(
&self,
captures: &mut Captures,
ast: &mut Ast,
user_ctx: &mut C,
) -> Result<(), String> {
match &self.inner {
TranslatorImpl::OneShot { matched_root, .. } => {
let root = *matched_root;
captures.try_map_all_captures(|cid| {
if cid == root {
Ok(vec![cid])
} else {
self.translate(ast, user_ctx, cid)
}
})
}
TranslatorImpl::Repeating => Ok(()),
}
}
}
/// The transform function for a rule.
///
/// Takes the AST, the (raw, untranslated) captured variables, a fresh-name
/// scope, the source range of the matched node, a mutable reference to the
/// user context of type `C`, and a [`TranslatorHandle`] for recursively
/// translating nodes. Returns the IDs of the replacement nodes, or an
/// error message if the transform could not be completed.
///
/// Transforms produced by [`Rule::new`] receive **raw** captures and must
/// translate them themselves (via the handle). Transforms produced by the
/// `rule!` macro have an auto-translation prefix injected for backward
/// compatibility.
pub type Transform<C = ()> = Box<
dyn Fn(
&mut Ast,
Captures,
&tree_builder::FreshScope,
Option<tree_sitter::Range>,
&mut C,
TranslatorHandle<'_, C>,
) -> Result<Vec<Id>, String>
+ Send
+ Sync,
>;
pub struct Rule {
pub struct Rule<C = ()> {
query: QueryNode,
transform: Transform,
transform: Transform<C>,
/// If true, after this rule fires on a node the engine will try to
/// re-apply this same rule on the result root. Defaults to false:
/// each rule fires at most once on a given node, which prevents
@@ -614,8 +824,8 @@ pub struct Rule {
repeated: bool,
}
impl Rule {
pub fn new(query: QueryNode, transform: Transform) -> Self {
impl<C> Rule<C> {
pub fn new(query: QueryNode, transform: Transform<C>) -> Self {
Self {
query,
transform,
@@ -637,9 +847,13 @@ impl Rule {
ast: &mut Ast,
node: Id,
fresh: &tree_builder::FreshScope,
user_ctx: &mut C,
translator: TranslatorHandle<'_, C>,
) -> Result<Option<Vec<Id>>, String> {
match self.try_match(ast, node)? {
Some(captures) => Ok(Some(self.run_transform(ast, captures, node, fresh))),
Some(captures) => Ok(Some(
self.run_transform(ast, captures, node, fresh, user_ctx, translator)?,
)),
None => Ok(None),
}
}
@@ -663,29 +877,31 @@ impl Rule {
captures: Captures,
node: Id,
fresh: &tree_builder::FreshScope,
) -> Vec<Id> {
user_ctx: &mut C,
translator: TranslatorHandle<'_, C>,
) -> Result<Vec<Id>, String> {
fresh.next_scope();
let source_range = ast.get_node(node).and_then(|n| match n.content {
NodeContent::Range(r) => Some(r),
_ => n.source_range,
});
(self.transform)(ast, captures, fresh, source_range)
(self.transform)(ast, captures, fresh, source_range, user_ctx, translator)
}
}
const MAX_REWRITE_DEPTH: usize = 100;
/// Index of rules by their root query kind for fast lookup.
struct RuleIndex<'a> {
struct RuleIndex<'a, C> {
/// Rules indexed by root node kind name.
by_kind: BTreeMap<&'static str, Vec<&'a Rule>>,
by_kind: BTreeMap<&'static str, Vec<&'a Rule<C>>>,
/// Rules with wildcard queries (Any) that apply to all nodes.
wildcard: Vec<&'a Rule>,
wildcard: Vec<&'a Rule<C>>,
}
impl<'a> RuleIndex<'a> {
fn new(rules: &'a [Rule]) -> Self {
let mut by_kind: BTreeMap<&'static str, Vec<&'a Rule>> = BTreeMap::new();
impl<'a, C> RuleIndex<'a, C> {
fn new(rules: &'a [Rule<C>]) -> Self {
let mut by_kind: BTreeMap<&'static str, Vec<&'a Rule<C>>> = BTreeMap::new();
let mut wildcard = Vec::new();
for rule in rules {
match rule.query.root_kind() {
@@ -696,7 +912,7 @@ impl<'a> RuleIndex<'a> {
Self { by_kind, wildcard }
}
fn rules_for_kind(&self, kind: &str) -> impl Iterator<Item = &&'a Rule> {
fn rules_for_kind(&self, kind: &str) -> impl Iterator<Item = &&'a Rule<C>> {
self.by_kind
.get(kind)
.into_iter()
@@ -705,23 +921,25 @@ impl<'a> RuleIndex<'a> {
}
}
fn apply_repeating_rules(
rules: &[Rule],
fn apply_repeating_rules<C: Clone>(
rules: &[Rule<C>],
ast: &mut Ast,
user_ctx: &mut C,
id: Id,
fresh: &tree_builder::FreshScope,
) -> Result<Vec<Id>, String> {
let index = RuleIndex::new(rules);
apply_repeating_rules_inner(&index, ast, id, fresh, 0, None)
apply_repeating_rules_inner(&index, ast, user_ctx, id, fresh, 0, None)
}
fn apply_repeating_rules_inner(
index: &RuleIndex,
fn apply_repeating_rules_inner<C: Clone>(
index: &RuleIndex<C>,
ast: &mut Ast,
user_ctx: &mut C,
id: Id,
fresh: &tree_builder::FreshScope,
rewrite_depth: usize,
skip_rule: Option<*const Rule>,
skip_rule: Option<*const Rule<C>>,
) -> Result<Vec<Id>, String> {
if rewrite_depth > MAX_REWRITE_DEPTH {
return Err(format!(
@@ -732,11 +950,23 @@ fn apply_repeating_rules_inner(
let node_kind = ast.get_node(id).map(|n| n.kind()).unwrap_or("");
for rule in index.rules_for_kind(node_kind) {
let rule_ptr = *rule as *const Rule;
let rule_ptr = *rule as *const Rule<C>;
if Some(rule_ptr) == skip_rule {
continue;
}
if let Some(result_node) = rule.try_rule(ast, id, fresh)? {
// Snapshot the user context before invoking the rule so that any
// mutations the rule makes are visible during recursive translation
// of its result, but not leaked to the parent's siblings.
let snapshot = user_ctx.clone();
// Repeating rules don't need a real translator: their captures
// aren't auto-translated (Repeating preserves the input schema),
// and `ctx.translate(id)` errors if invoked from a Repeating
// transform.
let translator = TranslatorHandle {
inner: TranslatorImpl::Repeating,
};
let try_result = rule.try_rule(ast, id, fresh, user_ctx, translator)?;
if let Some(result_node) = try_result {
// For non-repeated rules, suppress further application of *this*
// rule on the result root, so a rule whose output matches its own
// query doesn't loop. Other rules and child traversal are
@@ -747,14 +977,19 @@ fn apply_repeating_rules_inner(
results.extend(apply_repeating_rules_inner(
index,
ast,
user_ctx,
node,
fresh,
rewrite_depth + 1,
next_skip,
)?);
}
*user_ctx = snapshot;
return Ok(results);
}
// Rule didn't match; restore any speculative changes (none expected
// since try_rule only mutates on match, but be defensive).
*user_ctx = snapshot;
}
// Take the parent's fields by ownership: the recursion will rewrite
@@ -769,7 +1004,15 @@ fn apply_repeating_rules_inner(
for children in fields.values_mut() {
let mut new_children: Option<Vec<Id>> = None;
for (i, &child_id) in children.iter().enumerate() {
let result = apply_repeating_rules_inner(index, ast, child_id, fresh, rewrite_depth, None)?;
let result = apply_repeating_rules_inner(
index,
ast,
user_ctx,
child_id,
fresh,
rewrite_depth,
None,
)?;
let unchanged = result.len() == 1 && result[0] == child_id;
match (&mut new_children, unchanged) {
(None, true) => {} // unchanged so far, no allocation needed
@@ -798,24 +1041,25 @@ fn apply_repeating_rules_inner(
/// each visited node, recursion proceeds only through captured nodes (not
/// through the input node's children directly), and an error is returned if
/// no rule matches a visited node.
fn apply_one_shot_rules(
rules: &[Rule],
fn apply_one_shot_rules<C: Clone>(
rules: &[Rule<C>],
ast: &mut Ast,
user_ctx: &mut C,
id: Id,
fresh: &tree_builder::FreshScope,
) -> Result<Vec<Id>, String> {
let index = RuleIndex::new(rules);
apply_one_shot_rules_inner(&index, ast, id, fresh, 0)
apply_one_shot_rules_inner(&index, ast, user_ctx, id, fresh, 0)
}
fn apply_one_shot_rules_inner(
index: &RuleIndex,
fn apply_one_shot_rules_inner<C: Clone>(
index: &RuleIndex<C>,
ast: &mut Ast,
user_ctx: &mut C,
id: Id,
fresh: &tree_builder::FreshScope,
rewrite_depth: usize,
) -> Result<Vec<Id>, String> {
if rewrite_depth > MAX_REWRITE_DEPTH {
return Err(format!(
"Desugaring exceeded maximum rewrite depth ({MAX_REWRITE_DEPTH}). \
@@ -825,31 +1069,28 @@ fn apply_one_shot_rules_inner(
let node_kind = ast.get_node(id).map(|n| n.kind()).unwrap_or("");
// Don't rewrite unnamed nodes (punctuation, keywords, etc.); leave them
// as-is. Rules target named nodes only.
if let Some(node) = ast.get_node(id) {
if !node.is_named() {
return Ok(vec![id]);
}
}
for rule in index.rules_for_kind(node_kind) {
if let Some(mut captures) = rule.try_match(ast, id)? {
// Recursively translate every captured node before invoking the
// transform. The transform's output uses output-schema kinds, so
// we must translate captured input-schema nodes to their
// output-schema equivalents first.
captures.try_map_all_captures(|captured_id| {
// Avoid infinite recursion when a capture refers to the root
// node of the matched tree (e.g. an `@_` capture on the
// pattern root): re-analyzing it would match the same rule
// again indefinitely.
if captured_id == id {
return Ok(vec![captured_id]);
}
apply_one_shot_rules_inner(index, ast, captured_id, fresh, rewrite_depth + 1)
})?;
return Ok(rule.run_transform(ast, captures, id, fresh));
if let Some(captures) = rule.try_match(ast, id)? {
// Snapshot the user context before invoking the rule so that any
// mutations the rule (or its transitively-translated captures)
// make are visible during this rule's transform, but not leaked
// to the parent's siblings.
let snapshot = user_ctx.clone();
// Build the translator handle the transform will use to
// recursively translate captures (or, for macro-generated
// rules, the auto-translate prefix uses it to translate every
// capture up front, preserving the legacy behavior).
let translator = TranslatorHandle {
inner: TranslatorImpl::OneShot {
index,
fresh,
rewrite_depth,
matched_root: id,
},
};
let result = rule.run_transform(ast, captures, id, fresh, user_ctx, translator)?;
*user_ctx = snapshot;
return Ok(result);
}
}
@@ -877,15 +1118,15 @@ pub enum PhaseKind {
/// starts. Rules within a phase compete for matches as usual; rules in
/// different phases never compete because each traversal only considers the
/// current phase's rules.
pub struct Phase {
pub struct Phase<C = ()> {
/// Name used in error messages.
pub name: String,
pub rules: Vec<Rule>,
pub rules: Vec<Rule<C>>,
pub kind: PhaseKind,
}
impl Phase {
pub fn new(name: impl Into<String>, kind: PhaseKind, rules: Vec<Rule>) -> Self {
impl<C> Phase<C> {
pub fn new(name: impl Into<String>, kind: PhaseKind, rules: Vec<Rule<C>>) -> Self {
Self {
name: name.into(),
rules,
@@ -911,17 +1152,30 @@ impl Phase {
/// .add_phase("desugar", PhaseKind::Repeating, desugar_rules)
/// .with_output_node_types_yaml(yaml);
/// ```
#[derive(Default)]
pub struct DesugaringConfig {
///
/// The optional type parameter `C` is the user context type threaded through
/// rule transforms. Defaults to `()` (no user context).
pub struct DesugaringConfig<C = ()> {
/// Phases of rule application, applied in order.
pub phases: Vec<Phase>,
pub phases: Vec<Phase<C>>,
/// Output node-types in YAML format. If `None`, the input grammar's
/// node types are used (i.e. the desugared AST has the same node types
/// as the tree-sitter grammar).
pub output_node_types_yaml: Option<&'static str>,
}
impl DesugaringConfig {
// Manual `Default` impl so users with a custom `C` that doesn't implement
// `Default` can still construct an empty config.
impl<C> Default for DesugaringConfig<C> {
fn default() -> Self {
Self {
phases: Vec::new(),
output_node_types_yaml: None,
}
}
}
impl<C> DesugaringConfig<C> {
/// Create an empty configuration. Add phases via [`add_phase`] and an
/// optional output schema via [`with_output_node_types_yaml`].
pub fn new() -> Self {
@@ -933,7 +1187,7 @@ impl DesugaringConfig {
mut self,
name: impl Into<String>,
kind: PhaseKind,
rules: Vec<Rule>,
rules: Vec<Rule<C>>,
) -> Self {
self.phases.push(Phase::new(name, kind, rules));
self
@@ -955,15 +1209,15 @@ impl DesugaringConfig {
}
}
pub struct Runner<'a> {
pub struct Runner<'a, C = ()> {
language: tree_sitter::Language,
schema: schema::Schema,
phases: &'a [Phase],
phases: &'a [Phase<C>],
}
impl<'a> Runner<'a> {
impl<'a, C> Runner<'a, C> {
/// Create a runner using the input grammar's schema for output.
pub fn new(language: tree_sitter::Language, phases: &'a [Phase]) -> Self {
pub fn new(language: tree_sitter::Language, phases: &'a [Phase<C>]) -> Self {
let schema = schema::Schema::from_language(&language);
Self {
language,
@@ -976,7 +1230,7 @@ impl<'a> Runner<'a> {
pub fn with_schema(
language: tree_sitter::Language,
schema: &schema::Schema,
phases: &'a [Phase],
phases: &'a [Phase<C>],
) -> Self {
Self {
language,
@@ -988,7 +1242,7 @@ impl<'a> Runner<'a> {
/// Create a runner from a [`DesugaringConfig`].
pub fn from_config(
language: tree_sitter::Language,
config: &'a DesugaringConfig,
config: &'a DesugaringConfig<C>,
) -> Result<Self, String> {
let schema = config.build_schema(&language)?;
Ok(Self {
@@ -997,11 +1251,17 @@ impl<'a> Runner<'a> {
phases: &config.phases,
})
}
}
pub fn run_from_tree(
impl<'a, C: Clone> Runner<'a, C> {
/// Parse `tree` against `source` and run all phases, threading
/// `user_ctx` through every rule transform. The caller owns the
/// initial context state.
pub fn run_from_tree_with_ctx(
&self,
tree: &tree_sitter::Tree,
source: &[u8],
user_ctx: &mut C,
) -> Result<Ast, String> {
let mut ast = Ast::from_tree_with_schema_and_source(
self.schema.clone(),
@@ -1009,11 +1269,13 @@ impl<'a> Runner<'a> {
&self.language,
source.to_vec(),
);
self.run_phases(&mut ast)?;
self.run_phases(&mut ast, user_ctx)?;
Ok(ast)
}
pub fn run(&self, input: &str) -> Result<Ast, String> {
/// Parse `input` and run all phases, threading `user_ctx` through
/// every rule transform. The caller owns the initial context state.
pub fn run_with_ctx(&self, input: &str, user_ctx: &mut C) -> Result<Ast, String> {
let mut parser = tree_sitter::Parser::new();
parser
.set_language(&self.language)
@@ -1027,20 +1289,24 @@ impl<'a> Runner<'a> {
&self.language,
input.as_bytes().to_vec(),
);
self.run_phases(&mut ast)?;
self.run_phases(&mut ast, user_ctx)?;
Ok(ast)
}
/// Apply each phase in turn to the AST, threading the root through.
/// A single `FreshScope` is shared across phases so that fresh
/// identifiers generated in different phases don't collide.
fn run_phases(&self, ast: &mut Ast) -> Result<(), String> {
fn run_phases(&self, ast: &mut Ast, user_ctx: &mut C) -> Result<(), String> {
let fresh = tree_builder::FreshScope::new();
let mut root = ast.get_root();
for phase in self.phases {
let res = match phase.kind {
PhaseKind::Repeating => apply_repeating_rules(&phase.rules, ast, root, &fresh),
PhaseKind::OneShot => apply_one_shot_rules(&phase.rules, ast, root, &fresh),
PhaseKind::Repeating => {
apply_repeating_rules(&phase.rules, ast, user_ctx, root, &fresh)
}
PhaseKind::OneShot => {
apply_one_shot_rules(&phase.rules, ast, user_ctx, root, &fresh)
}
}
.map_err(|e| format!("Phase `{}`: {e}", phase.name))?;
if res.len() != 1 {
@@ -1056,3 +1322,78 @@ impl<'a> Runner<'a> {
Ok(())
}
}
impl<'a, C: Clone + Default> Runner<'a, C> {
/// Parse `tree` against `source` and run all phases, using the
/// default context (`C::default()`) as the initial context state.
pub fn run_from_tree(&self, tree: &tree_sitter::Tree, source: &[u8]) -> Result<Ast, String> {
let mut user_ctx = C::default();
self.run_from_tree_with_ctx(tree, source, &mut user_ctx)
}
/// Parse `input` and run all phases, using the default context
/// (`C::default()`) as the initial context state.
pub fn run(&self, input: &str) -> Result<Ast, String> {
let mut user_ctx = C::default();
self.run_with_ctx(input, &mut user_ctx)
}
}
// ---------------------------------------------------------------------------
// Desugarer: type-erased view of a DesugaringConfig + Runner
// ---------------------------------------------------------------------------
/// Type-erased interface to a desugaring pipeline for a single language.
///
/// Consumers (e.g. a generic tree-sitter extractor) hold
/// `Box<dyn Desugarer>` so they can dispatch through the trait without
/// knowing the user context type `C` that's internal to yeast.
///
/// Construct one via [`ConcreteDesugarer::new`] from a
/// [`DesugaringConfig<C>`] and a [`tree_sitter::Language`].
pub trait Desugarer: Send + Sync {
/// The output AST schema (in YAML format), or `None` if the input
/// grammar's schema should be used.
fn output_node_types_yaml(&self) -> Option<&'static str>;
/// Parse `tree` against `source` and run the desugaring pipeline.
/// Each call constructs a fresh default user context internally.
fn run_from_tree(&self, tree: &tree_sitter::Tree, source: &[u8]) -> Result<Ast, String>;
}
/// A concrete [`Desugarer`] backed by a [`DesugaringConfig<C>`] for a
/// specific user context type `C`. Stores the language and a pre-built
/// schema so that per-call cost is bounded to constructing a transient
/// [`Runner`] and cloning the schema (no YAML re-parsing).
pub struct ConcreteDesugarer<C: Default + Clone + Send + Sync + 'static> {
language: tree_sitter::Language,
schema: schema::Schema,
config: DesugaringConfig<C>,
}
impl<C: Default + Clone + Send + Sync + 'static> ConcreteDesugarer<C> {
/// Build a desugarer for `language` from `config`. Parses the output
/// schema YAML once (if set) and stores it for reuse across files.
pub fn new(
language: tree_sitter::Language,
config: DesugaringConfig<C>,
) -> Result<Self, String> {
let schema = config.build_schema(&language)?;
Ok(Self {
language,
schema,
config,
})
}
}
impl<C: Default + Clone + Send + Sync + 'static> Desugarer for ConcreteDesugarer<C> {
fn output_node_types_yaml(&self) -> Option<&'static str> {
self.config.output_node_types_yaml
}
fn run_from_tree(&self, tree: &tree_sitter::Tree, source: &[u8]) -> Result<Ast, String> {
let runner = Runner::with_schema(self.language.clone(), &self.schema, &self.config.phases);
runner.run_from_tree(tree, source)
}
}

View File

@@ -242,10 +242,7 @@ pub fn convert(yaml_input: &str) -> Result<String, String> {
/// Apply YAML node-type definitions to a mutable Schema.
/// Registers all types, fields, and allowed types from the YAML into the schema.
fn apply_yaml_to_schema(
yaml: &YamlNodeTypes,
schema: &mut crate::schema::Schema,
) {
fn apply_yaml_to_schema(yaml: &YamlNodeTypes, schema: &mut crate::schema::Schema) {
// Register all supertypes as node kinds
for name in yaml.supertypes.keys() {
schema.register_kind(name);
@@ -307,7 +304,8 @@ fn apply_yaml_to_schema(
.into_vec()
.into_iter()
.map(|type_ref| {
let (kind, named) = resolve_type_ref_pair(&type_ref, &named_types, &unnamed_types);
let (kind, named) =
resolve_type_ref_pair(&type_ref, &named_types, &unnamed_types);
crate::schema::NodeType { kind, named }
})
.collect::<Vec<_>>();

View File

@@ -198,13 +198,8 @@ impl Schema {
.insert((parent_kind.to_string(), field_id), node_types);
}
pub fn field_types(
&self,
parent_kind: &str,
field_id: FieldId,
) -> Option<&Vec<NodeType>> {
self.field_types
.get(&(parent_kind.to_string(), field_id))
pub fn field_types(&self, parent_kind: &str, field_id: FieldId) -> Option<&Vec<NodeType>> {
self.field_types.get(&(parent_kind.to_string(), field_id))
}
pub fn set_field_cardinality(

View File

@@ -7,7 +7,7 @@ const OUTPUT_SCHEMA_YAML: &str = include_str!("node-types.yml");
/// Helper: parse Ruby source with no rules, return dump.
fn parse_and_dump(input: &str) -> String {
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run(input).unwrap();
dump_ast(&ast, ast.get_root(), input)
}
@@ -18,13 +18,23 @@ fn run_and_dump(input: &str, rules: Vec<Rule>) -> String {
run_phased_and_dump(input, vec![Phase::new("test", PhaseKind::Repeating, rules)])
}
/// Helper: parse Ruby source with custom rules and return the transformed AST.
fn run_and_ast(input: &str, rules: Vec<Rule>) -> Ast {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let phases = vec![Phase::new("test", PhaseKind::Repeating, rules)];
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
runner.run(input).unwrap()
}
/// Helper: parse Ruby source with a custom output schema and multiple
/// rule phases, return dump.
fn run_phased_and_dump(input: &str, phases: Vec<Phase>) -> String {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let runner = Runner::with_schema(lang, &schema, &phases);
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let ast = runner.run(input).unwrap();
dump_ast(&ast, ast.get_root(), input)
}
@@ -36,7 +46,7 @@ fn run_and_get_error(input: &str, rules: Vec<Rule>) -> String {
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let phases = vec![Phase::new("test", PhaseKind::Repeating, rules)];
let runner = Runner::with_schema(lang, &schema, &phases);
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
runner
.run(input)
.expect_err("expected runner to return an error")
@@ -44,7 +54,7 @@ fn run_and_get_error(input: &str, rules: Vec<Rule>) -> String {
/// Helper: parse Ruby source with no rules and dump with schema type errors.
fn parse_and_dump_typed(input: &str, schema_yaml: &str) -> String {
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run(input).unwrap();
let schema = yeast::node_types_yaml::schema_from_yaml(schema_yaml).unwrap();
dump_ast_with_type_errors(&ast, ast.get_root(), input, &schema)
@@ -54,10 +64,10 @@ fn parse_and_dump_typed(input: &str, schema_yaml: &str) -> String {
/// building schema with language IDs so field checks align with parser fields.
fn parse_and_dump_typed_with_language(input: &str, schema_yaml: &str) -> String {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let runner = Runner::new(lang.clone(), &[]);
let runner: Runner = Runner::new(lang.clone(), &[]);
let ast = runner.run(input).unwrap();
let schema = yeast::node_types_yaml::schema_from_yaml_with_language(schema_yaml, &lang)
.unwrap();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(schema_yaml, &lang).unwrap();
dump_ast_with_type_errors(&ast, ast.get_root(), input, &schema)
}
@@ -66,7 +76,7 @@ fn run_and_dump_typed(input: &str, rules: Vec<Rule>, schema_yaml: &str) -> Strin
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema = yeast::node_types_yaml::schema_from_yaml(schema_yaml).unwrap();
let phases = vec![Phase::new("test", PhaseKind::Repeating, rules)];
let runner = Runner::with_schema(lang, &schema, &phases);
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let ast = runner.run(input).unwrap();
dump_ast_with_type_errors(&ast, ast.get_root(), input, &schema)
}
@@ -156,7 +166,7 @@ fn test_parse_for_loop() {
#[test]
fn test_dump_highlights_type_errors_inline() {
let schema_yaml = r#"
let schema_yaml = r#"
named:
program:
$children*: assignment
@@ -166,13 +176,13 @@ named:
identifier:
"#;
let dump = parse_and_dump_typed("x = 1", schema_yaml);
assert!(dump.contains("integer \"1\" <-- ERROR:"));
let dump = parse_and_dump_typed("x = 1", schema_yaml);
assert!(dump.contains("integer \"1\" <-- ERROR:"));
}
#[test]
fn test_dump_reports_preserved_unknown_kind_after_transformation() {
let schema_yaml = r#"
let schema_yaml = r#"
named:
program:
$children*: assignment
@@ -182,25 +192,25 @@ named:
identifier:
"#;
// This rewrite runs and preserves the RHS node kind via capture.
// With schema above, preserving `integer` should be reported inline.
let rules = vec![yeast::rule!(
(assignment left: (_) @left right: (_) @right)
=>
(assignment
left: {left}
right: {right}
)
)];
// This rewrite runs and preserves the RHS node kind via capture.
// With schema above, preserving `integer` should be reported inline.
let rules: Vec<Rule> = vec![yeast::rule!(
(assignment left: (_) @left right: (_) @right)
=>
(assignment
left: {left}
right: {right}
)
)];
let dump = run_and_dump_typed("x = 1", rules, schema_yaml);
assert!(dump.contains("integer \"1\" <-- ERROR:"));
assert!(dump.contains("node kind 'integer' not in schema"));
let dump = run_and_dump_typed("x = 1", rules, schema_yaml);
assert!(dump.contains("integer \"1\" <-- ERROR:"));
assert!(dump.contains("node kind 'integer' not in schema"));
}
#[test]
fn test_dump_reports_undeclared_field_on_node() {
let schema_yaml = r#"
let schema_yaml = r#"
named:
program:
$children*: assignment
@@ -209,14 +219,14 @@ named:
identifier:
"#;
let dump = parse_and_dump_typed_with_language("x = y", schema_yaml);
assert!(dump.contains("right: identifier \"y\" <-- ERROR:"));
assert!(dump.contains("the node 'assignment' has no field 'right'"));
let dump = parse_and_dump_typed_with_language("x = y", schema_yaml);
assert!(dump.contains("right: identifier \"y\" <-- ERROR:"));
assert!(dump.contains("the node 'assignment' has no field 'right'"));
}
#[test]
fn test_dump_reports_disallowed_kind_in_field_type() {
let schema_yaml = r#"
let schema_yaml = r#"
named:
program:
$children*: assignment
@@ -227,17 +237,17 @@ named:
integer:
"#;
let dump = parse_and_dump_typed_with_language("x = 1", schema_yaml);
assert!(dump.contains("right: integer \"1\" <-- ERROR:"));
assert!(dump.contains("should contain"));
assert!(dump.contains("but got integer"));
let dump = parse_and_dump_typed_with_language("x = 1", schema_yaml);
assert!(dump.contains("right: integer \"1\" <-- ERROR:"));
assert!(dump.contains("should contain"));
assert!(dump.contains("but got integer"));
}
// ---- Query tests ----
#[test]
fn test_query_match() {
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let query = yeast::query!(
@@ -258,7 +268,7 @@ fn test_query_match() {
#[test]
fn test_query_no_match() {
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let query = yeast::query!(
@@ -283,7 +293,7 @@ fn test_query_skips_extras_in_positional_match() {
// captured comment to nothing (a common idiom, e.g.
// `(comment) => ()` in Swift) leaves the capture's match-list empty
// and causes the transform to fail with "Variable X has 0 matches".
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("[1, # comment\n2]").unwrap();
// Navigate to the `array` node: program -> array.
@@ -299,15 +309,11 @@ fn test_query_skips_extras_in_positional_match() {
let matched = query.do_match(&ast, array_id, &mut captures).unwrap();
assert!(matched);
assert_eq!(
ast.get_node(captures.get_var("a").unwrap())
.unwrap()
.kind(),
ast.get_node(captures.get_var("a").unwrap()).unwrap().kind(),
"integer"
);
assert_eq!(
ast.get_node(captures.get_var("b").unwrap())
.unwrap()
.kind(),
ast.get_node(captures.get_var("b").unwrap()).unwrap().kind(),
"integer"
);
}
@@ -315,14 +321,14 @@ fn test_query_skips_extras_in_positional_match() {
#[test]
fn test_reachable_nodes_excludes_orphaned_rewrite_nodes() {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema = yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang)
.unwrap();
let phases = vec![Phase::new(
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let phases: Vec<Phase> = vec![Phase::new(
"test",
PhaseKind::Repeating,
vec![yeast::rule!((integer) => (identifier "replaced"))],
)];
let runner = Runner::with_schema(lang, &schema, &phases);
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let input = "x = 1";
let ast = runner.run(input).unwrap();
@@ -340,7 +346,7 @@ fn test_reachable_nodes_excludes_orphaned_rewrite_nodes() {
#[test]
fn test_query_repeated_capture() {
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x, y, z = 1").unwrap();
let query = yeast::query!(
@@ -365,7 +371,7 @@ fn test_query_repeated_capture() {
#[test]
fn test_capture_unnamed_node_parenthesized() {
// `("=") @op` captures the unnamed `=` token between left and right.
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let query = yeast::query!(
@@ -389,10 +395,33 @@ fn test_capture_unnamed_node_parenthesized() {
assert!(!op_node.is_named());
}
#[test]
fn test_capture_bare_underscore_repeated() {
// `_` matches named and unnamed nodes in bare-child position. On this
// assignment shape, bare children correspond to unnamed tokens (the `=`).
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let query = yeast::query!((assignment _* @all));
let mut cursor = AstCursor::new(&ast);
cursor.goto_first_child();
let assignment_id = cursor.node_id();
let mut captures = yeast::captures::Captures::new();
let matched = query.do_match(&ast, assignment_id, &mut captures).unwrap();
assert!(matched);
let all = captures.get_all("all");
assert_eq!(all.len(), 1);
assert_eq!(ast.get_node(all[0]).unwrap().kind(), "=");
assert!(!ast.get_node(all[0]).unwrap().is_named());
}
#[test]
fn test_capture_unnamed_node_bare_literal() {
// `"=" @op` (without surrounding parens) is the same as `("=") @op`.
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let query = yeast::query!(
@@ -421,7 +450,7 @@ fn test_bare_underscore_matches_unnamed() {
// Bare `_` matches any node, including unnamed tokens, while `(_)`
// matches only named nodes. Demonstrate by matching the unnamed `=`
// token in the implicit `child` field of an `assignment`.
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let mut cursor = AstCursor::new(&ast);
@@ -460,7 +489,7 @@ fn test_bare_forms_in_field_position() {
// field's value, not just in the bare-children position. This is
// syntactic sugar for `(_)` / `("…")` and goes through the same
// code paths.
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let mut cursor = AstCursor::new(&ast);
@@ -499,7 +528,7 @@ fn test_forward_scan_finds_unnamed_token_late() {
// query for `("end")` skip past the first two and match the third.
// Without forward-scan, the matcher took the first child unconditionally
// and failed.
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("for x in list do\n y\nend").unwrap();
// Navigate: program > for > do (the body wrapper).
@@ -526,7 +555,7 @@ fn test_forward_scan_preserves_order() {
// order. A query for ("end") then ("do") should fail because `do`
// appears before `end` in the source order; once forward-scan has
// consumed `end`, the iterator is exhausted.
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("for x in list do\n y\nend").unwrap();
let mut cursor = AstCursor::new(&ast);
@@ -547,7 +576,7 @@ fn test_forward_scan_preserves_order() {
#[test]
fn test_tree_builder() {
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let mut ast = runner.run("x = 1").unwrap();
let input = "x = 1";
@@ -565,7 +594,8 @@ fn test_tree_builder() {
// Swap left and right
let fresh = yeast::tree_builder::FreshScope::new();
let mut ctx = yeast::build::BuildCtx::new(&mut ast, &captures, &fresh);
let mut user_ctx = ();
let mut ctx = yeast::build::BuildCtx::new(&mut ast, &captures, &fresh, &mut user_ctx);
let new_id = yeast::tree!(ctx,
(program
child: (assignment
@@ -593,7 +623,7 @@ fn test_tree_builder() {
// tree-sitter-ruby grammar with named fields for nodes that only have
// unnamed children in tree-sitter (e.g. block_body.stmt, block_parameters.parameter).
fn ruby_rules() -> Vec<Rule> {
let assign_rule = yeast::rule!(
let assign_rule: Rule = yeast::rule!(
(assignment
left: (left_assignment_list
(identifier)* @left
@@ -618,7 +648,7 @@ fn ruby_rules() -> Vec<Rule> {
)}
);
let for_rule = yeast::rule!(
let for_rule: Rule = yeast::rule!(
(for
pattern: (_) @pat
value: (in (_) @val)
@@ -700,7 +730,7 @@ fn test_desugar_for_loop() {
#[test]
fn test_shorthand_rule() {
let rule = yeast::rule!(
let rule: Rule = yeast::rule!(
(assignment
left: (_) @method
right: (_) @receiver
@@ -852,7 +882,7 @@ fn test_phase_error_includes_phase_name() {
PhaseKind::Repeating,
vec![swap_assignment_rule().repeated()],
)];
let runner = Runner::with_schema(lang, &schema, &phases);
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let err = runner
.run("x = 1")
.expect_err("expected runner to return an error");
@@ -895,7 +925,7 @@ fn test_one_shot_phase() {
PhaseKind::OneShot,
one_shot_xeq1_rules(),
)];
let runner = Runner::with_schema(lang, &schema, &phases);
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let input = "x = 1";
let ast = runner.run(input).unwrap();
@@ -921,7 +951,7 @@ fn test_one_shot_phase_errors_when_no_rule_matches() {
let mut rules = one_shot_xeq1_rules();
rules.pop();
let phases = vec![Phase::new("translate", PhaseKind::OneShot, rules)];
let runner = Runner::with_schema(lang, &schema, &phases);
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let err = runner
.run("x = 1")
@@ -945,7 +975,7 @@ fn test_one_shot_recurses_into_returned_capture() {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let rules = vec![
let rules: Vec<Rule> = vec![
yeast::rule!(
(program (_)* @stmts)
=>
@@ -961,7 +991,7 @@ fn test_one_shot_recurses_into_returned_capture() {
yeast::rule!((integer) => (integer "INT")),
];
let phases = vec![Phase::new("translate", PhaseKind::OneShot, rules)];
let runner = Runner::with_schema(lang, &schema, &phases);
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let input = "x = 1";
let ast = runner.run(input).unwrap();
@@ -987,7 +1017,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {
let lang: tree_sitter::Language = tree_sitter_ruby::LANGUAGE.into();
let schema =
yeast::node_types_yaml::schema_from_yaml_with_language(OUTPUT_SCHEMA_YAML, &lang).unwrap();
let rules = vec![
let rules: Vec<Rule> = vec![
yeast::rule!(
(program (_)* @stmts)
=>
@@ -1008,7 +1038,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {
yeast::rule!((integer) => (integer "INT")),
];
let phases = vec![Phase::new("translate", PhaseKind::OneShot, rules)];
let runner = Runner::with_schema(lang, &schema, &phases);
let runner: Runner = Runner::with_schema(lang, &schema, &phases);
let input = "x = 1";
let ast = runner.run(input).unwrap();
@@ -1032,7 +1062,7 @@ fn test_one_shot_does_not_recurse_into_wrapper_output() {
#[test]
fn test_cursor_navigation() {
let runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let runner: Runner = Runner::new(tree_sitter_ruby::LANGUAGE.into(), &[]);
let ast = runner.run("x = 1").unwrap();
let mut cursor = AstCursor::new(&ast);
@@ -1106,7 +1136,7 @@ fn test_desugar_for_with_multiple_assignment() {
/// resolves to the captured node's source text via `YeastDisplay`.
#[test]
fn test_hash_brace_renders_capture_source_text() {
let rule = rule!(
let rule: Rule = rule!(
(call
method: (identifier) @name
receiver: (identifier) @recv
@@ -1135,7 +1165,7 @@ fn test_hash_brace_renders_capture_source_text() {
/// `Display` impl (covered by `YeastDisplay`'s blanket impls for primitives).
#[test]
fn test_hash_brace_renders_integer_expression() {
let rule = rule!(
let rule: Rule = rule!(
(identifier) @_
=>
(identifier #{1 + 2})
@@ -1149,3 +1179,39 @@ fn test_hash_brace_renders_integer_expression() {
"#,
);
}
/// Regression test: `(kind #{capture})` should inherit the captured node's
/// source location, not the full source range of the matched rule root.
#[test]
fn test_hash_brace_uses_capture_location_for_leaf() {
let rule: Rule = rule!(
(call
method: (identifier) @name
receiver: (identifier) @recv
)
=>
(call
method: (identifier #{name})
receiver: (identifier #{recv})
arguments: (argument_list)
)
);
let ast = run_and_ast("foo.bar()", vec![rule]);
let mut bar_ids: Vec<usize> = Vec::new();
for id in ast.reachable_node_ids() {
let Some(node) = ast.get_node(id) else {
continue;
};
if node.kind() == "identifier" && ast.source_text(id) == "bar" {
bar_ids.push(id);
}
}
assert_eq!(bar_ids.len(), 1, "expected exactly one identifier 'bar'");
let bar = ast.get_node(bar_ids[0]).unwrap();
assert_eq!(bar.start_byte(), 4);
assert_eq!(bar.end_byte(), 7);
}

View File

@@ -3,25 +3,21 @@
This is a CodeQL extractor based on tree-sitter.
## Building
To build the extractor, run `scripts/create-extractor-pack.sh`
- To build the extractor, run `scripts/create-extractor-pack.sh`
## Editing the Swift grammar
The vendored tree-sitter-swift grammar lives at
`extractor/tree-sitter-swift/`. After editing `grammar.js` (or any other
grammar source), run `scripts/regenerate-grammar.sh` to:
- regenerate `extractor/tree-sitter-swift/src/{parser.c, grammar.json,
node-types.json}` (and the `src/tree_sitter/*.h` headers) via
`tree-sitter generate`; and
- refresh `extractor/tree-sitter-swift/node-types.yml`, the
human-readable companion to `src/node-types.json` produced by yeast's
`node_types_yaml` binary.
## Swift Parser
- The Swift parser is defined by `extractor/tree-sitter-swift/grammar.js` and can be edited if needed.
`node-types.yml` is the recommended review surface for grammar changes —
it shows the impact of a grammar tweak on the named node kinds, fields,
and child types in a form much easier to read than the raw JSON.
- After editing the grammar, always run `scripts/regenerate-grammar.sh`.
## Extractor Testing
- To run extractor tests, run `cargo test` in the `extractor` directory.
- The raw parse tree is described by `extractor/tree-sitter-swift/node-types.yml` and should be reviewed after grammar changes.
## AST Mapping
- The target AST shape is described by `extractor/ast_types.yml`.
- The mapping from the parse tree to the target AST is found in `extractor/src/languages/swift/swift.rs`
- To run tests for the parser and mapping, run `cargo test` in the `extractor` directory.
- Do not edit the printed ASTs in `extractor/test/corpus` directly. To regenerate the ASTs, run `scripts/update-corpus.sh`.

View File

@@ -2,36 +2,103 @@ supertypes:
expr:
- name_expr
- int_literal
- float_literal
- boolean_literal
- string_literal
- regex_literal
- builtin_expr
- binary_expr
- unary_expr
- call_expr
- member_access_expr
- lambda_expr
- unsupported_node
stmt:
- empty_stmt
- block_stmt
- expr_stmt
- if_stmt
- variable_declaration_stmt
- guard_if_stmt
- unsupported_node
condition:
- expr_condition
- let_pattern_condition
- sequence_condition
- super_expr
- function_expr
- array_literal
- map_literal
- key_value_pair
- tuple_expr
- type_cast_expr
- type_test_expr
- if_expr
- assign_expr
- compound_assign_expr
- pattern_guard_expr
- empty_expr
- block
- break_expr
- continue_expr
- return_expr
- throw_expr
- try_expr
- switch_expr
- unsupported_node
expr_or_pattern:
- expr
- pattern
expr_or_type:
- expr
- type_expr
pattern:
- var_pattern
- apply_pattern
- name_pattern
- tuple_pattern
- constructor_pattern
- ignore_pattern
- expr_equality_pattern
- bulk_importing_pattern
- unsupported_node
# A statement is anything that can appear in a block.
# This type contains all of 'expr' and has partial overlap with 'member'.
# For example, type_alias_declaration can appear either as a stmt or member.
# constructor_declaration and destructor_declaration appear here because
# tree-sitter-swift's error recovery for #if/#endif in class bodies can place
# init/deinit declarations at the wrong (statement) level.
stmt:
- expr
- variable_declaration
- type_alias_declaration
- function_declaration
- import_declaration
- operator_syntax_declaration
- class_like_declaration
- accessor_declaration
- constructor_declaration
- destructor_declaration
- guard_if_stmt
- for_each_stmt
- while_stmt
- do_while_stmt
- labeled_stmt
# A member is anything that can appear in the body of a class-like declaration
member:
- constructor_declaration
- destructor_declaration
- function_declaration
- variable_declaration
- accessor_declaration
- initializer_declaration
- class_like_declaration
- type_alias_declaration
- associated_type_declaration
- unsupported_node
type_expr:
- named_type_expr
- generic_type_expr
- tuple_type_expr
- function_type_expr
- inferred_type_expr
- unsupported_node
type_constraint:
- equality_type_constraint
- bound_type_constraint
operator:
- infix_operator
- prefix_operator
- postfix_operator
named:
# Top-level is the root node, currently containing a list of expressions
# Top-level is the root node, containing a single block of statements
# (which are themselves expressions or declarations).
top_level:
body*: [expr, stmt]
body: block
# An identifier used in the context of an expression
name_expr:
@@ -40,13 +107,28 @@ named:
# An integer literal
int_literal:
# A floating-point literal
float_literal:
# A boolean literal
boolean_literal:
# A literal backed by a keyword such as `nil`, `null`, or `nullptr`.
#
# Although nil/null are keyword literals in many languages there should be
# no attempt to normalize "null-like" named entities, like Python's `None`.
builtin_expr:
# A string literal
string_literal:
# A regex literal
regex_literal:
# Application of a binary operator, such as `a + b`
binary_expr:
left: expr
operator: operator
operator: infix_operator
right: expr
# Application of a unary operator, such as `!x`
@@ -54,86 +136,310 @@ named:
operand: expr
operator: operator
# A function or method call, such as `f(x)` or `obj.m(x)`. Method calls
# are represented as a call whose `function` is a `member_access_expr`.
# Plain assignment
assign_expr:
target: expr_or_pattern
value: expr
# Compound assignment
compound_assign_expr:
target: expr
operator: infix_operator
value: expr
# A function or method call, such as `f(x)` or `obj.m(x)`.
#
# Method calls are represented as a call whose `function` is a `member_access_expr`.
#
# Constructor calls are marked by a language-specific modifier, and the target may be
# a `type_expr` if the parser can deduce that the target is a type.
call_expr:
function: expr
argument*: expr
modifier*: modifier
callee: expr_or_type
argument*: argument
argument:
modifier*: modifier
name?: identifier
value: expr
# Member access, such as `obj.member`.
#
# The base may be a type expression when it is a static member access like `Array<Int>.method`.
# In ambiguous cases where the parser cannot distinguish static and instance member access, the base
# will be typically be an expression.
#
# For `super.x` the base will be an instance of `super_expr`.
member_access_expr:
target: expr
base: expr_or_type
member: identifier
lambda_expr:
# A type expression that refers to a type inferred from the contextual type.
# This is used to translate Swift's leading-dot syntax, `.foo`, which means `T.foo` where
# `T` is the contextual type of some enclosing expression. This is translated to a member_access
# with an inferred_type_expr as the base.
inferred_type_expr:
# A `super` token, which can usually only appear as the base of member access.
super_expr:
function_expr:
modifier*: modifier
capture_declaration*: variable_declaration
parameter*: parameter
body: [expr, stmt]
return_type?: type_expr
body: block
# A parameter
array_literal:
element*: expr
map_literal:
element*: expr
# A key-value pair, usually appearing as a named argument or as part of a map literal.
#
# For some languages, the key-value pair is a first class value and this type of expression
# may thus appear anywhere in the general case.
key_value_pair:
key: expr
value: expr
# A tuple expression, such as `(a, b, c)`.
tuple_expr:
element*: expr
# A parameter.
#
# `type` is its declared type annotation (if any)
#
# `pattern` binds the parameter's internal name(s). For a simple parameter this is a
# `name_pattern`, but may be an arbitrary pattern for languages where patterns may appear
# in the parameter list.
#
# `external_name` is the name by which to call sites refer to the parameter, if the parameter
# can be passed as a named parameter. For example, the Swift function `func greet(person id: String)`
# would have `person` as the external name and a `name_pattern` wrapping `id` is the parameter's pattern.
parameter:
modifier*: modifier
external_name?: identifier
type?: type_expr
pattern?: pattern
default?: expr
# An expression that does nothing. Used where the grammar permits an
# empty statement (e.g. a stray `;`).
empty_expr:
# A brace-delimited sequence of statements (`{ ... }`). Blocks are the
# only nodes that can directly contain statements; every other body-like
# field holds a single `block`.
block:
stmt*: stmt
if_expr:
condition: expr
then?: expr
else?: expr
# A variable declaration or destructuring assignment that introduces new variables.
#
# Any occurrence of `var_patterns` in 'pattern' result in fresh bindings that are
# in scope for the rest of the enclosing block.
#
# The initializer is optional (but typically cannot be omitted if combined with a non-trivial pattern).
#
# Modifiers should include 'var', 'let', 'const', etc, if they are significant.
# A grouped declaration like `let x = 1, y = 2` is emitted as a sequence of
# `variable_declaration`s directly into the enclosing stmt/member slot; every
# declaration after the first in such a group is tagged with a synthetic
# `chained_declaration` modifier so the grouping can be recovered downstream.
variable_declaration:
modifier*: modifier
pattern: pattern
empty_stmt:
block_stmt:
body*: stmt
expr_stmt:
expr: expr
if_stmt:
condition: condition
then?: stmt
else?: stmt
variable_declaration_stmt:
variable_declarator+: variable_declarator
# A variable declaration, or assignment to a pattern.
# The initializer is optional (but typically only possible in combination with a simple variable pattern).
variable_declarator:
pattern: pattern
type?: type_expr
value?: expr
# Evaluate 'condition', and if false, execute 'else' which must break from the enclosing block scope (return, break, etc).
# Any variables bound by 'condition' will be in scope for the remainder of the enclosing block scope
# (which differs from how if_stmt works).
# (which differs from how if_expr works).
guard_if_stmt:
condition: condition
else: stmt
condition: expr
else: block
# Evaluates the given condition and interprets it as a boolean (by language conventions)
expr_condition:
expr: expr
# `break` (with optional label)
break_expr:
label?: identifier
# A series of statements that are executed before evaluating the trailing condition.
# Useful for languages where a conditional clause may be preceded by side-effecting
# syntactic elements (e.g. binding clauses) that don't themselves form a condition.
sequence_condition:
stmt*: stmt
condition: condition
# `continue` (with optional label)
continue_expr:
label?: identifier
# A labeled statement, such as `outer: for ... { ... }`. The labeled
# statement appears as the `stmt` field; `break`/`continue` may target
# the label.
labeled_stmt:
label: identifier
stmt: stmt
# `return value` or bare `return`
return_expr:
value?: expr
# `throw value`
throw_expr:
value?: expr
# An import declaration.
#
# The semantics of an import are generally:
# - Evaluate the 'imported_expr' to a value (possibly a compile-time value, such as namespace)
# - Filter away possible values based on modifiers (e.g. type-only imports only accept types)
# - Assign the value to the pattern, binding variables and/or type names in scope
#
import_declaration:
modifier*: modifier
imported_expr: expr # Qualified names are encoded as a chain of member_access_expr ending with a name_expr
pattern?: pattern # Binds local names in scope (possibly via bulk_importing_pattern)
# `typealias Name = Type`
type_alias_declaration:
modifier*: modifier
name: identifier
type_parameter*: type_parameter
type_constraint*: type_constraint
type: type_expr
# A top-level function declaration.
function_declaration:
modifier*: modifier
name: identifier
type_parameter*: type_parameter
type_constraint*: type_constraint
parameter*: parameter
return_type?: type_expr
body?: block
# `for pattern in iterable [where guard] { body }`.
for_each_stmt:
modifier*: modifier
pattern: pattern
iterable: expr
guard?: expr
body?: block
# `while condition { body }`.
while_stmt:
modifier*: modifier
condition: expr
body?: block
# `repeat { body } while condition`.
do_while_stmt:
modifier*: modifier
body?: block
condition: expr
# `do { body } catch pattern { ... } catch ...`. Swift uses `do`/`catch`
# for error handling; for languages with `try`/`catch`, this is the same shape.
try_expr:
modifier*: modifier
body: block
catch_clause*: catch_clause
catch_clause:
modifier*: modifier
pattern?: pattern
guard?: expr
body: block
# `switch value { case pattern: body case ...: default: body }`
switch_expr:
modifier*: modifier
value: expr
case*: switch_case
# A single `case ...:` (or `default:`) entry in a switch.
# An entry with multiple `case p1, p2:` patterns has multiple `pattern`s.
# A `default:` entry has no patterns.
# An optional `guard` corresponds to a `where`-clause on the case.
switch_case:
modifier*: modifier
pattern*: pattern
guard?: expr
body: block
# Evaluate 'expr' and match its result against 'pattern', and return true if it matches.
# Variables bound by the pattern will be in scope within the 'true' branch controlled by this condition.
let_pattern_condition:
# Variables bound by the pattern will be in scope within the 'true' branch controlled by this expression.
#
# In Swift, `if case let PATTERN = EXPR` maps to this node
#
# Java: 'if (x instanceof Foo y && w ...) { ... }'
pattern_guard_expr:
pattern: pattern
value: expr
# A pattern matching anything, binding its value to the given variable
var_pattern:
# A type cast expression, such as `x as T`, `x as? T`, or `x as! T`. The
# operator distinguishes between the variants.
type_cast_expr:
expr: expr
operator: infix_operator
type: type_expr
# A type-test expression, such as `x is T`. Yields a boolean indicating
# whether `expr` is an instance of `type`.
type_test_expr:
expr: expr
operator: infix_operator
type: type_expr
# An identifier that introduces a variable.
#
# When used as a pattern, the pattern matches anything and binds its incoming value to the variable
name_pattern:
modifier*: modifier
identifier: identifier
# A pattern matching anything, binding no variables, usually using the syntax "_"
ignore_pattern:
# A pattern such as `Some(x)` where `Some` is the constructor and `x` is an argument
apply_pattern:
constructor: expr
argument*: pattern
# A pattern that matches if the incoming value is equal to the value of the given expression.
# Used for literal patterns in switch (e.g. `case 1:`).
expr_equality_pattern:
expr: expr
# A tuple pattern such as `(a, b)` in `let (a, b) = pair`.
#
# Elements of the tuple pattern can have names, such as Swift's `let (foo: x, bar: y) = tuple`.
tuple_pattern:
element*: pattern
modifier*: modifier
element*: pattern_element
# A pattern such as `Some(x)` where `Some` is the constructor and `x` is an element.
# The element names are interpreted as argument labels and/or field names.
constructor_pattern:
modifier*: modifier
constructor: expr_or_type
element*: pattern_element
# A pattern with an optional associated name.
pattern_element:
modifier*: modifier
key?: identifier
pattern: pattern
# A pattern that checks if the incoming value has the given type, and if so, the
# value is matched against the given nested pattern (and succeeds iff the nested match succeeds).
#
# In Swift: `if let y = x as? Foo` is a pattern_guard_expr containing a type_test_pattern
# In Java: `x instanceof Foo y` is a type_test_pattern wrapping a name_pattern
type_test_pattern:
pattern: pattern
type: type_expr
# A '*' pattern that imports all members of the incoming value into the local scope
# Currently this can only appear in import declarations.
bulk_importing_pattern:
modifier*: modifier
# An simple unqualified identifier token
identifier:
@@ -141,4 +447,129 @@ named:
# A node that we don't yet translate
unsupported_node:
operator:
infix_operator:
prefix_operator:
postfix_operator:
# The fixity of a custom operator declaration (e.g. "prefix", "infix",
# "postfix"). The value is the keyword string.
fixity:
type_parameter:
modifier*: modifier
name: identifier
bound?: type_expr
# A generic constraint of the form `T == U`, requiring two types to be
# equal. Appears in `where` clauses on generic declarations
# (e.g. Swift `func foo<T, U>() where T == U`).
equality_type_constraint:
left: type_expr
right: type_expr
# A generic constraint of the form `T: Bound`, requiring a type parameter
# to conform to (or inherit from) some other type. Appears in `where`
# clauses on generic declarations (e.g. Swift `where T: Equatable`).
bound_type_constraint:
type: type_expr
bound: type_expr
# `infix operator +++` (and the like) — a declaration of a custom operator.
operator_syntax_declaration:
modifier*: modifier
name: identifier
# The fixity specifier (`prefix`, `infix`, `postfix`), when applicable.
fixity?: fixity
# The declared precedence level, when present (e.g. Swift's
# `infix operator +++ : AdditionPrecedence`).
precedence?: expr
# A class-like declaration: class, struct, interface (protocol), enum (or actor).
# The syntactic kind is carried as a `modifier` (e.g. "class", "struct",
# "interface", "enum", "extension"). The `"enum_case"` modifier additionally
# marks a declaration as an enum case with associated values. Extensions are
# represented as a class-like declaration with the `"extension"` modifier and
# no `name`; the extended type appears as a `base_type`.
class_like_declaration:
modifier*: modifier
name?: identifier
type_parameter*: type_parameter
type_constraint*: type_constraint
base_type*: base_type
member*: member
# One of the base types of a class declaration.
#
# If the language has multiple kinds of base classes (e.g. extends/implements) the
# kind should be included as a modifier on this node.
base_type:
modifier*: modifier
type: type_expr
constructor_declaration:
modifier*: modifier
name?: identifier
parameter*: parameter
body: block
# A destructor / finalizer (Swift `deinit`, C++ `~T()`, etc.).
destructor_declaration:
modifier*: modifier
body: block
# Declaration of a single accessor for a property (such as a getter, setter,
# or observer like Swift's `willSet`/`didSet`).
#
# Multiple accessors for the same property are emitted as a sequence of
# accessor_declaration nodes; every accessor after the first is tagged with
# a synthetic `chained_declaration` modifier so the grouping can be recovered
# downstream. Stored properties with observers are emitted as a
# variable_declaration followed by one accessor_declaration per observer
# (each observer also tagged with `chained_declaration`).
accessor_declaration:
modifier*: modifier
name: identifier
accessor_kind: accessor_kind
parameter*: parameter
type?: type_expr
body?: block
# "get", "set", or a language-specific kind like "didSet"
accessor_kind:
# Static or instance initializer block. That is, code that runs at initialization time of either the class or an instance.
initializer_declaration:
modifier*: modifier
body: block
associated_type_declaration:
modifier*: modifier
name: identifier
bound?: type_expr
named_type_expr:
qualifier?: type_expr
name: identifier
generic_type_expr:
base: type_expr
type_argument*: type_expr
# A tuple type such as `(Int, String)` or `(a: A, b: B)`.
tuple_type_expr:
element*: tuple_type_element
# An element of a `tuple_type_expr`, optionally carrying a label.
tuple_type_element:
name?: identifier
type: type_expr
# A function type such as `(Int, String) -> Bool` or `(x: Int) -> Bool`.
function_type_expr:
parameter*: parameter
return_type: type_expr
# A modifier such as 'static', 'public', or 'async'. For now this is just a leaf node with a string value.
modifier:

View File

@@ -1,9 +1,9 @@
use clap::Args;
use std::path::PathBuf;
use crate::languages;
use codeql_extractor::extractor::simple;
use codeql_extractor::trap;
use crate::languages;
#[derive(Args)]
pub struct Options {
@@ -35,7 +35,9 @@ pub fn run(options: Options) -> std::io::Result<()> {
prefix: "unified".to_string(),
languages,
trap_dir: options.output_dir,
trap_compression: trap::Compression::from_env("CODEQL_EXTRACTOR_UNIFIED_OPTION_TRAP_COMPRESSION"),
trap_compression: trap::Compression::from_env(
"CODEQL_EXTRACTOR_UNIFIED_OPTION_TRAP_COMPRESSION",
),
source_archive_dir: options.source_archive_dir,
file_lists: vec![options.file_list],
};

View File

@@ -22,14 +22,19 @@ pub fn run(options: Options) -> std::io::Result<()> {
// The QL-visible schema is the unified output AST, not the per-language
// input grammars. Pass it via `desugar.output_node_types_yaml` so the
// generator converts the YAML to JSON node-types.
let desugar = yeast::DesugaringConfig::new()
.with_output_node_types_yaml(languages::OUTPUT_AST_SCHEMA);
let desugar =
yeast::DesugaringConfig::new().with_output_node_types_yaml(languages::OUTPUT_AST_SCHEMA);
let languages = vec![Language {
name: "Unified".to_owned(),
node_types: "", // unused: generator picks up output_node_types_yaml above
node_types: "", // unused: generator picks up output_node_types_yaml above
desugar: Some(desugar),
}];
generate(languages, options.dbscheme, options.library, "run unified/scripts/create-extractor-pack.sh")
generate(
languages,
options.dbscheme,
options.library,
"run unified/scripts/create-extractor-pack.sh",
)
}

File diff suppressed because it is too large Load Diff

View File

@@ -50,6 +50,35 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "f"
value:
function_expr
body:
block
stmt:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "x"
right: int_literal "2"
parameter:
parameter
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
return_type:
named_type_expr
name: identifier "Int"
===
Closure with shorthand parameters
@@ -82,6 +111,26 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "f"
value:
function_expr
body:
block
stmt:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "$0"
right:
name_expr
identifier: identifier "$1"
===
Trailing closure
@@ -114,6 +163,28 @@ source_file
top_level
body:
block
stmt:
call_expr
argument:
argument
value:
function_expr
body:
block
stmt:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "$0"
right: int_literal "2"
callee:
member_access_expr
base:
name_expr
identifier: identifier "xs"
member: identifier "map"
===
Closure with capture list
@@ -163,6 +234,31 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "f"
value:
function_expr
body:
block
stmt:
call_expr
callee:
member_access_expr
base:
name_expr
identifier: identifier "self"
member: identifier "doThing"
capture_declaration:
variable_declaration
modifier: modifier "weak"
pattern:
name_pattern
identifier: identifier "self"
===
Multi-statement closure
@@ -236,3 +332,46 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "f"
value:
function_expr
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "y"
value:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "x"
right: int_literal "1"
return_expr
value:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "y"
right: int_literal "2"
parameter:
parameter
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
return_type:
named_type_expr
name: identifier "Int"

View File

@@ -28,6 +28,19 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "xs"
value:
array_literal
element:
int_literal "1"
int_literal "2"
int_literal "3"
===
Empty array literal with type
@@ -68,6 +81,22 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "xs"
type:
generic_type_expr
base:
named_type_expr
name: identifier "Array"
type_argument:
named_type_expr
name: identifier "Int"
value: array_literal "[]"
===
Dictionary literal
@@ -106,6 +135,14 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "d"
value: map_literal "[\"a\": 1, \"b\": 2]"
===
Set literal
@@ -155,6 +192,22 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "s"
type:
named_type_expr
name: identifier "Set<Int>"
value:
array_literal
element:
int_literal "1"
int_literal "2"
int_literal "3"
===
Tuple literal
@@ -191,6 +244,14 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "t"
value: tuple_expr "(1, \"two\", 3.0)"
===
Subscript access
@@ -232,9 +293,21 @@ source_file
top_level
body:
unsupported_node "// TODO: tree-sitter-swift parses `xs[0]` as a call_expression (same shape"
unsupported_node "// as `xs(0)`), so the mapping currently produces a call_expr. Update the"
unsupported_node "// parser / add a separate subscript_expr node and remap when fixed."
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "first"
value:
call_expr
argument:
argument
value: int_literal "0"
callee:
name_expr
identifier: identifier "xs"
===
Dictionary subscript
@@ -276,8 +349,21 @@ source_file
top_level
body:
unsupported_node "// TODO: same parser issue as the array subscript case above —"
unsupported_node "// `d[\"key\"]` is parsed as `call_expression(d, (\"key\"))`."
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "v"
value:
call_expr
argument:
argument
value: string_literal "\"key\""
callee:
name_expr
identifier: identifier "d"
===
Tuple member access
@@ -309,3 +395,16 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "n"
value:
member_access_expr
base:
name_expr
identifier: identifier "t"
member: identifier "0"

View File

@@ -35,6 +35,28 @@ source_file
top_level
body:
block
stmt:
if_expr
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
then:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "x"
callee:
name_expr
identifier: identifier "print"
===
If-else
@@ -90,6 +112,43 @@ source_file
top_level
body:
block
stmt:
if_expr
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
else:
block
stmt:
call_expr
argument:
argument
value:
unary_expr
operand:
name_expr
identifier: identifier "x"
operator: prefix_operator "-"
callee:
name_expr
identifier: identifier "print"
then:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "x"
callee:
name_expr
identifier: identifier "print"
===
If-else-if chain
@@ -165,6 +224,55 @@ source_file
top_level
body:
block
stmt:
if_expr
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
else:
if_expr
condition:
binary_expr
operator: infix_operator "<"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
else:
block
stmt:
call_expr
argument:
argument
value: int_literal "3"
callee:
name_expr
identifier: identifier "print"
then:
block
stmt:
call_expr
argument:
argument
value: int_literal "2"
callee:
name_expr
identifier: identifier "print"
then:
block
stmt:
call_expr
argument:
argument
value: int_literal "1"
callee:
name_expr
identifier: identifier "print"
===
If-let optional binding
@@ -207,6 +315,39 @@ source_file
top_level
body:
block
stmt:
if_expr
condition:
pattern_guard_expr
pattern:
constructor_pattern
element:
pattern_element
pattern:
name_pattern
identifier: identifier "value"
constructor:
member_access_expr
base:
named_type_expr
name: identifier "Optional"
member: identifier "some"
value:
name_expr
identifier: identifier "optional"
then:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "value"
callee:
name_expr
identifier: identifier "print"
===
Guard let
@@ -240,6 +381,30 @@ source_file
top_level
body:
block
stmt:
guard_if_stmt
condition:
pattern_guard_expr
pattern:
constructor_pattern
element:
pattern_element
pattern:
name_pattern
identifier: identifier "value"
constructor:
member_access_expr
base:
named_type_expr
name: identifier "Optional"
member: identifier "some"
value:
name_expr
identifier: identifier "optional"
else:
block
stmt: return_expr "return"
===
Ternary expression
@@ -277,6 +442,27 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "y"
value:
if_expr
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
else:
unary_expr
operand: int_literal "1"
operator: prefix_operator "-"
then: int_literal "1"
===
Switch statement
@@ -357,6 +543,54 @@ source_file
top_level
body:
block
stmt:
switch_expr
case:
switch_case
body:
block
stmt:
call_expr
argument:
argument
value: string_literal "\"one\""
callee:
name_expr
identifier: identifier "print"
pattern:
expr_equality_pattern
expr: int_literal "1"
switch_case
body:
block
stmt:
call_expr
argument:
argument
value: string_literal "\"two or three\""
callee:
name_expr
identifier: identifier "print"
pattern:
expr_equality_pattern
expr: int_literal "2"
expr_equality_pattern
expr: int_literal "3"
switch_case
body:
block
stmt:
call_expr
argument:
argument
value: string_literal "\"other\""
callee:
name_expr
identifier: identifier "print"
value:
name_expr
identifier: identifier "x"
===
Switch with binding pattern
@@ -396,6 +630,7 @@ source_file
pattern:
pattern
bound_identifier: simple_identifier "r"
dot: .
name: simple_identifier "circle"
statement:
call_expression
@@ -428,6 +663,7 @@ source_file
pattern:
pattern
bound_identifier: simple_identifier "s"
dot: .
name: simple_identifier "square"
statement:
call_expression
@@ -445,3 +681,207 @@ source_file
top_level
body:
block
stmt:
switch_expr
case:
switch_case
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "r"
callee:
name_expr
identifier: identifier "print"
pattern:
constructor_pattern
element:
pattern_element
pattern:
name_pattern
identifier: identifier "r"
constructor:
member_access_expr
base: inferred_type_expr "."
member: identifier "circle"
switch_case
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "s"
callee:
name_expr
identifier: identifier "print"
pattern:
constructor_pattern
element:
pattern_element
pattern:
name_pattern
identifier: identifier "s"
constructor:
member_access_expr
base: inferred_type_expr "."
member: identifier "square"
value:
name_expr
identifier: identifier "shape"
===
Switch with labeled case pattern arguments
===
switch x {
case .implicit(isAcknowledged: false):
print("yes")
case .thread(threadRowId: _, let rowId):
print(rowId)
}
---
source_file
statement:
switch_statement
entry:
switch_entry
pattern:
switch_pattern
pattern:
pattern
kind:
case_pattern
arguments:
tuple_pattern
item:
tuple_pattern_item
name: simple_identifier "isAcknowledged"
pattern:
pattern
kind:
boolean_literal
dot: .
name: simple_identifier "implicit"
statement:
call_expression
function: simple_identifier "print"
suffix:
call_suffix
arguments:
value_arguments
argument:
value_argument
value:
line_string_literal
text: line_str_text "yes"
switch_entry
pattern:
switch_pattern
pattern:
pattern
kind:
case_pattern
arguments:
tuple_pattern
item:
tuple_pattern_item
name: simple_identifier "threadRowId"
pattern:
pattern
kind: wildcard_pattern "_"
tuple_pattern_item
pattern:
pattern
kind:
binding_pattern
binding:
value_binding_pattern
mutability: let
pattern:
pattern
bound_identifier: simple_identifier "rowId"
dot: .
name: simple_identifier "thread"
statement:
call_expression
function: simple_identifier "print"
suffix:
call_suffix
arguments:
value_arguments
argument:
value_argument
value: simple_identifier "rowId"
expr: simple_identifier "x"
---
top_level
body:
block
stmt:
switch_expr
case:
switch_case
body:
block
stmt:
call_expr
argument:
argument
value: string_literal "\"yes\""
callee:
name_expr
identifier: identifier "print"
pattern:
constructor_pattern
element:
pattern_element
key: identifier "isAcknowledged"
pattern:
expr_equality_pattern
expr: boolean_literal "false"
constructor:
member_access_expr
base: inferred_type_expr "."
member: identifier "implicit"
switch_case
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "rowId"
callee:
name_expr
identifier: identifier "print"
pattern:
constructor_pattern
element:
pattern_element
key: identifier "threadRowId"
pattern: ignore_pattern "_"
pattern_element
pattern:
name_pattern
identifier: identifier "rowId"
constructor:
member_access_expr
base: inferred_type_expr "."
member: identifier "thread"
value:
name_expr
identifier: identifier "x"

View File

@@ -17,6 +17,12 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "+"
left: int_literal "1"
right: int_literal "2"
===
Another additive expression is desugared
@@ -37,3 +43,144 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "foo"
right:
name_expr
identifier: identifier "bar"
===
Simple import with single name
===
import Foundation
---
source_file
statement:
import_declaration
name:
identifier
part: simple_identifier "Foundation"
---
top_level
body:
block
stmt:
import_declaration
pattern: bulk_importing_pattern "import Foundation"
imported_expr:
name_expr
identifier: identifier "Foundation"
===
Import with dotted path (two parts)
===
import Foundation.Networking
---
source_file
statement:
import_declaration
name:
identifier
part:
simple_identifier "Foundation"
simple_identifier "Networking"
---
top_level
body:
block
stmt:
import_declaration
pattern: bulk_importing_pattern "import Foundation.Networking"
imported_expr:
member_access_expr
base:
name_expr
identifier: identifier "Foundation"
member: identifier "Networking"
===
Import with deeply nested path (three parts)
===
import Foundation.Networking.URLSession
---
source_file
statement:
import_declaration
name:
identifier
part:
simple_identifier "Foundation"
simple_identifier "Networking"
simple_identifier "URLSession"
---
top_level
body:
block
stmt:
import_declaration
pattern: bulk_importing_pattern "import Foundation.Networking.URLSession"
imported_expr:
member_access_expr
base:
member_access_expr
base:
name_expr
identifier: identifier "Foundation"
member: identifier "Networking"
member: identifier "URLSession"
===
Scoped import uses name_pattern
===
import struct Foundation.Date
---
source_file
statement:
import_declaration
name:
identifier
part:
simple_identifier "Foundation"
simple_identifier "Date"
scoped_import_kind: struct
---
top_level
body:
block
stmt:
import_declaration
modifier: modifier "struct"
pattern:
name_pattern
identifier: identifier "Date"
imported_expr:
member_access_expr
base:
name_expr
identifier: identifier "Foundation"
member: identifier "Date"

View File

@@ -31,6 +31,20 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
call_expr
argument:
argument
value: string_literal "\"hello\""
callee:
name_expr
identifier: identifier "print"
name: identifier "greet"
===
Function with parameters and return type
@@ -93,6 +107,37 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
return_expr
value:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
name: identifier "add"
parameter:
parameter
external_name: identifier "_"
pattern:
name_pattern
identifier: identifier "a"
parameter
external_name: identifier "_"
pattern:
name_pattern
identifier: identifier "b"
return_type:
named_type_expr
name: identifier "Int"
===
Function with named parameters
@@ -138,6 +183,28 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "name"
callee:
name_expr
identifier: identifier "print"
name: identifier "greet"
parameter:
parameter
external_name: identifier "person"
pattern:
name_pattern
identifier: identifier "name"
===
Function with default parameter value
@@ -185,6 +252,28 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "name"
callee:
name_expr
identifier: identifier "print"
name: identifier "greet"
parameter:
parameter
default: string_literal "\"world\""
pattern:
name_pattern
identifier: identifier "name"
===
Variadic function
@@ -249,6 +338,38 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
return_expr
value:
call_expr
argument:
argument
value: int_literal "0"
argument
value:
name_expr
identifier: identifier "+"
callee:
member_access_expr
base:
name_expr
identifier: identifier "values"
member: identifier "reduce"
name: identifier "sum"
parameter:
parameter
external_name: identifier "_"
pattern:
name_pattern
identifier: identifier "values"
return_type:
named_type_expr
name: identifier "Int"
===
Function call
@@ -276,6 +397,17 @@ source_file
top_level
body:
block
stmt:
call_expr
argument:
argument
value: int_literal "1"
argument
value: int_literal "2"
callee:
name_expr
identifier: identifier "foo"
===
Function call with labelled arguments
@@ -306,6 +438,16 @@ source_file
top_level
body:
block
stmt:
call_expr
argument:
argument
name: identifier "person"
value: string_literal "\"Bob\""
callee:
name_expr
identifier: identifier "greet"
===
Method call
@@ -336,6 +478,18 @@ source_file
top_level
body:
block
stmt:
call_expr
argument:
argument
value: int_literal "1"
callee:
member_access_expr
base:
name_expr
identifier: identifier "list"
member: identifier "append"
===
Generic function
@@ -387,3 +541,117 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
return_expr
value:
name_expr
identifier: identifier "x"
name: identifier "identity"
parameter:
parameter
external_name: identifier "_"
pattern:
name_pattern
identifier: identifier "x"
return_type:
named_type_expr
name: identifier "T"
===
Leading-dot expression value
===
let x = .foo
---
source_file
statement:
property_declaration
binding:
value_binding_pattern
mutability: let
declarator:
property_binding
name:
pattern
bound_identifier: simple_identifier "x"
value:
prefix_expression
operation: .
target: simple_identifier "foo"
---
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
value:
member_access_expr
base: inferred_type_expr ".foo"
member: identifier "foo"
===
Leading-dot expression call
===
let y = .some(1)
---
source_file
statement:
property_declaration
binding:
value_binding_pattern
mutability: let
declarator:
property_binding
name:
pattern
bound_identifier: simple_identifier "y"
value:
call_expression
function:
prefix_expression
operation: .
target: simple_identifier "some"
suffix:
call_suffix
arguments:
value_arguments
argument:
value_argument
value: integer_literal "1"
---
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "y"
value:
call_expr
argument:
argument
value: int_literal "1"
callee:
member_access_expr
base: inferred_type_expr ".some"
member: identifier "some"

View File

@@ -13,6 +13,8 @@ source_file
top_level
body:
block
stmt: int_literal "42"
===
Negative integer literal
@@ -32,6 +34,11 @@ source_file
top_level
body:
block
stmt:
unary_expr
operand: int_literal "7"
operator: prefix_operator "-"
===
Floating-point literal
@@ -48,6 +55,8 @@ source_file
top_level
body:
block
stmt: float_literal "3.14"
===
Boolean literals
@@ -67,6 +76,10 @@ source_file
top_level
body:
block
stmt:
boolean_literal "true"
boolean_literal "false"
===
Nil literal
@@ -83,6 +96,8 @@ source_file
top_level
body:
block
stmt: builtin_expr "nil"
===
String literal
@@ -101,6 +116,8 @@ source_file
top_level
body:
block
stmt: string_literal "\"hello\""
===
String with interpolation
@@ -122,3 +139,5 @@ source_file
top_level
body:
block
stmt: string_literal "\"hello \\(name)\""

View File

@@ -37,6 +37,30 @@ source_file
top_level
body:
block
stmt:
for_each_stmt
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "x"
callee:
name_expr
identifier: identifier "print"
pattern:
name_pattern
identifier: identifier "x"
iterable:
array_literal
element:
int_literal "1"
int_literal "2"
int_literal "3"
===
For-in over range
@@ -76,6 +100,29 @@ source_file
top_level
body:
block
stmt:
for_each_stmt
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "i"
callee:
name_expr
identifier: identifier "print"
pattern:
name_pattern
identifier: identifier "i"
iterable:
binary_expr
operator: infix_operator "..<"
left: int_literal "0"
right: int_literal "10"
===
For-in with where clause
@@ -119,6 +166,34 @@ source_file
top_level
body:
block
stmt:
for_each_stmt
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "x"
callee:
name_expr
identifier: identifier "print"
pattern:
name_pattern
identifier: identifier "x"
guard:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
iterable:
name_expr
identifier: identifier "xs"
===
While loop
@@ -154,6 +229,25 @@ source_file
top_level
body:
block
stmt:
while_stmt
body:
block
stmt:
compound_assign_expr
operator: infix_operator "-="
target:
name_expr
identifier: identifier "x"
value: int_literal "1"
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
===
Repeat-while loop
@@ -189,6 +283,25 @@ source_file
top_level
body:
block
stmt:
do_while_stmt
body:
block
stmt:
compound_assign_expr
operator: infix_operator "-="
target:
name_expr
identifier: identifier "x"
value: int_literal "1"
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
===
Break and continue
@@ -252,3 +365,46 @@ source_file
top_level
body:
block
stmt:
for_each_stmt
body:
block
stmt:
if_expr
condition:
binary_expr
operator: infix_operator "<"
left:
name_expr
identifier: identifier "x"
right: int_literal "0"
then:
block
stmt: continue_expr "continue"
if_expr
condition:
binary_expr
operator: infix_operator ">"
left:
name_expr
identifier: identifier "x"
right: int_literal "100"
then:
block
stmt: break_expr "break"
call_expr
argument:
argument
value:
name_expr
identifier: identifier "x"
callee:
name_expr
identifier: identifier "print"
pattern:
name_pattern
identifier: identifier "x"
iterable:
name_expr
identifier: identifier "xs"

View File

@@ -17,6 +17,16 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Subtraction
@@ -37,6 +47,16 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "-"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Multiplication
@@ -57,6 +77,16 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Division
@@ -77,6 +107,16 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "/"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Operator precedence: addition and multiplication
@@ -101,6 +141,22 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "+"
left:
name_expr
identifier: identifier "a"
right:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "b"
right:
name_expr
identifier: identifier "c"
===
Parenthesised expression
@@ -129,6 +185,14 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "*"
left: tuple_expr "(a + b)"
right:
name_expr
identifier: identifier "c"
===
Comparison
@@ -149,6 +213,16 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "<"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Equality
@@ -169,6 +243,16 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "=="
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Logical and
@@ -189,6 +273,16 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "&&"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Logical or
@@ -209,6 +303,16 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "||"
left:
name_expr
identifier: identifier "a"
right:
name_expr
identifier: identifier "b"
===
Logical not
@@ -228,6 +332,13 @@ source_file
top_level
body:
block
stmt:
unary_expr
operand:
name_expr
identifier: identifier "a"
operator: prefix_operator "!"
===
Range operator
@@ -248,3 +359,9 @@ source_file
top_level
body:
block
stmt:
binary_expr
operator: infix_operator "..."
left: int_literal "1"
right: int_literal "10"

View File

@@ -34,6 +34,22 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
type:
generic_type_expr
base:
named_type_expr
name: identifier "Optional"
type_argument:
named_type_expr
name: identifier "Int"
value: builtin_expr "nil"
===
Optional chaining
@@ -74,6 +90,22 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "n"
value:
member_access_expr
base:
member_access_expr
base:
name_expr
identifier: identifier "obj"
member: identifier "foo"
member: identifier "bar"
===
Force unwrap
@@ -103,6 +135,19 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "n"
value:
unary_expr
operand:
name_expr
identifier: identifier "opt"
operator: postfix_operator "!"
===
Nil-coalescing
@@ -132,6 +177,20 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "n"
value:
binary_expr
operator: infix_operator "??"
left:
name_expr
identifier: identifier "opt"
right: int_literal "0"
===
Throwing function
@@ -167,6 +226,18 @@ source_file
top_level
body:
block
stmt:
function_declaration
body:
block
stmt:
return_expr
value: string_literal "\"\""
name: identifier "read"
return_type:
named_type_expr
name: identifier "String"
===
Do-catch
@@ -216,6 +287,33 @@ source_file
top_level
body:
block
stmt:
try_expr
body:
block
stmt:
unary_expr
operand:
call_expr
callee:
name_expr
identifier: identifier "foo"
operator: prefix_operator "try"
catch_clause:
catch_clause
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "error"
callee:
name_expr
identifier: identifier "print"
===
Try? expression
@@ -252,6 +350,21 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "result"
value:
unary_expr
operand:
call_expr
callee:
name_expr
identifier: identifier "foo"
operator: prefix_operator "try?"
===
Try! expression
@@ -288,3 +401,18 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "result"
value:
unary_expr
operand:
call_expr
callee:
name_expr
identifier: identifier "foo"
operator: prefix_operator "try!"

View File

@@ -18,6 +18,11 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
modifier: modifier "class"
name: identifier "Foo"
===
Class with stored properties
@@ -79,6 +84,28 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "y"
type:
named_type_expr
name: identifier "Int"
modifier: modifier "class"
name: identifier "Point"
===
Class with initializer
@@ -152,6 +179,34 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
constructor_declaration
body:
block
stmt:
assign_expr
target:
member_access_expr
base:
name_expr
identifier: identifier "self"
member: identifier "x"
value:
name_expr
identifier: identifier "x"
modifier: modifier "class"
name: identifier "Point"
===
Class with method
@@ -200,6 +255,29 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "n"
value: int_literal "0"
function_declaration
body:
block
stmt:
compound_assign_expr
operator: infix_operator "+="
target:
name_expr
identifier: identifier "n"
value: int_literal "1"
name: identifier "bump"
modifier: modifier "class"
name: identifier "Counter"
===
Class inheritance
@@ -228,6 +306,11 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
modifier: modifier "class"
name: identifier "Dog"
===
Struct
@@ -289,6 +372,28 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "y"
type:
named_type_expr
name: identifier "Int"
modifier: modifier "struct"
name: identifier "Point"
===
Enum with cases
@@ -332,6 +437,32 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "enum_case"
pattern:
name_pattern
identifier: identifier "north"
variable_declaration
modifier: modifier "enum_case"
pattern:
name_pattern
identifier: identifier "south"
variable_declaration
modifier: modifier "enum_case"
pattern:
name_pattern
identifier: identifier "east"
variable_declaration
modifier: modifier "enum_case"
pattern:
name_pattern
identifier: identifier "west"
modifier: modifier "enum"
name: identifier "Direction"
===
Enum with associated values
@@ -389,6 +520,40 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
class_like_declaration
member:
constructor_declaration
body: block "circle(radius: Double)"
parameter:
parameter
pattern:
name_pattern
identifier: identifier "radius"
type:
named_type_expr
name: identifier "Double"
modifier: modifier "enum_case"
name: identifier "circle"
class_like_declaration
member:
constructor_declaration
body: block "square(side: Double)"
parameter:
parameter
pattern:
name_pattern
identifier: identifier "side"
type:
named_type_expr
name: identifier "Double"
modifier: modifier "enum_case"
name: identifier "square"
modifier: modifier "enum"
name: identifier "Shape"
===
Protocol declaration
@@ -414,6 +579,15 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
function_declaration
body: block "func draw()"
name: identifier "draw"
modifier: modifier "protocol"
name: identifier "Drawable"
===
Extension
@@ -463,6 +637,30 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
function_declaration
body:
block
stmt:
return_expr
value:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "self"
right:
name_expr
identifier: identifier "self"
name: identifier "squared"
return_type:
named_type_expr
name: identifier "Int"
modifier: modifier "extension"
name: identifier "Int"
===
Computed property
@@ -555,6 +753,48 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "w"
type:
named_type_expr
name: identifier "Double"
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "h"
type:
named_type_expr
name: identifier "Double"
accessor_declaration
body:
block
stmt:
return_expr
value:
binary_expr
operator: infix_operator "*"
left:
name_expr
identifier: identifier "w"
right:
name_expr
identifier: identifier "h"
modifier: modifier "var"
name: identifier "area"
type:
named_type_expr
name: identifier "Double"
accessor_kind: accessor_kind "get"
modifier: modifier "class"
name: identifier "Rect"
===
Property with getter and setter
@@ -639,3 +879,204 @@ source_file
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "_v"
value: int_literal "0"
accessor_declaration
body:
block
stmt:
return_expr
value:
name_expr
identifier: identifier "_v"
modifier: modifier "var"
name: identifier "v"
type:
named_type_expr
name: identifier "Int"
accessor_kind: accessor_kind "get"
accessor_declaration
body:
block
stmt:
assign_expr
target:
name_expr
identifier: identifier "_v"
value:
name_expr
identifier: identifier "newValue"
modifier:
modifier "var"
modifier "chained_declaration"
name: identifier "v"
type:
named_type_expr
name: identifier "Int"
accessor_kind: accessor_kind "set"
modifier: modifier "class"
name: identifier "Box"
===
Protocol with read-only and read-write property requirements
===
protocol P {
var foo: Int { get }
var bar: String { get set }
}
---
source_file
statement:
protocol_declaration
body:
protocol_body
member:
protocol_property_declaration
name:
pattern
binding:
value_binding_pattern
mutability: var
bound_identifier: simple_identifier "foo"
requirements:
protocol_property_requirements
accessor:
getter_specifier
type:
type_annotation
type:
type
name:
user_type
part:
simple_user_type
name: type_identifier "Int"
protocol_property_declaration
name:
pattern
binding:
value_binding_pattern
mutability: var
bound_identifier: simple_identifier "bar"
requirements:
protocol_property_requirements
accessor:
getter_specifier
setter_specifier
type:
type_annotation
type:
type
name:
user_type
part:
simple_user_type
name: type_identifier "String"
name: type_identifier "P"
---
top_level
body:
block
stmt:
class_like_declaration
member:
accessor_declaration
name: identifier "foo"
type:
named_type_expr
name: identifier "Int"
accessor_kind: accessor_kind "get"
accessor_declaration
name: identifier "bar"
type:
named_type_expr
name: identifier "String"
accessor_kind: accessor_kind "get"
accessor_declaration
modifier: modifier "chained_declaration"
name: identifier "bar"
type:
named_type_expr
name: identifier "String"
accessor_kind: accessor_kind "set"
modifier: modifier "protocol"
name: identifier "P"
===
Enum with comma-separated cases (chained_declaration)
===
enum Suit {
case clubs, diamonds, hearts, spades
}
---
source_file
statement:
class_declaration
body:
enum_class_body
member:
enum_entry
case:
enum_case_entry
name: simple_identifier "clubs"
enum_case_entry
name: simple_identifier "diamonds"
enum_case_entry
name: simple_identifier "hearts"
enum_case_entry
name: simple_identifier "spades"
declaration_kind: enum
name: type_identifier "Suit"
---
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "enum_case"
pattern:
name_pattern
identifier: identifier "clubs"
variable_declaration
modifier:
modifier "chained_declaration"
modifier "enum_case"
pattern:
name_pattern
identifier: identifier "diamonds"
variable_declaration
modifier:
modifier "chained_declaration"
modifier "enum_case"
pattern:
name_pattern
identifier: identifier "hearts"
variable_declaration
modifier:
modifier "chained_declaration"
modifier "enum_case"
pattern:
name_pattern
identifier: identifier "spades"
modifier: modifier "enum"
name: identifier "Suit"

View File

@@ -23,6 +23,14 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
value: int_literal "1"
===
Var binding
@@ -49,6 +57,14 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "x"
value: int_literal "1"
===
Let with type annotation
@@ -84,6 +100,17 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
value: int_literal "1"
===
Var without initialiser
@@ -118,6 +145,16 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
===
Tuple destructuring binding
@@ -154,6 +191,28 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
tuple_pattern
element:
pattern_element
pattern:
expr_equality_pattern
expr:
name_expr
identifier: identifier "a"
pattern_element
pattern:
expr_equality_pattern
expr:
name_expr
identifier: identifier "b"
value:
name_expr
identifier: identifier "pair"
===
Multiple bindings on one line
@@ -185,6 +244,22 @@ source_file
top_level
body:
block
stmt:
variable_declaration
modifier: modifier "let"
pattern:
name_pattern
identifier: identifier "x"
value: int_literal "1"
variable_declaration
modifier:
modifier "let"
modifier "chained_declaration"
pattern:
name_pattern
identifier: identifier "y"
value: int_literal "2"
===
Assignment
@@ -207,6 +282,13 @@ source_file
top_level
body:
block
stmt:
assign_expr
target:
name_expr
identifier: identifier "x"
value: int_literal "1"
===
Compound assignment
@@ -229,3 +311,138 @@ source_file
top_level
body:
block
stmt:
compound_assign_expr
operator: infix_operator "+="
target:
name_expr
identifier: identifier "x"
value: int_literal "1"
===
Property with willSet and didSet observers
===
class C {
var x: Int = 0 {
willSet { print(newValue) }
didSet { print(oldValue) }
}
}
---
source_file
statement:
class_declaration
body:
class_body
member:
property_declaration
binding:
value_binding_pattern
mutability: var
declarator:
property_binding
name:
pattern
bound_identifier: simple_identifier "x"
observers:
willset_didset_block
didset:
didset_clause
body:
block
statement:
call_expression
function: simple_identifier "print"
suffix:
call_suffix
arguments:
value_arguments
argument:
value_argument
value: simple_identifier "oldValue"
willset:
willset_clause
body:
block
statement:
call_expression
function: simple_identifier "print"
suffix:
call_suffix
arguments:
value_arguments
argument:
value_argument
value: simple_identifier "newValue"
type:
type_annotation
type:
type
name:
user_type
part:
simple_user_type
name: type_identifier "Int"
value: integer_literal "0"
declaration_kind: class
name: type_identifier "C"
---
top_level
body:
block
stmt:
class_like_declaration
member:
variable_declaration
modifier: modifier "var"
pattern:
name_pattern
identifier: identifier "x"
type:
named_type_expr
name: identifier "Int"
value: int_literal "0"
accessor_declaration
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "newValue"
callee:
name_expr
identifier: identifier "print"
modifier:
modifier "var"
modifier "chained_declaration"
name: identifier "x"
accessor_kind: accessor_kind "willSet"
accessor_declaration
body:
block
stmt:
call_expr
argument:
argument
value:
name_expr
identifier: identifier "oldValue"
callee:
name_expr
identifier: identifier "print"
modifier:
modifier "var"
modifier "chained_declaration"
name: identifier "x"
accessor_kind: accessor_kind "didSet"
modifier: modifier "class"
name: identifier "C"

View File

@@ -2,7 +2,7 @@ use std::fs;
use std::path::Path;
use codeql_extractor::extractor::simple;
use yeast::{dump::dump_ast, dump::dump_ast_with_type_errors, Runner};
use yeast::{Runner, dump::dump_ast, dump::dump_ast_with_type_errors};
#[path = "../src/languages/mod.rs"]
mod languages;
@@ -146,29 +146,36 @@ fn render_corpus(cases: &[CorpusCase]) -> String {
out
}
fn run_desugaring(
lang: &simple::LanguageSpec,
input: &str,
) -> Result<yeast::Ast, String> {
let runner = match lang.desugar.as_ref() {
Some(config) => Runner::from_config(lang.ts_language.clone(), config)
.map_err(|e| format!("Failed to create yeast runner: {e}"))?,
None => Runner::new(lang.ts_language.clone(), &[]),
};
runner
.run(input)
.map_err(|e| format!("Failed to parse input: {e}"))
fn run_desugaring(lang: &simple::LanguageSpec, input: &str) -> Result<yeast::Ast, String> {
match lang.desugar.as_deref() {
Some(desugarer) => {
// Parse the input ourselves so we don't depend on the desugarer
// knowing about the language.
let mut parser = tree_sitter::Parser::new();
parser
.set_language(&lang.ts_language)
.map_err(|e| format!("Failed to set language: {e}"))?;
let tree = parser
.parse(input, None)
.ok_or_else(|| "Failed to parse input".to_string())?;
desugarer
.run_from_tree(&tree, input.as_bytes())
.map_err(|e| format!("Desugaring failed: {e}"))
}
None => {
let runner: Runner = Runner::new(lang.ts_language.clone(), &[]);
runner
.run(input)
.map_err(|e| format!("Failed to parse input: {e}"))
}
}
}
/// Produce the raw tree-sitter parse tree dump for `input`, with no
/// desugaring rules applied. Uses a `Runner` with an empty phase list and
/// the input grammar's own schema.
fn dump_raw_parse(
lang: &simple::LanguageSpec,
input: &str,
) -> Result<String, String> {
let runner = Runner::new(lang.ts_language.clone(), &[]);
fn dump_raw_parse(lang: &simple::LanguageSpec, input: &str) -> Result<String, String> {
let runner: Runner = Runner::new(lang.ts_language.clone(), &[]);
let ast = runner
.run(input)
.map_err(|e| format!("Failed to parse input: {e}"))?;
@@ -272,11 +279,7 @@ fn test_corpus() {
}
}
assert!(
failures.is_empty(),
"{}",
failures.join("\n\n") + "\n\n"
);
assert!(failures.is_empty(), "{}", failures.join("\n\n") + "\n\n");
if update_mode {
let updated = render_corpus(&cases);
@@ -285,7 +288,9 @@ fn test_corpus() {
write_result.is_ok(),
"Failed to update corpus file {}: {}",
corpus_path.display(),
write_result.err().map_or_else(String::new, |e| e.to_string())
write_result
.err()
.map_or_else(String::new, |e| e.to_string())
);
}
}

View File

@@ -1368,7 +1368,7 @@ module.exports = grammar({
seq(
field("modifiers", optional($.modifiers)),
"import",
optional($._import_kind),
optional(field("scoped_import_kind", $._import_kind)),
field("name", $.identifier)
),
_import_kind: ($) =>
@@ -1930,7 +1930,7 @@ module.exports = grammar({
seq(
optional("case"),
optional(field("type", $.user_type)), // XXX this should just be _type but that creates ambiguity
$._dot,
field("dot", $._dot),
field("name", $.simple_identifier),
optional(field("arguments", $.tuple_pattern))
),

View File

@@ -173,6 +173,7 @@ named:
value?: expression
case_pattern:
arguments?: tuple_pattern
dot: "."
name: simple_identifier
type?: user_type
catch_block:
@@ -351,6 +352,7 @@ named:
import_declaration:
modifiers?: modifiers
name: identifier
scoped_import_kind?: ["class", "enum", "func", "let", "protocol", "struct", "typealias", "var"]
infix_expression:
lhs: expression
op: custom_operator

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,18 @@
/** Provides classes for working with comments. */
private import unified
/**
* A comment appearing in the source code.
*/
class Comment extends TriviaToken {
// At the moment, comments are the only type trivia token we extract
/**
* Gets the text inside this comment, not counting the delimeters.
*/
string getCommentText() {
result = this.getValue().regexpCapture("//(.*)", 1)
or
result = this.getValue().regexpCapture("(?s)/\\*(.*)\\*/", 1)
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,8 @@
/**
* Provides classes for working with the AST, as well as files and locations.
*/
import codeql.Locations
import codeql.files.FileSystem
import codeql.unified.Ast::Unified
import codeql.unified.Comments

View File

@@ -1,9 +1,37 @@
nameExpr
| name_expr.swift:1:9:1:9 | NameExpr | y |
| test.swift:1:8:1:17 | NameExpr | Foundation |
| test.swift:8:9:8:13 | NameExpr | items |
| test.swift:8:22:8:25 | NameExpr | item |
| test.swift:12:16:12:20 | NameExpr | items |
| test.swift:12:31:12:34 | NameExpr | item |
| test.swift:25:18:25:22 | NameExpr | Array |
| test.swift:25:24:25:28 | NameExpr | first |
| test.swift:26:17:26:22 | NameExpr | second |
| test.swift:27:13:27:18 | NameExpr | result |
| test.swift:27:29:27:32 | NameExpr | item |
| test.swift:28:13:28:18 | NameExpr | result |
| test.swift:28:27:28:30 | NameExpr | item |
| test.swift:31:12:31:17 | NameExpr | result |
| test.swift:40:16:40:19 | NameExpr | data |
| test.swift:44:9:44:12 | NameExpr | data |
| test.swift:48:15:48:19 | NameExpr | index |
| test.swift:48:29:48:33 | NameExpr | index |
| test.swift:48:37:48:40 | NameExpr | data |
| test.swift:49:16:49:19 | NameExpr | data |
| test.swift:49:21:49:25 | NameExpr | index |
| test.swift:53:9:53:12 | NameExpr | data |
| test.swift:53:21:53:24 | NameExpr | item |
| test.swift:63:16:63:19 | NameExpr | self |
| test.swift:65:29:65:37 | NameExpr | transform |
| test.swift:65:39:65:43 | NameExpr | value |
| test.swift:67:29:67:33 | NameExpr | error |
| test.swift:76:16:76:19 | NameExpr | self |
| test.swift:76:21:76:21 | NameExpr | i |
| test.swift:76:26:76:29 | NameExpr | self |
| test.swift:76:31:76:31 | NameExpr | i |
| test.swift:86:12:86:17 | NameExpr | values |
| test.swift:87:12:87:17 | NameExpr | values |
| test.swift:87:38:87:43 | NameExpr | values |
| test.swift:87:49:87:57 | NameExpr | transform |
unsupported
| test.swift:3:1:3:38 | | |
| test.swift:16:1:16:32 | | |
| test.swift:23:1:23:37 | | |
| test.swift:34:1:34:49 | | |
| test.swift:57:1:57:30 | | |
| test.swift:72:1:72:37 | | |
| test.swift:84:1:84:24 | | |

View File

@@ -0,0 +1,3 @@
| comments.swift:1:1:1:22 | // Hello this is swift | Hello this is swift |
| comments.swift:3:1:6:3 | /*\n * This is a multi-line comment\n * It should be ignored by the parser\n */ | \n * This is a multi-line comment\n * It should be ignored by the parser\n |
| comments.swift:9:5:9:36 | // This is a single-line comment | This is a single-line comment |

View File

@@ -0,0 +1,3 @@
import unified
query predicate comments(Comment c, string text) { text = c.getCommentText() }

View File

@@ -0,0 +1,11 @@
// Hello this is swift
/*
* This is a multi-line comment
* It should be ignored by the parser
*/
func hello() {
// This is a single-line comment
print("Hello, world!")
}