codeql

mirror of https://github.com/github/codeql.git synced 2026-05-14 11:19:27 +02:00

Author	SHA1	Message	Date
Taus	15936a5f8d	yeast: Take fields by ownership in apply_rules_inner Previously, apply_rules_inner snapshotted a node's fields by cloning the BTreeMap into a Vec<(FieldId, Vec<Id>)>, then built a fresh BTreeMap of new_fields for the rewritten Ids. For a node with N fields, this allocated 2N+1 things per visit (the snapshot Vec, N cloned children Vecs, the new BTreeMap entries) — even when nothing in the subtree was rewritten. Use std::mem::take to swap the parent's fields out by ownership: the recursion can mutate the AST (including pushing new nodes from rule firings) without any conflict, since we hold the owned BTreeMap locally. Iterate values_mut() and only allocate a fresh children Vec on the first divergence (lazy alloc): unchanged children stay in the existing slot. When done, swap the fields back. For a subtree with no rewrites, this is now zero allocations per node (modulo the recursion itself). For nodes with rewrites, it's one Vec allocation per field that contains a rewritten child, instead of two plus the BTreeMap rebuild.	2026-05-08 12:48:10 +00:00
Taus	7bd27b83e0	yeast: Mutate parent fields in place; remove redundant Node::id apply_rules_inner used to handle the "child was rewritten, so the parent needs new field IDs" case by cloning the parent node, swapping in the new fields, pushing the clone onto the arena, and returning the new Id. Every ancestor on the path from the rewrite up to the root was duplicated this way, with the originals retained as garbage in the arena. Switch to in-place mutation: assign `ast.nodes[id].fields = new_fields` and return the same Id. Rule firings still produce genuinely new nodes via BuildCtx (their structure differs from the input), but the ancestor-rebuild spine no longer copies anything. This is safe because apply_rules_inner already works entirely by Id: the field snapshot is cloned out before recursing, no &Node references are held across mutations of the arena, and captures are scoped to a single rule firing so the now-stable Ids do not break anything. Memory effect: a desugaring pass that rewrites R leaves of a tree of average depth d previously appended R*d ancestor clones to the arena. Now appends 0. With Ids stable for the lifetime of an Ast, the Node::id field becomes truly redundant and is removed (along with the Node::id() accessor). AstCursor switches from caching `node: &Node` to tracking `node_id: Id` and looking the node up via the arena on each access; ChildrenIter now yields Ids directly. A new AstCursor::node_id() method gives callers access to the cursor position by Id.	2026-05-08 12:47:22 +00:00
Owen Mansel-Chan	36554d160c	Merge pull request #21741 from MarkLee131/fix/path-injection-read-subkind Fix/path injection read subkind	2026-05-08 12:38:16 +01:00
Taus	5a4dee50f7	Merge pull request #21810 from github/tausbn/yeast-forward-scan-queries yeast: Align query semantics more closely with tree-sitter	2026-05-08 13:30:43 +02:00
Anders Schack-Mulligen	81e1ab7aab	Merge pull request #21808 from aschackmull/cfg/switch-pattern-eval Cfg: Rework CFG for switch case patterns.	2026-05-08 12:48:44 +02:00
Anders Schack-Mulligen	048411e168	Apply suggestions from code review Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>	2026-05-08 08:11:32 +02:00
Taus	b027ac3658	Merge pull request #21809 from github/tausbn/yeast-add-support-for-desugaring-phases Yeast: Two small improvements	2026-05-07 19:00:44 +02:00
MarkLee131	26af52897d	Merge branch 'main' into fix/path-injection-read-subkind	2026-05-07 23:48:42 +08:00
Taus	af6e921da5	yeast: Forward-scan bare child patterns instead of strict positional Previously, a bare child pattern in a query took whatever the next child of the iterator was and either matched or failed: it would not scan ahead to find a match. So `(foo ("baz"))` against a `foo` whose implicit `child` field was `["bar", "baz"]` would fail (the pattern took "bar" first). Switch to forward-scan semantics: a SingleNode matcher advances through the iterator until it finds a child that matches its sub-query. Patterns that are named-only continue to skip past unnamed children for free. Order is preserved across multiple bare patterns at the same level — each pattern advances the shared iterator past whatever it consumed — so a query cannot match children out of source order. Captures from a failed match attempt are rolled back via a snapshot, so partial captures from a complex sub-query do not leak across attempts. Add two regression tests against the `do` body wrapper in a Ruby for-loop, whose implicit `child` field contains [do, identifier, end]: - a query for ("end") matches by skipping past `do` and the identifier - a query for ("end") then ("do") fails, demonstrating order preservation	2026-05-07 15:08:22 +00:00
Taus	6f643a3604	yeast: Use canonical ID when registering unnamed kinds in Schema Schema::from_language registered unnamed kinds via or_insert(id), where `id` came from iterating 0..node_kind_count. For names with multiple unnamed IDs (notably "end" in tree-sitter-ruby has IDs 0 and 13, where ID 0 is the reserved error token), this picked the first encountered ID — typically the wrong one. The visitor sets node.kind via language.id_for_node_kind(name, false), which returns the canonical ID. So a query for ("end") would compare node.kind=13 against schema=0 and silently fail to match, with no diagnostic. Use language.id_for_node_kind(name, false) to obtain the canonical ID when registering, mirroring the named-kind path that already does the same with id_for_node_kind(name, true).	2026-05-07 15:08:21 +00:00
Taus	a4df96aad6	yeast: Support capturing unnamed nodes in queries Three improvements to the query parser, all aimed at allowing query patterns to refer to unnamed tokens: 1. Bare-literal capture: `"=" @op` now captures the unnamed `=` token, matching the parenthesized form `("=") @op`. Previously the literal branch in parse_query_list skipped the maybe_wrap_capture call, so the `@op` was a leftover token and would error. 2. Bare `_` matches any node, named or unnamed. Previously bare `_` and `(_)` both produced QueryNode::Any with the same matches_named_only behaviour, so bare `_` would skip unnamed children. Now Any carries a match_unnamed flag: false for `(_)` (named-only, tree-sitter default) and true for bare `_` (any node). 3. Named fields and bare child patterns may be intermixed in any order. Previously, once parse_query_fields saw a bare pattern it would stop accepting named fields. The fix accumulates bare patterns into the implicit `child` field and keeps parsing. Each named field independently selects its target field for matching, so the source-order of fields in the query is purely cosmetic and intermixing is safe. Add tests covering parenthesized capture, bare-literal capture, and the named-vs-any distinction between `(_)` and bare `_`. Update query-syntax docs to reflect all three.	2026-05-07 15:08:21 +00:00
Paolo Tranquilli	f9e42ac443	Merge pull request #21794 from github/post-release-prep/codeql-cli-2.25.4 Post-release preparation for codeql-cli-2.25.4	2026-05-07 14:43:24 +02:00
copilot-swe-agent[bot]	e0d663f79b	yeast: address review wording in phase docs Agent-Logs-Url: https://github.com/github/codeql/sessions/6d23db05-a6e9-4de4-8951-b465980fd0ef Co-authored-by: tausbn <1104778+tausbn@users.noreply.github.com>	2026-05-07 12:35:46 +00:00
Anders Schack-Mulligen	48785a0a76	Cfg: Rework CFG for switch case patterns.	2026-05-07 13:07:07 +02:00
MarkLee131	e8553c7449	Merge branch 'main' into fix/path-injection-read-subkind	2026-05-07 18:11:45 +08:00
Taus	957c89b478	yeast: Support multi-phase desugaring via DesugaringConfig::add_phase Extend the desugaring config from a single flat list of rules to an ordered sequence of named Phases. Each phase runs to completion (a full traversal applying its rules) before the next phase starts. Rules in different phases never compete for matches. The config is built via the new chainable API: DesugaringConfig::new() .add_phase("cleanup", cleanup_rules) .add_phase("desugar", desugar_rules) .with_output_node_types_yaml(yaml); Single-phase configs are just .add_phase(...) called once. A single FreshScope is shared across phases so generated identifier names (e.g. $tmp-N) are unique throughout the run. Phase names appear in error messages, e.g. "Phase `desugar`: exceeded maximum rewrite depth". Add two regression tests: one verifying basic two-phase chained desugaring, and one verifying that errors include the failing phase name.	2026-05-06 21:17:31 +00:00
Taus	9a94836974	yeast: Add per-rule .repeated() flag to opt into iterative matching Previously, after a rule fired the engine would always re-try that same rule on the result root. A rule whose output matched its own query (intentionally or by accident) would loop until the global MAX_REWRITE_DEPTH safety net kicked in. Make the default behavior fire-once-per-node: after a rule fires on node N, the engine no longer tries that same rule on the result root. Other rules and child traversal are unaffected. Rules that intentionally rewrite iteratively can opt into the old behavior via the new Rule::repeated() builder method. Add two regression tests using a self-swapping assignment rule: - with .repeated(), the swap loops and trips the depth limit - without it (default), the swap fires once and terminates	2026-05-06 12:33:18 +00:00
Taus	a0a0e9e9a7	yeast: Add test for chained rules with output-only kinds Adds a regression test verifying that desugaring rules can chain across output-only node kinds: a first rule rewrites an input kind to an output-only kind, and a second rule then rewrites that output-only kind into another output-only kind. This exercises the schema lookup for query patterns whose root kind is not present in the input tree-sitter grammar.	2026-05-06 11:45:53 +00:00
Taus	60dcf88b50	yeast: Add Bazel build rules for yeast crates Add BUILD.bazel files for the yeast and yeast-macros crates, register them as dependencies of the shared tree-sitter extractor, and refresh the vendored crate dependencies via update_tree_sitter_extractors_deps.sh.	2026-05-06 11:34:09 +00:00
Taus	82bbdee832	yeast: Support separate output node types in extractor generator Language and LanguageSpec gain optional output_node_types field. When set, the generator produces dbscheme/QL from the output types and the extractor validates TRAP against them. All existing extractors pass None (no behavior change). Ruby extract() calls gain vec![] for the new rules parameter. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Taus	9ad431dea1	yeast: Integrate yeast with shared tree-sitter extractor extract() gains a rules parameter. When empty, uses tree-sitter native traversal (no behavior change). When non-empty, runs yeast desugaring and extracts via traverse_yeast. Adds AstNode trait abstracting over tree_sitter::Node and yeast::Node, with minimal changes to existing Visitor methods (Node -> &N in 6 signatures). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Taus	cc28ff9a48	yeast: Add yeast documentation Covers architecture, query language, template language (tree!/trees!/rule!), capture semantics, fresh identifiers, and extractor integration. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Taus	6e580446fd	yeast: Add yeast test suite 12 tests covering parsing, queries, tree building, desugaring rules, cursor navigation, and the shorthand rule! syntax. Tests use a custom output node-types.yml with named fields for all children (parameter, stmt, index), loaded via schema_from_yaml_with_language. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Taus	4c5548363c	yeast: Add AST dumper for human-readable tree output Produces indented text showing node kinds, named fields, and leaf content. Unnamed tokens are hidden unless inside a named field. Used by tests for readable assertions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Taus	8a9e53cc58	yeast: Add YAML node-types format and converter Human-friendly YAML alternative to tree-sitter node-types.json with three sections: supertypes, named, unnamed. Supports bidirectional conversion and building Schema objects from YAML. Includes CLI binary (node_types_yaml) and documentation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Taus	04f587190e	yeast: AST desugaring framework with proc-macro DSL YEAST (YEAST Elaborates Abstract Syntax Trees) is a framework for transforming tree-sitter parse trees before CodeQL extraction. Core components: - shared/yeast/ — Ast, Node, Schema, query matching engine, captures, FreshScope, BuildCtx - shared/yeast-macros/ — proc macros: query!, tree!, trees!, rule! The query language is inspired by tree-sitter queries: (assignment left: (_) @lhs right: (_) @rhs) Templates support embedded Rust ({expr}), splicing ({..expr}), computed literals (#{expr}), and fresh identifiers ($name). The rule! macro combines query and transform: rule!((for pattern: (_) @pat ...) => (call receiver: {val} ...)) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-06 11:34:09 +00:00
Tom Hvitved	00fb11b028	Merge pull request #21778 from hvitved/rust/type-inference-verbose-type-path-expectations Rust: Use verbose type paths in inline expectation comments	2026-05-05 20:23:25 +02:00
github-actions[bot]	7610277199	Post-release preparation for codeql-cli-2.25.4	2026-05-05 10:10:06 +00:00
github-actions[bot]	88e1d86c27	Release preparation for version 2.25.4	2026-05-05 09:34:30 +00:00
Tom Hvitved	4c1461ad5b	Merge pull request #21786 from hvitved/inline-test-ignore-tags Inline test expectations: Rename `tagIsOptional` to `tagIsIgnored`	2026-05-05 09:01:58 +02:00
Tom Hvitved	80ccdcc696	Inline test expectations: Rename `tagIsOptional` to `tagIsIgnored`	2026-05-04 11:21:33 +02:00
Anders Schack-Mulligen	e0421dbf53	C#: Reinstate toString for SSA data flow nodes.	2026-04-30 13:56:16 +02:00
Tom Hvitved	e1cd708c75	Rust: Use verbose type paths in inline expectation comments	2026-04-30 13:54:09 +02:00
MarkLee131	936f0c650c	Address review comments on path-injection[read] sub-kind - shared/mad/codeql/mad/ModelValidation.qll: shorten the comment for `path-injection[%]` to `// Java-only currently`, matching the style of other language-scoped entries and dropping API examples and the java/zipslip reference. - java/ql/lib/semmle/code/java/security/ZipSlipQuery.qll: replace the `File.exists` example in the QLDoc with `FileReader`, since `File.exists` is still labelled plain `path-injection`, not `path-injection[read]`.	2026-04-30 19:06:04 +08:00
MarkLee131	90741b15e2	Merge branch 'main' into fix/path-injection-read-subkind	2026-04-30 18:37:12 +08:00
Tom Hvitved	e14b654e8a	Update shared/controlflow/codeql/controlflow/ControlFlowGraph.qll Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>	2026-04-29 14:57:35 +02:00
Tom Hvitved	cbe207ab65	C#: Include parameters and their defaults in the CFG	2026-04-29 14:01:09 +02:00
Tom Hvitved	cbc12324bb	Merge pull request #21703 from hvitved/rust/type-inference-sibling Rust: Refine `implSiblings`	2026-04-24 12:36:51 +02:00
Tom Hvitved	c64223ae56	Merge pull request #21748 from hvitved/shared/remove-deprecated Shared: Remove deprecated code	2026-04-23 14:44:17 +02:00
Tom Hvitved	90ae086822	Shared: Remove deprecated code	2026-04-23 11:24:14 +02:00
Tom Hvitved	1a84b2b555	CFG: Use dense ranking	2026-04-23 11:22:38 +02:00
Tom Hvitved	71fa2166ee	Apply suggestions from code review Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>	2026-04-22 17:06:31 +02:00
Tom Hvitved	39cd86a48e	C#: Move handling of callables into shared control flow library	2026-04-22 14:11:57 +02:00
Tom Hvitved	e60275c4de	Rust: Refine `implSiblings` Consider two implementations of the same trait to be siblings when the type being implemented by one is an instantiation of the type being implemented by the other.	2026-04-22 13:32:56 +02:00
Anders Schack-Mulligen	f912731cd4	Merge pull request #21565 from aschackmull/csharp/cfg2 C#: Replace CFG with the shared implementation	2026-04-21 15:50:38 +02:00
Kaixuan Li	07e97e20d8	Merge branch 'github:main' into fix/path-injection-read-subkind	2026-04-21 22:59:53 +10:00
Anders Schack-Mulligen	67c0515d3c	Cfg: Undo consistency check change.	2026-04-21 13:10:03 +02:00
Anders Schack-Mulligen	9de02b7ae6	Cfg: Use consistent casing in additional node tags.	2026-04-21 10:56:10 +02:00
MarkLee131	c336a1595d	Java: split read-only path sinks into path-injection[read] Introduce a new Models-as-Data sink sub-kind path-injection[read] for models that only read from or inspect a path. The general java/path-injection query and its PathInjectionSanitizer barrier continue to consider both path-injection and path-injection[read] sinks, so no alerts are lost. The java/zipslip query deliberately selects only path-injection sinks, since read-only accesses such as ClassLoader.getResource or FileInputStream are outside the archive extraction threat model. Addresses https://github.com/github/codeql/issues/21606 along the lines proposed on the issue thread: prefer path-injection[read] over a [create] sub-kind so that miscategorizing a sink causes a false positive (easy to spot) rather than a false negative. - shared/mad/codeql/mad/ModelValidation.qll: allow path-injection[...] as a valid sink kind. - java/ql/lib/ext/*.model.yml: relabel the models that PR #12916 migrated from the historical read-file kind (plus the newer ClassLoader resource-lookup variants that share the same read-only semantics). - java/ql/lib/semmle/code/java/security/TaintedPathQuery.qll and PathSanitizer.qll: select both path-injection and path-injection[read] sinks/barriers. - java/ql/lib/semmle/code/java/security/ZipSlipQuery.qll: keep only path-injection, with a comment explaining why path-injection[read] is excluded. - java/ql/test/query-tests/security/CWE-022/semmle/tests/ZipTest.java: add m7 regression covering the Dubbo-style classpath lookup from issue #21606 and assert no alert is produced. - Update TaintedPath.expected for the renamed kinds in the models list. - Add change-notes under java/ql/lib/change-notes and java/ql/src/change-notes.	2026-04-21 09:17:36 +10:00
Anders Schack-Mulligen	e928c224ae	C#/Cfg: Some simple review fixes.	2026-04-20 14:43:27 +02:00

1 2 3 4 5 ...

1847 Commits