codeql

mirror of https://github.com/github/codeql.git synced 2026-07-29 06:46:46 +02:00

Author	SHA1	Message	Date
Taus	fc5b3562c3	Python: Add parser test for comprehensions with unpacking	2026-04-14 13:27:31 +02:00
Taus	ad4018f399	Python: Add parser support for lazy imports As defined in PEP-810. We implement this in much the same way as how we handle `async` annotations currently. The relevant nodes get an `is_lazy` field that defaults to being false.	2026-04-10 13:50:43 +00:00
Taus	12ee93042b	Python: Add tests	2026-02-05 13:47:24 +00:00
Taus	2c83b296a4	Python: Add parser test Note in particular that the `exceptions.py` test is unaffected.	2026-01-06 13:40:38 +00:00
Taus	28e733e335	Python: Support template strings in rest of extractor Adds three new AST nodes to the mix: - `TemplateString` represents a t-string in Python 3.14 - `TemplateStringPart` represents one of the string constituents of a t-string. (The interpolated expressions are represented as `Expr` nodes, just like f-strings.) - `JoinedTemplateString` represents an implicit concatenation of template strings. Importantly, we _completely avoid_ the complicated construction we currently do for format strings (as well as the confusing nomenclature). No extra injection of empty strings (so that a template string is a strict alternation of strings and expressions). A `JoinedTemplateString` simply has a list of template string children, and a `TemplateString` has a list of "values" which may be either `Expr` or `TemplateStringPart` nodes. If we ever find that we actually want the more complicated interface for these strings, then I would much rather we reconstruct this inside of QL rather than in the parser.	2025-12-16 23:57:58 +01:00
Nora Dimitrijević	29b1a7403b	Support CODEQL_PATH_TRANSFORMER env var in python path renamer The new name is required by overlay support.	2025-10-06 11:37:02 +02:00
Taus	9802ad77dc	Python: Update `types_new.py` and test output	2025-09-02 12:41:57 +00:00
Taus	b108d47b26	Python: Update parser test output It seems that with a newer version of tree-sitter, we no longer parse the (not actually valid!) syntax `Spam[P2]` as if the `` is an exponentiation operation (with a missing left operand).	2025-09-02 12:41:55 +00:00
Taus	e04821e9e3	Python: Allow use of `match` as an identifier This previously only worked in certain circumstances. In particular, assignments such as `match[1] = ...` or even just `match[1]` would fail to parse correctly. Fixing this turned out to be less trivial than anticipated. Consider the fact that ``` match [1]: case (...) ``` can either look the start of a `match` statement, or it could be a type ascription, ascribing the value of `case(...)` (a call) to the item at index 1 of `match`. To fix this, then, we give `match` the identifier and `match` the statement the same precendence in the grammar, and additionally also mark a conflict between `match_statement` and `primary_expression`. This causes the conflict to be resolved dynamically, and seems to do the right thing in all cases.	2025-06-26 15:33:00 +00:00
Taus	c5be2a3e2d	Python: Allow comments in subscripts Once again, the interaction between anchors and extras (specifically comments) was causing trouble. The root of the problem was the fact that in `a[b]`, we put `b` in the `index` field of the subscript node, whereas in `a[b,c]`, we additionally synthesize a `Tuple` node for `b,c` (which matches the Python AST). To fix this, we refactored the grammar slightly so as to make that tuple explicit, such that a subscript node either contains a single expression or the newly added tuple node. This greatly simplifies the logic.	2025-02-06 14:04:57 +00:00
Taus	2892f0ff48	Merge pull request #17873 from github/tausbn/python-fix-generator-expression-locations Python: Even more parser fixes	2024-11-01 12:47:19 +01:00
Taus	f75615b913	Merge pull request #17822 from github/tausbn/python-more-parser-fixes Python: A few more parser fixes	2024-10-30 13:47:10 +01:00
Taus	5d6600e61f	Python: Fix generator expression locations Our logic for detecting the first and last item in a generator expression was faulty, sometimes matching comments as well. Because attributes (like `_location_start`) can only be written once, this caused `tree-sitter-graph` to get unhappy. To fix this, we now require the first item to be an `expression`, and the last one to be either a `for_in_clause` or an `if_clause`. Crucially, `comment` is neither of these, and this prevents the unfortunate overlap.	2024-10-28 14:53:09 +00:00
Taus	ef60b730ea	Python: Fix parenthesized tuple parser bug We were writing the `parenthesised` attribute twice on tuples, once because of the explicit parenthetisation, and once because all non-empty tuples are parenthesised. This made `tree-sitter-graph` unhappy. To fix this, we now explicitly check whether a tuple is already parenthesised, and do nothing if that is the case.	2024-10-28 14:49:45 +00:00
Taus	b4ecc7937d	Python: Fix some more `async` parsing problems Turns out we were not setting the `is_async` field on anything except `async for` statements. This commit makes it so that we also do this for `async def` and `async with`, and adds a test that this produces the same behaviour as the old parser.	2024-10-28 14:44:02 +00:00
Taus	1e51703ce9	Python: Allow escaped quotes/backslashes in raw strings Quoting the Python documentation (last paragraph of https://docs.python.org/3/reference/lexical_analysis.html#escape-sequences): "Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes)." We did not handle this correctly in the scanner, as we only consumed the backslash but not the following single or double quote, resulting in that character getting interpreted as the end of the string. To fix this, we do a second lookahead after consuming the backslash, and if the next character is the end character for the string, we advance the lexer across it as well. Similarly, backslashes in raw strings can escape other backslashes. Thus, for a string like '\\' we must consume the second backslash, otherwise we'll interpret it as escaping the end quote.	2024-10-28 14:40:24 +00:00
Taus	5db601af3c	Python: Allow comments in comprehensions A somewhat complicated solution that necessitated adding a new custom function to `tsg-python`. See the comments in `python.tsg` for why this was necessary.	2024-10-23 14:24:47 +00:00
Taus	4f60494019	Python: Support assignments of the form `[x,y,z] = w` Surprisingly, the new parser did not support these constructs (and the relevant test was missing this case), so on files that required the new parser we were unable to parse this construct. To fix it, we add `list_pattern` (not to be confused with `pattern_list`) as a `tree-sitter-python` node that results in a `List` node in the AST.	2024-10-22 16:06:35 +00:00
Taus	9c913902c5	Python: Allow `except` to be written as `except ` Turns out, `except` is actually not a token on its own according to the Python grammar. This means it's legal to write `except foo: ...`, which we previously would consider a syntax error. To fix it, we simply break up the `except*` into two separate tokens.	2024-10-22 15:39:29 +00:00
Taus	8053e0ed44	Python: Allow `list_splat`s as type annotations That is, the `T` in `def foo(args : *T): ...`. This is apparently a piece of syntax we did not support correctly until now. In terms of the grammar, we simply add `list_splat` as a possible alternative for `type` (which could previously only be an `expression`). We also update `python.tsg` to not specify `expression` those places (as the relevant stanzas will then not work for `list_splat`s). This syntax is not supported by the old parser, hence we only add a new parser test for it.	2024-10-22 15:17:12 +00:00
Taus	fcec8e0256	Python: Fail tests when errors/warnings are logged This is primarily useful for ensuring that errors where a node does not have an appropriate context set in `python.tsg` actually have an effect on the pass/fail status of the parser tests. Previously, these would just be logged to stdout, but test could still succeed when there were errors present. Also fixes one of the logging lines in `tsg_parser.py` to be more consistent with the others.	2024-10-22 15:11:51 +00:00
Taus	9803bbdc4b	Python: Update class parser test	2024-10-21 15:35:48 +00:00
Taus	819b3d77ab	Python: Update test expectations Note that this still includes the somewhat puzzling parsing of `Spam[**P2]` as an exponentiation with an empty left hand side. When we fix that bug, we should also update this test to contain actually valid syntax.	2024-10-15 11:22:33 +00:00
Taus	1ced5b44d7	Python: Add test for type parameter defaults	2024-10-15 11:22:30 +00:00
yoff	0b0e8a4bf5	Update python/extractor/tests/parser/.gitignore As suggested by @tausbn	2024-10-09 12:22:17 +02:00
Rasmus Lerchedahl Petersen	ad630bc6ff	Python: ignore some extractor test output If you test the extractor locally, you want to ignore these files.	2024-10-09 11:34:58 +02:00
Taus	04c9ed37a7	Python: Fix reference in unit test The referenced file lives in the internal repo, so this is perhaps a bit of a hack, but I think it should be fine in the short run.	2024-03-19 17:11:40 +00:00
Taus	6dec323cfc	Python: Copy Python extractor to `codeql` repo	2024-03-07 13:59:16 +00:00

28 Commits