Commit Graph

11 Commits

Author SHA1 Message Date
Taus
537d85c27d Python: Regenerate parser files 2025-12-16 23:58:40 +01:00
Taus
1e6adf5445 Python: Regenerate parser files 2025-12-16 23:58:40 +01:00
Taus
e9876aa0e6 Python: Add parser support for template strings
- Extends the scanner with a new token kind representing the start of a
template string. This is used to distinguish template strings from
regular strings (because only a template string will start with a
`_template_string_start` external token).

- Cleans up the logic surrounding interpolations (and the method names)
so that format strings and template strings behave the same in this
case.

Finally, we add two new node types in the tree-sitter grammar:

- `template_string` behaves like format strings, but is a distinct type
(mainly so that an implicit concatenation between template strings and
regular strings becomes a syntax error).
- `concatenated_template_string` is the counterpart of
`concatenated_string`.

However, internally, the string parts of a template strings are just the
same `string_content` nodes that are used in regular format strings. We
will disambiguate these inside `tsg-python`.
2025-12-16 23:58:40 +01:00
Taus
ad53518644 Python: Regenerate parser files 2025-06-26 15:34:44 +00:00
Taus
7124e80f28 Python: Regenerate parser files 2025-02-06 14:05:40 +00:00
Taus
e710c0a6bf Python: Regenerate parser files 2024-10-28 14:44:01 +00:00
Taus
1e51703ce9 Python: Allow escaped quotes/backslashes in raw strings
Quoting the Python documentation (last paragraph of
https://docs.python.org/3/reference/lexical_analysis.html#escape-sequences):

"Even in a raw literal, quotes can be escaped with a backslash, but the
backslash remains in the result; for example, r"\"" is a valid string
literal consisting of two characters: a backslash and a double quote;
r"\" is not a valid string literal (even a raw string cannot end in an
odd number of backslashes)."

We did not handle this correctly in the scanner, as we only consumed the
backslash but not the following single or double quote, resulting in
that character getting interpreted as the end of the string.

To fix this, we do a second lookahead after consuming the backslash, and
if the next character is the end character for the string, we advance
the lexer across it as well.

Similarly, backslashes in raw strings can escape other backslashes.
Thus, for a string like '\\' we must consume the second backslash,
otherwise we'll interpret it as escaping the end quote.
2024-10-28 14:40:24 +00:00
Taus
89ea4b8200 Python: Regenerate parser files 2024-10-22 15:39:41 +00:00
Taus
7ceefb509b Python: Regenerate parser files 2024-10-22 15:17:34 +00:00
Taus
6545bfffa7 Python: Regenerate parser files
Two new files -- alloc.h and array.h -- suddenly appeared. Presumably
they are used by the somewhat newer version of tree-sitter. To be safe,
I included them in this commit.
2024-10-15 11:22:31 +00:00
Taus
38169a981d Python: Shorten tree-sitter-python directory name
The current name results in a path that is more than 260 characters long,
and this causes issues for the build on Windows.
2024-03-19 17:11:40 +00:00