Commit Graph

9673 Commits

Author SHA1 Message Date
Taus
2bc1c348c9 Python: Add support for PEP-758 exception syntax
See https://peps.python.org/pep-0758/ for more details.

We implement this by extending the syntax for exceptions and exception
groups so that the `type` field can now contain either an expression
(which matches the old behaviour), or a comma-separated list of at least
two elements (representing the new behaviour).

We model the latter case using a new node type `exception_list`, which
in `tsg-python` is simply mapped to a tuple. This means it matches the
existing behaviour (when the tuple is surrounded by parentheses)
exactly, hence we don't need to change any other code.

As a consequence of this, however, we cannot directly parse the Python
2.7 syntax `except Foo, e: ...` as `except Foo as e: ...`, as this would
introduce an ambiguity in the grammar. Thus, we have removed support for
the (deprecated) 2.7-style syntax, and only allow `as` to indicate
binding of the exception. The syntax `except Foo, e: ...` continues to
be parsed (in particular, it's not suddenly a syntax error), but it will
be parsed as if it were `except (Foo, e): ...`, which may not give the
correct results.

In principle we could extend the QL libraries to account for this case
(specifically when analysing Python 2 code). In practice, however, I
expect this to have a minor impact on results, and not worth the
additional investment at this time.
2025-12-16 23:58:40 +01:00
Taus
714fe13c70 Python: Add change note 2025-12-16 23:58:40 +01:00
Taus
7b4c65e82c Python: Add stats
Not actually based on any measurements, just the usual 100/1000 stuff.
2025-12-16 23:58:40 +01:00
Taus
7bc4a4d764 Python: Add up-/downgrade scripts for template literals
We do the usual thing. Downgrade scripts remove the relevant relations;
upgrade scripts do nothing.
2025-12-16 23:58:40 +01:00
Taus
defcadc359 Python: Bump extractor version 2025-12-16 23:58:40 +01:00
Taus
e497328d55 Python: Add AST node wrappers 2025-12-16 23:58:40 +01:00
Taus
73f560bd66 Python: Regenerate AST and dbscheme files 2025-12-16 23:58:40 +01:00
Taus
175250b505 Python: Support template strings in rest of extractor
Adds three new AST nodes to the mix:

- `TemplateString` represents a t-string in Python 3.14
- `TemplateStringPart` represents one of the string constituents of a
t-string. (The interpolated expressions are represented as `Expr` nodes,
just like f-strings.)
- `JoinedTemplateString` represents an implicit concatenation of
template strings.

Importantly, we _completely avoid_ the complicated construction we
currently do for format strings (as well as the confusing nomenclature).
No extra injection of empty strings (so that a template string is a
strict alternation of strings and expressions). A `JoinedTemplateString`
simply has a list of template string children, and a `TemplateString`
has a list of "values" which may be either `Expr` or
`TemplateStringPart` nodes.

If we ever find that we actually want the more complicated interface for
these strings, then I would much rather we reconstruct this inside of QL
rather than in the parser.
2025-12-16 23:58:40 +01:00
Taus
1e6adf5445 Python: Regenerate parser files 2025-12-16 23:58:40 +01:00
Taus
e9876aa0e6 Python: Add parser support for template strings
- Extends the scanner with a new token kind representing the start of a
template string. This is used to distinguish template strings from
regular strings (because only a template string will start with a
`_template_string_start` external token).

- Cleans up the logic surrounding interpolations (and the method names)
so that format strings and template strings behave the same in this
case.

Finally, we add two new node types in the tree-sitter grammar:

- `template_string` behaves like format strings, but is a distinct type
(mainly so that an implicit concatenation between template strings and
regular strings becomes a syntax error).
- `concatenated_template_string` is the counterpart of
`concatenated_string`.

However, internally, the string parts of a template strings are just the
same `string_content` nodes that are used in regular format strings. We
will disambiguate these inside `tsg-python`.
2025-12-16 23:58:40 +01:00
Óscar San José
d972af9ef8 Merge branch 'main' of https://github.com/github/codeql into oscarsj/mergeback-rc-3-20-into-main 2025-12-12 13:22:08 +01:00
yoff
5c6d83ed65 Merge pull request #20877 from joefarebrother/python-tornado-websocket
Python: Add models for websocket handlers for Tornado
2025-12-09 10:08:59 +01:00
github-actions[bot]
2854330759 Post-release preparation for codeql-cli-2.23.8 2025-12-08 15:49:10 +00:00
github-actions[bot]
66c51e979e Release preparation for version 2.23.8 2025-12-08 14:38:23 +00:00
Óscar San José
bc6133de5c Merge branch 'main' of https://github.com/github/codeql into oscarsj/merge-back-rc-3.20 2025-12-05 19:31:47 +01:00
Taus
1b519384d7 Merge pull request #20739 from github/tausbn/python-remove-top-level-points-to-imports
Python: Hide points-to imports in `python.qll`
2025-12-05 14:24:41 +01:00
Joe Farebrother
d70c596c86 Merge pull request #20914 from joefarebrother/python-socketio
Python: Add models for socketio
2025-12-04 23:14:58 +00:00
Anders Schack-Mulligen
607ad1f886 Merge pull request #20961 from aschackmull/dataflow/flowfrom
Dataflow: Add flowFrom predicates to mirror flowTo.
2025-12-04 10:09:29 +01:00
yoff
7fd4755e93 Merge pull request #20919 from yoff/python/header-splitting-experiments
Python: detecting header splitting in synthetic app
2025-12-03 15:48:54 +01:00
Anders Schack-Mulligen
78e1879c9e Use more flowTo. 2025-12-03 14:12:08 +01:00
github-actions[bot]
085faa2bdb Post-release preparation for codeql-cli-2.23.7 2025-12-02 16:39:43 +00:00
github-actions[bot]
a045b317ac Release preparation for version 2.23.7 2025-12-02 15:31:27 +00:00
Joe Farebrother
7cf3964e44 Update expectations 2025-12-01 20:27:48 +00:00
github-actions[bot]
19a13467e0 Release preparation for version 2.23.7 2025-12-01 16:07:37 +00:00
Asger F
b8cff77cab Merge pull request #20873 from github/shared-xml-discard
Share XML discard predicates
2025-12-01 10:06:02 +01:00
Asger F
6257bed089 Sync OverlayXml.qll 2025-11-28 09:23:49 +01:00
Taus
ec336a0334 Python: Fix list bullets in change note
Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>
2025-11-27 17:49:13 +01:00
Taus
bc8ed286ac Python: Make some more points-to imports private
This makes things a bit cleaner.

After this, the only non-private (and non-`LegacyPointsTo`) imports of
`semmle.python.{types,objects,pointsto}.*` are in
`semmle.python.objects.ObjectInternal`, which is reasonable, as that is
the entry point for the entire internal object API.
2025-11-27 16:47:53 +00:00
Taus
0c358acc24 Merge pull request #20908 from akoeplinger/patch-1
Fix KeyError: 'name' in python/extractor/imp.py on Python 3.14
2025-11-27 15:29:54 +01:00
Taus
f55ff96674 Python: Bump extractor version and add change note 2025-11-27 13:52:37 +00:00
Taus
a7458df0a4 Python: Appease the QLDoc checker 2025-11-26 22:13:21 +00:00
Taus
c6ad438bfc Python: Add change note 2025-11-26 21:58:26 +00:00
Taus
24a29f46be Python: Fix all metrics-related compilation failures
In hindsight, having a `.getMetrics()` method that just returns `this`
is somewhat weird. It's possible that it predates the existence of the
inline cast, however.
2025-11-26 21:28:51 +00:00
Taus
c75329d7b7 Python: Move metrics-related API to LegacyPointsTo module
Gets rid of the `getMetrics` methods on the `Function`, `Class`, and
`Module` classes. To access the metrics, one must first import the
`LegacyPointsTo` module, and then either change the type to
`{Function,Class,Module}Metrics` or cast to the appropriate type.
2025-11-26 17:06:55 +00:00
Taus
cd1619b43e Python: Fix queries and tests 2025-11-26 17:06:55 +00:00
Taus
b9a5b3b628 Python: Remove points-to from SSA.ql
Happily, this was not as deeply entwined as it looked at first glance.
2025-11-26 17:06:55 +00:00
Joe Farebrother
16018e91a2 Minor test fix 2025-11-26 15:47:56 +00:00
Felicity Chapman
caf6b950ac Remove trailing periods from @name metadata in query files
Fixed 73 .ql query files where the @name metadata contained an ending period.
This ensures consistency with the CodeQL query metadata style guidelines.
2025-11-26 14:29:51 +00:00
yoff
2c835dc33c python: add changenote 2025-11-26 14:03:15 +01:00
yoff
24e55c0691 python: update MAD expectations 2025-11-26 14:00:22 +01:00
yoff
ebe29dd143 python: model urllib.ParseResult 2025-11-26 13:36:05 +01:00
yoff
a878bc61e1 python: add model for urllib.urlparse 2025-11-26 13:32:54 +01:00
yoff
d59f721341 python: add test for header injection 2025-11-26 13:32:54 +01:00
Taus
5b47fcbfa4 Python: Remove dependence on Builtins from attribute module
The `Builtins` module is deeply entwined with points-to, so it would be
nice to not have this dependence. Happily, the only thing we used
`Builtin` for was to get the names of known builtins, and for this we
already maintain such a set of names in
`dataflow.new.internal.Builtins`.
2025-11-26 12:30:31 +00:00
Taus
9dc774aaa3 Python: Remove points-to dependency from parts of SSA
For whatever reason, the CFG node for exceptions and exception groups
was placed with the points-to code. (Probably because a lot of the
predicates depended on points-to.)

However, as it turned out, two of the SSA modules only depended on
non-points-to properties of these nodes, and so it was fairly
straightforward to remove the imports of `LegacyPointsTo` for those
modules.

In the process, I moved the aforementioned CFG node types into
`Flow.qll`, and changed the classes in the `Exceptions` module to the
`...WithPointsTo` form that we introduced elsewhere.
2025-11-26 12:30:31 +00:00
Taus
e09840426c Python: Get rid of points-to from Definitions.qll
Turns out the `ImportTime` module (despite living in
`semmle.python.types` does not actually depend on points-to, so some of
the `LegacyPointsTo` imports could be replaced or removed.
2025-11-26 12:30:31 +00:00
Taus
7328f26311 Python: Fix reachability-related test failures 2025-11-26 12:30:31 +00:00
Taus
21e74a3f01 Python: Fully remove points-to from Flow.qll
Gets rid of a bunch of predicates relating to reachability (which
depended on the modelling of exceptions, which uses points-to), moving
them to `LegacyPointsTo`. In the process, we gained a new class
`BasicBlockWithPointsTo`.
2025-11-26 12:30:31 +00:00
Taus
7176898503 Python: Fix library tests 2025-11-26 12:30:31 +00:00
Taus
b3b87c968b Python: Fix extractor/experimental tests 2025-11-26 12:30:31 +00:00