Commit Graph

26 Commits

Author SHA1 Message Date
Taus
fd7b123ee3 Python: Add overlay annotations to AST classes
... and everything else that it depends on.
2026-02-16 13:48:32 +00:00
Taus
a7458df0a4 Python: Appease the QLDoc checker 2025-11-26 22:13:21 +00:00
Taus
f0465f441f Python: Get rid of some get...Object methods
This frees `Class.qll`, `Exprs.qll`, and `Function.qll` from the
clutches of points-to. For the somewhat complicated setup with
`getLiteralObject` (an abstract method), I opted for a slightly ugly but
workable solution of just defining a predicate on `ImmutableLiteral`
that inlines each predicate body, special-cased to the specific instance
to which it applies.
2025-11-26 12:30:30 +00:00
Taus
b93ce98612 Python: Remove points-to from Expr 2025-10-30 13:58:59 +00:00
Taus
b434ce460e Python: Get rid of getLiteralValue
This had only two uses in our libraries, so I simply inlined the
predicate body in both places.
2025-10-30 13:30:04 +00:00
Taus
fef08afff9 Python: Remove points-to to from ControlFlowNode
Moves the existing points-to predicates to the newly added class
`ControlFlowNodeWithPointsTo` which resides in the `LegacyPointsTo`
module.

(Existing code that uses these predicates should import this module, and
references to `ControlFlowNode` should be changed to
`ControlFlowNodeWithPointsTo`.)

Also updates all existing points-to based code to do just this.
2025-10-30 13:30:04 +00:00
Taus
c6c6a857df Python: Add tests
Also fixes an issue with the return type annotations that caused these
to not work properly.

Currently, annotated assignments don't work properly, due to the fact
that our flow relation doesn't consider flow going to the "type" part of
an annotated assignment. This means that in `x : Foo`, we do correctly
note that `x` is annotated with `Foo`, but we have no idea what `Foo`
is, since it has no incoming flow.

To fix this we should probably just extend the flow relation, but this
may need to be done with some care, so I have left it as future work.
2025-07-11 12:03:14 +00:00
Taus
d1cf7f0624 Python: Support type annotations in call graph
Adds support for tracking instances via type annotations. Also adds a
convenience method to the newly added `Annotation` class,
`getAnnotatedExpression`, that returns the expression that is annotated
with the given type. For return annotations this is any value returned
from the annotated function in question.

Co-authored-by: Napalys Klicius <napalys@github.com>
2025-07-11 12:03:14 +00:00
Taus
50a01b1244 Python: Remove superfluous reference to FunctionExpr
This way we also get annotations that appear in `Lambda`s
2025-03-04 15:53:34 +00:00
Taus
88615f427b Python: Add support for forward declarations in unused var query
Fixes the false positive reported in
https://github.com/github/codeql/issues/18910

Adds a new `Annotation` class (subclass of `Expr`) which encompasses all
possible kinds of annotations in Python.

Using this, we look for string literals which are part of an annotation,
and which have the same content as the name of a (potentially) unused
global variable, and in that case we do not produce an alert.

In future, we may want to support inspecting such string literals more
deeply (e.g. to support stuff like "list[unused_var]"), but I think for
now this level of support is sufficient.
2025-03-04 14:41:45 +00:00
Taus
81246cd41a Python: Add missing QLDoc for isUnicode 2024-04-22 12:08:53 +00:00
Taus
f6487d7b13 Python: Rename StrConst to StringLiteral
Does a few things:
- Renames `StrConst` to `StringLiteral`, and deprecates the former.
- Also deprecates `Str`.
- Adds an override of `StringLiteral::toString` making it output
`"StringLiteral"` rather than the inherited `"Str"`. This ensures that
the AST viewer shows these nodes as the former type, not the latter.

There are a large number of uses of `StrConst` in the codebase. These
will be fixed in a later commit.
2024-04-22 12:00:09 +00:00
Rasmus Wriedt Larsen
41ce1c2016 Python: getStarArg gives first *args argument
I couldn't see any reason that we should give up altogether if there are
multiple `*args` arguments. Including the first one looks like a win to
me!
2022-09-12 17:02:31 +02:00
Taus
1c15fc5600 Python: Define Str as an alias of StrConst 2022-08-17 13:36:32 +00:00
Taus
bde47836d0 Python: Add Str class
This makes the AST viewer (which annotates string constant nodes as
`Str`) a bit more consistent.
2022-07-19 12:25:10 +00:00
Taus
95d235416c Python: Fix bad antijoin in getAKeyword
Before:

```
Tuple counts for Exprs::Call::getAKeyword_dispred#ff#antijoin_rhs/3@7bc202ij after 9s:
1        ~0%     {1} r1 = CONSTANT(unique int)[2]
4244385  ~2%     {1} r2 = JOIN r1 WITH py_dict_items_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg0'
4244352  ~3%     {3} r3 = JOIN r2 WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg1', Lhs.0 'arg0', Rhs.2 'arg2'
66618690 ~3%     {5} r4 = JOIN r3 WITH AstGenerated::Call_::getNamedArg_dispred#ffb ON FIRST 1 OUTPUT Lhs.1 'arg0', Lhs.0 'arg1', Lhs.2 'arg2', Rhs.1, Rhs.2
31187133 ~0%     {5} r5 = SELECT r4 ON In.3 < In.2 'arg2'
31187133 ~1%     {5} r6 = SCAN r5 OUTPUT In.4, 0, In.0 'arg0', In.1 'arg1', In.2 'arg2'
0        ~0%     {3} r7 = JOIN r6 WITH py_dict_items ON FIRST 2 OUTPUT Lhs.2 'arg0', Lhs.3 'arg1', Lhs.4 'arg2'
                 return r7

Tuple counts for Exprs::Call::getAKeyword_dispred#ff/2@1dc9468b after 421ms:
1       ~0%     {1} r1 = CONSTANT(unique int)[2]
4244385 ~2%     {1} r2 = JOIN r1 WITH py_dict_items_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'result'
4244352 ~0%     {3} r3 = JOIN r2 WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Lhs.0 'result', Rhs.1 'this', Rhs.2
4244352 ~0%     {3} r4 = r3 AND NOT Exprs::Call::getAKeyword_dispred#ff#antijoin_rhs(Lhs.0 'result', Lhs.1 'this', Lhs.2)
4244352 ~6%     {2} r5 = SCAN r4 OUTPUT In.1 'this', In.0 'result'
                return r5
```

Oof. All that work to produce zero tuples. Luckily we can improve
matters somewhat.

Basically, there's no reason to test _all_ dictionary unpackings, since
we're only interested in a lower bound. Thus, we can use `min` instead
which is much more efficient. For convenience I factored this into its
own (private) helper predicate.

Now the tuple counts look as follows:

```
Tuple counts for Exprs::Call::getMinimumUnpackingIndex_dispred#ff#min_range/2@39b0e9sm after 1ms:
246 ~0%     {2} r1 = JOIN Keywords::DictUnpackingOrKeyword#class#f#shared WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg0', Rhs.2 'arg1'
            return r1
Registering Exprs::Call::getMinimumUnpackingIndex_dispred#ff#min_range/2@39b0e9sm +  with content 9ea2f123k8necpu015v6tpsc2t1
 >>> Created relation Exprs::Call::getMinimumUnpackingIndex_dispred#ff#min_range/2@39b0e9sm with 246 rows.
Starting to evaluate predicate Exprs::Call::getMinimumUnpackingIndex_dispred#ff#min_term/3@9f4ca5g8
Tuple counts for Exprs::Call::getMinimumUnpackingIndex_dispred#ff#min_term/3@9f4ca5g8 after 0ms:
246 ~2%     {3} r1 = JOIN Keywords::DictUnpackingOrKeyword#class#f#shared WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg0', Rhs.2 'arg2', Rhs.2 'arg2'
            return r1

Tuple counts for Exprs::Call::getAKeyword_dispred#ff/2@000a0alb after 906ms:
1       ~0%     {1} r1 = CONSTANT(unique int)[2]
4244385 ~2%     {1} r2 = JOIN r1 WITH py_dict_items_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'result'

4244352 ~0%     {3} r3 = JOIN r2 WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Lhs.0 'result', Rhs.1 'this', Rhs.2
4244280 ~0%     {3} r4 = r3 AND NOT Exprs::Call::getMinimumUnpackingIndex_dispred#ff_0#antijoin_rhs(Lhs.1 'this')
4244280 ~6%     {2} r5 = SCAN r4 OUTPUT In.1 'this', In.0 'result'

4244352 ~3%     {3} r6 = JOIN r2 WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Rhs.1 'this', Lhs.0 'result', Rhs.2
72      ~4%     {4} r7 = JOIN r6 WITH Exprs::Call::getMinimumUnpackingIndex_dispred#ff ON FIRST 1 OUTPUT Lhs.1 'result', Lhs.0 'this', Lhs.2, Rhs.1
72      ~4%     {4} r8 = SELECT r7 ON In.2 <= In.3
72      ~0%     {2} r9 = SCAN r8 OUTPUT In.1 'this', In.0 'result'

4244352 ~6%     {2} r10 = r5 UNION r9
                return r10
```

This is not the perfect join order (note the similarity between `r3`
and `r6`) but overall it's a win.
2022-04-28 11:11:37 +00:00
Erik Krogh Kristensen
0da80f90d3 rename the SSA stages to AST 2022-03-30 22:54:00 +02:00
Erik Krogh Kristensen
60b5af215f cached stages iteration 2 2022-03-30 22:53:59 +02:00
Erik Krogh Kristensen
a86f0afb3c delete all deprecations that are over 14 months old 2022-03-09 18:28:07 +01:00
Taus
d2603884ca Python: Fix a bunch of class QLDoc 2022-03-07 18:59:49 +00:00
Taus
af7f532212 Python: Fix up a bunch of function QLDoc 2022-03-07 18:59:49 +00:00
Taus
095f27f294 Python: Remove deprecated annotations 2022-03-04 12:30:26 +00:00
Rasmus Lerchedahl Petersen
de8ecb214f python: Wrappers for database classes
- new syntactic category `Pattern` (in `Patterns.qll`)
- subpatterns available on statments
- new statements `MatchStmt` and `Case`
  (`Match` would conflict with the shared ReDoS library)
- new expression `Guard`
- support for pattern lists
2022-01-19 14:29:58 +01:00
Anders Schack-Mulligen
310eec07c1 Java/Python: Fix some potential performance problems due to transitive deltas. 2021-10-14 16:10:00 +02:00
Taus
a9c8163ab3 Python: Fix uses of implicit this
Quoting the style guide:

"14. _Always_ qualify _calls_ to predicates of the same class with
`this`."
2021-10-13 13:43:36 +00:00
Andrew Eisenberg
3660c64328 Packaging: Rafactor Python core libraries
Extract the external facing `qll` files into the codeql/python-all
query pack.
2021-08-24 13:23:45 -07:00