This one is a bit more involved. Of note is the fact that it at present
only uses local flow when determining the origin of some value (whereas
the points-to version used global flow). It may be desirable to rewrite
this query to use global data-flow, but this should be done with some
care (as using "all unhashable objects" as the set of sources is
somewhat iffy with respect to performance). For that reason, I'm
sticking to mostly local flow (except for well behaved things like types
and built-ins).
Uses the new `DuckTyping` module to handle recognising whether a class
is a container or not. Only trivial test changes (one version uses
"class", the other "Class").
Note that the ported query has no understanding of built-in classes. At
some point we'll likely want to replace `hasUnresolvedBase` (which will
hold for any class that extends a built-in) with something that's aware
of the built-in classes.
Removes the use of points-to for accessing various built-ins from three
of the queries. In order for this to work I had to extend the lists of
known built-ins slightly.
For now, these have just been made into `private` imports. After doing
this, I went through all of the (now not compiling) files and added in
private imports to the modules that they actually depended on.
I also added an explicit import of `LegacyPointsTo` (even though it may
be unnecessary) in cases where the points-to dependency was somewhat
surprising (and one we want to get rid of). This was primarily inside
the various SSA layers.
For modules inside `semmle.python.{types, objects, pointsto}` I did not
bother, as these are fairly clearly related to points-to.
Moves the existing points-to predicates to the newly added class
`ControlFlowNodeWithPointsTo` which resides in the `LegacyPointsTo`
module.
(Existing code that uses these predicates should import this module, and
references to `ControlFlowNode` should be changed to
`ControlFlowNodeWithPointsTo`.)
Also updates all existing points-to based code to do just this.
Excluded for now: unnecassary-delete; since the pattern is often intentional to break reference cycles, which the query doesn't account for; so uncertain about its claim of high precision
So it better matches what is in `py/code-injection`. I had my doubts
about CWE-95, but after reading
https://owasp.org/www-community/attacks/Direct_Dynamic_Code_Evaluation_Eval%20Injection
I think it's fine to add CWE-95 as well 👍
Definitions are:
CWE-78: Improper Neutralization of Special Elements used in an OS
Command ('OS Command Injection')
CWE-94: Improper Control of Generation of Code ('Code Injection')
CWE-95: Improper Neutralization of Directives in Dynamically Evaluated
Code ('Eval Injection')