Going all the way to the AST layer seemed excessive to me, so I rewrote
it to do most of the logic at the data-flow layer. In principle this
_could_ result in more names being computed (due to splitting), but in
practice I don't expect this make a big difference.
According to the official documentation, the purpose of `__main__.py`
files is that their presence in a package (say, `foo`) means one can
execute the package directly using `python -m foo` (which will run the
aforementioned `foo/__main__.py` file).
In principle this means that adding `if __name__ == "__main__"` in these
files is superfluous, as they are only intended to be executed (and not
imported by some other file).
However, in practice people often _do_ include the above construct.
Here are some instances of this on LGTM.com:
https://lgtm.com/query/7521266095072095777/
In particular, 10 out of 33 files in `cpython` have this construct.
This causes some confusion in our module naming, as we usually see the
presence of `__name__ == "__main__"` as an indication that a file may
be run directly (and hence with "absolute import" semantics). However,
when run with `python -m`, the interpreter uses the usual package
semantics, and this leads to modules getting multiple names.
For this reason, I think it makes sense to simply exclude `__main__.py`
files from consideration. Note that if there is a `#!` line mentioning
the Python interpreter, then they will still be included as entry
points.
Just like our support for the `PyYAML` PyPI package that you import with
`import yaml` is in `Yaml.qll`.
Since this file does not provide any public predicates/modules, it
should be safe to rename it.
So it better matches what is in `py/code-injection`. I had my doubts
about CWE-95, but after reading
https://owasp.org/www-community/attacks/Direct_Dynamic_Code_Evaluation_Eval%20Injection
I think it's fine to add CWE-95 as well 👍
Definitions are:
CWE-78: Improper Neutralization of Special Elements used in an OS
Command ('OS Command Injection')
CWE-94: Improper Control of Generation of Code ('Code Injection')
CWE-95: Improper Neutralization of Directives in Dynamically Evaluated
Code ('Eval Injection')
This seems like something we have been missing for a while now, so I
figured it might be useful to add. It is roughly based on the JavaScript
equivalent, with one major difference: in the JavaScript libraries,
`getAMethodCall` is reserved for syntactic method calls (`obj.m(...)`)
whereas `getAMemberInvocation` is used for both this and the case where
the bound method `obj.m` is stored in a temporary variable and then
subsequently invoked in the same local scope.
It seems to me that the more general predicate is more useful, and hence
should have the simpler name. (And also we don't really work with a
notion of "invocation" in the Python libraries, so we would need a
better name for it anyway.)
I think as long as the documentation makes the behaviour clear, it
should be okay.