This did turn into a few changes, that maybe could have been split into
separate PRs 🤷
* Rename `ClickHouseDriver` => `ClickhouseDriver`, to better follow
import name in `.qll` name
* Rewrote modeling to use API graphs
* Split modeling of `aioch` into separate `.qll` file, which does re-use
the `getExecuteMethodName` predicate. I feel that sharing code between
the modeling like this was the best approach, and stuck the
`INTERNAL: Do not use.` labels on both modules.
* I also added handling of keyword arguments (see change in .py files)
This required quite some changes in the expected output. I think it's much more
clear what the selected nodes are now 👍 (but it was a bit boring work to fix
this up)
This was an unwanted interaction between two unrelated tests, so I
switched to a different built-in in the second test. I also added a test
case that shows an unfortunate side effect of this more restricted
handling of built-ins.
After researching SqlAlchemy and it's various query methods, I discovered several types of SQL injection possibilities.
The SQLExecution.py file contains these examples and can be broken up into two types of injections. Injections requiring the text() taint-step and injections NOT requiring the text() taint step.
Which I had done locally. Problem is the same about not having PostUpdateNode
when points-to is not able to resolve the call, so I'm happy to just make CI
happy right now, and hopefully we'll get a fix to the underlying problem soon 😊
The problem with `tainted_filelike` not having taint, is that in the call
`ujson.dump(tainted_obj, tainted_filelike)`
there is no PostUpdateNote for `tainted_filelike` :( The reason is that
points-to is not able to resolve the call, so none of the clauses in
`argumentPreUpdateNode` matches
See 08731fc6cf/python/ql/src/semmle/python/dataflow/new/internal/DataFlowPrivate.qll (L101-L111)
Let's deal with that issue in an other PR though
I noticed that we don't handle PostUpdateNote very well in the concept tests,
for exmaple for `json.dump(...)` there _should_ have been an `encodeOutput` as
part of the inline expectations.
I'll work on fixing that up in a separate PR, to keep things clean.
To ensure that this query works against numerous usages of libraries such as PyMongo, Flask PyMongo, Mongoengine, and Flask Mongoengine, I've added a variety of query tests to test against. These tests deal with scenarious such as:
- Subscript expressions
- Mongoengine instances and Document subclasses
- Mongoengine connection usage
- And more...