To ensure that this query works against numerous usages of libraries such as PyMongo, Flask PyMongo, Mongoengine, and Flask Mongoengine, I've added a variety of query tests to test against. These tests deal with scenarious such as:
- Subscript expressions
- Mongoengine instances and Document subclasses
- Mongoengine connection usage
- And more...
After further research, it was discovered that Flask-Mongoengine has multiple ways of allowing a developer to call the Document class. One way is by directly importing the Document class from the module. Another approach is to get the Document class via a mongoengine instance.
The update to this query checks for cases where the developer gets the Document class via the MongoEngine instance.
Other misc changes include setting the various predicates to private.
A couple of notes with these changes:
- Added TypeTracker pattern to handle subscript expressions. We've found that pymongo supports subscripts expressions when calling databases and collections. To resolve this, we implemented the TypeTracker pattern to catch those subscripts since CodeQL Python API modeling doesn't support subscript expressions.
- After some research, we've discovered that MongoEngine and Flask-MongoEngine utilize MongoClient under-the-hood. This requires us to rewrite the query so that instead of querying these libraries with specific queries, we are instead going to query for usages of MongoClient since all of the libraries we are targeting utilizes MongoClient under-the-hood.
ObjectId is a sanitizer used to sanitize strings into valid MongoDB ids. During research we've found that this method is used.
ObjectId returns a string representing an id. If at any time ObjectId can't parse it's input (like when a tainted dict in passed in), then ObjectId will throw an error preventing the query from running.
After some research, we discovered that any keyword argument passed to the objects method will result in NoSQL injection. This includes scenarios where we have the following:
objects(name_of_model_attribute=unsanitized_user_input)