This required quite some changes in the expected output. I think it's much more
clear what the selected nodes are now 👍 (but it was a bit boring work to fix
this up)
- Removes the version check on the set of built-in names.
- Renames the predicate used to represent said set.
- Documents how these lists of names were obtained.
- Gets rid of a superfluous import.
This is a proposed method for advertising which queries are measuring
the lines of code in a project in a more robust manner than inspecting
the rule id.
Note that the python "LinesOfUserCode" query should _not_ have this
property, as otherwise the results of the two queries will be summed.
This was an unwanted interaction between two unrelated tests, so I
switched to a different built-in in the second test. I also added a test
case that shows an unfortunate side effect of this more restricted
handling of built-ins.
This is _slightly_ wrong, since `exec` isn't a built-in function in
Python 2. It should be harmless, however, since `exec` is a keyword,
and so cannot be redefined anyway.
I am very tempted to leave out the constants, or at the very least
`False`, `True`, and `None`, as these have _many_ occurrences in the
average codebase, and are not terribly useful at the API-graph level.
If we really do want to capture "nodes that refer to such and such
constant", then I think a better solution would be to create classes
extending `DataFlow::Node` to facilitate this.
After researching SqlAlchemy and it's various query methods, I discovered several types of SQL injection possibilities.
The SQLExecution.py file contains these examples and can be broken up into two types of injections. Injections requiring the text() taint-step and injections NOT requiring the text() taint step.
Which I had done locally. Problem is the same about not having PostUpdateNode
when points-to is not able to resolve the call, so I'm happy to just make CI
happy right now, and hopefully we'll get a fix to the underlying problem soon 😊
The problem with `tainted_filelike` not having taint, is that in the call
`ujson.dump(tainted_obj, tainted_filelike)`
there is no PostUpdateNote for `tainted_filelike` :( The reason is that
points-to is not able to resolve the call, so none of the clauses in
`argumentPreUpdateNode` matches
See 08731fc6cf/python/ql/src/semmle/python/dataflow/new/internal/DataFlowPrivate.qll (L101-L111)
Let's deal with that issue in an other PR though
I noticed that we don't handle PostUpdateNote very well in the concept tests,
for exmaple for `json.dump(...)` there _should_ have been an `encodeOutput` as
part of the inline expectations.
I'll work on fixing that up in a separate PR, to keep things clean.