A slightly complicated test setup. I wanted to both make sure I captured
the semantics of Python and also the fact that the kinds of global flow
we expect to see are indeed present.
The code is executable, and prints out both when the execution reaches
certain files, and also what values are assigned to the various
attributes that are referenced throughout the program. These values are
validated in the test as well.
My original version used introspection to avoid referencing attributes
directly (thus enabling better error diagnostics), but unfortunately
that made it so that the model couldn't follow what was going on.
The current setup is a bit clunky (and Python's scoping rules makes it
especially so -- cf. the explicit calls to `globals` and `locals`), but
I think it does the job okay.
Note: I kept the modeling using the old approach with type-trackers
instead of `DataFlow::MethodCallNode`.
I would like a meta query for DCA to show sinks before doing this, so I
can be absolutely sure we don't loose out on any important sinks on
this... so will postpone this work to a small one-off task (added to my
todo list).
I realized the modeling was done in a non-recommended way, so I changed
the modeling. It was very nice that I could use API graphs for the flask
part, and a little sad when I couldn't for Django/Tornado.
It's really hard to audit that this is all good.. I tried my best with
`icdiff` though -- and there is a problem with
ql/src/experimental/Security/CWE-348/ClientSuppliedIpUsedInSecurityCheck.ql
that needs to be fixed in the next commit
The other callables return control flow nodes,
so it is slightly inconsistent for this to return a
data flow node, but it does make models based
on API graphs nicer.
to ensure that the call graph seen by type tracking
does not include summary calls resolved by type tracking.
(I tried inserting a similar test into the Ruby codebase,
and it still compiled)
To get this to compile, I had to move the resolution of summary calls
out of the data flow nodes and into the `viableCallable` predicate.
This means that we now have a potential summary call for each
cfg call node. (I tried using the base class, `DataFlowCall`, for this
but calls to `map` got identified as class calls and would no longer
be associated with a summary.)
It is possible that the "NonLIbrary"-layers the were inserted into the
hierarchy can be removed again.