The `SimpleScalarSanitizer` class represents common scalar types which
cannot realistically carry taint (e.g. primitives/numbers, and
eventually UUIDs and Dates)
1. Introduces the `SourceNode` class which allows dataflow nodes
representing sources to indicate the threat model they are associated
with.
2. Introduces the `ThreatModelFlowSource` class which represents a
source node which respects the threat model configuration
Or, more generally, any copy step, as these presumably do not preserve
object identity.
(Arguably, `copy` could still be susceptible to interior mutability, but
I think that's outside the scope of this query anyway.)
There are two issues with `deepcopy` here. Firstly, the `deepcopy` function itself
has a mutable default value in its parameter `_nil` (set to the empty list by default).
Now, this value is never actually returned from `deepcopy`, as it is only used as a
sentinel, but our analysis is not clever enough to see this. Thus, it thinks that this
mutable default is returned, and hence the result of any call to `deepcopy` is a
potential source.
To remedy this, I opted to simply exclude all sources that originate from within the
standard library. It is very unlikely for any of the sources in the standard library
to be legit.
Secondly, `deepcopy` -- by virtue of being a function that we model as preserving
values -- admits data-flow through its calls, but this is not correct for the mutable
default query, as it is here the _identity_ of the default value in question that is
important. Thus, we get spurious flow through `deepcopy` for this specific query.
Moves the existing tests into the `ModificationOfParameterWithDefault` subdirectory
which already contained a bunch more tests. In the process, I also removed some
duplicated test cases.
Since some sanitisers don't handle backslashes correctly, I updated the data-flow configuration to incorporate a flow state tracking whether or not backslashes have been eliminated or converted to forward slashes.