Unsurprisingly, the only thing affected by this was the `import-helper`
tests. These have lost all of the results relating to `ImportMember`s,
but apart from that the underlying behaviour should be the same.
I also limited the test to only `CfgNode`s, as a bunch of `EssaNode`s
suddenly appeared when I switched to API graphs.
Finally, I used `API::moduleImport` with a dotted name in the type
tracking tests. This goes against the API graphs interface, but I think
it's more correct for this use case, as these type trackers are doing
the "module attribute lookup" bit manually.
Moves the current test out of `test.py`, as otherwise any unknown global
(like, say, `sink`) would _also_ be considered to be something
potentially defined in `unknown`.
Splits `ModuleVariableNode` away from `LocalSourceNode`, instead
creating a class `TypeTrackingNode` that encapsulates both of these.
This means we no longer have module variable nodes as part of
`LocalSourceNode` (which is good, since they have no "local" aspect to
them), and hence we can have `LocalSourceNode` inherit directly from
`ExprNode` (which makes the API a bit nicer).
Unfortunately these are breaking changes, so we can't actually fulfil
the above two desiderata until the `track` and `backtrack` methods on
`LocalSourceNode` have been fully deprecated. For this reason, we
preserve the present implementation of `LocalSourceNode`, and instead
lay the foundation for switching over in the future, by deprecating
`track` and `backtrack` on `LocalSourceNode`.
along with tests, but no implementations (to ease reviewing).
---
I've put quite some thinking into what to call our concept for this.
[JS has `CookieDefinition`](581f4ed757/javascript/ql/src/semmle/javascript/frameworks/HTTP.qll (L148-L187)), but I couldn't find a matching concept in any other languages.
We used to call this [`CookieSet`](f07a7bf8cf/python/ql/src/semmle/python/web/Http.qll (L76)) (and had a corresponding `CookieGet`).
But for headers, [Go calls this `HeaderWrite`](cd1e14ed09/ql/src/semmle/go/concepts/HTTP.qll (L97-L131)) and [JS calls this `HeaderDefinition`](581f4ed757/javascript/ql/src/semmle/javascript/frameworks/HTTP.qll (L23-L46))
I think it would be really cool if we have a naming scheme that means the name for getting the value of a header on a incoming request is obvious. I think `HeaderWrite`/`HeaderRead` fulfils this best. We could go with `HeaderSet`/`HeaderGet`, but they feel a bit too vague to me. For me, I'm so used to talking about def-use, that I would immediately go for `HeaderDefinition` and `HeaderUse`, which could work, but is kinda strange.
So in the end that means I went with `CookieWrite`, since that allows using a consistent naming scheme for the future :)
I made `FileSystemWriteAccess` be a subclass of `FileSystemAccess` (like in [JS](64001cc02c/javascript/ql/src/semmle/javascript/Concepts.qll (L68-L74))), but then I started wondering about how I could give a good result for `getAPathArgument`, and what would a good result even be? The argument to the `open` call, or the object that the `write` method is called on? I can't see how doing either of these enables us to do anything useful...
So I looked closer at how JS uses `FileSystemWriteAccess`:
1. as sink for zip-slip: 7c51dff0f7/javascript/ql/src/semmle/javascript/security/dataflow/ZipSlipCustomizations.qll (L121)
2. as sink for downloading unsafe files (identified through their extension) through non-secure connections: 89ef6ea4eb/javascript/ql/src/semmle/javascript/security/dataflow/InsecureDownloadCustomizations.qll (L134-L150)
3. as sink for writing untrusted data to a local file 93b1e59d62/javascript/ql/src/semmle/javascript/security/dataflow/HttpToFileAccessCustomizations.qll (L43-L46)
for the 2 first sinks, it's important that `getAPathArgument` has a proper result... so that solves the problem, and highlights that it _can_ be important to give proper results for `getAPathArgument` (if possible).
So I'm trying to do best effort for `f = open(...); f.write(...)`, but with this current code we won't always be able to give a result (as highlighted by the tests). It will also be the case that there are multiple `FileSystemAccess` with the same path-argument, which could be a little strange.
overall, I'm not super confident about the way this new concept and implementation turned out, but it also seems like the best I could come up with right now...
The obvious alternative solution is to NOT make `FileSystemWriteAccess` a subclass of `FileSystemAccess`, but I'm not very tempted to go down this path, given the examples of this being useful above, and just the general notion that we should be able to model writes as being a specialized kind of `FileSystemAccess`.