The example code is just copied from command injection tests, that is not too
important. The important part is that `jumpStep` says there is flow from the
import of `os` to `app.route()` :O
\## TaintFlow sources
The class `RemoteFlowSource` is very similarly defined as the other languages [C++](ac22e7950c/cpp/ql/src/semmle/code/cpp/security/FlowSources.qll), [Java](6de612a566/java/ql/src/semmle/code/java/dataflow/FlowSources.qll), [C#](fddbce0b7b/csharp/ql/src/semmle/code/csharp/security/dataflow/flowsources/Remote.qll), [JS](78334af354/javascript/ql/src/semmle/javascript/security/dataflow/RemoteFlowSources.qll), and [Go](24b3133e0c/ql/src/semmle/go/security/FlowSources.qll). There are some minor differences:
- Java/C++ defines the class in `FlowSources.qll`
- C# uses `csharp/ql/src/semmle/code/csharp/security/dataflow/flowsources/Remote.qll`, and provide `StoredFlowSource` and `LocalFlowSource` in separate classes.
- JS uses `RemoteFlowSources.qll`.
- JS defines additional predicate `RemoteFlowSource.isUserControlledObject`
- Go uses the class name `UntrustedFlowSource`, but still defined in `ql/src/semmle/go/security/FlowSources.qll`
- Go uses the `::Range` pattern to allow both extensibility and refinement
The big difference is how a RemoteFlowSource is specified:
- Java and C# have all subclasses of `RemoteFlowSource` defined in the same file
- Go and JS defines subclasses for frameworks in the actual framework `.qll` file, and all frameworks are transitively imported by `import go` or `import javascript` (so subclasses are always in scope).
- C++ uses class `RemoteFlowFunction` to do all the heavy lifting (and its subclasses are transitively imported).
\### What we will do
Use file `RemoteFlowSource.qll`, define subclasses in framework library classes.
_Why? Personally I really like it, Go/JS is already doing it, and Tom expressed a preference for doing the same for C# (although that is not what they are doing today)._
Jonas gave this advice:
> Whether you split the definitions between multiple files or keep them all in one file, the property you want is that all definitions are included when the abstract class is included. Otherwise you can get unexpected results via transitive includes.
We will make imports of all frameworks in the same file that defines `RemoteFlowSource`, as it seems to be the least intrusive change. If that turns out to be a problem, we can also move them to `python.qll` (the other way is not so easy).
\## TaintFlow sinks
[JS](473787a426/javascript/ql/src/semmle/javascript/Concepts.qll) and [Go](ecff1e6a16/ql/src/semmle/go/Concepts.qll) defines abstract base classes for interesting sinks in `Concepts.qll` (and all uses the `::Range` pattern in Go).
I really like this idea, since it allows multiple queries to reuse the same sink definitions, and it makes it _easy_ to discover what default sinks are available.
Personally I'm not 100% on board with the naming, but I don't have any good reason to change the naming convention.
\## Framework modeling
Following the model from Go ([example](https://github.com/github/codeql-go/blob/main/ql/src/semmle/go/frameworks/Gin.qll)), I propose that we make every definition in a framework modeling `private`. This allows some greater flexibility in changing our modeling, since we don't need to think about keeping deprecated versions around for a whole year.
It _does_ have the downside that someone writing a query can't reuse the classes/predicates for a framework, but it didn't seem to be too big of a concern. If we need to provide access, we can always make the definitions non-private (the other way is not so easy).
\## Customizations
Also introduced `Customizations.qll` like in JS/Java/Go (to replace `site.qll`)
Nodes to which we track type tracking flow from the source (any
identifier named `tracked`) are indicated with a `$tracked` tag, and
`$tracked=attr_name` if the attribute is for the specified attribute
of the given node.
For nodes that do have flow from `tracked`, I indicate this in one of
two ways:
- If it's expected due to the design of type tracking, I omit the
`$tracked tag.
- If it's flow that _ought_ to be there, I indicate it as a false
negative: `$f-:tracked`
Currently, only an instance of global flow is in the latter category.