Such that we can reuse the existing modeling, but have it globally
applied as a threat-model as well.
I Basically just moved the modeling. One important aspect is that this
changes is that the previously query-specific `argsParseStep` is now a
globally applied taint-step. This seems reasonable, if someone applied
the argument parsing to any user-controlled string, it seems correct to
propagate that taint for _any_ query.
Integration with RemoteFlowSource is not straightforward, so postponing
that for later
Naming in other languages:
- `SourceNode` (for QL only modeling)
- `ThreatModelFlowSource` (for active sources from QL or data-extensions)
However, since we use `LocalSourceNode` in Python, and `SourceNode` in
JS (for local source nodes), it seems a bit confusing to follow the same
naming convention as other languages, and instead I came up with new names.
A number of CPP library tests contain `// query-type: graph`
annotations that make the test driver compare the output
from the test query in a special mode. (This feature is
not used by other languages).
It's somewhat awkward in the implementation of `codeql test run`
that this annotation is not an ordinary item of query metadata --
essentially it means that _every_ test query has to be opened
and read an extra time to look for this annotation. I'd like
to move towards using ordinary query metadata for this, since
the QL compiler already parses it anyway.
For the time being, give the annotation in both old and new
syntaxes, until a CLI that recognizes both has been released.
On Windows, make's path resolution algorithm is incorrect.
It picks up a bazel.exe in PATH that's _after_ a bazel binary.
In particular, on actions, the non-exe binary is a bazelisk
instance, whereas bazel.exe is a bazel (at the current time 7.3.2)
installation.
This means we pick up the wrong bazel version, and
if the differences between the bazel we want and that we actually
get are too big, the build fails.
This caused a dataset check error on the `python/cpython` database, as
we had a `DictUnpacking` node whose parent was not a `dict_item_list`,
but rather an `expr_list`.
Investigating a bit further revealed that this was because in a
construction like
```python
class C[T](base, foo=bar, **kwargs): ...
```
we were mistakenly adding `**kwargs` to the same list as `base` (which
is just a list of expressions), rather than the same list as `foo=bar`
(which is a list of dictionary items)
The ultimate cause of this was the use of `! name` in `python.tsg` to
distinguish between bases and keyword arguments (only the latter of
which have the `name` field). Because `dictionary_splat` doesn't have a
`name` field either, these were mistakenly put in the wrong list,
leading to the error.
Also, because our previous test of `class` statements did not include a
`**kwargs` construction, we were not checking that the new parser
behaved correctly in this case. For the most part this was not a
problem, but on files that use syntax not supported by the old parser
(like type parameters on classes), this became an issue. This is also
why we did not see this error previously.
To fix this, we added `! value` (which is a field present on
`dictionary_splat` nodes) as a secondary filter, and added a third
stanza to handle `dictionary_splat` nodes.