This is a developer QoL improvement, where running codegen will skip
writing (and especially formatting) any files that were not changed.
**Why?** While code generation in itself was pretty much instant, QL
formatting of generated code was starting to take a long time. This made
unconditionally running codegen quite annoying, for example before each
test run as part of an IDE workflow or as part of the pre-commit hook.
**How?** This was not completely straightforward as we could not work
with the contents of the file prior to code generation as that was
already post-processed by the QL formatting, so we had no chance of
comparing the output of template rendering with that. We therefore store
the hashes of the files _prior_ to QL formatting in a checked-in file
(`swift/ql/.generated.list`). We can therefore load those hashes at
the beginning of code generation, use them to compare the template
rendering output and update them in this special registry file.
**What else?** We also extend this mechanism to detect accidental
modification of generated files in a more robust way. Before this patch,
we were doing it with a rough regexp based heuristic. Now, we just store
the hashes of the files _after_ QL formatting in the same checked file,
so we can check that and stop generation if a generated file was
modified, or a stub was modified without removing the `// generated`
header.
The information that was contained in `schema.yml` is now in
`swift/schema.py`, which allows a more integrated IDE experience
for writing and navigating it.
Another minor change is that `schema.Class` now has a `str` `group`
field instead of a `pathlib.Path` `dir` one.
This adds:
* a base `README.md` file to `codegen`
* module docstrings for the modules in `generators`
* help strings on all command line flags
Moreover some unneeded command line flags (`--namespace`,
`--include-dir` and `--trap-affix`) have been dropped.
Python code was simplified, and now a `--generate` option can be used
to drive what can be generated.
The extractor pack creation now will use an internally generated
dbscheme. This should be the same as the checked in one, but doing so
allows `bazel run create-extractor-pack` and `bazel run codegen` to be
run independently from one another, while previously the former had to
follow the latter in case of a schema change. This is the change that
triggered the above simplification, as in order for the two dbscheme
files to be identical, the first `// generated` line had to state the
same generator script.
This is solving a papercut, where the C++ build was relying on the
local dbscheme file to be up-to-date, even if all the information for
building is actually in `schema.yml`. This made a pure C++ development
cycle with changes to `schema.yml` clumsy, as it required a further
dbscheme generation step.
Now for C++ the dbscheme is generated internally in the build files, and
thus a change in `schema.yml` is reflected immediately in the C++ build.
A `swift/codegen` step for checked in generated code (including the
dbscheme) is still required, but a developer can do it just before
running QL tests or committing, instead of during each C++
recompilation.
Some directory reorganization was also carried out, moving specific
generator modules to a new `generators` python package, and only leaving
the two drivers at the top level.
This checks in the trapgen script generating trap entries in C++.
The codegen suite has been slightly reorganized, moving the templates
directory up one level and chopping everything into smaller bazel
packages. Running tests is now done via
```
bazel run //swift/codegen/test
```
With respect to the PoC, the nested `codeql::trap` namespace has been
dropped in favour of a `Trap` prefix (or suffix in case of entries)
within the `codeql` namespace. Also, generated C++ code is not checked
in in git any more, and generated during build. Finally, labels get
printed in hex in the trap file.
`TrapLabel` is for the moment only default-constructible, so only one
single label is possible. `TrapArena`, that is responsible for creating
disjoint labels will come in a later commit.
This patch introduces the basic infrastructure of the code generation
suite and the `dbscheme` generator.
Notice that the checked in `schema.yml` should reflect swift 5.6 but
might need some tweaking.
Closes https://github.com/github/codeql-c-team/issues/979