More in general, the managed renderer flow does things more sensibly
in case an exception is thrown:
* it will not remove any file
* it will drop already written files from the registry, so that codegen
won't be skipped for those files during the next run
This was causing hardly debuggable errors because names are transformed
to underscored lowercase names in the dbscheme and back to camelcase
for trap emission classes, which is not a noop in case uppercase
acronyms (like SIL or ABI) are in the name.
This makes the error be surfaced early with a helpful message.
This is a developer QoL improvement, where running codegen will skip
writing (and especially formatting) any files that were not changed.
**Why?** While code generation in itself was pretty much instant, QL
formatting of generated code was starting to take a long time. This made
unconditionally running codegen quite annoying, for example before each
test run as part of an IDE workflow or as part of the pre-commit hook.
**How?** This was not completely straightforward as we could not work
with the contents of the file prior to code generation as that was
already post-processed by the QL formatting, so we had no chance of
comparing the output of template rendering with that. We therefore store
the hashes of the files _prior_ to QL formatting in a checked-in file
(`swift/ql/.generated.list`). We can therefore load those hashes at
the beginning of code generation, use them to compare the template
rendering output and update them in this special registry file.
**What else?** We also extend this mechanism to detect accidental
modification of generated files in a more robust way. Before this patch,
we were doing it with a rough regexp based heuristic. Now, we just store
the hashes of the files _after_ QL formatting in the same checked file,
so we can check that and stop generation if a generated file was
modified, or a stub was modified without removing the `// generated`
header.
In those kinds of tests the results may have different final classes
that are not necessarily visible (or tested) solely through the string
representation. For better testing and reading of expected results,
`getQlPrimaryClasses` is added in these cases.
The information that was contained in `schema.yml` is now in
`swift/schema.py`, which allows a more integrated IDE experience
for writing and navigating it.
Another minor change is that `schema.Class` now has a `str` `group`
field instead of a `pathlib.Path` `dir` one.
This makes the result of code generation independent of the order
in which classes are defined in the schema, and makes additional
topological sorting not required.
Being independent from schema order will be important for reviewing the
move to a pure python schema, as generated code will be left untouched.
There was an issue in case multiple inheritance from classes with
children was involved, where indexes would overlap.
The generated code structure has been reshuffled a bit, with
`Impl::getImmediateChildOf<Class>` predicates giving 0-based children
for a given class, including those coming from bases, and the final
`Impl::getImmediateChild` disjuncting the above on final classes only.
This removes the need of `getMaximumChildrenIndex<Class>`, and also
removes the code scanning alerts.
Also, comments were fixed addressing the review.
This replaces numeric tag-based prefixes with the actual tag name.
While this means in general slightly larger trap files, it aids
debugging them for a human.
In the future we can make this conditional on some kind of trap debug
option, but for the moment it does not seem detrimental.
Some fields of base classes pose some problems with diamond hierarchies,
and we don't use them any way as we are emitting them using directly
trap entries instead of structured C++ classes.
This introduces a `cpp_skip` pragma to skip generation of those fields
in structured generated C++ classes, and applies it to `is_unknown` and
`location`.
It turns out the threshold of 5 lines for stub modification detection
was too strict: in case of a long class name the QL formatter will put
the closing brace of the empty class definition on a new line, leading
to codegen fail with an error thinking the stub was modified.
On the other side of things, also adding a base to a stub class was not
being detected as a modification.
Now the modification test is slightly smarter. If the stub still marked
as generated and
* has more than 6 lines, or
* the contents does not match a regexp aproximation of a plain stub
then codegen will abort. The test will still avoid reading the whole
contents of all the stubs.
If one modifies a QL stub but forgets to remove the `// generated`
header comment, codegen will now abort with an error rather than
silently reverting the change.
This is based on the rough heuristic of just counting the lines. If any
change is done to the stub class, the number of lines is bound to be
5 or more.