* visiting now happens in a later stage than fetching labels. While
fetching a list of entities to be visited is created, and then acted
upon in actual extraction. This partially flattens the recursive
nature of `fetchLabel` into a loop inside `SwiftVisitor::extract`.
Recursion in `fetchLabel` will only happen on labels fetched while
naming an entity (calling into `SwiftMangler`).
* The choice whether to name a declaration or type has been moved from
the translators to `SwiftMangler`. Acting on this choice is contained
in `SwiftDispatcher::createLabel`.
* The choice whether to emit a body of a declaration has been moved from
`DeclTranslator` to the dispatcher. This choice is also contained in
`SwiftDispatcher::createLabel`.
* The simple functionality of the `LabelStore` has been moved to the
`SwiftDispatcher` as well.
It's not much cleaner due to arithmetic to convert truncating division to a ceiling, but has two advantages:
1. It doesn't suffer from rounding issues with large TRAP labels. This is largely theoretical, but does let us handle `undefined` uniformly.
2. It should be much faster (using LZCNT/BSR instead of floating point arithmetic). This is probably not a performance bottleneck, so *shrug*.
The information that was contained in `schema.yml` is now in
`swift/schema.py`, which allows a more integrated IDE experience
for writing and navigating it.
Another minor change is that `schema.Class` now has a `str` `group`
field instead of a `pathlib.Path` `dir` one.
Firstly, this change reworks how inter-process races are resolved.
Moreover some responsability reorganization has led to merging
`TrapArena` and `TrapOutput` again into a `TrapDomain` class.
A `TargetFile` class is introduced, that is successfully created
only for the first process that starts processing a given trap output
file. From then on `TargetFile` simply wraps around `<<` stream
operations, dumping them to a temporary file. When `TargetFile::commit`
is called, the temporary file is moved on to the actual target trap
file.
Processes that lose the race can now just ignore the unneeded
extraction and go on, while previously all processes would carry out
all extractions overwriting each other at the end.
Some of the file system logic contained in `SwiftExtractor.cpp` has been
moved to this class, and two TODOs are solved:
* introducing a better inter process file collision avoidance strategy
* better error handling for trap output operations: if unable to write
to the trap file (or carry out other basic file operations), we just
abort.
The changes to `ExprVisitor` and `StmtVisitor` are due to wanting to
hide the raw `TrapDomain::createLabel` from them, and bring more
funcionality under the generic caching/dispatching mechanism.
Now `TypeRepr` is a final class in the AST, which is more or less just
a type with a location in code.
As the frontend does not provide a direct way to get a type from a
type representation, this information must be provided when fetching
the label of a type repr.
This meant:
* removing the type repr field from `EnumIsCaseExpr`: this is a virtual
AST node introduced in place of some kinds of `IsEpxr`. The type
repr is still available from the `ConditionalCheckedCastExpr` wrapped
by this virtual node, and we will rebuild the original `IsExpr` with
the IPA layer.
* some logic to get the type of keypath roots has been added to
`KeyPathExpr`. This was done to keep the `TypeRepr` to `Type` relation
total in the DB, but goes against the design of a dumb extractor. The
logic could be moved to QL in the future
* in the control flow library, `TypeRepr` children are now ignored. As
far as I can tell, there is no runtime evaluation going on in
`TypeRepr`s, so it does not make much sense to have control flow
through them.
As `ASTMangler` crashes when called on `ModuleDecl`, we simply use
its name.
This might probably not work reliably in a scenario where multiple
modules are compiled with the same name (like `main`), but this is left
for future work. At the moment this cannot create DB inconsistencies.
Visitor code has been split between header and sources to speed up
incremental build. Moreover the code was reorganized using a new `infra`
bazel package (and `visitors` got promoted to a bazel package as well).
This adds:
* a base `README.md` file to `codegen`
* module docstrings for the modules in `generators`
* help strings on all command line flags
Moreover some unneeded command line flags (`--namespace`,
`--include-dir` and `--trap-affix`) have been dropped.
Python code was simplified, and now a `--generate` option can be used
to drive what can be generated.
The extractor pack creation now will use an internally generated
dbscheme. This should be the same as the checked in one, but doing so
allows `bazel run create-extractor-pack` and `bazel run codegen` to be
run independently from one another, while previously the former had to
follow the latter in case of a schema change. This is the change that
triggered the above simplification, as in order for the two dbscheme
files to be identical, the first `// generated` line had to state the
same generator script.
That class was meant to allow aggregate initialization of generated
C++ entries having the label `id` as first argument.
As aggregate initialization turned out to be undesirable (names of
fields are not explicit, and `{}` must be inserted for empty
superclasses), this commit removes it and disallows aggregate
initialization altogether by defining empty constructors for generated
classes.
This allows to avoid bypassing label type correcness in the extractor,
and allows to independently resolve TBD extractions, as with this
approach TBD nodes do have the correctly typed trap label. The TBD
status is now a predicate on the QL side.
This requires:
* a default visit using the correct type, which is achieved via macro
metaprogramming in `VisitorBase.h`, following the way
`swift::ASTVisitor` is programmed
* a mapping from labels to corresponding binding trap entries. The
functor is defined in `TrapTagTraits.h` and instantiated in generated
`TrapEntries.h`
* Binding trap entries for TBD unknown entities must not have any other
field than the `id` (after all, we are supposed to not extract them
yet). This is why all unextracted fields in `schema.yml` have been
commented out, and will be uncommentend when visitors are added
This is solving a papercut, where the C++ build was relying on the
local dbscheme file to be up-to-date, even if all the information for
building is actually in `schema.yml`. This made a pure C++ development
cycle with changes to `schema.yml` clumsy, as it required a further
dbscheme generation step.
Now for C++ the dbscheme is generated internally in the build files, and
thus a change in `schema.yml` is reflected immediately in the C++ build.
A `swift/codegen` step for checked in generated code (including the
dbscheme) is still required, but a developer can do it just before
running QL tests or committing, instead of during each C++
recompilation.
Some directory reorganization was also carried out, moving specific
generator modules to a new `generators` python package, and only leaving
the two drivers at the top level.
This allows to avoid bypassing label type correcness in the extractor,
and allows to independently resolve TBD extractions, as with this
approach TBD nodes do have the correctly typed trap label. The TBD
status is now a predicate on the QL side.
This requires:
* a default visit using the correct type, which is achieved via macro
metaprogramming in `VisitorBase.h`, following the way
`swift::ASTVisitor` is programmed
* a mapping from labels to corresponding binding trap entries. The
functor is defined in `TrapTagTraits.h` and instantiated in generated
`TrapEntries.h`
* Binding trap entries for TBD unknown entities must not have any other
field than the `id` (after all, we are supposed to not extract them
yet). This is why all unextracted fields in `schema.yml` have been
commented out, and will be uncommentend when visitors are added
This turned out easier than expected previously. `llvm::PointerUnion`
was also considered, which would have less memory footprint, but it
would require more effort as it is lacking the same implicit conversions
and operators that `std::variant` provides.
Also renamed `ToTag<E>` to `TrapTagOf<E>` and introduced a derived
convenience functor `TrapLabelOf<E>`.