Skip to content

Merge clang abstraction serialization into the stable branch #483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

rjmccall
Copy link

No description provided.

rjmccall and others added 30 commits December 17, 2019 00:53
redeclaration checking.  NFC.

llvm-svn: 373406
The primary goal here is to make the type node hierarchy available to
other tblgen backends, although it should also make it easier to generate
more selective x-macros in the future.

Because tblgen doesn't seem to allow backends to preserve the source
order of defs, this is not NFC because it significantly re-orders IDs.
I've fixed the one (fortunately obvious) place where we relied on
the old order.  Unfortunately, I wasn't able to share code with the
existing AST-node x-macro generators because the x-macro schema we use
for types is different in a number of ways.  The main loss is that
subclasses aren't ordered together, which doesn't seem important for
types because the hierarchy is generally very shallow with little
clustering.

llvm-svn: 373407
We currently just look for files named in the modulemap in its
associated source directory.  This means that we can't name
generated files, like TypeNodes.def now is, which means we can't
explicitly mark it as textual.  But fortunately that's okay
because (as I understand it) the most important purpose of naming
the header in the modulemap is to ensure that it's not treated as
public, and the search for public headers also only considers
files in the associated source directory.  This isn't an elegant
solution, since among other things it means that a build which
wrote the generated files directly into the source directory would
result in something that wouldn't build as a module, but that's
a problem for all our other generated files as well.

llvm-svn: 373416
our autogenerated files.  NFC.

As requested by Nico Weber.

llvm-svn: 373425
Using `?` as an optional marker is very useful in Clang's AST-node
emitters because otherwise we need a separate class just to encode
the presence or absence of a base node reference.
This is useful for the property databases we want to add for abstract
serialization, since root classes can have interesting properties.
the tblgen AST node hierarchies.

Not totally NFC because both of the emitters now emit in a different
order.  The type-nodes emitter now visits nodes in hierarchy order,
which means we could use range checks in classof if we had any types
that would benefit from that; currently we do not.  The AST-nodes
emitter now uses a multimap keyed by the name of the record; previously
it was using `Record*`, which of couse isn't stable across processes
and may have led to non-reproducible builds in some circumstances.
There are three significant changes here:

- Most of the methods to read various embedded structures (`APInt`,
  `NestedNameSpecifier`, `DeclarationName`, etc.) have been moved
  from `ASTReader` to `ASTRecordReader`.  This cleans up quite a
  bit of code which was passing around `(F, Record, Idx)` arguments
  everywhere or doing explicit indexing, and it nicely parallels
  how it works on the writer side.  It also sets us up to then move
  most of these methods into the `BasicReader`s that I'm introducing
  as part of abstract serialization.

  As part of this, several of the top-level reader methods (e.g.
  `readTypeRecord`) have been converted to use `ASTRecordReader`
  internally, which is a nice readability improvement.

- I've standardized most of these method names on `readFoo` rather
  than `ReadFoo` (used in some of the helper structures) or `GetFoo`
  (used for some specific types for no apparent reason).

- I've changed a few of these methods to return their result instead
  of reading into an argument passed by reference.  This is partly
  for general consistency and partly because it will make the
  metaprogramming easier with abstract serialization.
The basic technical design here is that we have three levels
of readers and writers:

- At the lowest level, there's a `Basic{Reader,Writer}` that knows
  how to emit the basic structures of the AST.  CRTP allows this to
  be metaprogrammed so that the client only needs to support a handful
  of primitive types (e.g. `uint64_t` and `IdentifierInfo*`) and more
  complicated "inline" structures such as `DeclarationName` can just
  be emitted in terms of those primitives.

  In Clang's binary-serialization code, these are
  `ASTRecord{Reader,Writer}`.  For now, a large number of basic
  structures are still emitted explicitly by code on those classes
  rather than by either TableGen or CRTP metaprogramming, but I
  expect to move more of these over.

- In the middle, there's a `Property{Reader,Writer}` which is
  responsible for processing the properties of a larger object.  The
  object-level reader/writer asks the property-level reader/writer to
  project out a particular property, yielding a basic reader/writer
  which will be used to read/write the property's value, like so:

  ```
    propertyWriter.find("count").writeUInt32(node->getCount());
  ```

  Clang's binary-serialization code ignores this level (it uses
  the basic reader/writer as the property reader/writer and has the
  projection methods just return `*this`) and simply relies on the
  roperties being read/written in a stable order.

- At the highest level, there's an object reader/writer (e.g.
  `Type{Reader,Writer}` which emits a logical object with properties.
  Think of this as writing something like a JSON dictionary literal.

I haven't introduced support for bitcode abbreviations yet --- it
turns out that there aren't any operative abbreviations for types
besides the QualType one --- but I do have some ideas of how they
should work.  At any rate, they'll be necessary in order to handle
statements.

I'm sorry for not disentangling the patches that added basic and type
reader/writers; I made some effort to, but I ran out of energy after
disentangling a number of other patches from the work.

Negligible impact on module size, time to build a set of about 20
fairly large modules, or time to read a few declarations out of them.
On MSVC, friend declarations are (incorrectly) visible even if
not otherwise declared, which causes them to interfere with
lookup.  ASTTypeWriter is actually in an anonymous namespace
and cannot be ASTWriter's friend.  The others simply don't need
to be anymore.
In revision 139006c the Serialization
folder got its first def file 'TypeBitCodes.def'. This broke the
modules build as this .def file was not textually included but implicitly
converted into a module due to our umbrella directive.

This patch fixes this by explicitly marking the .def file as textual.
AbstractBasicReader.h has quite a few dependencies already,
and that's only likely to increase.  Meanwhile, ASTRecordReader
is really an implementation detail of the ASTReader that is only
used in a small number of places.

I've kept it in a public header for the use of projects like Swift
that might want to plug in to Clang's serialization framework.

I've also moved OMPClauseReader into an implementation file,
although it can't be made private because of friendship.
Similar motivations to the movement of ASTRecordReader:
AbstractBasicWriter.h already has quite a few dependencies,
and it's going to get pretty large as we generate more and more
into it.  Meanwhile, most clients don't depend on this detail of
the implementation and shouldn't need to be recompiled.

I've also made OMPClauseWriter private, like it belongs.
This error was originally added a while(7 years) ago when
including multiple files was basically always an error. Tablegen
now has preprocessor support, which allows for building nice
c/c++ style include guards. With the current error being
reported, we unfortunately need to double guard when including
files:

* In user of MyFile.td

 #ifndef MYFILE_TD
 include MyFile.td
 #endif

* In MyFile.td

 #ifndef MYFILE_TD
 #define MYFILE_TD
 ...
 #endif

Differential Revision: https://reviews.llvm.org/D70410
I'm going to introduce some uses of the property read/write methods.
This patch doesn't actually use this serialization for anything,
but follow-ups will move the current handling of various standard
types over to this.
properties from a value.

This is useful when the properties of a case are actually
read out of a specific structure, as with TemplateName.
directly to {read,write}UInt32.

This will be useful for textual formats.  NFC.
dependent template names.

Apparently we didn't test this in the test suite because we have
a lot of redundant ways of representing this situation that kick
in in the more common situations.  For example, DependentTST stores
a qualifier + identifier pair rather than a TemplateName.
@rjmccall
Copy link
Author

@swift-ci Please test.

@rjmccall rjmccall merged commit e1ad4d1 into swiftlang:apple/stable/20190619 Dec 17, 2019
@rjmccall rjmccall deleted the abstraction-serialization-apple-stable branch December 17, 2019 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants