-
Notifications
You must be signed in to change notification settings - Fork 342
SymbolGraph ExtractAPI support for C and Objective-C in clang #4442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SymbolGraph ExtractAPI support for C and Objective-C in clang #4442
Conversation
This is the initial commit for the clang-extract-api RFC <https://lists.llvm.org/pipermail/cfe-dev/2021-September/068768.html> Add a new driver option `-extract-api` and associate it with a dummy (for now) frontend action to set up the initial structure for incremental works. Differential Revision: https://reviews.llvm.org/D117809
Fix a build failure where an unused private field in ExtractAPIVisitor triggered a warning turned into error.
Add facilities for extract-api: - Structs/classes to hold collected API information: `APIRecord`, `API` - Structs/classes for API information: - `AvailabilityInfo`: aggregated availbility information - `DeclarationFragments`: declaration fragments - `DeclarationFragmentsBuilder`: helper class to build declaration fragments for various types/declarations - `FunctionSignature`: function signature - Serialization: `Serializer` - Add output file for `ExtractAPIAction` - Refactor `clang::RawComment::getFormattedText` to provide an additional `getFormattedLines` for a more detailed view of comment lines used for the SymbolGraph format Add support for global records (global variables and functions) - Add `GlobalRecord` based on `APIRecord` to store global records' information - Implement `VisitVarDecl` and `VisitFunctionDecl` in `ExtractAPIVisitor` to collect information - Implement serialization for global records - Add test case for global records Differential Revision: https://reviews.llvm.org/D119479
The clang/SymbolGraph/global_record.c test case explicitly diffs the clang version in use, which causes failures. Fix the issue by normalize the `generator` field before checking the output.
Implements an APISet specific unique ptr type that has a custom deleter that just calls the underlying APIRecord subclass destructor.
clang -extract-api should accept multiple headers and forward them to a single CC1 instance. This change introduces a new ExtractAPIJobAction. Currently API Extraction is done during the Precompile phase as this is the current phase that matches the requirements the most. Adding a new phase would need to change some logic in how phases are scheduled. If the headers scheduled for API extraction are of different types the driver emits a diagnostic. Differential Revision: https://reviews.llvm.org/D121936
- The name SymbolGraph is inappropriate and confusing for the new library for clang-extract-api. Refactor and rename things to make it clear that ExtractAPI is the core functionality and SymbolGraph is one serializer for the API information. - Add documentation comments to ExtractAPI classes and methods to improve readability and clearness of the ExtractAPI work. Differential Revision: https://reviews.llvm.org/D122160
…uage Change the Symbol Graph serializer for ExtractAPI to use `objective-c` for the language name string for Objective-C, to align with clang frontend standards.
Adds `--product-name=` flag to the clang driver. This gets forwarded to cc1 only when we are performing a ExtractAPI Action. This is used to populate the `name` field of the module object in the generated SymbolGraph. Differential Revision: https://reviews.llvm.org/D122141
Add support for enum records - Add `EnumConstantRecord` and `EnumRecord` to store API information for enums - Implement `VisitEnumDecl` in `ExtractAPIVisitor` - Implement serializatin for enum records and `MemberOf` relationship - Add test case for enum records - Few other improvements Depends on D122160 Differential Revision: https://reviews.llvm.org/D121873
- Add `StructFieldRecord` and `StructRecord` to store API information for structs - Implement `VisitRecordDecl` in `ExtractAPIVisitor` - Implement Symbol Graph serialization for struct records. - Add test case for struct records. Depends on D121873 Differential Revision: https://reviews.llvm.org/D122202
Before actually executing the ExtractAPIAction, clear the CompilationInstance's input list and replace it with a single synthesized file that just includes (or imports in ObjC) all the inputs. Depends on D122141 Differential Revision: https://reviews.llvm.org/D122175
Using a BumpPtrAllocator introduced memory leaks for APIRecords as they contain a std::vector. This meant that we needed to always keep a reference to the records in APISet and arrange for their destructor to get called appropriately. This was further complicated by the need for records to own sub-records as these subrecords would still need to be allocated via the BumpPtrAllocator and the owning record would now need to arrange for the destructor of its subrecords to be called appropriately. Since APIRecords contain a std::vector so whenever elements get added to that there is an associated heap allocation regardless. Since performance isn't currently our main priority it makes sense to use regular unique_ptr to keep track of APIRecords, this way we don't need to arrange for destructors to get called. The BumpPtrAllocator is still used for strings such as USRs so that we can easily de-duplicate them as necessary. Differential Revision: https://reviews.llvm.org/D122331
Add missing virtual method anchors for structs in ExtractAPI/API.h
Rename a local variable name to avoid potential ambiguity/conflict for some compilers.
The current way of getting the `clang::Language` from `LangOptions` does not handle Objective-C correctly because `clang::Language::ObjC` does not correspond to any `LangStandard`. This patch passes the correct `Language` from the frontend input information. Differential Revision: https://reviews.llvm.org/D122495
Add support for Objective-C interface declarations in ExtractAPI. Depends on D122495 Differential Revision: https://reviews.llvm.org/D122446
Add support for Objective-C protocol declarations in ExtractAPI. Depends on D122446 Differential Revision: https://reviews.llvm.org/D122511
Make the API records a property of the action instead of the ASTVisitor so that it can be accessed outside the AST visitation and push back serialization to the end of the frontend action. This will allow accessing and modifying the API records outside of the ASTVisitor, which is a prerequisite for supporting macros.
To achieve this we hook into the preprocessor during the ExtractAPIAction and record definitions for macros that don't get undefined during preprocessing.
…POpts This was triggering some build failures so removing this change for now.
Add struct level documentation for MacroDefinitionRecord. Differential Revision: https://reviews.llvm.org/D122798
This fixes the situation where a undefining a not previously defined macro resulted in a crash. Before trying to remove a definition from PendingMacros we first check to see if the macro did indeed have a previous definition. Differential Revision: https://reviews.llvm.org/D123056
This includes: - replacing "relationhips" with "relationships" - emitting the "pathComponents" property on symbols - emitting the "accessLevel" property on symbols Differential Revision: https://reviews.llvm.org/D123045
Typedef records consist of the symbol associated with the underlying TypedefDecl and a SymbolReference to the underlying type. Additionally typedefs for anonymous TagTypes use the typedef'd name as the symbol name in their respective records and USRs. As a result the declaration fragments for the anonymous TagType are those for the associated typedef. This means that when the user is defining a typedef to a typedef to a anonymous type, we use a reference the anonymous TagType itself and do not emit the typedef to the anonymous type in the generated symbol graph, including in the type destination of further typedef symbol records. Differential Revision: https://reviews.llvm.org/D123019
Add (partial) support for Objective-C category records in ExtractAPI. The current ExtractAPI collects everything for an Objective-C category, but not fully serialized in the SymbolGraphSerializer. Categories extending external interfaces are disgarded during serialization, and categories extending known interfaces are merged (all members surfaced) into the interfaces. Differential Revision: https://reviews.llvm.org/D122774
There is a bug in `DeclarationFragments::appendSpace` where the space character is added to a local copy of the last fragment. Differential Revision: https://reviews.llvm.org/D123259
- Split GlobalRecord into two distinct types to be able to introduce has_function_signature type trait. - Add has_function_signature type trait. - Serialize function signatures as part of serializeAPIRecord for records that are known to have a function signature. Differential Revision: https://reviews.llvm.org/D123304
Fix path replacement in sed (properly this time) using lit regex_replacement. Differential Revision: https://reviews.llvm.org/D123526 Co-authored-by: Michele Scandale <[email protected]> Co-authored-by: Zixu Wang <[email protected]>
Anonymous enums without a typedef should have a "(anonymous)" identifier. Differential Revision: https://reviews.llvm.org/D123533
Fix one test (enum.c) in ExtractAPI to use %clang_cc1 and -verify instead of calling the full driver and FileCheck. This is an example for my comment from https://reviews.llvm.org/D121873. Differential Revision: https://reviews.llvm.org/D124634
This patch transforms the given input headers to relative include names using header search entries and some heuritics. For example: `/Path/To/Header.h` will be included as `<Header.h>` with a search path of `-I /Path/To/`; and `/Path/To/Framework.framework/Headers/Header.h` will be included as `<Framework/Header.h>`, given a search path of `-F /Path/To`. Headermaps will also be queried in reverse to find a spelled name to include headers. Differential Revision: https://reviews.llvm.org/D123831
4c262fe accidentally added local unfinished test case clang/test/Index/annotate-comments-enum-constant.c This patch removes it.
This reverts commit 4c262fe. Revert to fix Msan and Asan errors.
Reapply the change after fixing sanitizer errors. The original problem was that `StringRef`s in `Matches` are pointing to temporary local `std::string`s created by `path::convert_to_slash` in the regex match call. This patch does the conversion up front in container `FilePath`. This reverts commit 2966f0f. Differential Revision: https://reviews.llvm.org/D124964
Differential Revision: https://reviews.llvm.org/D124995
@swift-ci please test |
@swift-ci Please Build Toolchain macOS Platform |
Failed test: https://github.com/apple/swift/blob/main/test/SourceKit/CursorInfo/cursor_symbol_graph_objc.swift#L243-L278 |
Pull request swiftlang/llvm-project#4442 brings in a change to `RawComment::getFormattedText` that removes spurious new lines and whitespaces at the end of block comments. It breaks the `cursor_symbol_graph_objc` test which is assuming the old behavior. Temporarily disable the relevant check lines in the test to merge the llvm change, and then fix the test properly and switch to the new `getFormattedLines` in SymbolGraphGen.
You'll want to grab zixu-w/swift@1ed123e on swift |
Pull request swiftlang/llvm-project#4442 brings in a change to `RawComment::getFormattedText` that removes spurious new lines and whitespaces at the end of block comments. It breaks the `cursor_symbol_graph_objc` test which is assuming the old behavior. Temporarily disable the relevant check lines in the test to merge the llvm change, and then fix the test properly and switch to the new `getFormattedLines` in SymbolGraphGen. (cherry picked from commit 1ed123e)
@swift-ci please test |
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an isolated change, so I am good with taking it.
Cherry-pick commits for llvm.org's main branch that implement support for Symbol Graph generation using clang for C and Objective-C headers, as mentioned in https://forums.swift.org/t/extending-swift-docc-to-support-objective-c-documentation/53243. Bringing in these changes would benefit the Swift community and facilitate support in SwiftPM and similar tooling.
These changes don’t affect regular compilation and instead introduce a new action to the clang frontend and driver. Most of the changes are self contained in the ExtractAPI library. An example clang invocation for generating a symbol graph would be:
Changes that affect the core compiler infrastructure are constrained to the driver, the frontend actions infrastructure and the ASTs infrastructure for comment processing.
Changes to the driver consist of creating a new JobAction for the
-extract-api
option. The aim of this is to create a single CC1 invocation that takes all the header files provided on the command line as inputs for further processing by the frontend. These changes can be found in the following files.:In CC1 itself the only new changes that aren’t self contained are to construct the appropriate new frontend action for the provided headers instead of performing a regular build. These changes can be found in the following files:
In the AST comment processing infrastructure we introduced a new API for getting a list of comment lines with associated source locations to enable correct symbol graph generation. This new interface is only exercised by the ExtractAPI library. This new interface is defined and implemented in the following files:
The rest of the changes are in the ExtractAPI library implement the new frontend action for Symbol Graph generation. These are fully self contained and will not affect the rest of the compiler.