Skip to content

[clang-doc] add a JSON generator #142483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 10, 2025
Merged

Conversation

evelez7
Copy link
Member

@evelez7 evelez7 commented Jun 2, 2025

Adds a JSON generator backend to emit mapped information as JSON. This will enable a better testing format for upcoming changes. It can also potentially serve to feed our other backend generators in the future, like Mustache which already serializes information to JSON before emitting as HTML.

This patch contains functionality to emit classes and provides most of the basis of the generator.

Copy link
Member Author

evelez7 commented Jun 2, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@evelez7 evelez7 requested review from petrhosek and ilovepi June 2, 2025 20:51
Copy link
Member Author

evelez7 commented Jun 2, 2025

I think this patch is mostly ready in terms of functionality, but we should decide if the emitted files should follow the HTML or YAML style of creation. Right now, since I just ripped the Mustache generator code, it creates folders for each namespace and emits the nested entities there. This actually represents a problem for template specializations because the code tries to write another JSON object (the specialization) to the same file as the base template, which is invalid.

With YAML, each entity is emitted in a file that uses the entity's USR as a name. All files are emitted to the output directory without any folders. That conveniently solves the above template problem, but also results in ugly file names that are potentially bad for multi-file testing (right now, we use regex to find a USR file name to test YAML).

I'd say the HTML layout is nice, but then we'd need to decide what to call the specializations' files or whether to just include them as an object inside the base template's file.

@evelez7 evelez7 force-pushed the users/evelez7/clang-doc-json-generator branch 2 times, most recently from cf68952 to fa8b80f Compare June 3, 2025 16:43
Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start, but I thin we'll want some more tests, and probably some unit test coverage. Unittesting is especially nice, since I believe this backend doesn't need any of the asset files, right?

As for whether this should follow the pattern of YAML or HTML ... I'm conflicted for the same reasons you are. But maybe there's a better way for us to generate the filename that doesn't need the USR? like could we get the mangled name and use that? I'd hope those would be different enough to not conflict. Alternatively, maybe we should look at a way we can merge all the related docinfo together at an earlier step.

return LocationObj;
SmallString<128> FileURL(*RepositoryUrl);
sys::path::append(FileURL, sys::path::Style::posix, Loc.Filename);
FileURL += "#" + std::to_string(Loc.StartLineNumber);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a mistake in the mustache impl, since I believe we normally allow the user to set the the line numbering schema in the existing other generators.

@evelez7
Copy link
Member Author

evelez7 commented Jun 3, 2025

This is a good start, but I thin we'll want some more tests, and probably some unit test coverage. Unittesting is especially nice, since I believe this backend doesn't need any of the asset files, right?

Yeah, this doesn't need assets, so I'll be able to call things directly for unit tests.

As for whether this should follow the pattern of YAML or HTML ... I'm conflicted for the same reasons you are. But maybe there's a better way for us to generate the filename that doesn't need the USR? like could we get the mangled name and use that? I'd hope those would be different enough to not conflict. Alternatively, maybe we should look at a way we can merge all the related docinfo together at an earlier step.

I'll investigate this. Although right off the bat, the names in the Infos aren't mangled. Name and FullName are the same for both records, which is unfortunate.

@ilovepi
Copy link
Contributor

ilovepi commented Jun 3, 2025

I'll investigate this. Although right off the bat, the names in the Infos aren't mangled. Name and FullName are the same for both records, which is unfortunate.

OK, well, let's give it a try, and if its too hard lets ... IDK go w/ the YAML thing, so its correct? Testing will be kind of hellish so I'm loathe to go that path, but if one is obviously broken, then we should probably avoid it.

@ilovepi
Copy link
Contributor

ilovepi commented Jun 3, 2025

Right now, since I just ripped the Mustache generator code, it creates folders for each namespace and emits the nested entities there. This actually represents a problem for template specializations because the code tries to write another JSON object (the specialization) to the same file as the base template, which is invalid.

After re-reading this, I have a vague memory that this used to happen for everything, and it somehow got fixed for YAML. Probably back when Brett was working on this.

@evelez7
Copy link
Member Author

evelez7 commented Jun 4, 2025

If we're open to adding a flag in the base Info like IsClassSpecialization, then we can probably easily deal with these by trying to reconstruct the specialization's arguments (will be something like "Foo<T, int>.json"). Functions don't produce their own files so function specializations wont be a problem. Then we can keep this layout. @ilovepi

That or we manually change the name when serializing during mapping. I'm not so sure how well merging the infos would be ergonomically. Specializations can have all of their own unique members and methods which need documentation.

I also found a way to get a nice ugly mangled name :)

@ilovepi
Copy link
Contributor

ilovepi commented Jun 4, 2025

If we're open to adding a flag in the base Info like IsClassSpecialization, then we can probably easily deal with these by trying to reconstruct the specialization's arguments (will be something like "Foo<T, int>.json"). Functions don't produce their own files so function specializations wont be a problem. Then we can keep this layout. @ilovepi

That or we manually change the name when serializing during mapping. I'm not so sure how well merging the infos would be ergonomically. Specializations can have all of their own unique members and methods which need documentation.

This sounds promising. I'm fine w/ adding a field to track this. BTW, what does clang do? I'm wondering if we should track more than 1-bit of info here.

I also found a way to get a nice ugly mangled name :)

Nice! IIRC that's what I've seen a lot of document generators use. If its unique enough for ODR, is probably good enough for docs ... right?

@evelez7
Copy link
Member Author

evelez7 commented Jun 5, 2025

This sounds promising. I'm fine w/ adding a field to track this. BTW, what does clang do? I'm wondering if we should track more than 1-bit of info here.

As far as I can tell, inside the AST Clang makes use of the isa<> mechanisms (which we could also leverage) or pointer unions like

llvm::PointerUnion<ClassTemplateDecl *, MemberSpecializationInfo *>
TemplateOrInstantiation;

@ilovepi
Copy link
Contributor

ilovepi commented Jun 5, 2025

This sounds promising. I'm fine w/ adding a field to track this. BTW, what does clang do? I'm wondering if we should track more than 1-bit of info here.

As far as I can tell, inside the AST Clang makes use of the isa<> mechanisms (which we could also leverage) or pointer unions like

llvm::PointerUnion<ClassTemplateDecl *, MemberSpecializationInfo *>
TemplateOrInstantiation;

Lets keep it simple for now. Clang-Doc doesn't have LLVM style RTTI yet, and I'm cautiously optimistic that we can keep it that way. So lets just add a bool or some bitfields or whatever.

@evelez7 evelez7 force-pushed the users/evelez7/clang-doc-json-generator branch from b794eed to 9995fe0 Compare June 6, 2025 19:11
Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good start, modulo a couple fixes on reserving space in json::Arrays. I'm fine if we keep incrementally adding testing rather than doing it all up front. When we're done I imagine most of the YAML tests can be converted, and we can just drop those bits.

Copy link

github-actions bot commented Jun 6, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@evelez7 evelez7 force-pushed the users/evelez7/clang-doc-json-generator branch from 5c82dcc to 8becb23 Compare June 6, 2025 21:21
@evelez7 evelez7 marked this pull request as ready for review June 6, 2025 21:24
@llvmbot
Copy link
Member

llvmbot commented Jun 6, 2025

@llvm/pr-subscribers-clang-tools-extra

Author: Erick Velez (evelez7)

Changes

Adds a JSON generator backend to emit mapped information as JSON. This will enable a better testing format for upcoming changes. It can also potentially serve to feed our other backend generators in the future, like Mustache which already serializes information to JSON before emitting as HTML.

This patch contains functionality to emit classes and provides most of the basis of the generator.


Patch is 34.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/142483.diff

10 Files Affected:

  • (modified) clang-tools-extra/clang-doc/CMakeLists.txt (+1)
  • (modified) clang-tools-extra/clang-doc/Generators.cpp (+2)
  • (modified) clang-tools-extra/clang-doc/Generators.h (+1)
  • (added) clang-tools-extra/clang-doc/JSONGenerator.cpp (+440)
  • (modified) clang-tools-extra/clang-doc/tool/ClangDocMain.cpp (+6-2)
  • (added) clang-tools-extra/test/clang-doc/json/class-template.cpp (+29)
  • (added) clang-tools-extra/test/clang-doc/json/class.cpp (+194)
  • (added) clang-tools-extra/test/clang-doc/json/method-template.cpp (+40)
  • (modified) clang-tools-extra/unittests/clang-doc/CMakeLists.txt (+1)
  • (added) clang-tools-extra/unittests/clang-doc/JSONGeneratorTest.cpp (+175)
diff --git a/clang-tools-extra/clang-doc/CMakeLists.txt b/clang-tools-extra/clang-doc/CMakeLists.txt
index 79563c41435eb..5989e5fe60cf3 100644
--- a/clang-tools-extra/clang-doc/CMakeLists.txt
+++ b/clang-tools-extra/clang-doc/CMakeLists.txt
@@ -17,6 +17,7 @@ add_clang_library(clangDoc STATIC
   Serialize.cpp
   YAMLGenerator.cpp
   HTMLMustacheGenerator.cpp
+  JSONGenerator.cpp
 
   DEPENDS
   omp_gen
diff --git a/clang-tools-extra/clang-doc/Generators.cpp b/clang-tools-extra/clang-doc/Generators.cpp
index a3c2773412cff..3fb5b63c403a7 100644
--- a/clang-tools-extra/clang-doc/Generators.cpp
+++ b/clang-tools-extra/clang-doc/Generators.cpp
@@ -105,5 +105,7 @@ static int LLVM_ATTRIBUTE_UNUSED HTMLGeneratorAnchorDest =
     HTMLGeneratorAnchorSource;
 static int LLVM_ATTRIBUTE_UNUSED MHTMLGeneratorAnchorDest =
     MHTMLGeneratorAnchorSource;
+static int LLVM_ATTRIBUTE_UNUSED JSONGeneratorAnchorDest =
+    JSONGeneratorAnchorSource;
 } // namespace doc
 } // namespace clang
diff --git a/clang-tools-extra/clang-doc/Generators.h b/clang-tools-extra/clang-doc/Generators.h
index aee04b9d58d9d..92d3006e6002d 100644
--- a/clang-tools-extra/clang-doc/Generators.h
+++ b/clang-tools-extra/clang-doc/Generators.h
@@ -58,6 +58,7 @@ extern volatile int YAMLGeneratorAnchorSource;
 extern volatile int MDGeneratorAnchorSource;
 extern volatile int HTMLGeneratorAnchorSource;
 extern volatile int MHTMLGeneratorAnchorSource;
+extern volatile int JSONGeneratorAnchorSource;
 
 } // namespace doc
 } // namespace clang
diff --git a/clang-tools-extra/clang-doc/JSONGenerator.cpp b/clang-tools-extra/clang-doc/JSONGenerator.cpp
new file mode 100644
index 0000000000000..0459d061cd138
--- /dev/null
+++ b/clang-tools-extra/clang-doc/JSONGenerator.cpp
@@ -0,0 +1,440 @@
+#include "Generators.h"
+#include "clang/Basic/Specifiers.h"
+#include "llvm/Support/JSON.h"
+
+using namespace llvm;
+using namespace llvm::json;
+
+namespace clang {
+namespace doc {
+
+class JSONGenerator : public Generator {
+public:
+  static const char *Format;
+
+  Error generateDocs(StringRef RootDir,
+                     llvm::StringMap<std::unique_ptr<doc::Info>> Infos,
+                     const ClangDocContext &CDCtx) override;
+  Error createResources(ClangDocContext &CDCtx) override;
+  Error generateDocForInfo(Info *I, llvm::raw_ostream &OS,
+                           const ClangDocContext &CDCtx) override;
+};
+
+const char *JSONGenerator::Format = "json";
+
+static void serializeInfo(const TypedefInfo &I, json::Object &Obj,
+                          std::optional<StringRef> RepositoryUrl);
+static void serializeInfo(const EnumInfo &I, json::Object &Obj,
+                          std::optional<StringRef> RepositoryUrl);
+
+static json::Object serializeLocation(const Location &Loc,
+                                      std::optional<StringRef> RepositoryUrl) {
+  Object LocationObj = Object();
+  LocationObj["LineNumber"] = Loc.StartLineNumber;
+  LocationObj["Filename"] = Loc.Filename;
+
+  if (!Loc.IsFileInRootDir || !RepositoryUrl)
+    return LocationObj;
+  SmallString<128> FileURL(*RepositoryUrl);
+  sys::path::append(FileURL, sys::path::Style::posix, Loc.Filename);
+  FileURL += "#" + std::to_string(Loc.StartLineNumber);
+  LocationObj["FileURL"] = FileURL;
+  return LocationObj;
+}
+
+static json::Value serializeComment(const CommentInfo &Comment) {
+  assert((Comment.Kind == "BlockCommandComment" ||
+          Comment.Kind == "FullComment" || Comment.Kind == "ParagraphComment" ||
+          Comment.Kind == "TextComment") &&
+         "Unknown Comment type in CommentInfo.");
+
+  Object Obj = Object();
+  json::Value Child = Object();
+
+  // TextComment has no children, so return it.
+  if (Comment.Kind == "TextComment") {
+    Obj["TextComment"] = Comment.Text;
+    return Obj;
+  }
+
+  // BlockCommandComment needs to generate a Command key.
+  if (Comment.Kind == "BlockCommandComment")
+    Child.getAsObject()->insert({"Command", Comment.Name});
+
+  // Use the same handling for everything else.
+  // Only valid for:
+  //  - BlockCommandComment
+  //  - FullComment
+  //  - ParagraphComment
+  json::Value ChildArr = Array();
+  auto &CARef = *ChildArr.getAsArray();
+  CARef.reserve(Comment.Children.size());
+  for (const auto &C : Comment.Children)
+    CARef.emplace_back(serializeComment(*C));
+  Child.getAsObject()->insert({"Children", ChildArr});
+  Obj.insert({Comment.Kind, Child});
+  return Obj;
+}
+
+static void serializeCommonAttributes(const Info &I, json::Object &Obj,
+                                      std::optional<StringRef> RepositoryUrl) {
+  Obj["Name"] = I.Name;
+  Obj["USR"] = toHex(toStringRef(I.USR));
+
+  if (!I.Path.empty())
+    Obj["Path"] = I.Path;
+
+  if (!I.Namespace.empty()) {
+    Obj["Namespace"] = json::Array();
+    for (const auto &NS : I.Namespace)
+      Obj["Namespace"].getAsArray()->push_back(NS.Name);
+  }
+
+  if (!I.Description.empty()) {
+    json::Value DescArray = json::Array();
+    auto &DescArrayRef = *DescArray.getAsArray();
+    DescArrayRef.reserve(I.Description.size());
+    for (const auto &Comment : I.Description)
+      DescArrayRef.push_back(serializeComment(Comment));
+    Obj["Description"] = DescArray;
+  }
+
+  // Namespaces aren't SymbolInfos, so they dont have a DefLoc
+  if (I.IT != InfoType::IT_namespace) {
+    const auto *Symbol = static_cast<const SymbolInfo *>(&I);
+    if (Symbol->DefLoc)
+      Obj["Location"] =
+          serializeLocation(Symbol->DefLoc.value(), RepositoryUrl);
+  }
+}
+
+static void serializeReference(const Reference &Ref, Object &ReferenceObj,
+                               SmallString<64> CurrentDirectory) {
+  SmallString<64> Path = Ref.getRelativeFilePath(CurrentDirectory);
+  sys::path::append(Path, Ref.getFileBaseName() + ".json");
+  sys::path::native(Path, sys::path::Style::posix);
+  ReferenceObj["Link"] = Path;
+  ReferenceObj["Name"] = Ref.Name;
+  ReferenceObj["QualName"] = Ref.QualName;
+  ReferenceObj["USR"] = toHex(toStringRef(Ref.USR));
+}
+
+static void serializeReference(const SmallVector<Reference, 4> &References,
+                               Object &Obj, std::string Key) {
+  json::Value ReferencesArray = Array();
+  json::Array &ReferencesArrayRef = *ReferencesArray.getAsArray();
+  ReferencesArrayRef.reserve(References.size());
+  for (const auto &Reference : References) {
+    json::Value ReferenceVal = Object();
+    auto &ReferenceObj = *ReferenceVal.getAsObject();
+    auto BasePath = Reference.getRelativeFilePath("");
+    serializeReference(Reference, ReferenceObj, BasePath);
+    ReferencesArrayRef.push_back(ReferenceVal);
+  }
+  Obj[Key] = ReferencesArray;
+}
+
+// Although namespaces and records both have ScopeChildren, they serialize them
+// differently. Only enums, records, and typedefs are handled here.
+static void serializeCommonChildren(const ScopeChildren &Children,
+                                    json::Object &Obj,
+                                    std::optional<StringRef> RepositoryUrl) {
+  if (!Children.Enums.empty()) {
+    json::Value EnumsArray = Array();
+    auto &EnumsArrayRef = *EnumsArray.getAsArray();
+    EnumsArrayRef.reserve(Children.Enums.size());
+    for (const auto &Enum : Children.Enums) {
+      json::Value EnumVal = Object();
+      auto &EnumObj = *EnumVal.getAsObject();
+      serializeInfo(Enum, EnumObj, RepositoryUrl);
+      EnumsArrayRef.push_back(EnumVal);
+    }
+    Obj["Enums"] = EnumsArray;
+  }
+
+  if (!Children.Typedefs.empty()) {
+    json::Value TypedefsArray = Array();
+    auto &TypedefsArrayRef = *TypedefsArray.getAsArray();
+    TypedefsArrayRef.reserve(Children.Typedefs.size());
+    for (const auto &Typedef : Children.Typedefs) {
+      json::Value TypedefVal = Object();
+      auto &TypedefObj = *TypedefVal.getAsObject();
+      serializeInfo(Typedef, TypedefObj, RepositoryUrl);
+      TypedefsArrayRef.push_back(TypedefVal);
+    }
+    Obj["Typedefs"] = TypedefsArray;
+  }
+
+  if (!Children.Records.empty()) {
+    json::Value RecordsArray = Array();
+    auto &RecordsArrayRef = *RecordsArray.getAsArray();
+    RecordsArrayRef.reserve(Children.Records.size());
+    for (const auto &Record : Children.Records) {
+      json::Value RecordVal = Object();
+      auto &RecordObj = *RecordVal.getAsObject();
+      SmallString<64> BasePath = Record.getRelativeFilePath("");
+      serializeReference(Record, RecordObj, BasePath);
+      RecordsArrayRef.push_back(RecordVal);
+    }
+    Obj["Records"] = RecordsArray;
+  }
+}
+
+static void serializeInfo(const TemplateInfo &Template, Object &Obj) {
+  json::Value TemplateVal = Object();
+  auto &TemplateObj = *TemplateVal.getAsObject();
+
+  if (Template.Specialization) {
+    json::Value TemplateSpecializationVal = Object();
+    auto &TemplateSpecializationObj = *TemplateSpecializationVal.getAsObject();
+    TemplateSpecializationObj["SpecializationOf"] =
+        toHex(toStringRef(Template.Specialization->SpecializationOf));
+    if (!Template.Specialization->Params.empty()) {
+      json::Value ParamsArray = Array();
+      auto &ParamsArrayRef = *ParamsArray.getAsArray();
+      ParamsArrayRef.reserve(Template.Specialization->Params.size());
+      for (const auto &Param : Template.Specialization->Params)
+        ParamsArrayRef.push_back(Param.Contents);
+      TemplateSpecializationObj["Parameters"] = ParamsArray;
+    }
+    TemplateObj["Specialization"] = TemplateSpecializationVal;
+  }
+
+  if (!Template.Params.empty()) {
+    json::Value ParamsArray = Array();
+    auto &ParamsArrayRef = *ParamsArray.getAsArray();
+    ParamsArrayRef.reserve(Template.Params.size());
+    for (const auto &Param : Template.Params)
+      ParamsArrayRef.push_back(Param.Contents);
+    TemplateObj["Parameters"] = ParamsArray;
+  }
+
+  Obj["Template"] = TemplateVal;
+}
+
+static void serializeInfo(const TypeInfo &I, Object &Obj) {
+  Obj["Name"] = I.Type.Name;
+  Obj["QualName"] = I.Type.QualName;
+  Obj["USR"] = toHex(toStringRef(I.Type.USR));
+  Obj["IsTemplate"] = I.IsTemplate;
+  Obj["IsBuiltIn"] = I.IsBuiltIn;
+}
+
+static void serializeInfo(const FunctionInfo &F, json::Object &Obj,
+                          std::optional<StringRef> RepositoryURL) {
+  serializeCommonAttributes(F, Obj, RepositoryURL);
+  Obj["IsStatic"] = F.IsStatic;
+
+  auto ReturnTypeObj = Object();
+  serializeInfo(F.ReturnType, ReturnTypeObj);
+  Obj["ReturnType"] = std::move(ReturnTypeObj);
+
+  if (!F.Params.empty()) {
+    json::Value ParamsArray = json::Array();
+    auto &ParamsArrayRef = *ParamsArray.getAsArray();
+    ParamsArrayRef.reserve(F.Params.size());
+    for (const auto &Param : F.Params) {
+      json::Value ParamVal = Object();
+      auto &ParamObj = *ParamVal.getAsObject();
+      ParamObj["Name"] = Param.Name;
+      ParamObj["Type"] = Param.Type.Name;
+      ParamsArrayRef.push_back(ParamVal);
+    }
+    Obj["Params"] = ParamsArray;
+  }
+
+  if (F.Template)
+    serializeInfo(F.Template.value(), Obj);
+}
+
+static void serializeInfo(const EnumInfo &I, json::Object &Obj,
+                          std::optional<StringRef> RepositoryUrl) {
+  serializeCommonAttributes(I, Obj, RepositoryUrl);
+  Obj["Scoped"] = I.Scoped;
+
+  if (I.BaseType) {
+    json::Value BaseTypeVal = Object();
+    auto &BaseTypeObj = *BaseTypeVal.getAsObject();
+    BaseTypeObj["Name"] = I.BaseType->Type.Name;
+    BaseTypeObj["QualName"] = I.BaseType->Type.QualName;
+    BaseTypeObj["USR"] = toHex(toStringRef(I.BaseType->Type.USR));
+    Obj["BaseType"] = BaseTypeVal;
+  }
+
+  if (!I.Members.empty()) {
+    json::Value MembersArray = Array();
+    auto &MembersArrayRef = *MembersArray.getAsArray();
+    MembersArrayRef.reserve(I.Members.size());
+    for (const auto &Member : I.Members) {
+      json::Value MemberVal = Object();
+      auto &MemberObj = *MemberVal.getAsObject();
+      MemberObj["Name"] = Member.Name;
+      if (!Member.ValueExpr.empty())
+        MemberObj["ValueExpr"] = Member.ValueExpr;
+      else
+        MemberObj["Value"] = Member.Value;
+      MembersArrayRef.push_back(MemberVal);
+    }
+    Obj["Members"] = MembersArray;
+  }
+}
+
+static void serializeInfo(const TypedefInfo &I, json::Object &Obj,
+                          std::optional<StringRef> RepositoryUrl) {
+  serializeCommonAttributes(I, Obj, RepositoryUrl);
+  Obj["TypeDeclaration"] = I.TypeDeclaration;
+  Obj["IsUsing"] = I.IsUsing;
+  json::Value TypeVal = Object();
+  auto &TypeObj = *TypeVal.getAsObject();
+  serializeInfo(I.Underlying, TypeObj);
+  Obj["Underlying"] = TypeVal;
+}
+
+static void serializeInfo(const RecordInfo &I, json::Object &Obj,
+                          std::optional<StringRef> RepositoryUrl) {
+  serializeCommonAttributes(I, Obj, RepositoryUrl);
+  Obj["FullName"] = I.FullName;
+  Obj["TagType"] = getTagType(I.TagType);
+  Obj["IsTypedef"] = I.IsTypeDef;
+
+  if (!I.Children.Functions.empty()) {
+    json::Value PubFunctionsArray = Array();
+    json::Array &PubFunctionsArrayRef = *PubFunctionsArray.getAsArray();
+    json::Value ProtFunctionsArray = Array();
+    json::Array &ProtFunctionsArrayRef = *ProtFunctionsArray.getAsArray();
+
+    for (const auto &Function : I.Children.Functions) {
+      json::Value FunctionVal = Object();
+      auto &FunctionObj = *FunctionVal.getAsObject();
+      serializeInfo(Function, FunctionObj, RepositoryUrl);
+      AccessSpecifier Access = Function.Access;
+      if (Access == AccessSpecifier::AS_public)
+        PubFunctionsArrayRef.push_back(FunctionVal);
+      else if (Access == AccessSpecifier::AS_protected)
+        ProtFunctionsArrayRef.push_back(FunctionVal);
+    }
+
+    if (!PubFunctionsArrayRef.empty())
+      Obj["PublicFunctions"] = PubFunctionsArray;
+    if (!ProtFunctionsArrayRef.empty())
+      Obj["ProtectedFunctions"] = ProtFunctionsArray;
+  }
+
+  if (!I.Members.empty()) {
+    json::Value PublicMembersArray = Array();
+    json::Array &PubMembersArrayRef = *PublicMembersArray.getAsArray();
+    json::Value ProtectedMembersArray = Array();
+    json::Array &ProtMembersArrayRef = *ProtectedMembersArray.getAsArray();
+
+    for (const MemberTypeInfo &Member : I.Members) {
+      json::Value MemberVal = Object();
+      auto &MemberObj = *MemberVal.getAsObject();
+      MemberObj["Name"] = Member.Name;
+      MemberObj["Type"] = Member.Type.Name;
+
+      if (Member.Access == AccessSpecifier::AS_public)
+        PubMembersArrayRef.push_back(MemberVal);
+      else if (Member.Access == AccessSpecifier::AS_protected)
+        ProtMembersArrayRef.push_back(MemberVal);
+    }
+
+    if (!PubMembersArrayRef.empty())
+      Obj["PublicMembers"] = PublicMembersArray;
+    if (!ProtMembersArrayRef.empty())
+      Obj["ProtectedMembers"] = ProtectedMembersArray;
+  }
+
+  if (!I.Bases.empty()) {
+    json::Value BasesArray = Array();
+    json::Array &BasesArrayRef = *BasesArray.getAsArray();
+    BasesArrayRef.reserve(I.Bases.size());
+    for (const auto &BaseInfo : I.Bases) {
+      json::Value BaseInfoVal = Object();
+      auto &BaseInfoObj = *BaseInfoVal.getAsObject();
+      serializeInfo(BaseInfo, BaseInfoObj, RepositoryUrl);
+      BaseInfoObj["IsVirtual"] = BaseInfo.IsVirtual;
+      BaseInfoObj["Access"] = getAccessSpelling(BaseInfo.Access);
+      BaseInfoObj["IsParent"] = BaseInfo.IsParent;
+      BasesArrayRef.push_back(BaseInfoVal);
+    }
+    Obj["Bases"] = BasesArray;
+  }
+
+  if (!I.Parents.empty())
+    serializeReference(I.Parents, Obj, "Parents");
+
+  if (!I.VirtualParents.empty())
+    serializeReference(I.VirtualParents, Obj, "VirtualParents");
+
+  if (I.Template)
+    serializeInfo(I.Template.value(), Obj);
+
+  serializeCommonChildren(I.Children, Obj, RepositoryUrl);
+}
+
+Error JSONGenerator::generateDocs(
+    StringRef RootDir, llvm::StringMap<std::unique_ptr<doc::Info>> Infos,
+    const ClangDocContext &CDCtx) {
+  StringSet<> CreatedDirs;
+  StringMap<std::vector<doc::Info *>> FileToInfos;
+  for (const auto &Group : Infos) {
+    Info *Info = Group.getValue().get();
+
+    SmallString<128> Path;
+    sys::path::native(RootDir, Path);
+    sys::path::append(Path, Info->getRelativeFilePath(""));
+    if (!CreatedDirs.contains(Path)) {
+      if (std::error_code Err = sys::fs::create_directories(Path);
+          Err != std::error_code())
+        return createFileError(Twine(Path), Err);
+      CreatedDirs.insert(Path);
+    }
+
+    sys::path::append(Path, Info->getFileBaseName() + ".json");
+    FileToInfos[Path].push_back(Info);
+  }
+
+  for (const auto &Group : FileToInfos) {
+    std::error_code FileErr;
+    raw_fd_ostream InfoOS(Group.getKey(), FileErr, sys::fs::OF_Text);
+    if (FileErr)
+      return createFileError("cannot open file " + Group.getKey(), FileErr);
+
+    for (const auto &Info : Group.getValue())
+      if (Error Err = generateDocForInfo(Info, InfoOS, CDCtx))
+        return Err;
+  }
+
+  return Error::success();
+}
+
+Error JSONGenerator::generateDocForInfo(Info *I, raw_ostream &OS,
+                                        const ClangDocContext &CDCtx) {
+  json::Object Obj = Object();
+
+  switch (I->IT) {
+  case InfoType::IT_namespace:
+    break;
+  case InfoType::IT_record:
+    serializeInfo(*static_cast<RecordInfo *>(I), Obj, CDCtx.RepositoryUrl);
+    break;
+  case InfoType::IT_enum:
+  case InfoType::IT_function:
+  case InfoType::IT_typedef:
+    break;
+  case InfoType::IT_default:
+    return createStringError(inconvertibleErrorCode(), "unexpected info type");
+  }
+  OS << llvm::formatv("{0:2}", llvm::json::Value(std::move(Obj)));
+  return Error::success();
+}
+
+Error JSONGenerator::createResources(ClangDocContext &CDCtx) {
+  return Error::success();
+}
+
+static GeneratorRegistry::Add<JSONGenerator> JSON(JSONGenerator::Format,
+                                                  "Generator for JSON output.");
+volatile int JSONGeneratorAnchorSource = 0;
+} // namespace doc
+} // namespace clang
diff --git a/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp b/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
index 15de031aa6091..3bb67baf65739 100644
--- a/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
+++ b/clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
@@ -110,7 +110,7 @@ Turn on time profiler. Generates clang-doc-tracing.json)"),
                                       llvm::cl::init(false),
                                       llvm::cl::cat(ClangDocCategory));
 
-enum OutputFormatTy { md, yaml, html, mustache };
+enum OutputFormatTy { md, yaml, html, mustache, json };
 
 static llvm::cl::opt<OutputFormatTy> FormatEnum(
     "format", llvm::cl::desc("Format for outputted docs."),
@@ -121,7 +121,9 @@ static llvm::cl::opt<OutputFormatTy> FormatEnum(
                      clEnumValN(OutputFormatTy::html, "html",
                                 "Documentation in HTML format."),
                      clEnumValN(OutputFormatTy::mustache, "mustache",
-                                "Documentation in mustache HTML format")),
+                                "Documentation in mustache HTML format"),
+                     clEnumValN(OutputFormatTy::json, "json",
+                                "Documentation in JSON format")),
     llvm::cl::init(OutputFormatTy::yaml), llvm::cl::cat(ClangDocCategory));
 
 static llvm::ExitOnError ExitOnErr;
@@ -136,6 +138,8 @@ static std::string getFormatString() {
     return "html";
   case OutputFormatTy::mustache:
     return "mustache";
+  case OutputFormatTy::json:
+    return "json";
   }
   llvm_unreachable("Unknown OutputFormatTy");
 }
diff --git a/clang-tools-extra/test/clang-doc/json/class-template.cpp b/clang-tools-extra/test/clang-doc/json/class-template.cpp
new file mode 100644
index 0000000000000..e3ca086d1d9a4
--- /dev/null
+++ b/clang-tools-extra/test/clang-doc/json/class-template.cpp
@@ -0,0 +1,29 @@
+// RUN: rm -rf %t && mkdir -p %t
+// RUN: clang-doc --output=%t --format=json --executor=standalone %s
+// RUN: FileCheck %s < %t/GlobalNamespace/MyClass.json
+
+template<typename T> struct MyClass {
+  T MemberTemplate;
+  T method(T Param); 
+};
+
+// CHECK:         "Name": "MyClass",
+// CHECK:         "Name": "method",
+// CHECK:         "Params": [
+// CHECK-NEXT:      {
+// CHECK-NEXT:        "Name": "Param",
+// CHECK-NEXT:        "Type": "T"
+// CHECK-NEXT:      } 
+// CHECK-NEXT:    ], 
+// CHECK-NEXT:    "ReturnType": {
+// CHECK-NEXT:      "IsBuiltIn": false,
+// CHECK-NEXT:      "IsTemplate": false,
+// CHECK-NEXT:      "Name": "T...
[truncated]

@evelez7 evelez7 force-pushed the users/evelez7/clang-doc-json-generator branch 2 times, most recently from 54287df to b567c6c Compare June 9, 2025 06:15
// CHECK-NEXT: "Records": [
// CHECK-NEXT: {
// CHECK-NEXT: "Name": "NestedClass",
// CHECK-NEXT: "Path": "GlobalNamespace/MyClass",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the path separator an issue for Windows?

Copy link
Member Author

@evelez7 evelez7 Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so apparently this is a thing in JSON?

// JSON-INDEX-NEXT: "Path": "PrimaryNamespace{{[\/]+}}NestedNamespace",

We have to handle multiple backslashes for some reason. The markdown in the same file doesn't need to handle multiple slashes.

// MD-GLOBAL-INDEX: * [PrimaryNamespace](..{{[\/]}}PrimaryNamespace{{[\/]}}index.md)

Copy link
Member Author

@evelez7 evelez7 Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, JSON needs to escape special characters so it leaves us with two backslashes...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CI is happy, I think you're good.

@evelez7 evelez7 force-pushed the users/evelez7/clang-doc-json-generator branch from b567c6c to 84173b4 Compare June 10, 2025 02:27
Copy link
Member Author

evelez7 commented Jun 10, 2025

Merge activity

  • Jun 10, 3:37 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Jun 10, 3:39 PM UTC: @evelez7 merged this pull request with Graphite.

@evelez7 evelez7 merged commit 1c3320c into main Jun 10, 2025
7 checks passed
@evelez7 evelez7 deleted the users/evelez7/clang-doc-json-generator branch June 10, 2025 15:39
rorth pushed a commit to rorth/llvm-project that referenced this pull request Jun 11, 2025
Adds a JSON generator backend to emit mapped information as JSON. This will enable a better testing format for upcoming changes. It can also potentially serve to feed our other backend generators in the future, like Mustache which already serializes information to JSON before emitting as HTML.

This patch contains functionality to emit classes and provides most of the basis of the generator.
tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025
Adds a JSON generator backend to emit mapped information as JSON. This will enable a better testing format for upcoming changes. It can also potentially serve to feed our other backend generators in the future, like Mustache which already serializes information to JSON before emitting as HTML.

This patch contains functionality to emit classes and provides most of the basis of the generator.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants