Skip to content

Commit f38b2cb

Browse files
ChuanqiXu9AlexisPerry
authored andcommitted
[C++20] [Modules] [Serialization] Don't reuse type ID and identifier ID from imported modules
To support no-transitive-change model for named modules, we can't reuse type ID and identifier ID from imported modules arbitrarily. Since the theory for no-transitive-change model is, for a user of a named module, the user can only access the indirectly imported decls via the directly imported module. So that it is possible to control what matters to the users when writing the module. And it will be unsafe to do so if the users can reuse the type IDs and identifier IDs from the indirectly imported modules not via the directly imported modules. So in this patch, we don't reuse the type ID and identifier ID in the AST writer to avoid the problematic case.
1 parent 3f5c8cf commit f38b2cb

File tree

3 files changed

+93
-0
lines changed

3 files changed

+93
-0
lines changed

clang/lib/Serialization/ASTWriter.cpp

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5355,6 +5355,20 @@ ASTFileSignature ASTWriter::WriteASTCore(Sema &SemaRef, StringRef isysroot,
53555355

53565356
writeUnhashedControlBlock(PP, Context);
53575357

5358+
// Don't reuse type ID and Identifier ID from readers for C++ standard named
5359+
// modules since we want to support no-transitive-change model for named
5360+
// modules. The theory for no-transitive-change model is,
5361+
// for a user of a named module, the user can only access the indirectly
5362+
// imported decls via the directly imported module. So that it is possible to
5363+
// control what matters to the users when writing the module. It would be
5364+
// problematic if the users can reuse the type IDs and identifier IDs from
5365+
// indirectly imported modules arbitrarily. So we choose to clear these ID
5366+
// here.
5367+
if (isWritingStdCXXNamedModules()) {
5368+
TypeIdxs.clear();
5369+
IdentifierIDs.clear();
5370+
}
5371+
53585372
// Look for any identifiers that were named while processing the
53595373
// headers, but are otherwise not needed. We add these to the hash
53605374
// table to enable checking of the predefines buffer in the case
@@ -6686,6 +6700,11 @@ void ASTWriter::ReaderInitialized(ASTReader *Reader) {
66866700
}
66876701

66886702
void ASTWriter::IdentifierRead(IdentifierID ID, IdentifierInfo *II) {
6703+
// Don't reuse Type ID from external modules for named modules. See the
6704+
// comments in WriteASTCore for details.
6705+
if (isWritingStdCXXNamedModules())
6706+
return;
6707+
66896708
IdentifierID &StoredID = IdentifierIDs[II];
66906709
unsigned OriginalModuleFileIndex = StoredID >> 32;
66916710

@@ -6708,6 +6727,11 @@ void ASTWriter::MacroRead(serialization::MacroID ID, MacroInfo *MI) {
67086727
}
67096728

67106729
void ASTWriter::TypeRead(TypeIdx Idx, QualType T) {
6730+
// Don't reuse Type ID from external modules for named modules. See the
6731+
// comments in WriteASTCore for details.
6732+
if (isWritingStdCXXNamedModules())
6733+
return;
6734+
67116735
// Always take the type index that comes in later module files.
67126736
// This copes with an interesting
67136737
// case for chained AST writing where we schedule writing the type and then,
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
// Testing that we won't record the identifier ID from external modules.
2+
//
3+
// RUN: rm -rf %t
4+
// RUN: split-file %s %t
5+
// RUN: cd %t
6+
//
7+
// RUN: %clang_cc1 -std=c++20 %t/a.cppm -emit-module-interface -o %t/a.pcm
8+
// RUN: %clang_cc1 -std=c++20 %t/b.cppm -emit-module-interface -o %t/b.pcm \
9+
// RUN: -fmodule-file=a=%t/a.pcm
10+
// RUN: llvm-bcanalyzer --dump --disable-histogram %t/b.pcm | FileCheck %t/b.cppm
11+
//
12+
// RUN: %clang_cc1 -std=c++20 %t/a.v1.cppm -emit-module-interface -o %t/a.v1.pcm
13+
// RUN: %clang_cc1 -std=c++20 %t/b.cppm -emit-module-interface -o %t/b.v1.pcm \
14+
// RUN: -fmodule-file=a=%t/a.v1.pcm
15+
// RUN: diff %t/b.pcm %t/b.v1.pcm &> /dev/null
16+
17+
//--- a.cppm
18+
export module a;
19+
export inline int a() {
20+
int foo = 43;
21+
return foo;
22+
}
23+
24+
//--- b.cppm
25+
export module b;
26+
import a;
27+
export inline int b() {
28+
int foo = 43;
29+
return foo;
30+
}
31+
32+
// CHECK: <DECL_VAR {{.*}} op5=4
33+
34+
//--- a.v1.cppm
35+
// We remove the unused the function and testing if the format of the BMI of B will change.
36+
export module a;
37+
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
// Testing that we won't record the type ID from external modules.
2+
//
3+
// RUN: rm -rf %t
4+
// RUN: split-file %s %t
5+
// RUN: cd %t
6+
//
7+
// RUN: %clang_cc1 -std=c++20 %t/a.cppm -emit-module-interface -o %t/a.pcm
8+
// RUN: %clang_cc1 -std=c++20 %t/b.cppm -emit-module-interface -o %t/b.pcm \
9+
// RUN: -fmodule-file=a=%t/a.pcm
10+
// RUN: llvm-bcanalyzer --dump --disable-histogram %t/b.pcm | FileCheck %t/b.cppm
11+
//
12+
// RUN: %clang_cc1 -std=c++20 %t/a.v1.cppm -emit-module-interface -o %t/a.v1.pcm
13+
// RUN: %clang_cc1 -std=c++20 %t/b.cppm -emit-module-interface -o %t/b.v1.pcm \
14+
// RUN: -fmodule-file=a=%t/a.v1.pcm
15+
// RUN: diff %t/b.pcm %t/b.v1.pcm &> /dev/null
16+
17+
//--- a.cppm
18+
export module a;
19+
export int a();
20+
21+
//--- b.cppm
22+
export module b;
23+
import a;
24+
export int b();
25+
26+
// CHECK: <DECL_FUNCTION {{.*}} op8=4048
27+
// CHECK: <TYPE_FUNCTION_PROTO
28+
29+
//--- a.v1.cppm
30+
// We remove the unused the function and testing if the format of the BMI of B will change.
31+
export module a;
32+

0 commit comments

Comments
 (0)