Skip to content

[C++20] [Modules] Warn if we found #include <filename> in module purview #69555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 2, 2023

Conversation

ChuanqiXu9
Copy link
Member

Close #68615.

It is generally wrong to include in the module purview. Although there are cases to include files in the module purview, generally these use cases should include files by quotes instead of by angles. Here we think the files got included by angles are the system headers.

This is consistency with MSVC too: https://learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warnings-by-compiler-version?view=msvc-170#warnings-introduced-in-visual-studio-2022-version-170-compiler-version-1930

@ChuanqiXu9 ChuanqiXu9 requested review from iains and dwblaikie October 19, 2023 03:31
@ChuanqiXu9 ChuanqiXu9 self-assigned this Oct 19, 2023
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Oct 19, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 19, 2023

@llvm/pr-subscribers-clang-modules

@llvm/pr-subscribers-clang

Author: Chuanqi Xu (ChuanqiXu9)

Changes

Close #68615.

It is generally wrong to include <filename> in the module purview. Although there are cases to include files in the module purview, generally these use cases should include files by quotes instead of by angles. Here we think the files got included by angles are the system headers.

This is consistency with MSVC too: https://learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/compiler-warnings-by-compiler-version?view=msvc-170#warnings-introduced-in-visual-studio-2022-version-170-compiler-version-1930


Full diff: https://github.com/llvm/llvm-project/pull/69555.diff

3 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticLexKinds.td (+4)
  • (modified) clang/lib/Lex/PPDirectives.cpp (+3)
  • (added) clang/test/Preprocessor/include-in-module-purview.cppm (+60)
diff --git a/clang/include/clang/Basic/DiagnosticLexKinds.td b/clang/include/clang/Basic/DiagnosticLexKinds.td
index 940cca67368492f..5d96bcb9359951f 100644
--- a/clang/include/clang/Basic/DiagnosticLexKinds.td
+++ b/clang/include/clang/Basic/DiagnosticLexKinds.td
@@ -922,6 +922,10 @@ def err_header_import_semi_in_macro : Error<
 def err_header_import_not_header_unit : Error<
   "header file %0 (aka '%1') cannot be imported because "
   "it is not known to be a header unit">;
+def warn_pp_include_angled_in_module_purview : Warning<
+  "'#include <filename>' in the module purview appears to be erroneous; "
+  "consider moving that directive before the module declaration">,
+  InGroup<DiagGroup<"include-angled-in-module-purview">>;
 
 def warn_header_guard : Warning<
   "%0 is used as a header guard here, followed by #define of a different macro">,
diff --git a/clang/lib/Lex/PPDirectives.cpp b/clang/lib/Lex/PPDirectives.cpp
index 2892d4b777846ff..243f7a729681ce9 100644
--- a/clang/lib/Lex/PPDirectives.cpp
+++ b/clang/lib/Lex/PPDirectives.cpp
@@ -2530,6 +2530,9 @@ Preprocessor::ImportAction Preprocessor::HandleHeaderIncludeOrImport(
     return {ImportAction::None};
   }
 
+  if (isAngled && isInNamedModule())
+    Diag(FilenameTok, diag::warn_pp_include_angled_in_module_purview);
+
   // Look up the file, create a File ID for it.
   SourceLocation IncludePos = FilenameTok.getLocation();
   // If the filename string was the result of macro expansions, set the include
diff --git a/clang/test/Preprocessor/include-in-module-purview.cppm b/clang/test/Preprocessor/include-in-module-purview.cppm
new file mode 100644
index 000000000000000..a06088b3b408fbc
--- /dev/null
+++ b/clang/test/Preprocessor/include-in-module-purview.cppm
@@ -0,0 +1,60 @@
+// RUN: rm -rf %t
+// RUN: mkdir %t
+// RUN: split-file %s %t
+//
+// RUN: %clang_cc1 -std=c++20 %t/a.cppm -E -P -I%t -o %t/tmp 2>&1 | FileCheck %t/a.cppm
+// RUN: %clang_cc1 -std=c++20 %t/a.cppm -E -P -I%t -o - 2>&1 \
+// RUN:     -Wno-include-angled-in-module-purview | FileCheck %t/a.cppm --check-prefix=CHECK-NO-WARN
+
+//--- a.h
+// left empty
+
+//--- b.h
+#include <stddef.h>
+// The headers not get included shouldn't be affected.
+#ifdef WHATEVER
+#include <stdint.h>
+#endif
+
+//--- a.cppm
+module;
+#include <stddef.h>
+#include <a.h>
+#include <b.h>
+#include "a.h"
+#include "b.h"
+export module a;
+
+#include <stddef.h>
+#include <a.h>
+#include <b.h>
+#include "a.h"
+#include "b.h"
+
+// CHECK: a.cppm:9:10: warning: '#include <filename>' in the module purview is probably incorrect
+// CHECK: a.cppm:10:10: warning: '#include <filename>' in the module purview is probably incorrect
+// CHECK: a.cppm:11:10: warning: '#include <filename>' in the module purview is probably incorrect
+// CHECK: In file included from {{.*}}/a.cppm:11
+// CHECK-NEXT: b.h:1:10: warning: '#include <filename>' in the module purview is probably incorrect; 
+// CHECK: In file included from {{.*}}/a.cppm:13
+// CHECK-NEXT: b.h:1:10: warning: '#include <filename>' in the module purview is probably incorrect; 
+
+module :private;
+#include <stddef.h>
+#include <a.h>
+#include <b.h>
+#include "a.h"
+#include "b.h"
+
+// CHECK: a.cppm:24:10: warning: '#include <filename>' in the module purview is probably incorrect;
+// CHECK: a.cppm:25:10: warning: '#include <filename>' in the module purview is probably incorrect
+// CHECK: a.cppm:26:10: warning: '#include <filename>' in the module purview is probably incorrect
+// CHECK: In file included from {{.*}}/a.cppm:26
+// CHECK-NEXT: b.h:1:10: warning: '#include <filename>' in the module purview is probably incorrect; 
+// CHECK: In file included from {{.*}}/a.cppm:28
+// CHECK-NEXT: b.h:1:10: warning: '#include <filename>' in the module purview is probably incorrect; 
+
+// We should have catched all warnings.
+// CHECK: 10 warnings generated.
+
+// CHECK-NO-WARN-NOT: warning

Address llvm#68615.

It is generally wrong to include <filename> in the module purview.
Although there are cases to include files in the module purview,
generally these use cases should include files by quotes instead of by
angles. Here we think the files got included by angles are the system
headers.
@ChuanqiXu9 ChuanqiXu9 force-pushed the WarnIncludeInModulePurview branch from 95427d2 to 86663a3 Compare October 19, 2023 03:39
@ChuanqiXu9 ChuanqiXu9 added the clang:modules C++20 modules and Clang Header Modules label Oct 19, 2023
@@ -922,6 +922,10 @@ def err_header_import_semi_in_macro : Error<
def err_header_import_not_header_unit : Error<
"header file %0 (aka '%1') cannot be imported because "
"it is not known to be a header unit">;
def warn_pp_include_angled_in_module_purview : Warning<
"'#include <filename>' in the module purview appears to be erroneous; "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'appears to be erroneous' seems a bit strong (since we know there are valid cases to do this). Maybe we can think of a way to explain the actual problem without ending up with too many words (or perhaps have a short warning message and attach a note with explanation)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts are that all the known valid use cases should use the form #include "filename" instead of #include <filename>. The angle or the quote matters here. I think it may be a general sense that the headers in the <> should be a system header and other things should be put in quotes "" (I know this is not strictly true). How do you think about this?

Copy link
Contributor

@iains iains Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree in general that limiting it to '<...>' seems a reasonable filter; either people should be import-ing or #include "..." - but let's also get Michael's opinion.

text-wise - maybe something like:
#include <xxxx> attaches the declarations to the named module 'M', which is not usually intended; (where we can name M in the message).

edit - or maybe even more specific:
or #include <xxxx> attaches system header declarations to the named module 'M', which is not usually intended;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@iains iains left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I suggested this, it was an intention to limit it to C++ std library headers. Do we not have some existing mechanism that knows which headers are in this category (for typo correction etc?)

My concern with making it more general is that it then covers things like frameworks which are also included like <Framework/....> (of course, it's also probably not what was intended there as well)

Let's add @Bigcheese to the reviewers, for an opinion on Frameworks

@iains iains requested a review from Bigcheese October 19, 2023 07:39
@ChuanqiXu9
Copy link
Member Author

When I suggested this, it was an intention to limit it to C++ std library headers. Do we not have some existing mechanism that knows which headers are in this category (for typo correction etc?)

As far as I can reach, we don't have such a mechanism. I was wondering if we can do this by using # pragma GCC system_header. But it shows that the libcxx don't always use this pragma.

Also I feel it makes sense to not include other system headers. e.g., in our coding standards, we need <> to include boost headers and the sys headers (e.g., #include <sys/mman.h>)

My concern with making it more general is that it then covers things like frameworks which are also included like <Framework/....> (of course, it's also probably not what was intended there as well)

Let's add @Bigcheese to the reviewers, for an opinion on Frameworks

What do you mean by frameworks? Do you mean something used in apple modules? If yes, the current implementation shouldn't cover that. Since I add this warning only after we handled import.

@iains
Copy link
Contributor

iains commented Oct 19, 2023

When I suggested this, it was an intention to limit it to C++ std library headers. Do we not have some existing mechanism that knows which headers are in this category (for typo correction etc?)

As far as I can reach, we don't have such a mechanism.

Ah, that's unfortunate, if we could limit the initial warning to items within the language and only expand it if there are reported problems elsewhere it would be easier.

I was wondering if we can do this by using # pragma GCC system_header. But it shows that the libcxx don't always use this pragma.

agree that this would seem unreliable/unusable.

Also I feel it makes sense to not include other system headers. e.g., in our coding standards, we need <> to include boost headers and the sys headers (e.g., #include <sys/mman.h>)

yeah - that is both a good case for making it more general and also a warning that project policies could do something unexpected.

My concern with making it more general is that it then covers things like frameworks which are also included like <Framework/....> (of course, it's also probably not what was intended there as well)
Let's add @Bigcheese to the reviewers, for an opinion on Frameworks

What do you mean by frameworks? Do you mean something used in apple modules? If yes, the current implementation shouldn't cover that. Since I add this warning only after we handled import.

(in this context) Framework headers are like system (or user headers) but searched by a different mechanism (so they are #include-ed). In principle, the same constraints should apply - <Framework/....> indicates a "system" header and probably should not be included in the module purview.

@ChuanqiXu9
Copy link
Member Author

(in this context) Framework headers are like system (or user headers) but searched by a different mechanism (so they are #include-ed). In principle, the same constraints should apply - <Framework/....> indicates a "system" header and probably should not be included in the module purview.

Got it. Then it looks like not a problem for Frameworks. Let's wait for the opinion from @Bigcheese

@github-actions
Copy link

github-actions bot commented Oct 19, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@ChuanqiXu9
Copy link
Member Author

@Bigcheese ping

@ChuanqiXu9
Copy link
Member Author

I'd like to land this next week if no objection comes in. Since the frameworks looks not intended to be the places too.

@Bigcheese
Copy link
Contributor

This is fine, it's definitely wrong to include framework headers like this too.

@ChuanqiXu9
Copy link
Member Author

Got it. Thanks. Then I feel this is ready to land.

@ChuanqiXu9 ChuanqiXu9 merged commit 0d21436 into llvm:main Nov 2, 2023
@dwblaikie
Copy link
Collaborator

FWIW, we saw failures at Google (where, to the best of my knowledge, we aren't using named modules at all) that look like this:

error: '#include <filename>' attaches the declarations to the named module '.get', which is not usually intended; consider moving that directive before the module declaration [-Werror,-Winclude-angled-in-module-purview]
   14 | #include <stddef.h>

So there's probably some problems with this patch? is this enough to go on, or would a reduced test case be required to address the issue? (reducing modules issues is a bit difficult/expensive, or I'd have provided it up front)

@ChuanqiXu9
Copy link
Member Author

FWIW, we saw failures at Google (where, to the best of my knowledge, we aren't using named modules at all) that look like this:

error: '#include <filename>' attaches the declarations to the named module '.get', which is not usually intended; consider moving that directive before the module declaration [-Werror,-Winclude-angled-in-module-purview]
   14 | #include <stddef.h>

So there's probably some problems with this patch? is this enough to go on, or would a reduced test case be required to address the issue? (reducing modules issues is a bit difficult/expensive, or I'd have provided it up front)

Since the patch itself is pretty simple, if there is a problem, it should come from the implementation Preprocessor::isInNamedModule(). It should be helpful to provide a reproducer so that we can be sure we're facing the same issue.

MaskRay added a commit to MaskRay/llvm-project that referenced this pull request Nov 3, 2023
…ular identifier

`ModuleDeclState` is incorrectly changed to `NamedModuleImplementation`
for `struct module {}; void foo(module a);`. This is mostly benign but
leads to a spurious warning after llvm#69555.

A real world example is:
```
// pybind11.h
class module_ { ... };
using module = module_;

// tensorflow
void DefineMetricsModule(pybind11::module main_module);
// `module main_module);` incorrectly changes `ModuleDeclState` to `NamedModuleImplementation`
```
@MaskRay
Copy link
Member

MaskRay commented Nov 3, 2023

FWIW, we saw failures at Google (where, to the best of my knowledge, we aren't using named modules at all) that look like this:

error: '#include <filename>' attaches the declarations to the named module '.get', which is not usually intended; consider moving that directive before the module declaration [-Werror,-Winclude-angled-in-module-purview]
   14 | #include <stddef.h>

So there's probably some problems with this patch? is this enough to go on, or would a reduced test case be required to address the issue? (reducing modules issues is a bit difficult/expensive, or I'd have provided it up front)

Since the patch itself is pretty simple, if there is a problem, it should come from the implementation Preprocessor::isInNamedModule(). It should be helpful to provide a reproducer so that we can be sure we're facing the same issue.

See #71134 :)

struct module {}; void foo(module a);
#include <stdio.h>

MaskRay added a commit that referenced this pull request Nov 3, 2023
…ular identifier (#71134)

`ModuleDeclState` is incorrectly changed to `NamedModuleImplementation`
for `struct module {}; void foo(module a);`. This is mostly benign but
leads to a spurious warning after #69555.

A real world example is:
```
// pybind11.h
class module_ { ... };
using module = module_;

// tensorflow
void DefineMetricsModule(pybind11::module main_module);
// `module main_module);` incorrectly changes `ModuleDeclState` to `NamedModuleImplementation`

#include <algorithm> // spurious warning
```
gnoliyil pushed a commit to gnoliyil/fuchsia that referenced this pull request Jan 27, 2024
ktl/enforce.h is included as a system header and purposefully placed as
the last header following all other #include lines.

This particular placement is flagged by the
include-angled-in-module-purview check introduced in
llvm/llvm-project#69555.

This CL disables the check for multiboot-init.cc so that it may preserve
the system header placement has intended.

Fixed: 136187
Change-Id: I9cbd918b4369364b4c160130a6c3ee77bd24f2a3
Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/943924
Reviewed-by: Nick Maniscalco <[email protected]>
Fuchsia-Auto-Submit: Caslyn Tonelli <[email protected]>
Commit-Queue: Auto-Submit <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[clang] [Modules] Add warning for including header in module purview
6 participants