Skip to content

[Driver] Add ExclusiveGroup feature to multilib.yaml. #69447

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 1, 2023

Conversation

statham-arm
Copy link
Collaborator

This allows a YAML-based multilib configuration to specify explicitly that a subset of its library directories are alternatives to each other, i.e. at most one of that subset should be selected.

So if you have multiple sysroots each including a full set of headers and libraries, you can mark them as members of the same ExclusiveGroup, and then you'll be sure that only one of them is selected, even if two or more are compatible with the compile options.

This is particularly important in multilib setups including the libc++ headers, where selecting the include directories from two different sysroots can cause an actual build failure. This occurs when including <stdio.h>, for example: libc++'s stdio.h is included first, and will try to use #include_next to fetch the underlying libc's version. But if there are two include directories from separate multilibs, then both of their C++ include directories will end up on the include path first, followed by both the C directories. So the #include_next from the first libc++ stdio.h will include the second libc++ stdio.h, which will do nothing because it has the same include guard macro, and the libc header won't ever be included at all.

If more than one of the options in an ExclusiveGroup matches the given flags, the last one wins.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Oct 18, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Simon Tatham (statham-arm)

Changes

This allows a YAML-based multilib configuration to specify explicitly that a subset of its library directories are alternatives to each other, i.e. at most one of that subset should be selected.

So if you have multiple sysroots each including a full set of headers and libraries, you can mark them as members of the same ExclusiveGroup, and then you'll be sure that only one of them is selected, even if two or more are compatible with the compile options.

This is particularly important in multilib setups including the libc++ headers, where selecting the include directories from two different sysroots can cause an actual build failure. This occurs when including <stdio.h>, for example: libc++'s stdio.h is included first, and will try to use #include_next to fetch the underlying libc's version. But if there are two include directories from separate multilibs, then both of their C++ include directories will end up on the include path first, followed by both the C directories. So the #include_next from the first libc++ stdio.h will include the second libc++ stdio.h, which will do nothing because it has the same include guard macro, and the libc header won't ever be included at all.

If more than one of the options in an ExclusiveGroup matches the given flags, the last one wins.


Full diff: https://github.com/llvm/llvm-project/pull/69447.diff

3 Files Affected:

  • (modified) clang/include/clang/Driver/Multilib.h (+14-1)
  • (modified) clang/lib/Driver/Multilib.cpp (+39-10)
  • (added) clang/test/Driver/baremetal-multilib-exclusive-group.yaml (+69)
diff --git a/clang/include/clang/Driver/Multilib.h b/clang/include/clang/Driver/Multilib.h
index 1416559414f894b..46f23a2ff5fabac 100644
--- a/clang/include/clang/Driver/Multilib.h
+++ b/clang/include/clang/Driver/Multilib.h
@@ -39,13 +39,23 @@ class Multilib {
   std::string IncludeSuffix;
   flags_list Flags;
 
+  // Optionally, a multilib can be assigned a string tag indicating that it's
+  // part of a group of mutually exclusive possibilities. If two or more
+  // multilibs have the same non-empty value of ExclusiveGroup, then only the
+  // last matching one of them will be selected.
+  //
+  // Setting this to the empty string is a special case, indicating that the
+  // directory is not mutually exclusive with anything else.
+  std::string ExclusiveGroup;
+
 public:
   /// GCCSuffix, OSSuffix & IncludeSuffix will be appended directly to the
   /// sysroot string so they must either be empty or begin with a '/' character.
   /// This is enforced with an assert in the constructor.
   Multilib(StringRef GCCSuffix = {}, StringRef OSSuffix = {},
            StringRef IncludeSuffix = {},
-           const flags_list &Flags = flags_list());
+           const flags_list &Flags = flags_list(),
+           StringRef ExclusiveGroup = {});
 
   /// Get the detected GCC installation path suffix for the multi-arch
   /// target variant. Always starts with a '/', unless empty
@@ -63,6 +73,9 @@ class Multilib {
   /// All elements begin with either '-' or '!'
   const flags_list &flags() const { return Flags; }
 
+  /// Get the exclusive group label.
+  const std::string &exclusiveGroup() const { return ExclusiveGroup; }
+
   LLVM_DUMP_METHOD void dump() const;
   /// print summary of the Multilib
   void print(raw_ostream &OS) const;
diff --git a/clang/lib/Driver/Multilib.cpp b/clang/lib/Driver/Multilib.cpp
index ba466af39e2dcaf..a8eff30f1416852 100644
--- a/clang/lib/Driver/Multilib.cpp
+++ b/clang/lib/Driver/Multilib.cpp
@@ -29,9 +29,10 @@ using namespace driver;
 using namespace llvm::sys;
 
 Multilib::Multilib(StringRef GCCSuffix, StringRef OSSuffix,
-                   StringRef IncludeSuffix, const flags_list &Flags)
+                   StringRef IncludeSuffix, const flags_list &Flags,
+                   StringRef ExclusiveGroup)
     : GCCSuffix(GCCSuffix), OSSuffix(OSSuffix), IncludeSuffix(IncludeSuffix),
-      Flags(Flags) {
+      Flags(Flags), ExclusiveGroup(ExclusiveGroup) {
   assert(GCCSuffix.empty() ||
          (StringRef(GCCSuffix).front() == '/' && GCCSuffix.size() > 1));
   assert(OSSuffix.empty() ||
@@ -96,13 +97,39 @@ bool MultilibSet::select(const Multilib::flags_list &Flags,
                          llvm::SmallVector<Multilib> &Selected) const {
   llvm::StringSet<> FlagSet(expandFlags(Flags));
   Selected.clear();
-  llvm::copy_if(Multilibs, std::back_inserter(Selected),
-                [&FlagSet](const Multilib &M) {
-                  for (const std::string &F : M.flags())
-                    if (!FlagSet.contains(F))
-                      return false;
-                  return true;
-                });
+
+  // Decide which multilibs we're going to select at all
+  std::vector<bool> IsSelected(Multilibs.size(), false);
+  std::map<std::string, size_t> ExclusiveGroupMembers;
+  for (size_t i = 0, e = Multilibs.size(); i < e; ++i) {
+    const Multilib &M = Multilibs[i];
+
+    // If this multilib doesn't match all our flags, don't select it
+    if (!llvm::all_of(M.flags(), [&FlagSet](const std::string &F) {
+          return FlagSet.contains(F);
+        }))
+      continue;
+
+    // If this multilib has the same ExclusiveGroup as one we've already
+    // selected, de-select the previous one
+    const std::string &group = M.exclusiveGroup();
+    if (!group.empty()) {
+      auto it = ExclusiveGroupMembers.find(group);
+      if (it != ExclusiveGroupMembers.end())
+        IsSelected[it->second] = false;
+    }
+
+    // Select this multilib
+    IsSelected[i] = true;
+    if (!group.empty())
+      ExclusiveGroupMembers[group] = i;
+  }
+
+  // Now write the selected multilibs into the output
+  for (size_t i = 0, e = Multilibs.size(); i < e; ++i)
+    if (IsSelected[i])
+      Selected.push_back(Multilibs[i]);
+
   return !Selected.empty();
 }
 
@@ -139,6 +166,7 @@ static const VersionTuple MultilibVersionCurrent(1, 0);
 struct MultilibSerialization {
   std::string Dir;
   std::vector<std::string> Flags;
+  std::string ExclusiveGroup;
 };
 
 struct MultilibSetSerialization {
@@ -153,6 +181,7 @@ template <> struct llvm::yaml::MappingTraits<MultilibSerialization> {
   static void mapping(llvm::yaml::IO &io, MultilibSerialization &V) {
     io.mapRequired("Dir", V.Dir);
     io.mapRequired("Flags", V.Flags);
+    io.mapOptional("ExclusiveGroup", V.ExclusiveGroup);
   }
   static std::string validate(IO &io, MultilibSerialization &V) {
     if (StringRef(V.Dir).starts_with("/"))
@@ -215,7 +244,7 @@ MultilibSet::parseYaml(llvm::MemoryBufferRef Input,
     std::string Dir;
     if (M.Dir != ".")
       Dir = "/" + M.Dir;
-    Multilibs.emplace_back(Dir, Dir, Dir, M.Flags);
+    Multilibs.emplace_back(Dir, Dir, Dir, M.Flags, M.ExclusiveGroup);
   }
 
   return MultilibSet(std::move(Multilibs), std::move(MS.FlagMatchers));
diff --git a/clang/test/Driver/baremetal-multilib-exclusive-group.yaml b/clang/test/Driver/baremetal-multilib-exclusive-group.yaml
new file mode 100644
index 000000000000000..0a0c98d245a1fbc
--- /dev/null
+++ b/clang/test/Driver/baremetal-multilib-exclusive-group.yaml
@@ -0,0 +1,69 @@
+# REQUIRES: shell
+# UNSUPPORTED: system-windows
+
+# RUN: rm -rf %T/baremetal_multilib
+
+# RUN: mkdir -p %T/baremetal_multilib/bin
+# RUN: ln -s %clang %T/baremetal_multilib/bin/clang
+
+# RUN: mkdir -p %T/baremetal_multilib/lib/clang-runtimes
+# RUN: ln -s %s %T/baremetal_multilib/lib/clang-runtimes/multilib.yaml
+
+# RUN: %T/baremetal_multilib/bin/clang -no-canonical-prefixes -x c++ %s -### -o %t.out --target=thumbv7em-none-unknown-eabi --sysroot= 2>%t
+
+# RUN: FileCheck -DSYSROOT=%T/baremetal_multilib %s < %t --check-prefix=TESTDIR1_NON_EXCLUSIVE
+# RUN: FileCheck -DSYSROOT=%T/baremetal_multilib %s < %t --check-prefix=TESTDIR2_NON_EXCLUSIVE
+# RUN: FileCheck -DSYSROOT=%T/baremetal_multilib %s < %t --check-prefix=TESTDIR1_EXCLUSIVE
+# RUN: FileCheck -DSYSROOT=%T/baremetal_multilib %s < %t --check-prefix=TESTDIR2_EXCLUSIVE
+# RUN: FileCheck -DSYSROOT=%T/baremetal_multilib %s < %t --check-prefix=TESTDIR1_OWN_GROUP
+# RUN: FileCheck -DSYSROOT=%T/baremetal_multilib %s < %t --check-prefix=TESTDIR2_OWN_GROUP
+
+# Expected results:
+#
+# Due to the Mappings section, all six of these library directories should
+# match the command-line flag --target=thumbv7em-none-unknown-eabi.
+#
+# The two "non_exclusive" directories, which don't have an ExclusiveGroup at
+# all, should both be selected. So should the two "own_group", each of which
+# specifies a different value of ExclusiveGroup. But the two "exclusive", which
+# have the _same_ ExclusiveGroup value, should not: the second one wins. So we
+# expect five of these six directories to show up in the clang-cc1 command
+# line, but not testdir1_exclusive.
+
+# TESTDIR1_NON_EXCLUSIVE: "-internal-isystem" "[[SYSROOT]]/bin/../lib/clang-runtimes/testdir1_non_exclusive/include/c++/v1"
+# TESTDIR2_NON_EXCLUSIVE: "-internal-isystem" "[[SYSROOT]]/bin/../lib/clang-runtimes/testdir2_non_exclusive/include/c++/v1"
+# TESTDIR2_EXCLUSIVE: "-internal-isystem" "[[SYSROOT]]/bin/../lib/clang-runtimes/testdir2_exclusive/include/c++/v1"
+# TESTDIR1_OWN_GROUP: "-internal-isystem" "[[SYSROOT]]/bin/../lib/clang-runtimes/testdir1_own_group/include/c++/v1"
+# TESTDIR2_OWN_GROUP: "-internal-isystem" "[[SYSROOT]]/bin/../lib/clang-runtimes/testdir2_own_group/include/c++/v1"
+
+# TESTDIR1_EXCLUSIVE-NOT: "-internal-isystem" "[[SYSROOT]]/bin/../lib/clang-runtimes/testdir1_exclusive/include/c++/v1"
+
+---
+MultilibVersion: 1.0
+
+Variants:
+- Dir: testdir1_non_exclusive
+  Flags: [--target=thumbv7m-none-unknown-eabi]
+
+- Dir: testdir2_non_exclusive
+  Flags: [--target=thumbv7em-none-unknown-eabi]
+
+- Dir: testdir1_exclusive
+  Flags: [--target=thumbv7m-none-unknown-eabi]
+  ExclusiveGroup: actually_exclude_something
+
+- Dir: testdir2_exclusive
+  Flags: [--target=thumbv7em-none-unknown-eabi]
+  ExclusiveGroup: actually_exclude_something
+
+- Dir: testdir1_own_group
+  Flags: [--target=thumbv7m-none-unknown-eabi]
+  ExclusiveGroup: foo
+
+- Dir: testdir2_own_group
+  Flags: [--target=thumbv7em-none-unknown-eabi]
+  ExclusiveGroup: bar
+
+Mappings:
+- Match: --target=thumbv7em-none-unknown-eabi
+  Flags: [--target=thumbv7m-none-unknown-eabi]

@github-actions
Copy link

github-actions bot commented Oct 18, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@statham-arm statham-arm force-pushed the exclusivegroup branch 3 times, most recently from e4d860c to 2a65ae7 Compare October 24, 2023 11:08
Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but other reviewers likely have opinions

@statham-arm
Copy link
Collaborator Author

@petrhosek, do you have any further comments? I'll merge this change based on @MaskRay's approval if I haven't heard back in another week.

@petrhosek
Copy link
Member

I'm fine with the feature, my only concern is to make sure we don't unintentionally make it harder to integrate potential future extensions such as the mutually dependent groups.

The only alternative I could come up with is representing groups as first class concept with properties, such as being mutually exclusive. To give a concrete example:

Groups:
- Name: actually_exclude_something
  Exclusive: True

Variants:
- Dir: testdir1_non_exclusive
  Flags: [--target=thumbv7m-none-unknown-eabi]

- Dir: testdir2_non_exclusive
  Flags: [--target=thumbv7em-none-unknown-eabi]

- Dir: testdir1_exclusive
  Flags: [--target=thumbv7m-none-unknown-eabi]
  Group: actually_exclude_something

- Dir: testdir2_exclusive
  Flags: [--target=thumbv7em-none-unknown-eabi]
  Group: actually_exclude_something

This makes it possible to extend the group concept in the future but it's a little more verbose since you need to define the group first, although that may not necessarily be a bad thing since you could also warn if someone accidentally tries to use a group that wasn't previously defined (e.g. when making a typo).

@statham-arm
Copy link
Collaborator Author

my only concern is to make sure we don't unintentionally make it harder to integrate potential future extensions such as the mutually dependent groups.

Hmmm. So if you had both ME and MD groups, you might also need a group to be able to be a member of another group? That way you could specify hierarchies such as "must have all of: A, B, and exactly one of C,D" (a MD group one of whose members is a ME group), or "must have at most one of: (all of A,B,C) or (all of U,V,W)" (a ME group containing MD groups).

I suppose that makes sense, and the only change it needs to your structure is that maybe later a group record might also need to have a Group: or Parent: header. But there's no need to put that part in now, only to make sure there's room to add it in future if needed.

Would you accept Type: Exclusive instead of Exclusive: True? It seems more plausible to me that there might be three kinds of group that can't go together than three group-type flags that you can have in any combination.

although that may not necessarily be a bad thing since you could also warn if someone accidentally tries to use a group that wasn't previously defined (e.g. when making a typo).

That is true.

@petrhosek
Copy link
Member

Would you accept Type: Exclusive instead of Exclusive: True? It seems more plausible to me that there might be three kinds of group that can't go together than three group-type flags that you can have in any combination.

I like this as it scales better to other types of groups we might have in the future.

@statham-arm
Copy link
Collaborator Author

OK, here's a version with the syntax that way. I've added another test to demonstrate the new error checks.

The implementation of exclusion is still done by having an ExclusiveGroup field in the actual Multilib class. Implementing mutually-dependent groups or nested groups is enough extra effort that I'd rather leave it until we actually need it! But now the user-facing syntax in multilib.yaml is futureproof against wanting to add those features later, so only the implementation should need to change.

@statham-arm
Copy link
Collaborator Author

(btw, that squash! commit contains the revised commit message I plan to put on the final version, so I need to not forget to do the squash by hand to get that right)

@MaskRay
Copy link
Member

MaskRay commented Nov 27, 2023

(btw, that squash! commit contains the revised commit message I plan to put on the final version, so I need to not forget to do the squash by hand to get that right)

You may edit the first comment in this PR now, and it will be propagated to the textbox when you click "Squash and merge" :)

https://discourse.llvm.org/t/force-push-and-rebase/73766 I believe the primary value appending new commits instead of rebasing is to demonstrate that the PR has been updated in the "Commits" tab. An amend without rebasing is equally good, just requiring compare links inserted by GitHub.

This allows a YAML-based multilib configuration to specify explicitly
that a subset of its library directories are alternatives to each
other, i.e. at most one of that subset should be selected.

So if you have multiple sysroots each including a full set of headers
and libraries, you can mark them as members of the same mutually
exclusive group, and then you'll be sure that only one of them is
selected, even if two or more are compatible with the compile options.

This is particularly important in multilib setups including the libc++
headers, where selecting the include directories from two different
sysroots can cause an actual build failure. This occurs when including
<stdio.h>, for example: libc++'s stdio.h is included first, and will
try to use `#include_next` to fetch the underlying libc's version. But
if there are two include directories from separate multilibs, then
both of their C++ include directories will end up on the include path
first, followed by both the C directories. So the `#include_next` from
the first libc++ stdio.h will include the second libc++ stdio.h, which
will do nothing because it has the same include guard macro, and the
libc header won't ever be included at all.

If more than one of the options in an exclusive group matches the
given flags, the last one wins.

The syntax for specifying this in multilib.yaml is to define a Groups
section in which you specify your group names, and for each one,
declare it to have Type: Exclusive. (This reserves space in the syntax
for maybe adding other group types later, such as a group of mutually
_dependent_ things that you must have all or none of.) Then each
Variant record that's a member of a group has a Group: property giving
that group's name.
@statham-arm
Copy link
Collaborator Author

(This final force-push is the squashed version of the previous stack, rebased to the current head of main, so that the builder can run a last test. Thanks both for the approvals; I'll merge it once the tests have finished.)

@statham-arm statham-arm merged commit 8727982 into llvm:main Dec 1, 2023
@statham-arm statham-arm deleted the exclusivegroup branch December 1, 2023 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants