Skip to content

[RISCV] Allow extra underscores in parseNormalizedArchString and parseArchString. #91532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented May 8, 2024

Allow double underscores and trailing underscores. gcc and binutils allow extra underscores without error.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels May 8, 2024
@llvmbot
Copy link
Member

llvmbot commented May 8, 2024

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-clang-driver

Author: Craig Topper (topperc)

Changes

Allow double underscores and trailing underscores. gcc and binutils allow extra underscores without error.


Full diff: https://github.com/llvm/llvm-project/pull/91532.diff

3 Files Affected:

  • (modified) clang/test/Driver/riscv-arch.c (-5)
  • (modified) llvm/lib/TargetParser/RISCVISAInfo.cpp (+3-26)
  • (modified) llvm/unittests/TargetParser/RISCVISAInfoTest.cpp (+1-13)
diff --git a/clang/test/Driver/riscv-arch.c b/clang/test/Driver/riscv-arch.c
index ddf617bbb6237..418d8e91595de 100644
--- a/clang/test/Driver/riscv-arch.c
+++ b/clang/test/Driver/riscv-arch.c
@@ -308,11 +308,6 @@
 // RV32-SMINOR0: error: invalid arch name 'rv32ist2p0',
 // RV32-SMINOR0: unsupported standard supervisor-level extension 'st'
 
-// RUN: not %clang --target=riscv32-unknown-elf -march=rv32ixabc_ -### %s \
-// RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV32-XSEP %s
-// RV32-XSEP: error: invalid arch name 'rv32ixabc_',
-// RV32-XSEP: extension name missing after separator '_'
-
 // RUN: not %clang --target=riscv32-unknown-elf -march=rv32ixabc_a -### %s \
 // RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV32-PREFIX %s
 // RV32-PREFIX: error: invalid arch name 'rv32ixabc_a',
diff --git a/llvm/lib/TargetParser/RISCVISAInfo.cpp b/llvm/lib/TargetParser/RISCVISAInfo.cpp
index 96590745b2ebc..dda2eeb515666 100644
--- a/llvm/lib/TargetParser/RISCVISAInfo.cpp
+++ b/llvm/lib/TargetParser/RISCVISAInfo.cpp
@@ -452,7 +452,7 @@ RISCVISAInfo::parseNormalizedArchString(StringRef Arch) {
   // and separated by _. Split by _ and then extract the name and version
   // information for each extension.
   SmallVector<StringRef, 8> Split;
-  Arch.split(Split, '_');
+  Arch.split(Split, '_', /*MaxSplit=*/-1, /*KeepEmpty=*/false);
   for (StringRef Ext : Split) {
     StringRef Prefix, MinorVersionStr;
     std::tie(Prefix, MinorVersionStr) = Ext.rsplit('p');
@@ -500,24 +500,6 @@ RISCVISAInfo::parseNormalizedArchString(StringRef Arch) {
   return std::move(ISAInfo);
 }
 
-static Error splitExtsByUnderscore(StringRef Exts,
-                                   std::vector<std::string> &SplitExts) {
-  SmallVector<StringRef, 8> Split;
-  if (Exts.empty())
-    return Error::success();
-
-  Exts.split(Split, "_");
-
-  for (auto Ext : Split) {
-    if (Ext.empty())
-      return createStringError(errc::invalid_argument,
-                               "extension name missing after separator '_'");
-
-    SplitExts.push_back(Ext.str());
-  }
-  return Error::success();
-}
-
 static Error processMultiLetterExtension(
     StringRef RawExt,
     MapVector<std::string, RISCVISAUtils::ExtensionVersion,
@@ -669,10 +651,6 @@ RISCVISAInfo::parseArchString(StringRef Arch, bool EnableExperimentalExtension,
     break;
   }
 
-  if (Arch.back() == '_')
-    return createStringError(errc::invalid_argument,
-                             "extension name missing after separator '_'");
-
   // Skip baseline.
   StringRef Exts = Arch.drop_front(1);
 
@@ -714,9 +692,8 @@ RISCVISAInfo::parseArchString(StringRef Arch, bool EnableExperimentalExtension,
   Exts = Exts.drop_front(ConsumeLength);
   Exts.consume_front("_");
 
-  std::vector<std::string> SplitExts;
-  if (auto E = splitExtsByUnderscore(Exts, SplitExts))
-    return std::move(E);
+  SmallVector<StringRef, 8> SplitExts;
+  Exts.split(SplitExts, '_', /*MaxSplit=*/-1, /*KeepEmpty=*/false);
 
   for (auto &Ext : SplitExts) {
     StringRef CurrExt = Ext;
diff --git a/llvm/unittests/TargetParser/RISCVISAInfoTest.cpp b/llvm/unittests/TargetParser/RISCVISAInfoTest.cpp
index f9e386a85fea8..95a03b2a90ec6 100644
--- a/llvm/unittests/TargetParser/RISCVISAInfoTest.cpp
+++ b/llvm/unittests/TargetParser/RISCVISAInfoTest.cpp
@@ -39,7 +39,7 @@ TEST(ParseNormalizedArchString, RejectsInvalidBaseISA) {
 
 TEST(ParseNormalizedArchString, RejectsMalformedInputs) {
   for (StringRef Input :
-       {"rv64i2p0_", "rv32i2p0__a2p0", "rv64e2p", "rv32i", "rv64ip1"}) {
+       {"rv64e2p", "rv32i", "rv64ip1"}) {
     EXPECT_EQ(
         toString(RISCVISAInfo::parseNormalizedArchString(Input).takeError()),
         "extension lacks version in expected format");
@@ -518,18 +518,6 @@ TEST(ParseArchString,
       "unsupported standard user-level extension 'zba1p0m'");
 }
 
-TEST(ParseArchString, RejectsDoubleOrTrailingUnderscore) {
-  EXPECT_EQ(
-      toString(RISCVISAInfo::parseArchString("rv64i__m", true).takeError()),
-      "extension name missing after separator '_'");
-
-  for (StringRef Input :
-       {"rv32ezicsr__zifencei", "rv32i_", "rv32izicsr_", "rv64im_"}) {
-    EXPECT_EQ(toString(RISCVISAInfo::parseArchString(Input, true).takeError()),
-              "extension name missing after separator '_'");
-  }
-}
-
 TEST(ParseArchString, RejectsDuplicateExtensionNames) {
   EXPECT_EQ(toString(RISCVISAInfo::parseArchString("rv64ii", true).takeError()),
             "invalid standard user-level extension 'i'");

Copy link

github-actions bot commented May 8, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 8, 2024

This one seems lazy, avoiding trailing or duplicate underscores is easy.

@topperc
Copy link
Collaborator Author

topperc commented May 8, 2024

This one seems lazy, avoiding trailing or duplicate underscores is easy.

Yeah I don't really expect anyone to use it. It seemed like an unnecessary complication in our parsing code.

I was considering getting rid of split call before the loop and make the loop work directly on the raw string. Not having to care about extra underscores makes that easier.

topperc added a commit that referenced this pull request May 9, 2024
…SCVISAInfo::parseArchString. NFC (#91538)

We can use a SmallVector<StringRef>.

Adjust the code so we check for empty strings in the loop instead of
making a copy of the vector returned from StringRef::split.

This overlaps with #91532 which also removed the std::vector, but
that PR may be more controversial.
…eArchString.

Allow double underscores and trailing underscores. gcc and binutils
allow extra underscores without error.
@topperc topperc force-pushed the pr/extra-underscores branch from 25a6786 to 0f5fcf5 Compare May 9, 2024 05:31
@topperc topperc closed this May 9, 2024
@topperc topperc deleted the pr/extra-underscores branch May 9, 2024 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants