Skip to content

[RISCV] Relax march string order constraint #78120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jan 30, 2024

Conversation

BeMg
Copy link
Contributor

@BeMg BeMg commented Jan 15, 2024

Follow riscv-non-isa/riscv-toolchain-conventions#14 by dropping the order requirement of -march.

  1. single-letter extension can be arbitrary order
    • march=rv32iamdf
  2. single-letter extension and multi-letter extension can be mixed
    • march=rv32i_zihintntl_m_a_f_d_svinval
  3. multi-letter extension need seperate the following extension by underscore, otherwise it will be intreprete as one extension.
    • march=rv32i_zbam -> i,zbam
    • march=rv32i_zba_m -> i,zba,m

@BeMg BeMg requested review from asb, MaskRay and kito-cheng January 15, 2024 04:17
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' llvm:support labels Jan 15, 2024
@llvmbot
Copy link
Member

llvmbot commented Jan 15, 2024

@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-llvm-support

Author: Piyou Chen (BeMg)

Changes

Address the riscv-non-isa/riscv-toolchain-conventions#14

This patch relax the -march string for accept any order.

  1. single-letter extension can be arbitrary order
    • march=rv32iamdf
  2. single-letter extension and multi-letter extension can be mixed
    • march=rv32i_zihintntl_m_a_f_d_svinval
  3. multi-letter extension need seperate the following extension by underscore, otherwise it will be intreprete as one extension.
    • march=rv32i_zbam -> i,zbam
    • march=rv32i_zba_m -> i,zba,m

Patch is 20.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78120.diff

3 Files Affected:

  • (modified) clang/test/Driver/riscv-arch.c (+6-8)
  • (modified) llvm/lib/Support/RISCVISAInfo.cpp (+149-148)
  • (modified) llvm/unittests/Support/RISCVISAInfoTest.cpp (+71-20)
diff --git a/clang/test/Driver/riscv-arch.c b/clang/test/Driver/riscv-arch.c
index 0ac81ea982f1b6..38de95e4fbf7aa 100644
--- a/clang/test/Driver/riscv-arch.c
+++ b/clang/test/Driver/riscv-arch.c
@@ -156,9 +156,8 @@
 // RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV32L %s
 // RV32L: error: invalid arch name 'rv32l'
 
-// RUN: not %clang --target=riscv32-unknown-elf -march=rv32imadf -### %s \
-// RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV32IMADF %s
-// RV32IMADF: error: invalid arch name 'rv32imadf'
+// RUN: %clang --target=riscv32-unknown-elf -march=rv32imadf -### %s \
+// RUN: -fsyntax-only 2>&1 | FileCheck %s
 
 // RUN: not %clang --target=riscv32-unknown-elf -march=rv32imm -### %s \
 // RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV32IMM %s
@@ -184,9 +183,8 @@
 // RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV64L %s
 // RV64L: error: invalid arch name 'rv64l'
 
-// RUN: not %clang --target=riscv64-unknown-elf -march=rv64imadf -### %s \
-// RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV64IMADF %s
-// RV64IMADF: error: invalid arch name 'rv64imadf'
+// RUN: %clang --target=riscv64-unknown-elf -march=rv64imadf -### %s \
+// RUN: -fsyntax-only 2>&1 | FileCheck %s
 
 // RUN: not %clang --target=riscv64-unknown-elf -march=rv64imm -### %s \
 // RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV64IMM %s
@@ -216,7 +214,7 @@
 // RUN: not %clang --target=riscv32-unknown-elf -march=rv32imcq -### %s \
 // RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV32-ORDER %s
 // RV32-ORDER: error: invalid arch name 'rv32imcq',
-// RV32-ORDER: standard user-level extension not given in canonical order 'q'
+// RV32-ORDER: unsupported standard user-level extension 'q'
 
 // RUN: not %clang --target=riscv32-unknown-elf -march=rv32izvl64b -### %s \
 // RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV32-ZVL64B-ER %s
@@ -318,7 +316,7 @@
 // RUN: not %clang --target=riscv32-unknown-elf -march=rv32ixabc_a -### %s \
 // RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV32-PREFIX %s
 // RV32-PREFIX: error: invalid arch name 'rv32ixabc_a',
-// RV32-PREFIX: invalid extension prefix 'a'
+// RV32-PREFIX: unsupported non-standard user-level extension 'xabc'
 
 // RUN: not %clang --target=riscv32-unknown-elf -march=rv32ixdef_sabc -### %s \
 // RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV32-X-ORDER %s
diff --git a/llvm/lib/Support/RISCVISAInfo.cpp b/llvm/lib/Support/RISCVISAInfo.cpp
index 390d950486a795..865ca48aeb90d1 100644
--- a/llvm/lib/Support/RISCVISAInfo.cpp
+++ b/llvm/lib/Support/RISCVISAInfo.cpp
@@ -695,6 +695,106 @@ RISCVISAInfo::parseNormalizedArchString(StringRef Arch) {
   return std::move(ISAInfo);
 }
 
+static Error splitExtsByUnderscore(StringRef Exts,
+                                   std::vector<std::string> &SplitedExts) {
+  SmallVector<StringRef, 8> Split;
+  if (Exts.empty())
+    return Error::success();
+
+  Exts.split(Split, "_");
+
+  for (auto Ext : Split) {
+    if (Ext.empty())
+      return createStringError(errc::invalid_argument,
+                               "extension name missing after separator '_'");
+
+    SplitedExts.push_back(Ext.str());
+  }
+  return Error::success();
+}
+
+static Error processMultiLetterExtension(
+    StringRef RawExt, SmallVector<std::string, 8> &SeenExts,
+    SmallVector<RISCVISAInfo::ExtensionVersion, 8> &ExtsVersion,
+    bool IgnoreUnknown, bool EnableExperimentalExtension,
+    bool ExperimentalExtensionVersionCheck) {
+  StringRef Type = getExtensionType(RawExt);
+  StringRef Desc = getExtensionTypeDesc(RawExt);
+  auto Pos = findLastNonVersionCharacter(RawExt) + 1;
+  StringRef Name(RawExt.substr(0, Pos));
+  StringRef Vers(RawExt.substr(Pos));
+
+  if (Type.empty()) {
+    if (IgnoreUnknown)
+      return Error::success();
+    return createStringError(errc::invalid_argument,
+                             "invalid extension prefix '" + RawExt + "'");
+  }
+
+  if (!IgnoreUnknown && Name.size() == Type.size())
+    return createStringError(errc::invalid_argument,
+                             "%s name missing after '%s'", Desc.str().c_str(),
+                             Type.str().c_str());
+
+  unsigned Major, Minor, ConsumeLength;
+  if (auto E = getExtensionVersion(Name, Vers, Major, Minor, ConsumeLength,
+                                   EnableExperimentalExtension,
+                                   ExperimentalExtensionVersionCheck)) {
+    if (IgnoreUnknown) {
+      consumeError(std::move(E));
+      return Error::success();
+    }
+    return E;
+  }
+
+  // Check if duplicated extension.
+  if (!IgnoreUnknown && llvm::is_contained(SeenExts, Name))
+    return createStringError(errc::invalid_argument, "duplicated %s '%s'",
+                             Desc.str().c_str(), Name.str().c_str());
+
+  if (IgnoreUnknown && !RISCVISAInfo::isSupportedExtension(Name))
+    return Error::success();
+
+  SeenExts.push_back(Name.str());
+  ExtsVersion.push_back({Major, Minor});
+  return Error::success();
+}
+
+static Error processSingleLetterExtension(
+    StringRef &RawExt, SmallVector<std::string, 8> &SeenExts,
+    SmallVector<RISCVISAInfo::ExtensionVersion, 8> &ExtsVersion,
+    bool IgnoreUnknown, bool EnableExperimentalExtension,
+    bool ExperimentalExtensionVersionCheck) {
+  unsigned Major, Minor, ConsumeLength;
+  StringRef Name = RawExt.take_front(1);
+  RawExt.consume_front(Name);
+  if (auto E = getExtensionVersion(Name, RawExt, Major, Minor, ConsumeLength,
+                                   EnableExperimentalExtension,
+                                   ExperimentalExtensionVersionCheck)) {
+    if (IgnoreUnknown) {
+      consumeError(std::move(E));
+      RawExt = RawExt.substr(ConsumeLength);
+      return Error::success();
+    }
+    return E;
+  }
+
+  RawExt = RawExt.substr(ConsumeLength);
+
+  // Check if duplicated extension.
+  if (!IgnoreUnknown && llvm::is_contained(SeenExts, Name))
+    return createStringError(errc::invalid_argument,
+                             "duplicated standard user-level extension '%s'",
+                             Name.str().c_str());
+
+  if (IgnoreUnknown && !RISCVISAInfo::isSupportedExtension(Name))
+    return Error::success();
+
+  SeenExts.push_back(Name.str());
+  ExtsVersion.push_back({Major, Minor});
+  return Error::success();
+}
+
 llvm::Expected<std::unique_ptr<RISCVISAInfo>>
 RISCVISAInfo::parseArchString(StringRef Arch, bool EnableExperimentalExtension,
                               bool ExperimentalExtensionVersionCheck,
@@ -715,6 +815,8 @@ RISCVISAInfo::parseArchString(StringRef Arch, bool EnableExperimentalExtension,
 
   unsigned XLen = HasRV64 ? 64 : 32;
   std::unique_ptr<RISCVISAInfo> ISAInfo(new RISCVISAInfo(XLen));
+  SmallVector<std::string, 8> SeenExts;
+  SmallVector<RISCVISAInfo::ExtensionVersion, 8> ExtsVersion;
 
   // The canonical order specified in ISA manual.
   // Ref: Table 22.1 in RISC-V User-Level ISA V2.2
@@ -745,17 +847,6 @@ RISCVISAInfo::parseArchString(StringRef Arch, bool EnableExperimentalExtension,
   // Skip rvxxx
   StringRef Exts = Arch.substr(5);
 
-  // Remove multi-letter standard extensions, non-standard extensions and
-  // supervisor-level extensions. They have 'z', 'x', 's' prefixes.
-  // Parse them at the end.
-  // Find the very first occurrence of 's', 'x' or 'z'.
-  StringRef OtherExts;
-  size_t Pos = Exts.find_first_of("zsx");
-  if (Pos != StringRef::npos) {
-    OtherExts = Exts.substr(Pos);
-    Exts = Exts.substr(0, Pos);
-  }
-
   unsigned Major, Minor, ConsumeLength;
   if (Baseline == 'g') {
     // Versions for g are disallowed, and this was checked for previously.
@@ -765,9 +856,11 @@ RISCVISAInfo::parseArchString(StringRef Arch, bool EnableExperimentalExtension,
     // version since the we don't have clear version scheme for that on
     // ISA spec.
     for (const auto *Ext : RISCVGImplications) {
-      if (auto Version = findDefaultVersion(Ext))
-        ISAInfo->addExtension(Ext, *Version);
-      else
+      if (auto Version = findDefaultVersion(Ext)) {
+        // Postpone AddExtension until end of this function
+        SeenExts.push_back(Ext);
+        ExtsVersion.push_back({Version->Major, Version->Minor});
+      } else
         llvm_unreachable("Default extension version not found?");
     }
   } else {
@@ -785,7 +878,9 @@ RISCVISAInfo::parseArchString(StringRef Arch, bool EnableExperimentalExtension,
       Minor = Version->Minor;
     }
 
-    ISAInfo->addExtension(StringRef(&Baseline, 1), {Major, Minor});
+    // Postpone AddExtension until end of this function
+    SeenExts.push_back(StringRef(&Baseline, 1).str());
+    ExtsVersion.push_back({Major, Minor});
   }
 
   // Consume the base ISA version number and any '_' between rvxxx and the
@@ -793,145 +888,51 @@ RISCVISAInfo::parseArchString(StringRef Arch, bool EnableExperimentalExtension,
   Exts = Exts.drop_front(ConsumeLength);
   Exts.consume_front("_");
 
-  auto StdExtsItr = StdExts.begin();
-  auto StdExtsEnd = StdExts.end();
-  auto GoToNextExt = [](StringRef::iterator &I, unsigned ConsumeLength,
-                        StringRef::iterator E) {
-    I += 1 + ConsumeLength;
-    if (I != E && *I == '_')
-      ++I;
-  };
-  for (auto I = Exts.begin(), E = Exts.end(); I != E;) {
-    char C = *I;
-
-    // Check ISA extensions are specified in the canonical order.
-    while (StdExtsItr != StdExtsEnd && *StdExtsItr != C)
-      ++StdExtsItr;
-
-    if (StdExtsItr == StdExtsEnd) {
-      // Either c contains a valid extension but it was not given in
-      // canonical order or it is an invalid extension.
-      if (StdExts.contains(C)) {
-        return createStringError(
-            errc::invalid_argument,
-            "standard user-level extension not given in canonical order '%c'",
-            C);
-      }
-
-      return createStringError(errc::invalid_argument,
-                               "invalid standard user-level extension '%c'", C);
-    }
-
-    // Move to next char to prevent repeated letter.
-    ++StdExtsItr;
-
-    StringRef Next;
-    unsigned Major, Minor, ConsumeLength;
-    if (std::next(I) != E)
-      Next = StringRef(std::next(I), E - std::next(I));
-    if (auto E = getExtensionVersion(StringRef(&C, 1), Next, Major, Minor,
-                                     ConsumeLength, EnableExperimentalExtension,
-                                     ExperimentalExtensionVersionCheck)) {
-      if (IgnoreUnknown) {
-        consumeError(std::move(E));
-        GoToNextExt(I, ConsumeLength, Exts.end());
-        continue;
-      }
-      return std::move(E);
-    }
-
-    // The order is OK, then push it into features.
-    // Currently LLVM supports only "mafdcvh".
-    if (!isSupportedExtension(StringRef(&C, 1))) {
-      if (IgnoreUnknown) {
-        GoToNextExt(I, ConsumeLength, Exts.end());
-        continue;
-      }
-      return createStringError(errc::invalid_argument,
-                               "unsupported standard user-level extension '%c'",
-                               C);
-    }
-    ISAInfo->addExtension(StringRef(&C, 1), {Major, Minor});
-
-    // Consume full extension name and version, including any optional '_'
-    // between this extension and the next
-    GoToNextExt(I, ConsumeLength, Exts.end());
-  }
-
-  // Handle other types of extensions other than the standard
-  // general purpose and standard user-level extensions.
-  // Parse the ISA string containing non-standard user-level
-  // extensions, standard supervisor-level extensions and
-  // non-standard supervisor-level extensions.
-  // These extensions start with 'z', 's', 'x' prefixes, might have a version
-  // number (major, minor) and are separated by a single underscore '_'. We do
-  // not enforce a canonical order for them.
-  // Set the hardware features for the extensions that are supported.
-
-  // Multi-letter extensions are seperated by a single underscore
-  // as described in RISC-V User-Level ISA V2.2.
-  SmallVector<StringRef, 8> Split;
-  OtherExts.split(Split, '_');
-
-  SmallVector<StringRef, 8> AllExts;
-  if (Split.size() > 1 || Split[0] != "") {
-    for (StringRef Ext : Split) {
-      if (Ext.empty())
-        return createStringError(errc::invalid_argument,
-                                 "extension name missing after separator '_'");
-
-      StringRef Type = getExtensionType(Ext);
-      StringRef Desc = getExtensionTypeDesc(Ext);
-      auto Pos = findLastNonVersionCharacter(Ext) + 1;
-      StringRef Name(Ext.substr(0, Pos));
-      StringRef Vers(Ext.substr(Pos));
-
-      if (Type.empty()) {
-        if (IgnoreUnknown)
-          continue;
-        return createStringError(errc::invalid_argument,
-                                 "invalid extension prefix '" + Ext + "'");
-      }
-
-      if (!IgnoreUnknown && Name.size() == Type.size()) {
+  std::vector<std::string> SplitedExts;
+  if (auto E = splitExtsByUnderscore(Exts, SplitedExts))
+    return std::move(E);
+
+  for (auto Ext : SplitedExts) {
+    StringRef CurrExt = Ext;
+    while (!CurrExt.empty()) {
+      if (AllStdExts.contains(CurrExt.front())) {
+        if (auto E = processSingleLetterExtension(
+                CurrExt, SeenExts, ExtsVersion, IgnoreUnknown,
+                EnableExperimentalExtension, ExperimentalExtensionVersionCheck))
+          return E;
+      } else if (CurrExt.front() == 'z' || CurrExt.front() == 's' ||
+                 CurrExt.front() == 'x') {
+        if (auto E = processMultiLetterExtension(
+                CurrExt, SeenExts, ExtsVersion, IgnoreUnknown,
+                EnableExperimentalExtension, ExperimentalExtensionVersionCheck))
+          return E;
+        // Multi-letter extension must be seperate following extension with
+        // underscore
+        break;
+      } else {
+        // FIXME: Could it be ignored by IgnoreUnknown?
         return createStringError(errc::invalid_argument,
-                                 "%s name missing after '%s'",
-                                 Desc.str().c_str(), Type.str().c_str());
-      }
-
-      unsigned Major, Minor, ConsumeLength;
-      if (auto E = getExtensionVersion(Name, Vers, Major, Minor, ConsumeLength,
-                                       EnableExperimentalExtension,
-                                       ExperimentalExtensionVersionCheck)) {
-        if (IgnoreUnknown) {
-          consumeError(std::move(E));
-          continue;
-        }
-        return std::move(E);
-      }
-
-      // Check if duplicated extension.
-      if (!IgnoreUnknown && llvm::is_contained(AllExts, Name)) {
-        return createStringError(errc::invalid_argument, "duplicated %s '%s'",
-                                 Desc.str().c_str(), Name.str().c_str());
+                                 "invalid standard user-level extension '%c'",
+                                 CurrExt.front());
       }
-
-      if (IgnoreUnknown && !isSupportedExtension(Name))
-        continue;
-
-      ISAInfo->addExtension(Name, {Major, Minor});
-      // Extension format is correct, keep parsing the extensions.
-      // TODO: Save Type, Name, Major, Minor to avoid parsing them later.
-      AllExts.push_back(Name);
     }
   }
 
-  for (auto Ext : AllExts) {
-    if (!isSupportedExtension(Ext)) {
-      StringRef Desc = getExtensionTypeDesc(getExtensionType(Ext));
-      return createStringError(errc::invalid_argument, "unsupported %s '%s'",
-                               Desc.str().c_str(), Ext.str().c_str());
+  // Check all Extensions are supported.
+  for (size_t Idx = 0; Idx < SeenExts.size(); Idx++) {
+    if (!RISCVISAInfo::isSupportedExtension(SeenExts[Idx])) {
+      if (SeenExts[Idx].size() == 1) {
+        return createStringError(
+            errc::invalid_argument,
+            "unsupported standard user-level extension '%s'",
+            SeenExts[Idx].c_str());
+      }
+      return createStringError(
+          errc::invalid_argument, "unsupported %s '%s'",
+          getExtensionTypeDesc(SeenExts[Idx]).str().c_str(),
+          SeenExts[Idx].c_str());
     }
+    ISAInfo->addExtension(SeenExts[Idx], ExtsVersion[Idx]);
   }
 
   return RISCVISAInfo::postProcessAndChecking(std::move(ISAInfo));
diff --git a/llvm/unittests/Support/RISCVISAInfoTest.cpp b/llvm/unittests/Support/RISCVISAInfoTest.cpp
index 997551e5c44c09..41d40f143eb391 100644
--- a/llvm/unittests/Support/RISCVISAInfoTest.cpp
+++ b/llvm/unittests/Support/RISCVISAInfoTest.cpp
@@ -203,24 +203,6 @@ TEST(ParseArchString, AcceptsSupportedBaseISAsAndSetsXLenAndFLen) {
   EXPECT_EQ(InfoRV64G.getFLen(), 64U);
 }
 
-TEST(ParseArchString, RequiresCanonicalOrderForSingleLetterExtensions) {
-  EXPECT_EQ(
-      toString(RISCVISAInfo::parseArchString("rv64idf", true).takeError()),
-      "standard user-level extension not given in canonical order 'f'");
-  EXPECT_EQ(
-      toString(RISCVISAInfo::parseArchString("rv32iam", true).takeError()),
-      "standard user-level extension not given in canonical order 'm'");
-  EXPECT_EQ(
-      toString(
-          RISCVISAInfo::parseArchString("rv32i_zfinx_a", true).takeError()),
-      "invalid extension prefix 'a'");
-  // Canonical ordering not required for z*, s*, and x* extensions.
-  EXPECT_THAT_EXPECTED(
-      RISCVISAInfo::parseArchString(
-          "rv64imafdc_xsfvcp_zicsr_xtheadba_svnapot_zawrs", true),
-      Succeeded());
-}
-
 TEST(ParseArchString, RejectsUnrecognizedExtensionNamesByDefault) {
   EXPECT_EQ(toString(RISCVISAInfo::parseArchString("rv64ib", true).takeError()),
             "unsupported standard user-level extension 'b'");
@@ -337,10 +319,79 @@ TEST(ParseArchString, AcceptsUnderscoreSplittingExtensions) {
   }
 }
 
+TEST(ParseArchString, AcceptsRelaxSingleLetterExtensions) {
+  for (StringRef Input :
+       {"rv32imfad", "rv32im_fa_d", "rv32im2p0fad", "rv32i2p1m2p0fad"}) {
+    auto MaybeISAInfo = RISCVISAInfo::parseArchString(Input, true);
+    ASSERT_THAT_EXPECTED(MaybeISAInfo, Succeeded());
+    RISCVISAInfo::OrderedExtensionMap Exts = (*MaybeISAInfo)->getExtensions();
+    EXPECT_EQ(Exts.size(), 6UL);
+    EXPECT_EQ(Exts.count("i"), 1U);
+    EXPECT_EQ(Exts.count("m"), 1U);
+    EXPECT_EQ(Exts.count("f"), 1U);
+    EXPECT_EQ(Exts.count("a"), 1U);
+    EXPECT_EQ(Exts.count("d"), 1U);
+    EXPECT_EQ(Exts.count("zicsr"), 1U);
+  }
+}
+
+TEST(ParseArchString, AcceptsRelaxMixedLetterExtensions) {
+  for (StringRef Input :
+       {"rv32i_zihintntl_m_a_f_d_svinval", "rv32izihintntl_mafdsvinval",
+        "rv32i_zihintntl_mafd_svinval"}) {
+    auto MaybeISAInfo = RISCVISAInfo::parseArchString(Input, true);
+    ASSERT_THAT_EXPECTED(MaybeISAInfo, Succeeded());
+    RISCVISAInfo::OrderedExtensionMap Exts = (*MaybeISAInfo)->getExtensions();
+    EXPECT_EQ(Exts.size(), 8UL);
+    EXPECT_EQ(Exts.count("i"), 1U);
+    EXPECT_EQ(Exts.count("m"), 1U);
+    EXPECT_EQ(Exts.count("a"), 1U);
+    EXPECT_EQ(Exts.count("f"), 1U);
+    EXPECT_EQ(Exts.count("d"), 1U);
+    EXPECT_EQ(Exts.count("zihintntl"), 1U);
+    EXPECT_EQ(Exts.count("svinval"), 1U);
+    EXPECT_EQ(Exts.count("zicsr"), 1U);
+  }
+}
+
+TEST(ParseArchString, AcceptsAmbiguousFromRelaxExtensions) {
+  for (StringRef Input : {"rv32i_zba_m", "rv32izba_m", "rv32izba1p0_m2p0"}) {
+    auto MaybeISAInfo = RISCVISAInfo::parseArchString(Input, true);
+    ASSERT_THAT_EXPECTED(MaybeISAInfo, Succeeded());
+    RISCVISAInfo::OrderedExtensionMap Exts = (*MaybeISAInfo)->getExtensions();
+    EXPECT_EQ(Exts.size(), 3UL);
+    EXPECT_EQ(Exts.count("i"), 1U);
+    EXPECT_EQ(Exts.count("zba"), 1U);
+    EXPECT_EQ(Exts.count("m"), 1U);
+  }
+  for (StringRef Input :
+       {"rv32ia_zba_m", "rv32iazba_m", "rv32ia2p1zba1p0_m2p0"}) {
+    auto MaybeISAInfo = RISCVISAInfo::parseArchString(Input, true);
+    ASSERT_THAT_EXPECTED(MaybeISAInfo, Succeeded());
+    RISCVISAInfo::OrderedExtensionMap Exts = (*MaybeISAInfo)->getExtensions();
+    EXPECT_EQ(Exts.size(), 4UL);
+    EXPECT_EQ(Exts.count("i"), 1U);
+    EXPECT_EQ(Exts.count("zba"), 1U);
+    EXPECT_EQ(Exts.count("m"), 1U);
+    EXPECT_EQ(Exts.count("a"), 1U);
+  }
+}
+
+TEST(ParseArchString,
+     RejectsMultiLetterExtensionFollowBySingleLetterExtensions) {
+  for (StringRef Input : {"rv32izbam", "rv32i_zbam"})
+    EXPECT_EQ(toString(RISCVISAInfo::parseArchString(Input, true).takeError()),
+              "unsupported standard user-level extension 'zbam'");
+  EXPECT_EQ(
+      toString(
+          RISCVISAInfo::parseArchString("rv32i_zba1p0m", true).takeError()),
+      "unsupported standard user-level ...
[truncated]

@asb
Copy link
Contributor

asb commented Jan 16, 2024

Just to check, can someone confirm if gcc is now handling ISA strings in this way too? I do think this is the right direction to go but haven't done a detailed code review of this approach yet.

@kito-cheng
Copy link
Member

I am working on GCC part[1], and it's still under review, also @BeMg is working very closely with me :)

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642151.html

// RUN: -fsyntax-only 2>&1 | FileCheck -check-prefix=RV32IMADF %s
// RV32IMADF: error: invalid arch name 'rv32imadf'
// RUN: %clang --target=riscv32-unknown-elf -march=rv32imadf -### %s \
// RUN: -fsyntax-only 2>&1 | FileCheck %s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention is to indent continuation lines by 2 spaces. Some tests do not follow the convention. You can make them follow the convention as you are modifying lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add two more space for some RUN LINE.

// Parse the ISA string containing non-standard user-level
// extensions, standard supervisor-level extensions and
// non-standard supervisor-level extensions.
// These extensions start with 'z', 's', 'x' prefixes, might have a version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment appears to be removed. Consider retaining it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment has been restored.

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaskRay
Copy link
Member

MaskRay commented Jan 19, 2024

Address the riscv-non-isa/riscv-toolchain-conventions#14

This patch relax the -march string for accept any order.

Perhaps:

Follow riscv-non-isa/riscv-toolchain-conventions#14 by dropping the order requirement of -march.

}

TEST(ParseArchString, AcceptsAmbiguousFromRelaxExtensions) {
for (StringRef Input : {"rv32i_zba_m", "rv32izba_m", "rv32izba1p0_m2p0"}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider another test where i is after zba

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add three testcases

EXPECT_EQ(
    toString(RISCVISAInfo::parseArchString("rv32zba_im", true).takeError()),
    "first letter should be 'e', 'i' or 'g'");
EXPECT_EQ(
    toString(RISCVISAInfo::parseArchString("rv32izbai_m", true).takeError()),
    "unsupported standard user-level extension 'zbai'");
EXPECT_EQ(
    toString(RISCVISAInfo::parseArchString("rv32izbaim", true).takeError()),
    "unsupported standard user-level extension 'zbaim'");

@@ -695,6 +695,106 @@ RISCVISAInfo::parseNormalizedArchString(StringRef Arch) {
return std::move(ISAInfo);
}

static Error splitExtsByUnderscore(StringRef Exts,
std::vector<std::string> &SplitedExts) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SplitExts

Splitted has two ts but I don't think the ed is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -715,6 +815,8 @@ RISCVISAInfo::parseArchString(StringRef Arch, bool EnableExperimentalExtension,

unsigned XLen = HasRV64 ? 64 : 32;
std::unique_ptr<RISCVISAInfo> ISAInfo(new RISCVISAInfo(XLen));
SmallVector<std::string, 8> SeenExts;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe SeenExts should be a SetVector to make it more efficient to check for duplicates?

Or maybe a MapVector storing the version as the value so we don't need ExtsVersion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use MapVector instead of SmallVector

@BeMg BeMg requested review from MaskRay and topperc January 23, 2024 13:26
}

// Check if duplicated extension.
if (!IgnoreUnknown && (SeenExtMap.find(Name.str()) != SeenExtMap.end()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SeenExtMap.contains(Name.str())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

RawExt = RawExt.substr(ConsumeLength);

// Check if duplicated extension.
if (!IgnoreUnknown && (SeenExtMap.find(Name.str()) != SeenExtMap.end()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SeenExtMap.contains

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@BeMg BeMg requested a review from topperc January 24, 2024 03:09
@BeMg BeMg force-pushed the relax-riscv-march-order branch from 1873aec to 06ac457 Compare January 25, 2024 09:13
@BeMg BeMg requested a review from topperc January 25, 2024 09:27
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BeMg BeMg force-pushed the relax-riscv-march-order branch 2 times, most recently from 4a207be to a86f982 Compare January 29, 2024 10:08
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BeMg BeMg force-pushed the relax-riscv-march-order branch from a86f982 to 0accb78 Compare January 30, 2024 00:49
@BeMg BeMg merged commit d09082f into llvm:main Jan 30, 2024
@BeMg BeMg deleted the relax-riscv-march-order branch January 30, 2024 06:33
joker-eph added a commit that referenced this pull request Jan 30, 2024
Reverts #78120

Buildbot is broken:

llvm/lib/Support/RISCVISAInfo.cpp:910:18: error: call to deleted
constructor of 'llvm::Error'
          return E;
                 ^
@joker-eph
Copy link
Collaborator

Sorry, had to revert because this broke a buildbot:

llvm/lib/Support/RISCVISAInfo.cpp:910:18: error: call to deleted constructor of 'llvm::Error'
return E;
^

https://lab.llvm.org/buildbot/#/builders/61/builds/53704

topperc pushed a commit that referenced this pull request Jan 30, 2024
With std::move added to fix build bot failure.

Original commit message:

Follow
riscv-non-isa/riscv-toolchain-conventions#14 by
dropping the order requirement of `-march`.

1. single-letter extension can be arbitrary order
    - march=rv32iamdf
2. single-letter extension and multi-letter extension can be mixed
    - march=rv32i_zihintntl_m_a_f_d_svinval
3. multi-letter extension need seperate the following extension by
underscore, otherwise it will be intreprete as one extension.
    - march=rv32i_zbam -> i,zbam
    - march=rv32i_zba_m -> i,zba,m
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category llvm:support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants