Skip to content

[Clang] Allow raw string literals in C as an extension #88265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jul 10, 2024

Conversation

Sirraide
Copy link
Member

This is a tentative implementation of support for raw string literals in C following the discussion on #85703.

GCC supports raw string literals in C in -gnuXY mode. This pr both enables raw string literals in -gnuXY mode in C and adds a -f[no-]raw-string-literals flag to override this beheviour. There are a few questions I still have though:

  1. GCC does not seem to support raw string literals in C++ before C++11, even if e.g. -std=gnu++03 is passed. Should we follow this behaviour or should we enable raw string literals in earlier C++ language modes as well if -gnu++XY is passed? -fraw-string-literals currently makes it possible to enable them in e.g. C++03.
  2. -fno-raw-string-literals allows users to disable raw string literals in -gnuXY mode. I thought it might be useful to have this, but do we want it?
  3. The implementation of this currently adds a RawStringLiterals option to the LangOpts; -f[no-]raw-string-literals overrides the default value for it which depends on the language standard. As a consequence, passing e.g. -std=c++11 -fno-raw-string-literals will disable raw string literals even though we’re in C++11 mode. Do we want to allow this or should we just ignore -f[no-]raw-string-literals if we’re in C++11 or later?
  4. This probably deserves a note in LanguageExtensions.rst, but I’m not exactly sure where.
  5. Should we add a flag for this to __has_feature/__has_extension?

@Sirraide Sirraide added clang:frontend Language frontend issues, e.g. anything involving "Sema" extension:gnu labels Apr 10, 2024
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang-format labels Apr 10, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 10, 2024

@llvm/pr-subscribers-clang-format

@llvm/pr-subscribers-clang-driver

Author: None (Sirraide)

Changes

This is a tentative implementation of support for raw string literals in C following the discussion on #85703.

GCC supports raw string literals in C in -gnuXY mode. This pr both enables raw string literals in -gnuXY mode in C and adds a -f[no-]raw-string-literals flag to override this beheviour. There are a few questions I still have though:

  1. GCC does not seem to support raw string literals in C++ before C++11, even if e.g. -std=gnu++03 is passed. Should we follow this behaviour or should we enable raw string literals in earlier C++ language modes as well if -gnu++XY is passed? -fraw-string-literals currently makes it possible to enable them in e.g. C++03.
  2. -fno-raw-string-literals allows users to disable raw string literals in -gnuXY mode. I thought it might be useful to have this, but do we want it?
  3. The implementation of this currently adds a RawStringLiterals option to the LangOpts; -f[no-]raw-string-literals overrides the default value for it which depends on the language standard. As a consequence, passing e.g. -std=c++11 -fno-raw-string-literals will disable raw string literals even though we’re in C++11 mode. Do we want to allow this or should we just ignore -f[no-]raw-string-literals if we’re in C++11 or later?
  4. This probably deserves a note in LanguageExtensions.rst, but I’m not exactly sure where.
  5. Should we add a flag for this to __has_feature/__has_extension?

Full diff: https://github.com/llvm/llvm-project/pull/88265.diff

9 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+3)
  • (modified) clang/include/clang/Basic/LangOptions.def (+2)
  • (modified) clang/include/clang/Basic/LangStandard.h (+6)
  • (modified) clang/include/clang/Driver/Options.td (+6)
  • (modified) clang/lib/Basic/LangOptions.cpp (+1)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+2)
  • (modified) clang/lib/Format/Format.cpp (+1)
  • (modified) clang/lib/Lex/Lexer.cpp (+5-5)
  • (added) clang/test/Lexer/raw-string-ext.c (+18)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index f96cebbde3d825..20d14130fb62bc 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -43,6 +43,9 @@ code bases.
 C/C++ Language Potentially Breaking Changes
 -------------------------------------------
 
+- Clang now supports raw string literals in ``-std=gnuXY`` mode as an extension in
+  C. This behaviour can also be overridden using ``-f[no-]raw-string-literals``.
+
 C++ Specific Potentially Breaking Changes
 -----------------------------------------
 - Clang now diagnoses function/variable templates that shadow their own template parameters, e.g. ``template<class T> void T();``.
diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def
index 8ef6700ecdc78e..96bd339bb1851d 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -454,6 +454,8 @@ LANGOPT(MatrixTypes, 1, 0, "Enable or disable the builtin matrix type")
 
 LANGOPT(CXXAssumptions, 1, 1, "Enable or disable codegen and compile-time checks for C++23's [[assume]] attribute")
 
+LANGOPT(RawStringLiterals, 1, 0, "Enable or disable raw string literals")
+
 ENUM_LANGOPT(StrictFlexArraysLevel, StrictFlexArraysLevelKind, 2,
              StrictFlexArraysLevelKind::Default,
              "Rely on strict definition of flexible arrays")
diff --git a/clang/include/clang/Basic/LangStandard.h b/clang/include/clang/Basic/LangStandard.h
index 8e25afc833661c..0a308b93ada746 100644
--- a/clang/include/clang/Basic/LangStandard.h
+++ b/clang/include/clang/Basic/LangStandard.h
@@ -130,6 +130,12 @@ struct LangStandard {
   /// hasDigraphs - Language supports digraphs.
   bool hasDigraphs() const { return Flags & Digraphs; }
 
+  /// hasRawStringLiterals - Language supports R"()" raw string literals.
+  bool hasRawStringLiterals() const {
+    // GCC supports raw string literals in C, but not in C++ before C++11.
+    return isCPlusPlus11() || (!isCPlusPlus() && isGNUMode());
+  }
+
   /// isGNUMode - Language includes GNU extensions.
   bool isGNUMode() const { return Flags & GNUMode; }
 
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index f745e573eb2686..32e6c10e1251b7 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4142,6 +4142,12 @@ def fenable_matrix : Flag<["-"], "fenable-matrix">, Group<f_Group>,
     HelpText<"Enable matrix data type and related builtin functions">,
     MarshallingInfoFlag<LangOpts<"MatrixTypes">>;
 
+defm raw_string_literals : BoolFOption<"raw-string-literals",
+    LangOpts<"RawStringLiterals">, Default<std#".hasRawStringLiterals()">,
+    PosFlag<SetTrue, [], [], "Enable">,
+    NegFlag<SetFalse, [], [], "Disable">,
+    BothFlags<[], [ClangOption, CC1Option], " raw string literals">>;
+
 def fzero_call_used_regs_EQ
     : Joined<["-"], "fzero-call-used-regs=">, Group<f_Group>,
     Visibility<[ClangOption, CC1Option]>,
diff --git a/clang/lib/Basic/LangOptions.cpp b/clang/lib/Basic/LangOptions.cpp
index a0adfbf61840e3..c34f0ed5ed7174 100644
--- a/clang/lib/Basic/LangOptions.cpp
+++ b/clang/lib/Basic/LangOptions.cpp
@@ -124,6 +124,7 @@ void LangOptions::setLangDefaults(LangOptions &Opts, Language Lang,
   Opts.HexFloats = Std.hasHexFloats();
   Opts.WChar = Std.isCPlusPlus();
   Opts.Digraphs = Std.hasDigraphs();
+  Opts.RawStringLiterals = Std.hasRawStringLiterals();
 
   Opts.HLSL = Lang == Language::HLSL;
   if (Opts.HLSL && Opts.IncludeDefaultHeader)
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 766a9b91e3c0ad..c99bfe4efc4137 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6536,6 +6536,8 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   Args.AddLastArg(CmdArgs, options::OPT_fheinous_gnu_extensions);
   Args.AddLastArg(CmdArgs, options::OPT_fdigraphs, options::OPT_fno_digraphs);
   Args.AddLastArg(CmdArgs, options::OPT_fzero_call_used_regs_EQ);
+  Args.AddLastArg(CmdArgs, options::OPT_fraw_string_literals,
+                  options::OPT_fno_raw_string_literals);
 
   if (Args.hasFlag(options::OPT_femulated_tls, options::OPT_fno_emulated_tls,
                    Triple.hasDefaultEmulatedTLS()))
diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 89e6c19b0af45c..71865bb061f57e 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -3850,6 +3850,7 @@ LangOptions getFormattingLangOpts(const FormatStyle &Style) {
   // the sequence "<::" will be unconditionally treated as "[:".
   // Cf. Lexer::LexTokenInternal.
   LangOpts.Digraphs = LexingStd >= FormatStyle::LS_Cpp11;
+  LangOpts.RawStringLiterals = LexingStd >= FormatStyle::LS_Cpp11;
 
   LangOpts.LineComment = 1;
   bool AlternativeOperators = Style.isCpp();
diff --git a/clang/lib/Lex/Lexer.cpp b/clang/lib/Lex/Lexer.cpp
index c98645993abe07..67d75c1140b232 100644
--- a/clang/lib/Lex/Lexer.cpp
+++ b/clang/lib/Lex/Lexer.cpp
@@ -3867,7 +3867,7 @@ bool Lexer::LexTokenInternal(Token &Result, bool TokAtPhysicalStartOfLine) {
                                tok::utf16_char_constant);
 
       // UTF-16 raw string literal
-      if (Char == 'R' && LangOpts.CPlusPlus11 &&
+      if (Char == 'R' && LangOpts.RawStringLiterals &&
           getCharAndSize(CurPtr + SizeTmp, SizeTmp2) == '"')
         return LexRawStringLiteral(Result,
                                ConsumeChar(ConsumeChar(CurPtr, SizeTmp, Result),
@@ -3889,7 +3889,7 @@ bool Lexer::LexTokenInternal(Token &Result, bool TokAtPhysicalStartOfLine) {
                                   SizeTmp2, Result),
               tok::utf8_char_constant);
 
-        if (Char2 == 'R' && LangOpts.CPlusPlus11) {
+        if (Char2 == 'R' && LangOpts.RawStringLiterals) {
           unsigned SizeTmp3;
           char Char3 = getCharAndSize(CurPtr + SizeTmp + SizeTmp2, SizeTmp3);
           // UTF-8 raw string literal
@@ -3925,7 +3925,7 @@ bool Lexer::LexTokenInternal(Token &Result, bool TokAtPhysicalStartOfLine) {
                                tok::utf32_char_constant);
 
       // UTF-32 raw string literal
-      if (Char == 'R' && LangOpts.CPlusPlus11 &&
+      if (Char == 'R' && LangOpts.RawStringLiterals &&
           getCharAndSize(CurPtr + SizeTmp, SizeTmp2) == '"')
         return LexRawStringLiteral(Result,
                                ConsumeChar(ConsumeChar(CurPtr, SizeTmp, Result),
@@ -3940,7 +3940,7 @@ bool Lexer::LexTokenInternal(Token &Result, bool TokAtPhysicalStartOfLine) {
     // Notify MIOpt that we read a non-whitespace/non-comment token.
     MIOpt.ReadToken();
 
-    if (LangOpts.CPlusPlus11) {
+    if (LangOpts.RawStringLiterals) {
       Char = getCharAndSize(CurPtr, SizeTmp);
 
       if (Char == '"')
@@ -3963,7 +3963,7 @@ bool Lexer::LexTokenInternal(Token &Result, bool TokAtPhysicalStartOfLine) {
                               tok::wide_string_literal);
 
     // Wide raw string literal.
-    if (LangOpts.CPlusPlus11 && Char == 'R' &&
+    if (LangOpts.RawStringLiterals && Char == 'R' &&
         getCharAndSize(CurPtr + SizeTmp, SizeTmp2) == '"')
       return LexRawStringLiteral(Result,
                                ConsumeChar(ConsumeChar(CurPtr, SizeTmp, Result),
diff --git a/clang/test/Lexer/raw-string-ext.c b/clang/test/Lexer/raw-string-ext.c
new file mode 100644
index 00000000000000..45e3990cadf3d2
--- /dev/null
+++ b/clang/test/Lexer/raw-string-ext.c
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -fsyntax-only -std=gnu11 -verify=gnu -DGNU %s
+// RUN: %clang_cc1 -fsyntax-only -std=c11 -fraw-string-literals -verify=gnu -DGNU %s
+// RUN: %clang_cc1 -fsyntax-only -std=c11 -verify=std %s
+// RUN: %clang_cc1 -fsyntax-only -std=gnu11 -fno-raw-string-literals -verify=std %s
+
+void f() {
+  (void) R"foo()foo"; // std-error {{use of undeclared identifier 'R'}}
+  (void) LR"foo()foo"; // std-error {{use of undeclared identifier 'LR'}}
+  (void) uR"foo()foo"; // std-error {{use of undeclared identifier 'uR'}}
+  (void) u8R"foo()foo"; // std-error {{use of undeclared identifier 'u8R'}}
+  (void) UR"foo()foo"; // std-error {{use of undeclared identifier 'UR'}}
+}
+
+// gnu-error@* {{missing terminating delimiter}}
+// gnu-error@* {{expected expression}}
+// gnu-error@* {{expected ';' after top level declarator}}
+#define R "bar"
+const char* s =  R"foo(";

@Sirraide Sirraide linked an issue Apr 10, 2024 that may be closed by this pull request
Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! Broadly speaking, I think the idea makes a lot of sense.

GCC does not seem to support raw string literals in C++ before C++11, even if e.g. -std=gnu++03 is passed. Should we follow this behaviour or should we enable raw string literals in earlier C++ language modes as well if -gnu++XY is passed? -fraw-string-literals currently makes it possible to enable them in e.g. C++03.

I think we should follow that behavior; because R can be a valid macro identifier, being conservative is defensible.

-fno-raw-string-literals allows users to disable raw string literals in -gnuXY mode. I thought it might be useful to have this, but do we want it?

I think it's reasonable to have it, but I don't think we should allow it for C++11 and later modes unless there's some rationale I'm missing. (I don't think we want to let users disable language features in standards modes where the feature is standardized without some sort of reasonable justification.)

The implementation of this currently adds a RawStringLiterals option to the LangOpts; -f[no-]raw-string-literals overrides the default value for it which depends on the language standard. As a consequence, passing e.g. -std=c++11 -fno-raw-string-literals will disable raw string literals even though we’re in C++11 mode. Do we want to allow this or should we just ignore -f[no-]raw-string-literals if we’re in C++11 or later?

I think we should either ignore or diagnose it in C++11 or later.

This probably deserves a note in LanguageExtensions.rst, but I’m not exactly sure where.

It definitely should be noted in there; I would probably recommend https://clang.llvm.org/docs/LanguageExtensions.html#c-11-raw-string-literals for the C++ side of things and then something similar for C around where we document those.

Should we add a flag for this to __has_feature/__has_extension?

Yes, but it's a fun question as to which one. We currently use __has_feature for it in C++:

FEATURE(cxx_raw_string_literals, LangOpts.CPlusPlus11)

and it seems like it would make sense to continue to do so for C++. But this isn't a language feature of C, so __has_extension makes sense there. But that's confusing because then we've got both, so I'm not entirely certain that's the right approach. Perhaps using __has_feature for both C and C++ makes the most sense?

@Sirraide
Copy link
Member Author

I don't think we should allow it for C++11 and later modes

To clarify, should we allow enabling them in e.g. c++03 mode if -fraw-string-literals is passed? I don’t see why not, but I’m not entirely sure whether you’re saying we should not support that flag in C++ (neither the positive nor the negative variant) at all or just not in C++11 and later.

@mydeveloperday mydeveloperday requested a review from owenca May 1, 2024 08:51
@AaronBallman
Copy link
Collaborator

I don't think we should allow it for C++11 and later modes

To clarify, should we allow enabling them in e.g. c++03 mode if -fraw-string-literals is passed? I don’t see why not, but I’m not entirely sure whether you’re saying we should not support that flag in C++ (neither the positive nor the negative variant) at all or just not in C++11 and later.

I think we should allow users to enable them in C++03 modes if -fraw-string-literals is passed. I think it's fine to have -fno-raw-string-literals that allows users to disable them in C++03 mode (in case an earlier command line option opted into them and the user wants to disable the feature for some reason), but I don't know if that's worth the effort to support because I don't think we should allow -fno-raw-string-literals in C++11 mode.

@AaronBallman
Copy link
Collaborator

Btw, it seems that precommit CI found some valid issues to be addressed

@Sirraide
Copy link
Member Author

I think we should allow users to enable them in C++03 modes if -fraw-string-literals is passed. I think it's fine to have -fno-raw-string-literals that allows users to disable them in C++03 mode (in case an earlier command line option opted into them and the user wants to disable the feature for some reason), but I don't know if that's worth the effort to support because I don't think we should allow -fno-raw-string-literals in C++11 mode.

In that case I think it might just make sense to ignore the flag in C++11 and later then and allow it before C++11.

Btw, it seems that precommit CI found some valid issues to be addressed

Ah, I didn’t know that all new warnings should have a -W flag associated with them; it seems I’ve somehow only ever added errors so far. I’ll take a look at what happened there.

@Sirraide
Copy link
Member Author

@AaronBallman I just noticed something that I’ve somehow not realised until now even though I’d already written a test case for it: Not only does GCC allow raw string literals in gnuXY mode, but also UTF string literals, e.g. u"foo" (https://godbolt.org/z/771s8ne5d).

Should we follow suit here? And if so, should we add a separate flag for that or rename fraw-string-literals (and the LangOption) to fext-string-literals or something similar?

Copy link
Collaborator

C has Unicode string literals as well: https://godbolt.org/z/chdjYrK9v and so if we're allowing raw string literals, it makes sense to also allow raw unicode string literals IMO. I don't think we need to rename the flag though.

Copy link
Collaborator

In that case I think it might just make sense to ignore the flag in C++11 and later then and allow it before C++11.

I think that makes the most sense.

@cor3ntin
Copy link
Contributor

cor3ntin commented Jun 3, 2024

C has Unicode string literals as well: https://godbolt.org/z/chdjYrK9v and so if we're allowing raw string literals, it makes sense to also allow raw unicode string literals IMO. I don't think we need to rename the flag though.

Yes, these things are completely orthogonal, it makes no sense to treat raw strings with an encoding prefix differently

@Sirraide
Copy link
Member Author

Sirraide commented Jun 19, 2024

Alright, I think this has the behaviour that we want now:

  • raw string literals are enabled in C++11 and later, as well as in C in gnu99 mode and later;
  • raw string literals can be explicitly enabled or disabled in C (and in C++ standards before C++11) using -f[no-]raw-string-literals;
  • in C++11 and later -f[no-]raw-string-literals is ignored and a warning is issued.

@Sirraide Sirraide requested a review from AaronBallman June 19, 2024 14:54
@Sirraide
Copy link
Member Author

So, apparently, this test here

// FIXME: R"()" strings depend on using C++11 language mode
ASSERT_FALSE(minimizeSourceToDependencyDirectives(
R"(_Pragma(R"abc(clang module import)abc"))", Out));
EXPECT_STREQ("<TokBeforeEOF>\n", Out.data());

is now failing, presumably because of this:

static LangOptions getLangOptsForDepScanning() {
LangOptions LangOpts;
// Set the lexer to use 'tok::at' for '@', instead of 'tok::unknown'.
LangOpts.ObjC = true;
LangOpts.LineComment = true;
// FIXME: we do not enable C11 or C++11, so we are missing u/u8/U"" and
// R"()" literals.
return LangOpts;
}

I’m not entirely sure how to fix this candidly. It doesn’t look like unconditionally enabling raw string literals is an option here... This situation reminds me of a similar issue we’re having with ' in numeric literals (#88896). I’m personally not too familiar with the lexer, but would it be possible to pass through the original lang options here from wherever this is invoked?

Copy link
Collaborator

raw string literals are enabled in C++11 and later, as well as in C in gnu99 mode and later;

Why gnu99 mode and not gnu89 mode? I see GCC has that behavior, but I'm not certain why.

I’m not entirely sure how to fix this candidly. It doesn’t look like unconditionally enabling raw string literals is an option here... This situation reminds me of a similar issue we’re having with ' in numeric literals (#88896). I’m personally not too familiar with the lexer, but would it be possible to pass through the original lang options here from wherever this is invoked?

Yeah, it's pretty frustrating that we've found two instances of this in such a short period of time. :-/

That test was added in ee8ed0b and it seems to be a bit of a drive-by as the author noticed the behavior. Given that dependency scanning is never going to care about raw string literals to begin with (at least that I can think of), I'm not certain there's any harm in always supporting raw string literals from dependency scanning, so we could probably do that in the worst case.

But my concerns from #93753 (comment) are still relevant too. CC @jansvoboda11

@Sirraide
Copy link
Member Author

Sirraide commented Jun 20, 2024

Why gnu99 mode and not gnu89 mode? I see GCC has that behavior, but I'm not certain why.

We went over this a while back: #88265 (comment)

I'm not certain there's any harm in always supporting raw string literals from dependency scanning, so we could probably do that in the worst case.

But my concerns from #93753 (comment) are still relevant too. CC @jansvoboda11

👍

@AaronBallman
Copy link
Collaborator

Why gnu99 mode and not gnu89 mode? I see GCC has that behavior, but I'm not certain why.

We went over this a while back: #88265 (comment)

THAT is why this was so familiar to me! :-D Thanks!

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang changes LGTM modulo the dependency scanner bits.

@Sirraide
Copy link
Member Author

Clang changes LGTM modulo the dependency scanner bits.

Alright, I’ll wait for a reply from @jansvoboda11 then

@jansvoboda11
Copy link
Contributor

I assume that @benlangmuir added the scanner unit-test to demonstrate the current behavior instead of trying to make sure it's preserved. I think making it so that the test passes (actually handles raw string literals) and updating the FIXME in DependencyDirectivesScanner.cpp should be fine. CC @Bigcheese

@Sirraide
Copy link
Member Author

I think making it so that the test passes (actually handles raw string literals) and updating the FIXME in DependencyDirectivesScanner.cpp should be fine

To clarify, that means setting the RawStringLiterals LangOpt in DependencyDirectivesScanner.cpp, right? I’m assuming yes, but I just want to make sure.

@jansvoboda11
Copy link
Contributor

To clarify, that means setting the RawStringLiterals LangOpt in DependencyDirectivesScanner.cpp, right? I’m assuming yes, but I just want to make sure.

Yes, that's what I had in mind 👍

@Sirraide
Copy link
Member Author

Sirraide commented Jul 4, 2024

Alright, I just enabled raw string literals in the dependency scanner by default; barring any further complications, I’ll merge this once CI is done.

@benlangmuir
Copy link
Collaborator

I assume that @benlangmuir added the scanner unit-test to demonstrate the current behavior instead of trying to make sure it's preserved.

Correct. They're only interesting to the scanner insofar as they're used in _Pragma() as far as I know. If we can handle them, great! You just need to update the test expectations for those cases.

Thanks for working on this!

@Sirraide
Copy link
Member Author

Sirraide commented Jul 8, 2024

Silly me forgot to actually update the test after enabling raw string literals in the dependency scanner, but now everything should pass.

@Sirraide Sirraide merged commit e464684 into llvm:main Jul 10, 2024
8 checks passed
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
This enables raw R"" string literals in C in some language modes
and adds an option to disable or enable them explicitly as an
extension.

Background: GCC supports raw string literals in C in `-gnuXY` modes
starting with gnu99. This pr both enables raw string literals in gnu99 
mode and later in C and adds an `-f[no-]raw-string-literals` flag to override 
this behaviour. The decision not to enable raw string literals in gnu89
mode, according to the GCC devs, is intentional as that mode is supposed
to be used for ‘old code’ that they don’t want to break; we’ve decided to
match GCC’s behaviour here as well.

The `-fraw-string-literals`  flag can additionally be used to enable raw string 
literals in modes where they aren’t enabled by default (such as c99—as 
opposed to gnu99—or even e.g. C++03); conversely, the negated flag can 
be used to disable them in any gnuXY modes that *do* provide them by 
default, or to override a previous flag. However, we do *not*  support 
disabling raw string literals (or indeed either of these two options) in 
C++11 mode and later, because we don’t want to just start supporting 
disabling features that are actually part of the language in the general case.

This fixes llvm#85703.
@Sirraide Sirraide deleted the raw-string-literals-ext branch October 14, 2024 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category extension:gnu
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GCC compatibility: Raw strings in C mode
7 participants