Skip to content

Fix a crash with empty escape sequences when lexing #102339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 8, 2024

Conversation

AaronBallman
Copy link
Collaborator

The utilities we use for lexing string and character literals can be run in a mode where we pass a null pointer for the diagnostics engine. This mode is used by the format string checkers, for example. However, there were two places that failed to account for a null diagnostic engine pointer: \o{} and \x{}.

This patch adds a check for a null pointer and correctly handles fallback behavior.

Fixes #102218

The utilities we use for lexing string and character literals can be
run in a mode where we pass a null pointer for the diagnostics engine.
This mode is used by the format string checkers, for example. However,
there were two places that failed to account for a null diagnostic
engine pointer: `\o{}` and `\x{}`.

This patch adds a check for a null pointer and correctly handles
fallback behavior.

Fixes llvm#102218
@AaronBallman AaronBallman added c clang:frontend Language frontend issues, e.g. anything involving "Sema" crash-on-invalid labels Aug 7, 2024
@AaronBallman AaronBallman requested a review from cor3ntin August 7, 2024 17:38
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Aug 7, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 7, 2024

@llvm/pr-subscribers-clang

Author: Aaron Ballman (AaronBallman)

Changes

The utilities we use for lexing string and character literals can be run in a mode where we pass a null pointer for the diagnostics engine. This mode is used by the format string checkers, for example. However, there were two places that failed to account for a null diagnostic engine pointer: \o{} and \x{}.

This patch adds a check for a null pointer and correctly handles fallback behavior.

Fixes #102218


Full diff: https://github.com/llvm/llvm-project/pull/102339.diff

3 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+2)
  • (modified) clang/lib/Lex/LiteralSupport.cpp (+8-6)
  • (modified) clang/test/Lexer/char-escapes-delimited.c (+13)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 978b4ac8ea2e3..565812e9b503f 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -170,6 +170,8 @@ Bug Fixes in This Version
   be used in C++.
 - Fixed a failed assertion when checking required literal types in C context. (#GH101304).
 - Fixed a crash when trying to transform a dependent address space type. Fixes #GH101685.
+- Fixed a crash when diagnosing format strings and encountering an empty
+  delimited escape sequence (e.g., ``"\o{}"``). #GH102218
 
 Bug Fixes to Compiler Builtins
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/lib/Lex/LiteralSupport.cpp b/clang/lib/Lex/LiteralSupport.cpp
index 9d2720af5dbd9..225a6c2d15baa 100644
--- a/clang/lib/Lex/LiteralSupport.cpp
+++ b/clang/lib/Lex/LiteralSupport.cpp
@@ -190,9 +190,10 @@ static unsigned ProcessCharEscape(const char *ThisTokBegin,
       Delimited = true;
       ThisTokBuf++;
       if (*ThisTokBuf == '}') {
-        Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,
-             diag::err_delimited_escape_empty);
-        return ResultChar;
+        HadError = true;
+        if (Diags)
+          Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,
+               diag::err_delimited_escape_empty);
       }
     } else if (ThisTokBuf == ThisTokEnd || !isHexDigit(*ThisTokBuf)) {
       if (Diags)
@@ -283,9 +284,10 @@ static unsigned ProcessCharEscape(const char *ThisTokBegin,
     Delimited = true;
     ++ThisTokBuf;
     if (*ThisTokBuf == '}') {
-      Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,
-           diag::err_delimited_escape_empty);
-      return ResultChar;
+      HadError = true;
+      if (Diags)
+        Diag(Diags, Features, Loc, ThisTokBegin, EscapeBegin, ThisTokBuf,
+             diag::err_delimited_escape_empty);
     }
 
     while (ThisTokBuf != ThisTokEnd) {
diff --git a/clang/test/Lexer/char-escapes-delimited.c b/clang/test/Lexer/char-escapes-delimited.c
index 5327ef700b0e2..7a8986bc5f867 100644
--- a/clang/test/Lexer/char-escapes-delimited.c
+++ b/clang/test/Lexer/char-escapes-delimited.c
@@ -123,3 +123,16 @@ void separators(void) {
 static_assert('\N??<DOLLAR SIGN??>' == '$'); // expected-warning 2{{trigraph converted}} \
                                              // ext-warning {{extension}} cxx23-warning {{C++23}}
 #endif
+
+void GH102218(void) {
+  // The format specifier checking code runs the lexer with diagnostics
+  // disabled. This used to crash Clang for malformed \o and \x because the
+  // lexer missed a null pointer check for the diagnostics engine in that case.
+  extern int printf(const char *, ...);
+  printf("\o{}"); // expected-error {{delimited escape sequence cannot be empty}}
+  printf("\x{}"); // expected-error {{delimited escape sequence cannot be empty}}
+
+  // These cases always worked but are here for completeness.
+  printf("\u{}"); // expected-error {{delimited escape sequence cannot be empty}}
+  printf("\N{}"); // expected-error {{delimited escape sequence cannot be empty}}
+}

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AaronBallman AaronBallman merged commit 8f0c865 into llvm:main Aug 8, 2024
13 checks passed
@AaronBallman AaronBallman deleted the aballman-escape-seq-crash branch August 8, 2024 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category crash-on-invalid
Projects
None yet
Development

Successfully merging this pull request may close these issues.

clang / llvm segmentation fault when compiling code with empty "delimited escape sequence"
3 participants