Skip to content

[flang] Improve error recovery in tricky situation #95168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 13, 2024
Merged

Conversation

klausler
Copy link
Contributor

When the very first statement of the executable part has syntax errors, it's not at all obvious whether the error messages that are reported to the user should be those from its failure to be the last statement of the specification part or its failure to be the first executable statement when both failures are at the same character in the cooked character stream. Fortran makes this problem more exciting by allowing statement function definitions look a lot like several executable statements.

The current error recovery scheme for declaration constructs depends on a look-ahead test to see whether the failed construct is actually the first executable statement. This works fine when the first executable statement is not in error, but should also allow for some error cases that begin with the tokens of an executable statement.

This can obviously still go wrong for declaration constructs that are unparseable and also have ambiguity in their leading tokens with executable statements, but that seems to be a less likely case.

Also improves error recovery for parenthesized items.

@klausler klausler requested a review from jeffhammond June 11, 2024 20:27
When the very first statement of the executable part has syntax
errors, it's not at all obvious whether the error messages that
are reported to the user should be those from its failure to be
the last statement of the specification part or its failure to
be the first executable statement when both failures are at the
same character in the cooked character stream.  Fortran makes
this problem more exciting by allowing statement function definitions
look a lot like several executable statements.

The current error recovery scheme for declaration constructs
depends on a look-ahead test to see whether the failed construct
is actually the first executable statement.  This works fine
when the first executable statement is not in error, but should
also allow for some error cases that begin with the tokens of an
executable statement.

This can obviously still go wrong for declaration constructs that
are unparseable and also have ambiguity in their leading tokens with
executable statements, but that seems to be a less likely case.

Also improves error recovery for parenthesized items.
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:parser labels Jun 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 12, 2024

@llvm/pr-subscribers-flang-parser

Author: Peter Klausler (klausler)

Changes

When the very first statement of the executable part has syntax errors, it's not at all obvious whether the error messages that are reported to the user should be those from its failure to be the last statement of the specification part or its failure to be the first executable statement when both failures are at the same character in the cooked character stream. Fortran makes this problem more exciting by allowing statement function definitions look a lot like several executable statements.

The current error recovery scheme for declaration constructs depends on a look-ahead test to see whether the failed construct is actually the first executable statement. This works fine when the first executable statement is not in error, but should also allow for some error cases that begin with the tokens of an executable statement.

This can obviously still go wrong for declaration constructs that are unparseable and also have ambiguity in their leading tokens with executable statements, but that seems to be a less likely case.

Also improves error recovery for parenthesized items.


Full diff: https://github.com/llvm/llvm-project/pull/95168.diff

5 Files Affected:

  • (modified) flang/lib/Parser/expr-parsers.cpp (+2-1)
  • (modified) flang/lib/Parser/program-parsers.cpp (+16-8)
  • (modified) flang/lib/Parser/token-parsers.h (+27-1)
  • (added) flang/test/Parser/recovery01.f90 (+10)
  • (added) flang/test/Parser/recovery02.f90 (+8)
diff --git a/flang/lib/Parser/expr-parsers.cpp b/flang/lib/Parser/expr-parsers.cpp
index a47aae166b575..77a13de7fd02d 100644
--- a/flang/lib/Parser/expr-parsers.cpp
+++ b/flang/lib/Parser/expr-parsers.cpp
@@ -70,7 +70,8 @@ TYPE_PARSER(construct<AcImpliedDoControl>(
 constexpr auto primary{instrumented("primary"_en_US,
     first(construct<Expr>(indirect(Parser<CharLiteralConstantSubstring>{})),
         construct<Expr>(literalConstant),
-        construct<Expr>(construct<Expr::Parentheses>(parenthesized(expr))),
+        construct<Expr>(construct<Expr::Parentheses>("(" >>
+            expr / !","_tok / recovery(")"_tok, SkipPastNested<'(', ')'>{}))),
         construct<Expr>(indirect(functionReference) / !"("_tok / !"%"_tok),
         construct<Expr>(designator / !"("_tok / !"%"_tok),
         construct<Expr>(indirect(Parser<SubstringInquiry>{})), // %LEN or %KIND
diff --git a/flang/lib/Parser/program-parsers.cpp b/flang/lib/Parser/program-parsers.cpp
index 6f25ba4827220..b51b60157f39c 100644
--- a/flang/lib/Parser/program-parsers.cpp
+++ b/flang/lib/Parser/program-parsers.cpp
@@ -86,10 +86,15 @@ TYPE_CONTEXT_PARSER("specification part"_en_US,
 // are in contexts that impose constraints on the kinds of statements that
 // are allowed, and so we have a variant production for declaration-construct
 // that implements those constraints.
-constexpr auto execPartLookAhead{first(actionStmt >> ok, openaccConstruct >> ok,
-    openmpConstruct >> ok, "ASSOCIATE ("_tok, "BLOCK"_tok, "SELECT"_tok,
-    "CHANGE TEAM"_sptok, "CRITICAL"_tok, "DO"_tok, "IF ("_tok, "WHERE ("_tok,
-    "FORALL ("_tok, "!$CUF"_tok)};
+constexpr auto actionStmtLookAhead{first(actionStmt >> ok,
+    // Also accept apparent action statements with errors if they might be
+    // first in the execution part
+    "ALLOCATE ("_tok, "CALL" >> name >> "("_tok, "GO TO"_tok, "OPEN ("_tok,
+    "PRINT"_tok / space / !"("_tok, "READ ("_tok, "WRITE ("_tok)};
+constexpr auto execPartLookAhead{first(actionStmtLookAhead >> ok,
+    openaccConstruct >> ok, openmpConstruct >> ok, "ASSOCIATE ("_tok,
+    "BLOCK"_tok, "SELECT"_tok, "CHANGE TEAM"_sptok, "CRITICAL"_tok, "DO"_tok,
+    "IF ("_tok, "WHERE ("_tok, "FORALL ("_tok, "!$CUF"_tok)};
 constexpr auto declErrorRecovery{
     stmtErrorRecoveryStart >> !execPartLookAhead >> skipStmtErrorRecovery};
 constexpr auto misplacedSpecificationStmt{Parser<UseStmt>{} >>
@@ -446,10 +451,13 @@ TYPE_PARSER(extension<LanguageFeature::CUDA>(
     "<<<" >> construct<CallStmt::Chevrons>(scalarExpr, "," >> scalarExpr,
                  maybe("," >> scalarIntExpr), maybe("," >> scalarIntExpr)) /
         ">>>"))
-TYPE_PARSER(construct<CallStmt>(
-    sourced(construct<CallStmt>("CALL" >> Parser<ProcedureDesignator>{},
-        maybe(Parser<CallStmt::Chevrons>{}),
-        defaulted(parenthesized(optionalList(actualArgSpec)))))))
+constexpr auto actualArgSpecList{optionalList(actualArgSpec)};
+TYPE_CONTEXT_PARSER("CALL statement"_en_US,
+    construct<CallStmt>(
+        sourced(construct<CallStmt>("CALL" >> Parser<ProcedureDesignator>{},
+            maybe(Parser<CallStmt::Chevrons>{}) / space,
+            "(" >> actualArgSpecList / ")" ||
+                lookAhead(endOfStmt) >> defaulted(actualArgSpecList)))))
 
 // R1522 procedure-designator ->
 //         procedure-name | proc-component-ref | data-ref % binding-name
diff --git a/flang/lib/Parser/token-parsers.h b/flang/lib/Parser/token-parsers.h
index 2495017d19649..fe6bc1f69f576 100644
--- a/flang/lib/Parser/token-parsers.h
+++ b/flang/lib/Parser/token-parsers.h
@@ -560,6 +560,8 @@ template <char goal> struct SkipPast {
     while (std::optional<const char *> p{state.GetNextChar()}) {
       if (**p == goal) {
         return {Success{}};
+      } else if (**p == '\n') {
+        break;
       }
     }
     return std::nullopt;
@@ -574,8 +576,32 @@ template <char goal> struct SkipTo {
     while (std::optional<const char *> p{state.PeekAtNextChar()}) {
       if (**p == goal) {
         return {Success{}};
+      } else if (**p == '\n') {
+        break;
+      } else {
+        state.UncheckedAdvance();
+      }
+    }
+    return std::nullopt;
+  }
+};
+
+template <char left, char right> struct SkipPastNested {
+  using resultType = Success;
+  constexpr SkipPastNested() {}
+  constexpr SkipPastNested(const SkipPastNested &) {}
+  static std::optional<Success> Parse(ParseState &state) {
+    int nesting{1};
+    while (std::optional<const char *> p{state.GetNextChar()}) {
+      if (**p == right) {
+        if (!--nesting) {
+          return {Success{}};
+        }
+      } else if (**p == left) {
+        ++nesting;
+      } else if (**p == '\n') {
+        break;
       }
-      state.UncheckedAdvance();
     }
     return std::nullopt;
   }
diff --git a/flang/test/Parser/recovery01.f90 b/flang/test/Parser/recovery01.f90
new file mode 100644
index 0000000000000..674abaccc7c7d
--- /dev/null
+++ b/flang/test/Parser/recovery01.f90
@@ -0,0 +1,10 @@
+! RUN: not %flang_fc1 -fsyntax-only %s 2>&1 | FileCheck %s
+program main
+    call foo(i, &
+             j, &
+             k, &
+             1$)
+end
+
+!CHECK: error: expected ')'
+!CHECK: in the context: CALL statement
diff --git a/flang/test/Parser/recovery02.f90 b/flang/test/Parser/recovery02.f90
new file mode 100644
index 0000000000000..0d0d15545cf39
--- /dev/null
+++ b/flang/test/Parser/recovery02.f90
@@ -0,0 +1,8 @@
+! RUN: not %flang_fc1 -fsyntax-only %s 2>&1 | FileCheck %s
+continue ! force executable part
+CALL ADD_HASH_BLOCK(d_c,f_c,dimc, &
+  (h2b-1+noab*(h1b-1+noab*(p4b-noab-1+nvab*(p3b-noab-1$)))))
+end
+
+!CHECK: error: expected ')'
+!CHECK: in the context: CALL statement

@klausler klausler merged commit 010c55b into llvm:main Jun 13, 2024
9 checks passed
@klausler klausler deleted the jeff1 branch June 13, 2024 17:46
EthanLuisMcDonough pushed a commit to EthanLuisMcDonough/llvm-project that referenced this pull request Aug 13, 2024
When the very first statement of the executable part has syntax errors,
it's not at all obvious whether the error messages that are reported to
the user should be those from its failure to be the last statement of
the specification part or its failure to be the first executable
statement when both failures are at the same character in the cooked
character stream. Fortran makes this problem more exciting by allowing
statement function definitions look a lot like several executable
statements.

The current error recovery scheme for declaration constructs depends on
a look-ahead test to see whether the failed construct is actually the
first executable statement. This works fine when the first executable
statement is not in error, but should also allow for some error cases
that begin with the tokens of an executable statement.

This can obviously still go wrong for declaration constructs that are
unparseable and also have ambiguity in their leading tokens with
executable statements, but that seems to be a less likely case.

Also improves error recovery for parenthesized items.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:parser flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants