Skip to content

[flang][runtime] Don't use endfile record number for EOF detection on… #74640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 11, 2023

Conversation

klausler
Copy link
Contributor

@klausler klausler commented Dec 6, 2023

… input

The current EOF detection method (IsAtEOF()) depends on comparing the current record number with the record number of the endfile record, if it is known, which it is for units that have been written and then rewound for input. For formatted input, this is wrong in the case of a unit written with in-band newline characters. Rather than scan output data to count newlines, it's best to just organically determine EOF by detecting a failed or short read(), as we would have done anyway had the endfile record number not been known. (I considered resetting the endfile record number at the point of a REWIND, but not all rewinds are followed by input; it seems wiser to defer the resetting until an actual READ takes place.)

Fixes llvm-test-suite/Fortran/gfortran/regression/backslash_2.f90

@klausler klausler requested a review from vdonaldson December 6, 2023 18:45
@llvmbot llvmbot added flang:runtime flang Flang issues not falling into any other category labels Dec 6, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 6, 2023

@llvm/pr-subscribers-flang-runtime

Author: Peter Klausler (klausler)

Changes

… input

The current EOF detection method (IsAtEOF()) depends on comparing the current record number with the record number of the endfile record, if it is known, which it is for units that have been written and then rewound for input. For formatted input, this is wrong in the case of a unit written with in-band newline characters. Rather than scan output data to count newlines, it's best to just organically determine EOF by detecting a failed or short read(), as we would have done anyway had the endfile record number not been known. (I considered resetting the endfile record number at the point of a REWIND, but not all rewinds are followed by input; it seems wiser to defer the resetting until an actual READ takes place.)

Fixes llvm-test-suite/Fortran/gfortran/regression/backslash_2.f90


Full diff: https://github.com/llvm/llvm-project/pull/74640.diff

1 Files Affected:

  • (modified) flang/runtime/unit.cpp (+14-11)
diff --git a/flang/runtime/unit.cpp b/flang/runtime/unit.cpp
index 995656b9480c4..15855bff753a4 100644
--- a/flang/runtime/unit.cpp
+++ b/flang/runtime/unit.cpp
@@ -436,6 +436,14 @@ bool ExternalFileUnit::BeginReadingRecord(IoErrorHandler &handler) {
   RUNTIME_CHECK(handler, direction_ == Direction::Input);
   if (!beganReadingRecord_) {
     beganReadingRecord_ = true;
+    // Don't use IsAtEOF() to check for an EOF condition here, just detect
+    // it from a failed or short read from the file.  IsAtEOF() could be
+    // wrong for formatted input if actual newline characters had been
+    // written in-band by previous WRITEs before a REWIND.  In fact,
+    // now that we know that the unit is being used for input (again),
+    // it's best to reset endfileRecordNumber and ensure IsAtEOF() will
+    // now be true on return only if it gets set by HitEndOnRead().
+    endfileRecordNumber.reset();
     if (access == Access::Direct) {
       CheckDirectAccess(handler);
       auto need{static_cast<std::size_t>(recordOffsetInFrame_ + *openRecl)};
@@ -448,17 +456,11 @@ bool ExternalFileUnit::BeginReadingRecord(IoErrorHandler &handler) {
       }
     } else {
       recordLength.reset();
-      if (IsAtEOF()) {
-        handler.SignalEnd();
-      } else {
-        RUNTIME_CHECK(handler, isUnformatted.has_value());
-        if (*isUnformatted) {
-          if (access == Access::Sequential) {
-            BeginSequentialVariableUnformattedInputRecord(handler);
-          }
-        } else { // formatted sequential or stream
-          BeginVariableFormattedInputRecord(handler);
-        }
+      RUNTIME_CHECK(handler, isUnformatted.has_value());
+      if (*isUnformatted) {
+        BeginSequentialVariableUnformattedInputRecord(handler);
+      } else { // formatted sequential or stream
+        BeginVariableFormattedInputRecord(handler);
       }
     }
   }
@@ -723,6 +725,7 @@ void ExternalFileUnit::EndIoStatement() {
 
 void ExternalFileUnit::BeginSequentialVariableUnformattedInputRecord(
     IoErrorHandler &handler) {
+  RUNTIME_CHECK(handler, access == Access::Sequential);
   std::int32_t header{0}, footer{0};
   std::size_t need{recordOffsetInFrame_ + sizeof header};
   std::size_t got{ReadFrame(frameOffsetInFile_, need, handler)};

… input

The current EOF detection method (IsAtEOF()) depends on comparing
the current record number with the record number of the endfile
record, if it is known, which it is for units that have been
written and then rewound for input.  For formatted input, this is wrong
in the case of a unit written with in-band newline characters.
Rather than scan output data to count newlines, it's best to just
organically determine EOF by detecting a failed or short read(),
as we would have done anyway had the endfile record number not been
known.  (I considered resetting the endfile record number at the
point of a REWIND, but not all rewinds are followed by input; it
seems wiser to defer the resetting until an actual READ takes place.)

Fixes llvm-test-suite/Fortran/gfortran/regression/backslash_2.f90
@klausler klausler merged commit bf1c89c into llvm:main Dec 11, 2023
@klausler klausler deleted the backslash branch December 11, 2023 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:runtime flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants