Skip to content

[flang][runtime] Fix fixed-width field internal wide character input #74683

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 11, 2023

Conversation

klausler
Copy link
Contributor

@klausler klausler commented Dec 7, 2023

There was some confusion about units (bytes vs characters) in the handling of the amount of input remaining in fixed-width formatted input fields. Clarify that any variable or parameter counting "remaining" space in a field in the I/O runtime is always in units of bytes, and make it so where it wasn't.

Fixes the bug(s) in llvm-test-suite/Fortran/gfortran/regression/char4_iunit_2.f03, although the test still won't pass due to its dependence on gfortran's list-directed output spacing.

@klausler klausler requested a review from clementval December 7, 2023 00:36
@llvmbot llvmbot added flang:runtime flang Flang issues not falling into any other category labels Dec 7, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 7, 2023

@llvm/pr-subscribers-flang-runtime

Author: Peter Klausler (klausler)

Changes

There was some confusion about units (bytes vs characters) in the handling of the amount of input remaining in fixed-width formatted input fields. Clarify that any variable or parameter counting "remaining" space in a field in the I/O runtime is always in units of bytes, and make it so where it wasn't.

Fixes the bug(s) in llvm-test-suite/Fortran/gfortran/regression/char4_iunit_2.f03, although the test still won't pass due to its dependence on gfortran's list-directed output spacing.


Full diff: https://github.com/llvm/llvm-project/pull/74683.diff

2 Files Affected:

  • (modified) flang/runtime/edit-input.cpp (+10-9)
  • (modified) flang/runtime/io-stmt.h (+11-3)
diff --git a/flang/runtime/edit-input.cpp b/flang/runtime/edit-input.cpp
index 4e8c9aa868a69..9dcca5bea6d93 100644
--- a/flang/runtime/edit-input.cpp
+++ b/flang/runtime/edit-input.cpp
@@ -915,18 +915,22 @@ bool EditCharacterInput(
     return false;
   }
   const ConnectionState &connection{io.GetConnectionState()};
-  std::size_t remaining{length};
+  int unitBytesPerChar{std::max<int>(connection.internalIoCharKind, 1)};
+  std::size_t remaining{length * unitBytesPerChar};
+  // Skip leading characters.
+  // Their bytes don't count towards INQUIRE(IOLENGTH=).
+  std::size_t skip{0};
   if (edit.width && *edit.width > 0) {
-    remaining = *edit.width;
+    remaining = *edit.width * unitBytesPerChar;
+    if (static_cast<std::size_t>(*edit.width) > length) {
+      skip = *edit.width - length;
+    }
   }
   // When the field is wider than the variable, we drop the leading
   // characters.  When the variable is wider than the field, there can be
   // trailing padding or an EOR condition.
   const char *input{nullptr};
   std::size_t ready{0};
-  // Skip leading bytes.
-  // These bytes don't count towards INQUIRE(IOLENGTH=).
-  std::size_t skip{remaining > length ? remaining - length : 0};
   // Transfer payload bytes; these do count.
   while (remaining > 0) {
     if (ready == 0) {
@@ -958,7 +962,6 @@ bool EditCharacterInput(
         // error recovery: skip bad encoding
         chunk = 1;
       }
-      --remaining;
     } else if (connection.internalIoCharKind > 1) {
       // Reading from non-default character internal unit
       chunk = connection.internalIoCharKind;
@@ -970,7 +973,6 @@ bool EditCharacterInput(
         *x++ = buffer;
         --length;
       }
-      --remaining;
     } else if constexpr (sizeof *x > 1) {
       // Read single byte with expansion into multi-byte CHARACTER
       chunk = 1;
@@ -980,7 +982,6 @@ bool EditCharacterInput(
         *x++ = static_cast<unsigned char>(*input);
         --length;
       }
-      --remaining;
     } else { // single bytes -> default CHARACTER
       if (skipping) {
         chunk = std::min<std::size_t>(skip, ready);
@@ -991,9 +992,9 @@ bool EditCharacterInput(
         x += chunk;
         length -= chunk;
       }
-      remaining -= chunk;
     }
     input += chunk;
+    remaining -= chunk;
     if (!skipping) {
       io.GotChar(chunk);
     }
diff --git a/flang/runtime/io-stmt.h b/flang/runtime/io-stmt.h
index d4ceb83265246..91169f6c6e323 100644
--- a/flang/runtime/io-stmt.h
+++ b/flang/runtime/io-stmt.h
@@ -92,8 +92,8 @@ class IoStatementState {
   std::size_t GetNextInputBytes(const char *&);
   bool AdvanceRecord(int = 1);
   void BackspaceRecord();
-  void HandleRelativePosition(std::int64_t);
-  void HandleAbsolutePosition(std::int64_t); // for r* in list I/O
+  void HandleRelativePosition(std::int64_t byteOffset);
+  void HandleAbsolutePosition(std::int64_t byteOffset); // for r* in list I/O
   std::optional<DataEdit> GetNextDataEdit(int maxRepeat = 1);
   ExternalFileUnit *GetExternalFileUnit() const; // null if internal unit
   bool BeginReadingRecord();
@@ -124,7 +124,11 @@ class IoStatementState {
   // Vacant after the end of the current record
   std::optional<char32_t> GetCurrentChar(std::size_t &byteCount);
 
-  // For fixed-width fields, return the number of remaining characters.
+  // The "remaining" arguments to CueUpInput(), SkipSpaces(), & NextInField()
+  // are always in units of bytes, not characters; the distinction matters
+  // for internal input from CHARACTER(KIND=2 and 4).
+
+  // For fixed-width fields, return the number of remaining bytes.
   // Skip over leading blanks.
   std::optional<int> CueUpInput(const DataEdit &edit) {
     std::optional<int> remaining;
@@ -134,6 +138,10 @@ class IoStatementState {
     } else {
       if (edit.width.value_or(0) > 0) {
         remaining = *edit.width;
+        if (int bytesPerChar{GetConnectionState().internalIoCharKind};
+            bytesPerChar > 1) {
+          *remaining *= bytesPerChar;
+        }
       }
       SkipSpaces(remaining);
     }

Copy link
Contributor

@clementval clementval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

There was some confusion about units (bytes vs characters) in the
handling of the amount of input remaining in fixed-width formatted
input fields.  Clarify that any variable or parameter counting "remaining"
space in a field in the I/O runtime is always in units of bytes, and
make it so where it wasn't.  Rename many local variables so that their
units (characters or bytes) are more clear.

Fixes the bug(s) in llvm-test-suite/Fortran/gfortran/regression/char4_iunit_2.f03,
although the test still won't pass due to its dependence on gfortran's
list-directed output spacing.
@klausler klausler merged commit 353d56d into llvm:main Dec 11, 2023
@klausler klausler deleted the char4 branch December 11, 2023 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:runtime flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants