Skip to content

Propagate IsText parameter to openFileForRead function #110661

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

abhina-sree
Copy link
Contributor

@abhina-sree abhina-sree commented Oct 1, 2024

This patch adds an IsText parameter to the following functions openFileForRead, getBufferForFile, getBufferForFileImpl. The default is set to OF_Text instead of OF_None, this change in value does not affect any other platforms other than z/OS. Setting this parameter correctly is required to open files on z/OS in the correct encoding. The IsText parameter is based on the context of where we open files, for example, the embed directive requires that files always be opened in binary even though they might be tagged as text.

Some more context:
Automatic conversion is a z/OS specific functionality to help ASCII applications work on z/OS. The native encoding on z/OS is EBCDIC, not ASCII-based, so without this tool enabled, we aren't able to read files correctly. This is some documentation on this feature: https://www.ibm.com/docs/en/zos/3.1.0?topic=pages-enhanced-ascii.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang-tools-extra lldb clangd clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules llvm:support labels Oct 1, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 1, 2024

@llvm/pr-subscribers-clang-tools-extra
@llvm/pr-subscribers-clangd
@llvm/pr-subscribers-clang-modules

@llvm/pr-subscribers-lldb

Author: Abhina Sree (abhina-sree)

Changes

This patch adds an IsText parameter to the following functions openFileForRead, getBufferForFile, getBufferForFileImpl. The default is set to OF_Text instead of OF_None, this change in value does not affect any other platforms other than z/OS. Setting this parameter correctly is required to open files on z/OS in the correct encoding.


Patch is 21.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110661.diff

19 Files Affected:

  • (modified) clang-tools-extra/clangd/FS.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/Preamble.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/support/ThreadsafeFS.cpp (+1-1)
  • (modified) clang-tools-extra/clangd/unittests/ClangdTests.cpp (+1-1)
  • (modified) clang/include/clang/Basic/FileManager.h (+4-4)
  • (modified) clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h (+1-1)
  • (modified) clang/lib/Basic/FileManager.cpp (+6-6)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+2-1)
  • (modified) clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp (+3-2)
  • (modified) clang/unittests/Driver/DistroTest.cpp (+2-2)
  • (modified) clang/unittests/Driver/ToolChainTest.cpp (+1-1)
  • (modified) clang/unittests/Frontend/PCHPreambleTest.cpp (+3-3)
  • (modified) clang/unittests/Tooling/DependencyScanning/DependencyScannerTest.cpp (+2-2)
  • (modified) lldb/unittests/Host/FileSystemTest.cpp (+1-1)
  • (modified) lldb/unittests/Utility/MockSymlinkFileSystem.h (+1-1)
  • (modified) llvm/include/llvm/Support/VirtualFileSystem.h (+12-9)
  • (modified) llvm/lib/Support/FileCollector.cpp (+2-2)
  • (modified) llvm/lib/Support/VirtualFileSystem.cpp (+12-9)
  • (modified) llvm/unittests/Support/VirtualFileSystemTest.cpp (+3-3)
diff --git a/clang-tools-extra/clangd/FS.cpp b/clang-tools-extra/clangd/FS.cpp
index 5729b9341d9d4b..bd3c6440c24b0f 100644
--- a/clang-tools-extra/clangd/FS.cpp
+++ b/clang-tools-extra/clangd/FS.cpp
@@ -64,8 +64,8 @@ PreambleFileStatusCache::getProducingFS(
         : ProxyFileSystem(std::move(FS)), StatCache(StatCache) {}
 
     llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-    openFileForRead(const llvm::Twine &Path) override {
-      auto File = getUnderlyingFS().openFileForRead(Path);
+    openFileForRead(const llvm::Twine &Path, bool IsText = true) override {
+      auto File = getUnderlyingFS().openFileForRead(Path, IsText);
       if (!File || !*File)
         return File;
       // Eagerly stat opened file, as the followup `status` call on the file
diff --git a/clang-tools-extra/clangd/Preamble.cpp b/clang-tools-extra/clangd/Preamble.cpp
index c14c4d1ba103f8..55cfe1c97fc6f4 100644
--- a/clang-tools-extra/clangd/Preamble.cpp
+++ b/clang-tools-extra/clangd/Preamble.cpp
@@ -479,9 +479,9 @@ class TimerFS : public llvm::vfs::ProxyFileSystem {
       : ProxyFileSystem(std::move(FS)) {}
 
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const llvm::Twine &Path) override {
+  openFileForRead(const llvm::Twine &Path, bool IsText = true) override {
     WallTimerRegion T(Timer);
-    auto FileOr = getUnderlyingFS().openFileForRead(Path);
+    auto FileOr = getUnderlyingFS().openFileForRead(Path, IsText);
     if (!FileOr)
       return FileOr;
     return std::make_unique<TimerFile>(Timer, std::move(FileOr.get()));
diff --git a/clang-tools-extra/clangd/support/ThreadsafeFS.cpp b/clang-tools-extra/clangd/support/ThreadsafeFS.cpp
index 7398e4258527ba..bc0b984e577cb8 100644
--- a/clang-tools-extra/clangd/support/ThreadsafeFS.cpp
+++ b/clang-tools-extra/clangd/support/ThreadsafeFS.cpp
@@ -29,7 +29,7 @@ class VolatileFileSystem : public llvm::vfs::ProxyFileSystem {
       : ProxyFileSystem(std::move(FS)) {}
 
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const llvm::Twine &InPath) override {
+  openFileForRead(const llvm::Twine &InPath, bool IsText = true) override {
     llvm::SmallString<128> Path;
     InPath.toVector(Path);
 
diff --git a/clang-tools-extra/clangd/unittests/ClangdTests.cpp b/clang-tools-extra/clangd/unittests/ClangdTests.cpp
index 643b8e9f12d751..e86385c2072b34 100644
--- a/clang-tools-extra/clangd/unittests/ClangdTests.cpp
+++ b/clang-tools-extra/clangd/unittests/ClangdTests.cpp
@@ -1010,7 +1010,7 @@ TEST(ClangdTests, PreambleVFSStatCache) {
             : ProxyFileSystem(std::move(FS)), CountStats(CountStats) {}
 
         llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-        openFileForRead(const Twine &Path) override {
+        openFileForRead(const Twine &Path, bool IsText = true) override {
           ++CountStats[llvm::sys::path::filename(Path.str())];
           return ProxyFileSystem::openFileForRead(Path);
         }
diff --git a/clang/include/clang/Basic/FileManager.h b/clang/include/clang/Basic/FileManager.h
index ce4e8c1fbe16eb..d987fb05a94a37 100644
--- a/clang/include/clang/Basic/FileManager.h
+++ b/clang/include/clang/Basic/FileManager.h
@@ -292,21 +292,21 @@ class FileManager : public RefCountedBase<FileManager> {
   /// MemoryBuffer if successful, otherwise returning null.
   llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
   getBufferForFile(FileEntryRef Entry, bool isVolatile = false,
-                   bool RequiresNullTerminator = true,
+                   bool RequiresNullTerminator = true, bool IsText = true,
                    std::optional<int64_t> MaybeLimit = std::nullopt);
   llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
   getBufferForFile(StringRef Filename, bool isVolatile = false,
-                   bool RequiresNullTerminator = true,
+                   bool RequiresNullTerminator = true, bool IsText = true,
                    std::optional<int64_t> MaybeLimit = std::nullopt) const {
     return getBufferForFileImpl(Filename,
                                 /*FileSize=*/MaybeLimit.value_or(-1),
-                                isVolatile, RequiresNullTerminator);
+                                isVolatile, RequiresNullTerminator, IsText);
   }
 
 private:
   llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
   getBufferForFileImpl(StringRef Filename, int64_t FileSize, bool isVolatile,
-                       bool RequiresNullTerminator) const;
+                       bool RequiresNullTerminator, bool IsText) const;
 
   DirectoryEntry *&getRealDirEntry(const llvm::vfs::Status &Status);
 
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
index d12814e7c9253e..635fdd0e00c433 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
@@ -346,7 +346,7 @@ class DependencyScanningWorkerFilesystem
 
   llvm::ErrorOr<llvm::vfs::Status> status(const Twine &Path) override;
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const Twine &Path) override;
+  openFileForRead(const Twine &Path, bool IsText = true) override;
 
   std::error_code getRealPath(const Twine &Path,
                               SmallVectorImpl<char> &Output) override;
diff --git a/clang/lib/Basic/FileManager.cpp b/clang/lib/Basic/FileManager.cpp
index 6097b85a03064b..27075cefafdc2f 100644
--- a/clang/lib/Basic/FileManager.cpp
+++ b/clang/lib/Basic/FileManager.cpp
@@ -530,7 +530,7 @@ void FileManager::fillRealPathName(FileEntry *UFE, llvm::StringRef FileName) {
 
 llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
 FileManager::getBufferForFile(FileEntryRef FE, bool isVolatile,
-                              bool RequiresNullTerminator,
+                              bool RequiresNullTerminator, bool IsText,
                               std::optional<int64_t> MaybeLimit) {
   const FileEntry *Entry = &FE.getFileEntry();
   // If the content is living on the file entry, return a reference to it.
@@ -558,21 +558,21 @@ FileManager::getBufferForFile(FileEntryRef FE, bool isVolatile,
 
   // Otherwise, open the file.
   return getBufferForFileImpl(Filename, FileSize, isVolatile,
-                              RequiresNullTerminator);
+                              RequiresNullTerminator, IsText);
 }
 
 llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
 FileManager::getBufferForFileImpl(StringRef Filename, int64_t FileSize,
-                                  bool isVolatile,
-                                  bool RequiresNullTerminator) const {
+                                  bool isVolatile, bool RequiresNullTerminator,
+                                  bool IsText) const {
   if (FileSystemOpts.WorkingDir.empty())
     return FS->getBufferForFile(Filename, FileSize, RequiresNullTerminator,
-                                isVolatile);
+                                isVolatile, IsText);
 
   SmallString<128> FilePath(Filename);
   FixupRelativePath(FilePath);
   return FS->getBufferForFile(FilePath, FileSize, RequiresNullTerminator,
-                              isVolatile);
+                              isVolatile, IsText);
 }
 
 /// getStatValue - Get the 'stat' information for the specified path,
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 8623c030b6d593..c575869a9061dd 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -5328,7 +5328,8 @@ std::string ASTReader::getOriginalSourceFile(
     const PCHContainerReader &PCHContainerRdr, DiagnosticsEngine &Diags) {
   // Open the AST file.
   auto Buffer = FileMgr.getBufferForFile(ASTFileName, /*IsVolatile=*/false,
-                                         /*RequiresNullTerminator=*/false);
+                                         /*RequiresNullTerminator=*/false,
+                                         /*IsText=*/false);
   if (!Buffer) {
     Diags.Report(diag::err_fe_unable_to_read_pch_file)
         << ASTFileName << Buffer.getError().message();
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
index 4d738e4bea41a6..7d6239a0732fe6 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -353,12 +353,13 @@ DepScanFile::create(EntryRef Entry) {
 }
 
 llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-DependencyScanningWorkerFilesystem::openFileForRead(const Twine &Path) {
+DependencyScanningWorkerFilesystem::openFileForRead(const Twine &Path,
+                                                    bool IsText) {
   SmallString<256> OwnedFilename;
   StringRef Filename = Path.toStringRef(OwnedFilename);
 
   if (shouldBypass(Filename))
-    return getUnderlyingFS().openFileForRead(Path);
+    return getUnderlyingFS().openFileForRead(Path, IsText);
 
   llvm::ErrorOr<EntryRef> Result = getOrCreateFileSystemEntry(Filename);
   if (!Result)
diff --git a/clang/unittests/Driver/DistroTest.cpp b/clang/unittests/Driver/DistroTest.cpp
index 43efc0dd8f51ec..4c13b8bbb94089 100644
--- a/clang/unittests/Driver/DistroTest.cpp
+++ b/clang/unittests/Driver/DistroTest.cpp
@@ -352,9 +352,9 @@ TEST(DistroTest, DetectWindowsAndCrossCompile) {
     }
 
     llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-    openFileForRead(const llvm::Twine &Path) override {
+    openFileForRead(const llvm::Twine &Path, bool IsText = true) override {
       ++Count;
-      return llvm::vfs::ProxyFileSystem::openFileForRead(Path);
+      return llvm::vfs::ProxyFileSystem::openFileForRead(Path, IsText);
     }
 
     unsigned Count{};
diff --git a/clang/unittests/Driver/ToolChainTest.cpp b/clang/unittests/Driver/ToolChainTest.cpp
index 8542a168f93c27..672742cea5d6ba 100644
--- a/clang/unittests/Driver/ToolChainTest.cpp
+++ b/clang/unittests/Driver/ToolChainTest.cpp
@@ -683,7 +683,7 @@ struct FileSystemWithError : public llvm::vfs::FileSystem {
     return std::make_error_code(std::errc::no_such_file_or_directory);
   }
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const Twine &Path) override {
+  openFileForRead(const Twine &Path, bool IsText = true) override {
     return std::make_error_code(std::errc::permission_denied);
   }
   llvm::vfs::directory_iterator dir_begin(const Twine &Dir,
diff --git a/clang/unittests/Frontend/PCHPreambleTest.cpp b/clang/unittests/Frontend/PCHPreambleTest.cpp
index 2ce24c91ac0f13..137dcf2377be47 100644
--- a/clang/unittests/Frontend/PCHPreambleTest.cpp
+++ b/clang/unittests/Frontend/PCHPreambleTest.cpp
@@ -36,10 +36,10 @@ class ReadCountingInMemoryFileSystem : public vfs::InMemoryFileSystem
   std::map<std::string, unsigned> ReadCounts;
 
 public:
-  ErrorOr<std::unique_ptr<vfs::File>> openFileForRead(const Twine &Path) override
-  {
+  ErrorOr<std::unique_ptr<vfs::File>>
+  openFileForRead(const Twine &Path, bool IsText = true) override {
     ++ReadCounts[Canonicalize(Path)];
-    return InMemoryFileSystem::openFileForRead(Path);
+    return InMemoryFileSystem::openFileForRead(Path, IsText);
   }
 
   unsigned GetReadCount(const Twine &Path) const
diff --git a/clang/unittests/Tooling/DependencyScanning/DependencyScannerTest.cpp b/clang/unittests/Tooling/DependencyScanning/DependencyScannerTest.cpp
index ec0e143be4a209..1012016ed03c7d 100644
--- a/clang/unittests/Tooling/DependencyScanning/DependencyScannerTest.cpp
+++ b/clang/unittests/Tooling/DependencyScanning/DependencyScannerTest.cpp
@@ -277,9 +277,9 @@ TEST(DependencyScanner, ScanDepsWithModuleLookup) {
     }
 
     llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-    openFileForRead(const Twine &Path) override {
+    openFileForRead(const Twine &Path, bool IsText = true) override {
       ReadFiles.push_back(Path.str());
-      return ProxyFileSystem::openFileForRead(Path);
+      return ProxyFileSystem::openFileForRead(Path, IsText);
     }
   };
 
diff --git a/lldb/unittests/Host/FileSystemTest.cpp b/lldb/unittests/Host/FileSystemTest.cpp
index 58887f6b2467e0..3c76f1d1bed21a 100644
--- a/lldb/unittests/Host/FileSystemTest.cpp
+++ b/lldb/unittests/Host/FileSystemTest.cpp
@@ -59,7 +59,7 @@ class DummyFileSystem : public vfs::FileSystem {
     return I->second;
   }
   ErrorOr<std::unique_ptr<vfs::File>>
-  openFileForRead(const Twine &Path) override {
+  openFileForRead(const Twine &Path, bool Text = true) override {
     auto S = status(Path);
     if (S)
       return std::unique_ptr<vfs::File>(new DummyFile{*S});
diff --git a/lldb/unittests/Utility/MockSymlinkFileSystem.h b/lldb/unittests/Utility/MockSymlinkFileSystem.h
index 7fa1f93bfa38a9..32a86cd669fb14 100644
--- a/lldb/unittests/Utility/MockSymlinkFileSystem.h
+++ b/lldb/unittests/Utility/MockSymlinkFileSystem.h
@@ -43,7 +43,7 @@ class MockSymlinkFileSystem : public llvm::vfs::FileSystem {
     return llvm::errc::operation_not_permitted;
   }
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const llvm::Twine &Path) override {
+  openFileForRead(const llvm::Twine &Path, bool IsText = true) override {
     return llvm::errc::operation_not_permitted;
   }
   llvm::vfs::directory_iterator dir_begin(const llvm::Twine &Dir,
diff --git a/llvm/include/llvm/Support/VirtualFileSystem.h b/llvm/include/llvm/Support/VirtualFileSystem.h
index 2531c075f262d7..2256fde2faa661 100644
--- a/llvm/include/llvm/Support/VirtualFileSystem.h
+++ b/llvm/include/llvm/Support/VirtualFileSystem.h
@@ -273,13 +273,14 @@ class FileSystem : public llvm::ThreadSafeRefCountedBase<FileSystem>,
 
   /// Get a \p File object for the file at \p Path, if one exists.
   virtual llvm::ErrorOr<std::unique_ptr<File>>
-  openFileForRead(const Twine &Path) = 0;
+  openFileForRead(const Twine &Path, bool IsText = true) = 0;
 
   /// This is a convenience method that opens a file, gets its content and then
   /// closes the file.
   llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
   getBufferForFile(const Twine &Name, int64_t FileSize = -1,
-                   bool RequiresNullTerminator = true, bool IsVolatile = false);
+                   bool RequiresNullTerminator = true, bool IsVolatile = false,
+                   bool IsText = true);
 
   /// Get a directory_iterator for \p Dir.
   /// \note The 'end' iterator is directory_iterator().
@@ -392,7 +393,7 @@ class OverlayFileSystem : public RTTIExtends<OverlayFileSystem, FileSystem> {
   llvm::ErrorOr<Status> status(const Twine &Path) override;
   bool exists(const Twine &Path) override;
   llvm::ErrorOr<std::unique_ptr<File>>
-  openFileForRead(const Twine &Path) override;
+  openFileForRead(const Twine &Path, bool IsText = true) override;
   directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override;
   llvm::ErrorOr<std::string> getCurrentWorkingDirectory() const override;
   std::error_code setCurrentWorkingDirectory(const Twine &Path) override;
@@ -446,8 +447,8 @@ class ProxyFileSystem : public RTTIExtends<ProxyFileSystem, FileSystem> {
   }
   bool exists(const Twine &Path) override { return FS->exists(Path); }
   llvm::ErrorOr<std::unique_ptr<File>>
-  openFileForRead(const Twine &Path) override {
-    return FS->openFileForRead(Path);
+  openFileForRead(const Twine &Path, bool IsText = true) override {
+    return FS->openFileForRead(Path, IsText);
   }
   directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override {
     return FS->dir_begin(Dir, EC);
@@ -615,7 +616,7 @@ class InMemoryFileSystem : public RTTIExtends<InMemoryFileSystem, FileSystem> {
 
   llvm::ErrorOr<Status> status(const Twine &Path) override;
   llvm::ErrorOr<std::unique_ptr<File>>
-  openFileForRead(const Twine &Path) override;
+  openFileForRead(const Twine &Path, bool IsText = true) override;
   directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override;
 
   llvm::ErrorOr<std::string> getCurrentWorkingDirectory() const override {
@@ -1051,7 +1052,8 @@ class RedirectingFileSystem
 
   ErrorOr<Status> status(const Twine &Path) override;
   bool exists(const Twine &Path) override;
-  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path) override;
+  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path,
+                                                 bool IsText = true) override;
 
   std::error_code getRealPath(const Twine &Path,
                               SmallVectorImpl<char> &Output) override;
@@ -1155,9 +1157,10 @@ class TracingFileSystem
     return ProxyFileSystem::status(Path);
   }
 
-  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path) override {
+  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path,
+                                                 bool IsText = true) override {
     ++NumOpenFileForReadCalls;
-    return ProxyFileSystem::openFileForRead(Path);
+    return ProxyFileSystem::openFileForRead(Path, IsText);
   }
 
   directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override {
diff --git a/llvm/lib/Support/FileCollector.cpp b/llvm/lib/Support/FileCollector.cpp
index 29436f85c2f23c..966c5ac90e9412 100644
--- a/llvm/lib/Support/FileCollector.cpp
+++ b/llvm/lib/Support/FileCollector.cpp
@@ -268,8 +268,8 @@ class FileCollectorFileSystem : public vfs::FileSystem {
   }
 
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const Twine &Path) override {
-    auto Result = FS->openFileForRead(Path);
+  openFileForRead(const Twine &Path, bool IsText = true) override {
+    auto Result = FS->openFileForRead(Path, IsText);
     if (Result && *Result)
       Collector->addFile(Path);
     return Result;
diff --git a/llvm/lib/Support/VirtualFileSystem.cpp b/llvm/lib/Support/VirtualFileSystem.cpp
index 928c0b5a24ed65..ceca65b46e486d 100644
--- a/llvm/lib/Support/VirtualFileSystem.cpp
+++ b/llvm/lib/Support/VirtualFileSystem.cpp
@@ -117,8 +117,9 @@ FileSystem::~FileSystem() = default;
 
 ErrorOr<std::unique_ptr<MemoryBuffer>>
 FileSystem::getBufferForFile(const llvm::Twine &Name, int64_t FileSize,
-                             bool RequiresNullTerminator, bool IsVolatile) {
-  auto F = openFileForRead(Name);
+                             bool RequiresNullTerminator, bool IsVolatile,
+                             bool IsText) {
+  auto F = openFileForRead(Name, IsText);
   if (!F)
     return F.getError();
 
@@ -278,7 +279,8 @@ class RealFileSystem : public FileSystem {
   }
 
   ErrorOr<Status> status(const Twine &Path) override;
-  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path) override;
+  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path,
+                                                 bool IsText = true) override;
   directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override;
 
   llvm::ErrorOr<std::string> getCurrentWorkingDirectory() const override;
@@ -323,10 +325,11 @@ ErrorOr<Status> RealFileSystem::status(const Twine &Path) {
 }
 
 ErrorOr<std::unique_ptr<File>>
-RealFileSystem::openFileForRead(const Twine &Name) {
+RealFileSystem::openFileForRead(const Twine &Name, bool IsText) {
   SmallString<256> RealName, Storage;
   Expected<file_t> FDOrErr = sys::fs::openNativeFileForRead(
-      adjustPath(Name, Storage), sys::fs::OF_None, &RealName);
+      adjustPath(Name, Storage), IsText ? sys::fs::OF_Text : sys::fs::OF_None,
+      &RealName);
   if (!FDOrErr)
     return errorToErrorCode(FDOrErr.takeError());
   return std::unique_ptr<File>(
@@ -458,10 +461,10 @@ bool OverlayFileSystem::exists(const Twine &Path) {
 }
 
 ErrorOr<std::unique_ptr<File>>
-OverlayFileSystem::openFileForRead(const llvm::Twine &Path) {
+OverlayFileSystem::openFileForRead(const llvm::Twine &Path, bool IsText) {
   // FIXME: handle symlinks that cross file systems
   for (iterator I = overlays_begin(), E = overlays_end(); I != E; ++I) {
-    auto Result = (*I)->openFileForRead(Path);
+    auto Result = (*I)->openFileForRead(Path, IsText);
     if (Result || Result.getError() != llvm::errc::no_such_file_or_directory)
       return Result;
   }
@@ -1073,7 +1076,7 @@ llvm::ErrorOr<Status> InMemoryFileSystem::status(const Twine &Path) {
 }
 
 llvm::ErrorOr<std::unique_ptr<File>>
-InMemoryFileSystem::openFileForRead(const Twine &Path) {
+InMemoryFileSystem::openFileForRead(const...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Oct 1, 2024

@llvm/pr-subscribers-clang-driver

Author: Abhina Sree (abhina-sree)

Changes

This patch adds an IsText parameter to the following functions openFileForRead, getBufferForFile, getBufferForFileImpl. The default is set to OF_Text instead of OF_None, this change in value does not affect any other platforms other than z/OS. Setting this parameter correctly is required to open files on z/OS in the correct encoding.


Patch is 21.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110661.diff

19 Files Affected:

  • (modified) clang-tools-extra/clangd/FS.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/Preamble.cpp (+2-2)
  • (modified) clang-tools-extra/clangd/support/ThreadsafeFS.cpp (+1-1)
  • (modified) clang-tools-extra/clangd/unittests/ClangdTests.cpp (+1-1)
  • (modified) clang/include/clang/Basic/FileManager.h (+4-4)
  • (modified) clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h (+1-1)
  • (modified) clang/lib/Basic/FileManager.cpp (+6-6)
  • (modified) clang/lib/Serialization/ASTReader.cpp (+2-1)
  • (modified) clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp (+3-2)
  • (modified) clang/unittests/Driver/DistroTest.cpp (+2-2)
  • (modified) clang/unittests/Driver/ToolChainTest.cpp (+1-1)
  • (modified) clang/unittests/Frontend/PCHPreambleTest.cpp (+3-3)
  • (modified) clang/unittests/Tooling/DependencyScanning/DependencyScannerTest.cpp (+2-2)
  • (modified) lldb/unittests/Host/FileSystemTest.cpp (+1-1)
  • (modified) lldb/unittests/Utility/MockSymlinkFileSystem.h (+1-1)
  • (modified) llvm/include/llvm/Support/VirtualFileSystem.h (+12-9)
  • (modified) llvm/lib/Support/FileCollector.cpp (+2-2)
  • (modified) llvm/lib/Support/VirtualFileSystem.cpp (+12-9)
  • (modified) llvm/unittests/Support/VirtualFileSystemTest.cpp (+3-3)
diff --git a/clang-tools-extra/clangd/FS.cpp b/clang-tools-extra/clangd/FS.cpp
index 5729b9341d9d4b..bd3c6440c24b0f 100644
--- a/clang-tools-extra/clangd/FS.cpp
+++ b/clang-tools-extra/clangd/FS.cpp
@@ -64,8 +64,8 @@ PreambleFileStatusCache::getProducingFS(
         : ProxyFileSystem(std::move(FS)), StatCache(StatCache) {}
 
     llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-    openFileForRead(const llvm::Twine &Path) override {
-      auto File = getUnderlyingFS().openFileForRead(Path);
+    openFileForRead(const llvm::Twine &Path, bool IsText = true) override {
+      auto File = getUnderlyingFS().openFileForRead(Path, IsText);
       if (!File || !*File)
         return File;
       // Eagerly stat opened file, as the followup `status` call on the file
diff --git a/clang-tools-extra/clangd/Preamble.cpp b/clang-tools-extra/clangd/Preamble.cpp
index c14c4d1ba103f8..55cfe1c97fc6f4 100644
--- a/clang-tools-extra/clangd/Preamble.cpp
+++ b/clang-tools-extra/clangd/Preamble.cpp
@@ -479,9 +479,9 @@ class TimerFS : public llvm::vfs::ProxyFileSystem {
       : ProxyFileSystem(std::move(FS)) {}
 
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const llvm::Twine &Path) override {
+  openFileForRead(const llvm::Twine &Path, bool IsText = true) override {
     WallTimerRegion T(Timer);
-    auto FileOr = getUnderlyingFS().openFileForRead(Path);
+    auto FileOr = getUnderlyingFS().openFileForRead(Path, IsText);
     if (!FileOr)
       return FileOr;
     return std::make_unique<TimerFile>(Timer, std::move(FileOr.get()));
diff --git a/clang-tools-extra/clangd/support/ThreadsafeFS.cpp b/clang-tools-extra/clangd/support/ThreadsafeFS.cpp
index 7398e4258527ba..bc0b984e577cb8 100644
--- a/clang-tools-extra/clangd/support/ThreadsafeFS.cpp
+++ b/clang-tools-extra/clangd/support/ThreadsafeFS.cpp
@@ -29,7 +29,7 @@ class VolatileFileSystem : public llvm::vfs::ProxyFileSystem {
       : ProxyFileSystem(std::move(FS)) {}
 
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const llvm::Twine &InPath) override {
+  openFileForRead(const llvm::Twine &InPath, bool IsText = true) override {
     llvm::SmallString<128> Path;
     InPath.toVector(Path);
 
diff --git a/clang-tools-extra/clangd/unittests/ClangdTests.cpp b/clang-tools-extra/clangd/unittests/ClangdTests.cpp
index 643b8e9f12d751..e86385c2072b34 100644
--- a/clang-tools-extra/clangd/unittests/ClangdTests.cpp
+++ b/clang-tools-extra/clangd/unittests/ClangdTests.cpp
@@ -1010,7 +1010,7 @@ TEST(ClangdTests, PreambleVFSStatCache) {
             : ProxyFileSystem(std::move(FS)), CountStats(CountStats) {}
 
         llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-        openFileForRead(const Twine &Path) override {
+        openFileForRead(const Twine &Path, bool IsText = true) override {
           ++CountStats[llvm::sys::path::filename(Path.str())];
           return ProxyFileSystem::openFileForRead(Path);
         }
diff --git a/clang/include/clang/Basic/FileManager.h b/clang/include/clang/Basic/FileManager.h
index ce4e8c1fbe16eb..d987fb05a94a37 100644
--- a/clang/include/clang/Basic/FileManager.h
+++ b/clang/include/clang/Basic/FileManager.h
@@ -292,21 +292,21 @@ class FileManager : public RefCountedBase<FileManager> {
   /// MemoryBuffer if successful, otherwise returning null.
   llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
   getBufferForFile(FileEntryRef Entry, bool isVolatile = false,
-                   bool RequiresNullTerminator = true,
+                   bool RequiresNullTerminator = true, bool IsText = true,
                    std::optional<int64_t> MaybeLimit = std::nullopt);
   llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
   getBufferForFile(StringRef Filename, bool isVolatile = false,
-                   bool RequiresNullTerminator = true,
+                   bool RequiresNullTerminator = true, bool IsText = true,
                    std::optional<int64_t> MaybeLimit = std::nullopt) const {
     return getBufferForFileImpl(Filename,
                                 /*FileSize=*/MaybeLimit.value_or(-1),
-                                isVolatile, RequiresNullTerminator);
+                                isVolatile, RequiresNullTerminator, IsText);
   }
 
 private:
   llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
   getBufferForFileImpl(StringRef Filename, int64_t FileSize, bool isVolatile,
-                       bool RequiresNullTerminator) const;
+                       bool RequiresNullTerminator, bool IsText) const;
 
   DirectoryEntry *&getRealDirEntry(const llvm::vfs::Status &Status);
 
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
index d12814e7c9253e..635fdd0e00c433 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
@@ -346,7 +346,7 @@ class DependencyScanningWorkerFilesystem
 
   llvm::ErrorOr<llvm::vfs::Status> status(const Twine &Path) override;
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const Twine &Path) override;
+  openFileForRead(const Twine &Path, bool IsText = true) override;
 
   std::error_code getRealPath(const Twine &Path,
                               SmallVectorImpl<char> &Output) override;
diff --git a/clang/lib/Basic/FileManager.cpp b/clang/lib/Basic/FileManager.cpp
index 6097b85a03064b..27075cefafdc2f 100644
--- a/clang/lib/Basic/FileManager.cpp
+++ b/clang/lib/Basic/FileManager.cpp
@@ -530,7 +530,7 @@ void FileManager::fillRealPathName(FileEntry *UFE, llvm::StringRef FileName) {
 
 llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
 FileManager::getBufferForFile(FileEntryRef FE, bool isVolatile,
-                              bool RequiresNullTerminator,
+                              bool RequiresNullTerminator, bool IsText,
                               std::optional<int64_t> MaybeLimit) {
   const FileEntry *Entry = &FE.getFileEntry();
   // If the content is living on the file entry, return a reference to it.
@@ -558,21 +558,21 @@ FileManager::getBufferForFile(FileEntryRef FE, bool isVolatile,
 
   // Otherwise, open the file.
   return getBufferForFileImpl(Filename, FileSize, isVolatile,
-                              RequiresNullTerminator);
+                              RequiresNullTerminator, IsText);
 }
 
 llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
 FileManager::getBufferForFileImpl(StringRef Filename, int64_t FileSize,
-                                  bool isVolatile,
-                                  bool RequiresNullTerminator) const {
+                                  bool isVolatile, bool RequiresNullTerminator,
+                                  bool IsText) const {
   if (FileSystemOpts.WorkingDir.empty())
     return FS->getBufferForFile(Filename, FileSize, RequiresNullTerminator,
-                                isVolatile);
+                                isVolatile, IsText);
 
   SmallString<128> FilePath(Filename);
   FixupRelativePath(FilePath);
   return FS->getBufferForFile(FilePath, FileSize, RequiresNullTerminator,
-                              isVolatile);
+                              isVolatile, IsText);
 }
 
 /// getStatValue - Get the 'stat' information for the specified path,
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 8623c030b6d593..c575869a9061dd 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -5328,7 +5328,8 @@ std::string ASTReader::getOriginalSourceFile(
     const PCHContainerReader &PCHContainerRdr, DiagnosticsEngine &Diags) {
   // Open the AST file.
   auto Buffer = FileMgr.getBufferForFile(ASTFileName, /*IsVolatile=*/false,
-                                         /*RequiresNullTerminator=*/false);
+                                         /*RequiresNullTerminator=*/false,
+                                         /*IsText=*/false);
   if (!Buffer) {
     Diags.Report(diag::err_fe_unable_to_read_pch_file)
         << ASTFileName << Buffer.getError().message();
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
index 4d738e4bea41a6..7d6239a0732fe6 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -353,12 +353,13 @@ DepScanFile::create(EntryRef Entry) {
 }
 
 llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-DependencyScanningWorkerFilesystem::openFileForRead(const Twine &Path) {
+DependencyScanningWorkerFilesystem::openFileForRead(const Twine &Path,
+                                                    bool IsText) {
   SmallString<256> OwnedFilename;
   StringRef Filename = Path.toStringRef(OwnedFilename);
 
   if (shouldBypass(Filename))
-    return getUnderlyingFS().openFileForRead(Path);
+    return getUnderlyingFS().openFileForRead(Path, IsText);
 
   llvm::ErrorOr<EntryRef> Result = getOrCreateFileSystemEntry(Filename);
   if (!Result)
diff --git a/clang/unittests/Driver/DistroTest.cpp b/clang/unittests/Driver/DistroTest.cpp
index 43efc0dd8f51ec..4c13b8bbb94089 100644
--- a/clang/unittests/Driver/DistroTest.cpp
+++ b/clang/unittests/Driver/DistroTest.cpp
@@ -352,9 +352,9 @@ TEST(DistroTest, DetectWindowsAndCrossCompile) {
     }
 
     llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-    openFileForRead(const llvm::Twine &Path) override {
+    openFileForRead(const llvm::Twine &Path, bool IsText = true) override {
       ++Count;
-      return llvm::vfs::ProxyFileSystem::openFileForRead(Path);
+      return llvm::vfs::ProxyFileSystem::openFileForRead(Path, IsText);
     }
 
     unsigned Count{};
diff --git a/clang/unittests/Driver/ToolChainTest.cpp b/clang/unittests/Driver/ToolChainTest.cpp
index 8542a168f93c27..672742cea5d6ba 100644
--- a/clang/unittests/Driver/ToolChainTest.cpp
+++ b/clang/unittests/Driver/ToolChainTest.cpp
@@ -683,7 +683,7 @@ struct FileSystemWithError : public llvm::vfs::FileSystem {
     return std::make_error_code(std::errc::no_such_file_or_directory);
   }
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const Twine &Path) override {
+  openFileForRead(const Twine &Path, bool IsText = true) override {
     return std::make_error_code(std::errc::permission_denied);
   }
   llvm::vfs::directory_iterator dir_begin(const Twine &Dir,
diff --git a/clang/unittests/Frontend/PCHPreambleTest.cpp b/clang/unittests/Frontend/PCHPreambleTest.cpp
index 2ce24c91ac0f13..137dcf2377be47 100644
--- a/clang/unittests/Frontend/PCHPreambleTest.cpp
+++ b/clang/unittests/Frontend/PCHPreambleTest.cpp
@@ -36,10 +36,10 @@ class ReadCountingInMemoryFileSystem : public vfs::InMemoryFileSystem
   std::map<std::string, unsigned> ReadCounts;
 
 public:
-  ErrorOr<std::unique_ptr<vfs::File>> openFileForRead(const Twine &Path) override
-  {
+  ErrorOr<std::unique_ptr<vfs::File>>
+  openFileForRead(const Twine &Path, bool IsText = true) override {
     ++ReadCounts[Canonicalize(Path)];
-    return InMemoryFileSystem::openFileForRead(Path);
+    return InMemoryFileSystem::openFileForRead(Path, IsText);
   }
 
   unsigned GetReadCount(const Twine &Path) const
diff --git a/clang/unittests/Tooling/DependencyScanning/DependencyScannerTest.cpp b/clang/unittests/Tooling/DependencyScanning/DependencyScannerTest.cpp
index ec0e143be4a209..1012016ed03c7d 100644
--- a/clang/unittests/Tooling/DependencyScanning/DependencyScannerTest.cpp
+++ b/clang/unittests/Tooling/DependencyScanning/DependencyScannerTest.cpp
@@ -277,9 +277,9 @@ TEST(DependencyScanner, ScanDepsWithModuleLookup) {
     }
 
     llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-    openFileForRead(const Twine &Path) override {
+    openFileForRead(const Twine &Path, bool IsText = true) override {
       ReadFiles.push_back(Path.str());
-      return ProxyFileSystem::openFileForRead(Path);
+      return ProxyFileSystem::openFileForRead(Path, IsText);
     }
   };
 
diff --git a/lldb/unittests/Host/FileSystemTest.cpp b/lldb/unittests/Host/FileSystemTest.cpp
index 58887f6b2467e0..3c76f1d1bed21a 100644
--- a/lldb/unittests/Host/FileSystemTest.cpp
+++ b/lldb/unittests/Host/FileSystemTest.cpp
@@ -59,7 +59,7 @@ class DummyFileSystem : public vfs::FileSystem {
     return I->second;
   }
   ErrorOr<std::unique_ptr<vfs::File>>
-  openFileForRead(const Twine &Path) override {
+  openFileForRead(const Twine &Path, bool Text = true) override {
     auto S = status(Path);
     if (S)
       return std::unique_ptr<vfs::File>(new DummyFile{*S});
diff --git a/lldb/unittests/Utility/MockSymlinkFileSystem.h b/lldb/unittests/Utility/MockSymlinkFileSystem.h
index 7fa1f93bfa38a9..32a86cd669fb14 100644
--- a/lldb/unittests/Utility/MockSymlinkFileSystem.h
+++ b/lldb/unittests/Utility/MockSymlinkFileSystem.h
@@ -43,7 +43,7 @@ class MockSymlinkFileSystem : public llvm::vfs::FileSystem {
     return llvm::errc::operation_not_permitted;
   }
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const llvm::Twine &Path) override {
+  openFileForRead(const llvm::Twine &Path, bool IsText = true) override {
     return llvm::errc::operation_not_permitted;
   }
   llvm::vfs::directory_iterator dir_begin(const llvm::Twine &Dir,
diff --git a/llvm/include/llvm/Support/VirtualFileSystem.h b/llvm/include/llvm/Support/VirtualFileSystem.h
index 2531c075f262d7..2256fde2faa661 100644
--- a/llvm/include/llvm/Support/VirtualFileSystem.h
+++ b/llvm/include/llvm/Support/VirtualFileSystem.h
@@ -273,13 +273,14 @@ class FileSystem : public llvm::ThreadSafeRefCountedBase<FileSystem>,
 
   /// Get a \p File object for the file at \p Path, if one exists.
   virtual llvm::ErrorOr<std::unique_ptr<File>>
-  openFileForRead(const Twine &Path) = 0;
+  openFileForRead(const Twine &Path, bool IsText = true) = 0;
 
   /// This is a convenience method that opens a file, gets its content and then
   /// closes the file.
   llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>>
   getBufferForFile(const Twine &Name, int64_t FileSize = -1,
-                   bool RequiresNullTerminator = true, bool IsVolatile = false);
+                   bool RequiresNullTerminator = true, bool IsVolatile = false,
+                   bool IsText = true);
 
   /// Get a directory_iterator for \p Dir.
   /// \note The 'end' iterator is directory_iterator().
@@ -392,7 +393,7 @@ class OverlayFileSystem : public RTTIExtends<OverlayFileSystem, FileSystem> {
   llvm::ErrorOr<Status> status(const Twine &Path) override;
   bool exists(const Twine &Path) override;
   llvm::ErrorOr<std::unique_ptr<File>>
-  openFileForRead(const Twine &Path) override;
+  openFileForRead(const Twine &Path, bool IsText = true) override;
   directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override;
   llvm::ErrorOr<std::string> getCurrentWorkingDirectory() const override;
   std::error_code setCurrentWorkingDirectory(const Twine &Path) override;
@@ -446,8 +447,8 @@ class ProxyFileSystem : public RTTIExtends<ProxyFileSystem, FileSystem> {
   }
   bool exists(const Twine &Path) override { return FS->exists(Path); }
   llvm::ErrorOr<std::unique_ptr<File>>
-  openFileForRead(const Twine &Path) override {
-    return FS->openFileForRead(Path);
+  openFileForRead(const Twine &Path, bool IsText = true) override {
+    return FS->openFileForRead(Path, IsText);
   }
   directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override {
     return FS->dir_begin(Dir, EC);
@@ -615,7 +616,7 @@ class InMemoryFileSystem : public RTTIExtends<InMemoryFileSystem, FileSystem> {
 
   llvm::ErrorOr<Status> status(const Twine &Path) override;
   llvm::ErrorOr<std::unique_ptr<File>>
-  openFileForRead(const Twine &Path) override;
+  openFileForRead(const Twine &Path, bool IsText = true) override;
   directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override;
 
   llvm::ErrorOr<std::string> getCurrentWorkingDirectory() const override {
@@ -1051,7 +1052,8 @@ class RedirectingFileSystem
 
   ErrorOr<Status> status(const Twine &Path) override;
   bool exists(const Twine &Path) override;
-  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path) override;
+  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path,
+                                                 bool IsText = true) override;
 
   std::error_code getRealPath(const Twine &Path,
                               SmallVectorImpl<char> &Output) override;
@@ -1155,9 +1157,10 @@ class TracingFileSystem
     return ProxyFileSystem::status(Path);
   }
 
-  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path) override {
+  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path,
+                                                 bool IsText = true) override {
     ++NumOpenFileForReadCalls;
-    return ProxyFileSystem::openFileForRead(Path);
+    return ProxyFileSystem::openFileForRead(Path, IsText);
   }
 
   directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override {
diff --git a/llvm/lib/Support/FileCollector.cpp b/llvm/lib/Support/FileCollector.cpp
index 29436f85c2f23c..966c5ac90e9412 100644
--- a/llvm/lib/Support/FileCollector.cpp
+++ b/llvm/lib/Support/FileCollector.cpp
@@ -268,8 +268,8 @@ class FileCollectorFileSystem : public vfs::FileSystem {
   }
 
   llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
-  openFileForRead(const Twine &Path) override {
-    auto Result = FS->openFileForRead(Path);
+  openFileForRead(const Twine &Path, bool IsText = true) override {
+    auto Result = FS->openFileForRead(Path, IsText);
     if (Result && *Result)
       Collector->addFile(Path);
     return Result;
diff --git a/llvm/lib/Support/VirtualFileSystem.cpp b/llvm/lib/Support/VirtualFileSystem.cpp
index 928c0b5a24ed65..ceca65b46e486d 100644
--- a/llvm/lib/Support/VirtualFileSystem.cpp
+++ b/llvm/lib/Support/VirtualFileSystem.cpp
@@ -117,8 +117,9 @@ FileSystem::~FileSystem() = default;
 
 ErrorOr<std::unique_ptr<MemoryBuffer>>
 FileSystem::getBufferForFile(const llvm::Twine &Name, int64_t FileSize,
-                             bool RequiresNullTerminator, bool IsVolatile) {
-  auto F = openFileForRead(Name);
+                             bool RequiresNullTerminator, bool IsVolatile,
+                             bool IsText) {
+  auto F = openFileForRead(Name, IsText);
   if (!F)
     return F.getError();
 
@@ -278,7 +279,8 @@ class RealFileSystem : public FileSystem {
   }
 
   ErrorOr<Status> status(const Twine &Path) override;
-  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path) override;
+  ErrorOr<std::unique_ptr<File>> openFileForRead(const Twine &Path,
+                                                 bool IsText = true) override;
   directory_iterator dir_begin(const Twine &Dir, std::error_code &EC) override;
 
   llvm::ErrorOr<std::string> getCurrentWorkingDirectory() const override;
@@ -323,10 +325,11 @@ ErrorOr<Status> RealFileSystem::status(const Twine &Path) {
 }
 
 ErrorOr<std::unique_ptr<File>>
-RealFileSystem::openFileForRead(const Twine &Name) {
+RealFileSystem::openFileForRead(const Twine &Name, bool IsText) {
   SmallString<256> RealName, Storage;
   Expected<file_t> FDOrErr = sys::fs::openNativeFileForRead(
-      adjustPath(Name, Storage), sys::fs::OF_None, &RealName);
+      adjustPath(Name, Storage), IsText ? sys::fs::OF_Text : sys::fs::OF_None,
+      &RealName);
   if (!FDOrErr)
     return errorToErrorCode(FDOrErr.takeError());
   return std::unique_ptr<File>(
@@ -458,10 +461,10 @@ bool OverlayFileSystem::exists(const Twine &Path) {
 }
 
 ErrorOr<std::unique_ptr<File>>
-OverlayFileSystem::openFileForRead(const llvm::Twine &Path) {
+OverlayFileSystem::openFileForRead(const llvm::Twine &Path, bool IsText) {
   // FIXME: handle symlinks that cross file systems
   for (iterator I = overlays_begin(), E = overlays_end(); I != E; ++I) {
-    auto Result = (*I)->openFileForRead(Path);
+    auto Result = (*I)->openFileForRead(Path, IsText);
     if (Result || Result.getError() != llvm::errc::no_such_file_or_directory)
       return Result;
   }
@@ -1073,7 +1076,7 @@ llvm::ErrorOr<Status> InMemoryFileSystem::status(const Twine &Path) {
 }
 
 llvm::ErrorOr<std::unique_ptr<File>>
-InMemoryFileSystem::openFileForRead(const Twine &Path) {
+InMemoryFileSystem::openFileForRead(const...
[truncated]

Copy link
Member

@kadircet kadircet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry this is same as #107906 (with a bigger impact radius, as you're also changing getBufferForFile) and doesn't address any of the issues mention about explaining the semantics of IsText or justification for changing the core VFS interfaces, for an operation that's can solely be performed on physical fileystem.

@perry-ca raised some concerns in #109664 about this functionality requiring some context awareness, I don't think any of those is addressed by this patch either. Pretty much all of the callers apart from ASTReader is just using IsText = true.

@perry-ca
Copy link
Contributor

perry-ca commented Oct 1, 2024

@perry-ca raised some concerns in #109664 about this functionality requiring some context awareness, I don't think any of those is addressed by this patch either. Pretty much all of the callers apart from ASTReader is just using IsText = true.

It just happens that 90% of the time you are dealing with text files. As you noted, in the context of the ASTReader, the file being read is expected to be a binary file. In that case the parameter is false.

These change mirror interface to getFileOrSTDIN() which has a IsText parameter. This does touch a number of places but I feel the changes are in line with the rest of the file I/O functions in llvm.

@abhina-sree abhina-sree force-pushed the abhina/propagate_text_flag branch from 29e3fff to 1a50964 Compare October 1, 2024 15:24
@abhina-sree
Copy link
Contributor Author

@perry-ca raised some concerns in #109664 about this functionality requiring some context awareness, I don't think any of those is addressed by this patch either. Pretty much all of the callers apart from ASTReader is just using IsText = true.

It just happens that 90% of the time you are dealing with text files. As you noted, in the context of the ASTReader, the file being read is expected to be a binary file. In that case the parameter is false.

These change mirror interface to getFileOrSTDIN() which has a IsText parameter. This does touch a number of places but I feel the changes are in line with the rest of the file I/O functions in llvm.

Yes, I agree with Sean. Most of the time it is a text file when this function is called, there are only a few places where we expect binary files. And the reason for changing in VirtualFileSystem.cpp is because this is where the OF_None flag is being passed unconditionally, and there are other parameters like RequiresNullTerminator and IsVolatile that are being passed here as well which is why it seems like the correct place

@abhina-sree abhina-sree force-pushed the abhina/propagate_text_flag branch from 1a50964 to b4bf7c6 Compare October 1, 2024 17:47
@cor3ntin
Copy link
Contributor

cor3ntin commented Oct 2, 2024

sorry this is same as #107906 (with a bigger impact radius, as you're also changing getBufferForFile) and doesn't address any of the issues mention about explaining the semantics of IsText or justification for changing the core VFS interfaces, for an operation that's can solely be performed on physical fileystem.

@perry-ca raised some concerns in #109664 about this functionality requiring some context awareness, I don't think any of those is addressed by this patch either. Pretty much all of the callers apart from ASTReader is just using IsText = true.

To give some context, the problem we are trying to solve initially is that file opened by #embed should not be utf-8 converted.

@perry-ca
Copy link
Contributor

perry-ca commented Oct 2, 2024

sorry this is same as #107906 (with a bigger impact radius, as you're also changing getBufferForFile) and doesn't address any of the issues mention about explaining the semantics of IsText or justification for changing the core VFS interfaces, for an operation that's can solely be performed on physical fileystem.
@perry-ca raised some concerns in #109664 about this functionality requiring some context awareness, I don't think any of those is addressed by this patch either. Pretty much all of the callers apart from ASTReader is just using IsText = true.

To give some context, the problem we are trying to solve initially is that file opened by #embed should not be utf-8 converted.

Correct. The overall/original problem is that we have many places where text files were being read as binary files. Adding the IsText parameter to the openFileForRead() function just like we have in the getFileOrStdin() group of functions solves that problem. It tells the compiler to read files the right way given the context of the file usage.

@abhina-sree
Copy link
Contributor Author

sorry this is same as #107906 (with a bigger impact radius, as you're also changing getBufferForFile) and doesn't address any of the issues mention about explaining the semantics of IsText or justification for changing the core VFS interfaces, for an operation that's can solely be performed on physical fileystem.
@perry-ca raised some concerns in #109664 about this functionality requiring some context awareness, I don't think any of those is addressed by this patch either. Pretty much all of the callers apart from ASTReader is just using IsText = true.

To give some context, the problem we are trying to solve initially is that file opened by #embed should not be utf-8 converted.

Thanks, I've tested the #embed lit tests with this fix and did not see any regressions

@kadircet
Copy link
Member

kadircet commented Oct 7, 2024

Sorry for not responding here for a while.

These change mirror interface to getFileOrSTDIN() which has a IsText parameter. This does touch a number of places but I feel the changes are in line with the rest of the file I/O functions in llvm.

I think there's a huge difference between getFileOrSTDIN and llvm::vfs::FileSystem interfaces. The former has a single implementation that's meant to always work with a physical filesytem. The latter has many implementations and is meant to decouple physical filesystem, we're now leaking that abstraction solely because physical system is trying to receive some information.

Thanks, I've tested the #embed lit tests with this fix and did not see any regressions

I am still wary of this. Logic that handles #embed directives lives in https://github.com/llvm/llvm-project/blob/main/clang/lib/Lex/PPDirectives.cpp#L3964-L3990. As you can see, all of them use the default filemanager/filesystem interfaces, which will read this file with IsText = true after this patch.


All of that being said;

If you folks are in alignment with this being required, I think we should still make it as least intrusive to rest of the users of this interface as possible.

Looks like this only has affects on RealFileSystem implementation (and probably isn't really useful for the rest anyways).

Can we at least change the implementation here to:

  • Introduce a new virtual methods called openFileForReadBinary and getBufferForFileBinary into llvm::vfs::Filesystem. Make default implementations just delegate to openFileForRead and getBufferForFile. Explain when the binary specific overloads should be preferred and how the default versions assume operating on text files in the comments for these methods.
  • Override these new methods in RealFileSystem to pass different flags to underlying calls
  • Make sure place like #embed handling and astreader calls the read-binary versions of these methods.

@abhina-sree
Copy link
Contributor Author

https://github.com/llvm/llvm-project/blob/main/clang/lib/Lex/PPDirectives.cpp#L3964-L3990

Yes I think we still require specifying the Text/Binary parameter based on the context of where it is called. I will look into adding the new virtual functions as an alternative solution

@abhina-sree
Copy link
Contributor Author

The default is set to OF_Text instead of OF_None, this change in value does not affect any other platforms other than z/OS. Setting this parameter correctly is required to open files on z/OS in the correct encoding. The IsText parameter is based on the context of where we open files, for example, the embed directive requires that files always be opened in binary even though they might be tagged as text.

Thanks for your patience. I've put up this PR #111723 to implement your suggestion. I kept the IsText parameter in the getBufferForFile, getBufferForFileImpl because it already has other attributes like IsVolatile, RequiresNullTerminator but created a new virtual function for openFileForReadBinary which greatly reduces the number of files touched compared to this PR. The current #embed lit tests seem to work without additional changes but I will experiment with creating a testcase with inputs that are non-ascii to expose any issues

@abhina-sree
Copy link
Contributor Author

Closing this because an alternative solution was implemented here #111723

@abhina-sree abhina-sree deleted the abhina/propagate_text_flag branch June 5, 2025 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category clang-tools-extra clangd lldb llvm:support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants