Skip to content

Support: use Windows 10 RS1+ API if possible to make moves more atomic #102240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

compnerd
Copy link
Member

@compnerd compnerd commented Aug 6, 2024

This addresses a small window of in-atomicity in the rename support for Windows. If the rename API is stressed sufficiently, it can run afoul of the small window of opportunity where the destination file will not exist. In Swift, this causes the LockFileManager to improperly handle file locking for module compilation resulting in spurious failures as the file file is failed to be read after writing due to multiple processes overwriting the file and exposing the window constantly to the scheduled process when building with -num-threads > for the Swift driver.

The implementation of llvm::sys::fs::rename was changed in SVN r315079 (80e31f1) which even explicitly identifies the issue:

This implementation is still not fully POSIX. Specifically in the case where the destination file is open at the point when rename is called, there will be a short interval of time during which the destination file will not exist. It isn't clear whether it is possible to avoid this using the Windows API.

Special thanks to Ben Barham for the discussions and help with analyzing logging to track down this issue!

This API was introduced with Windows 10 RS1, and although there are LTSC contracts that allow older versions to be supported still, the mainstream support system has declared this release to be obsoleted. We attempt to use the new API, and if we are unable to do so, fallback to the old API. This allows us to maintain compatibility with older releases.

@compnerd
Copy link
Member Author

compnerd commented Aug 6, 2024

CC: @mstorsjo @rnk

@llvmbot
Copy link
Member

llvmbot commented Aug 6, 2024

@llvm/pr-subscribers-llvm-support

@llvm/pr-subscribers-platform-windows

Author: Saleem Abdulrasool (compnerd)

Changes

This addresses a small window of in-atomicity in the rename support for Windows. If the rename API is stressed sufficiently, it can run afoul of the small window of opportunity where the destination file will not exist. In Swift, this causes the LockFileManager to improperly handle file locking for module compilation resulting in spurious failures as the file file is failed to be read after writing due to multiple processes overwriting the file and exposing the window constantly to the scheduled process when building with -num-threads > for the Swift driver.

The implementation of llvm::sys::fs::rename was changed in SVN r315079 (80e31f1) which even explicitly identifies the issue:

This implementation is still not fully POSIX. Specifically in the case where the destination file is open at the point when rename is called, there will be a short interval of time during which the destination file will not exist. It isn't clear whether it is possible to avoid this using the Windows API.

Special thanks to Ben Barham for the discussions and help with analyzing logging to track down this issue!

This API was introduced with Windows 10 RS1, and although there are LTSC contracts that allow older versions to be supported still, the mainstream support system has declared this release to be obsoleted. As such, assuming that the OS is at least as new as this is reasonable.


Full diff: https://github.com/llvm/llvm-project/pull/102240.diff

2 Files Affected:

  • (modified) llvm/include/llvm/Support/Windows/WindowsSupport.h (+3)
  • (modified) llvm/lib/Support/Windows/Path.inc (+3-2)
diff --git a/llvm/include/llvm/Support/Windows/WindowsSupport.h b/llvm/include/llvm/Support/Windows/WindowsSupport.h
index d3aacd14b2097..3c91bc37f651b 100644
--- a/llvm/include/llvm/Support/Windows/WindowsSupport.h
+++ b/llvm/include/llvm/Support/Windows/WindowsSupport.h
@@ -21,6 +21,7 @@
 #ifndef LLVM_SUPPORT_WINDOWSSUPPORT_H
 #define LLVM_SUPPORT_WINDOWSSUPPORT_H
 
+#if defined(__MINGW32__)
 // mingw-w64 tends to define it as 0x0502 in its headers.
 #undef _WIN32_WINNT
 #undef _WIN32_IE
@@ -28,6 +29,8 @@
 // Require at least Windows 7 API.
 #define _WIN32_WINNT 0x0601
 #define _WIN32_IE    0x0800 // MinGW at it again. FIXME: verify if still needed.
+#endif
+
 #define WIN32_LEAN_AND_MEAN
 #ifndef NOMINMAX
 #define NOMINMAX
diff --git a/llvm/lib/Support/Windows/Path.inc b/llvm/lib/Support/Windows/Path.inc
index c4bd5e2472351..9cbebf967050d 100644
--- a/llvm/lib/Support/Windows/Path.inc
+++ b/llvm/lib/Support/Windows/Path.inc
@@ -466,13 +466,14 @@ static std::error_code rename_internal(HANDLE FromHandle, const Twine &To,
                                   (ToWide.size() * sizeof(wchar_t)));
   FILE_RENAME_INFO &RenameInfo =
       *reinterpret_cast<FILE_RENAME_INFO *>(RenameInfoBuf.data());
-  RenameInfo.ReplaceIfExists = ReplaceIfExists;
+  RenameInfo.Flags = FILE_RENAME_FLAG_POSIX_SEMANTICS
+                   | (ReplaceIfExists ? FILE_RENAME_FLAG_REPLACE_IF_EXISTS : 0);
   RenameInfo.RootDirectory = 0;
   RenameInfo.FileNameLength = ToWide.size() * sizeof(wchar_t);
   std::copy(ToWide.begin(), ToWide.end(), &RenameInfo.FileName[0]);
 
   SetLastError(ERROR_SUCCESS);
-  if (!SetFileInformationByHandle(FromHandle, FileRenameInfo, &RenameInfo,
+  if (!SetFileInformationByHandle(FromHandle, FileRenameInfoEx, &RenameInfo,
                                   RenameInfoBuf.size())) {
     unsigned Error = GetLastError();
     if (Error == ERROR_SUCCESS)

Copy link

github-actions bot commented Aug 6, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 4f5d866af7fed0de1671a68530d3023e9762b71e a20dcd94c7bef55b5ebcb78c041812886f5edbbf --extensions h,inc -- llvm/include/llvm/Support/Windows/WindowsSupport.h llvm/lib/Support/Windows/Path.inc
View the diff from clang-format here.
diff --git a/llvm/lib/Support/Windows/Path.inc b/llvm/lib/Support/Windows/Path.inc
index 50a44d490b..70b948fc17 100644
--- a/llvm/lib/Support/Windows/Path.inc
+++ b/llvm/lib/Support/Windows/Path.inc
@@ -474,8 +474,11 @@ static std::error_code rename_internal(HANDLE FromHandle, const Twine &To,
     DWORD dwFlags;
     FILE_INFO_BY_HANDLE_CLASS Class;
   } kRenameStyles[] = {
-    { static_cast<DWORD>(FILE_RENAME_FLAG_POSIX_SEMANTICS | (ReplaceIfExists ? FILE_RENAME_FLAG_REPLACE_IF_EXISTS : 0)), FileRenameInfoEx },
-    { 0, FileRenameInfo },
+      {static_cast<DWORD>(
+           FILE_RENAME_FLAG_POSIX_SEMANTICS |
+           (ReplaceIfExists ? FILE_RENAME_FLAG_REPLACE_IF_EXISTS : 0)),
+       FileRenameInfoEx},
+      {0, FileRenameInfo},
   };
 
   DWORD dwError;

@mstorsjo
Copy link
Member

mstorsjo commented Aug 7, 2024

This API was introduced with Windows 10 RS1, and although there are LTSC contracts that allow older versions to be supported still, the mainstream support system has declared this release to be obsoleted. As such, assuming that the OS is at least as new as this is reasonable.

If we'd go this way, we should update the docs that state our minimum required target. But I would want to object to bumping the requirement that far - within the msys2 project, they still support older versions (CC @lazka @mati865 @jeremyd2019) as far as I know, and likewise llvm-mingw also supports running the toolchain on Windows 7.

In this case, it should be fairly straightforward to change this to try using the new FileRenameInfoEx, and if it fails (because running on an older OS that doesn't support it) fall back on the existing code with FileRenameInfo - I tested amending your patch to do that.

@@ -21,13 +21,16 @@
#ifndef LLVM_SUPPORT_WINDOWSSUPPORT_H
#define LLVM_SUPPORT_WINDOWSSUPPORT_H

#if defined(__MINGW32__)
// mingw-w64 tends to define it as 0x0502 in its headers.
#undef _WIN32_WINNT
#undef _WIN32_IE

// Require at least Windows 7 API.
#define _WIN32_WINNT 0x0601
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this value like this, forcing the headers to target explicitly Windows 7 when building with mingw, obviously would fail. So if we go this way, we'd need to bump this.

To use FileRenameInfoEx, we need a fairly new version of mingw-w64, to include the fix in mingw-w64/mingw-w64@837c48d. But that fix is included in the mingw-w64 v12 release, so that's probably acceptable.

One bit of a gotcha, is that _WIN32_WINNT affects two aspects in Windows headers - it's both the maximum API level to expose in headers (so newer APIs may not be visible at all), but in some cases it also serves as the minimum target API version. E.g. some headers, such as psapi.h use _WIN32_WINNT (or more precisely NTDDI_VERSION) to select a newer ABI construct, which only works on the version selected by _WIN32_WINNT - also CC @cjacek.

So by just bumping the value of _WIN32_WINNT, it's possible to end up requiring ABIs details from a newer version of Windows, even if the calling code isn't changed at all.

Anyway, I tested bumping this to 0x0A00, and running the binaries on Windows 7, and it seems to work, so I guess it should be fine - but it may require being careful with some APIs at times.

In addition to requiring mingw-w64 v12, we'll also need to manually set NTDDI_VERSION to unlock things from mingw-w64 headers past the base version of Windows 10.0.

I amended your patch with these changes, to make it build with mingw-w64 headers, and still run on Windows 7 - see this commit: mstorsjo@windows-rename-fix (This is based on top of #102307.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in some cases it also serves as the minimum target API version. E.g. some headers, such as psapi.h use _WIN32_WINNT (or more precisely NTDDI_VERSION) to select a newer ABI construct, which only works on the version selected by _WIN32_WINNT

Yes, there is the infamous example of psapi.h, but I don't think there are many other cases like that. This one is only about being older than Windows 7 or not, so it's not the problem in this context. I don't recall any example where such checks are done against newer Windows versions, so I think it's safe in this context. I presume that a lesson was learned from psapi.h mistake...

(FWIW, modern mingw-w64 defaults to 0xa00, which could in theory allow to remove the whole thing here instead of ifdefing __MINGW32__. Unfortunately, mingw-w64 was late to do that and in the mean time various assumptions spread among users, making the increase tricky from compatibility point of view and causing some vendors to override that to still define an older version by default.)

@AaronBallman
Copy link
Collaborator

This API was introduced with Windows 10 RS1, and although there are LTSC contracts that allow older versions to be supported still, the mainstream support system has declared this release to be obsoleted. As such, assuming that the OS is at least as new as this is reasonable.

I think this requires an RFC (also, we do not explicitly document what versions of Windows we support, or any OS for that matter, and I think that documentation should be created so users know what to expect).

@mstorsjo
Copy link
Member

mstorsjo commented Aug 7, 2024

also, we do not explicitly document what versions of Windows we support, or any OS for that matter, and I think that documentation should be created so users know what to expect

I was pretty sure we had such a statement written down somewhere - I remember reading it when I got started in LLVM many years ago (stating that we require Windows 7, both for running LLVM itself, and for the runtimes to run on), but I can't find that now. Maybe it was more implicit after all...

@AaronBallman
Copy link
Collaborator

also, we do not explicitly document what versions of Windows we support, or any OS for that matter, and I think that documentation should be created so users know what to expect

I was pretty sure we had such a statement written down somewhere - I remember reading it when I got started in LLVM many years ago (stating that we require Windows 7, both for running LLVM itself, and for the runtimes to run on), but I can't find that now. Maybe it was more implicit after all...

FWIW, I remember the same thing. I could have sworn we documented this somewhere, but the LLVM getting started page just lists "Windows x86" and "Windows x86-64" without version details, and Clang's getting started page points to LLVM's.

@tru
Copy link
Collaborator

tru commented Aug 7, 2024

I think we definitely should document it. And FWIW I am fine with a windows 10 requirement since 7 is EOL and I don't think we should hold back if there are good reasons for it.

@compnerd
Copy link
Member Author

compnerd commented Aug 7, 2024

What's also more interesting is that Windows 10 is also slated for EOL in October.

@mati865
Copy link
Contributor

mati865 commented Aug 7, 2024

But I would want to object to bumping the requirement that far - within the msys2 project, they still support older versions

As stated at https://www.msys2.org/docs/windows_support/ MSYS2 targets Windows 8.1 for mingw-w64 but packages are free to require any Windows version.
Many packages already won't run on anything older than Windows 10 since various compilers and libraries depend on newer APIs (for example, anything that uses Rust).

@mstorsjo
Copy link
Member

mstorsjo commented Aug 7, 2024

But I would want to object to bumping the requirement that far - within the msys2 project, they still support older versions

As stated at https://www.msys2.org/docs/windows_support/ MSYS2 targets Windows 8.1 for mingw-w64 but packages are free to require any Windows version. Many packages already won't run on anything older than Windows 10 since various compilers and libraries depend on newer APIs (for example, anything that uses Rust).

Ok, I see!

Anyway, the amount of APIs of this kind that we use/need is pretty small, so I don’t see a huge benefit in bumping the requirement (and in this case, making it fallback is quite easy). (A far more tricky aspect in these codepaths, is making them work properly with various less common file system drivers, like third party ramdisks and network mounts and similar.)

@compnerd
Copy link
Member Author

compnerd commented Aug 8, 2024

@mstorsjo I'm not sure a fallback is really that simple. This impacts the semantics - the behaviour between the two versions is pretty different.

@compnerd
Copy link
Member Author

compnerd commented Aug 8, 2024

@jeremyd2019
Copy link
Contributor

Anyway, the amount of APIs of this kind that we use/need is pretty small, so I don’t see a huge benefit in bumping the requirement (and in this case, making it fallback is quite easy). (A far more tricky aspect in these codepaths, is making them work properly with various less common file system drivers, like third party ramdisks and network mounts and similar.)

I think Cygwin has a whole pile of fallback and workaround cases for posix semantics due to filesystem drivers, so I think having a fallback is going to be essential regardless of minimum OS version support. Example: cygwin/cygwin@fe2545e

@mstorsjo
Copy link
Member

mstorsjo commented Aug 8, 2024

@mstorsjo I'm not sure a fallback is really that simple. This impacts the semantics - the behaviour between the two versions is pretty different.

I'm not quite sure I understand the distinction. If running on an old version, where the newer API is unavailable and we'd fall back on the old one, things would work like they have worked so far, right? And that behaviour has been acceptable in mosts cases, except in some specific corner cases?

@compnerd
Copy link
Member Author

compnerd commented Aug 9, 2024

True that the fallback world continue to behave as it does today. The reason for this change is that it does break and the failure is very difficult to track down. The swift toolchain has been shipping this change for quite a while (3 years or so) and never actually had any complaint about compatibility. The idea of retrying on ERROR_INVALID_PARAMETER is really good though.

@mstorsjo
Copy link
Member

Thanks for the update - I’ll try it on an older windows later.

You’ll need to update WindowsSupport.h with the changes from my linked commit mstorsjo@windows-rename-fix (and merge the latest changes from main first) for this to compile on mingw targets, whichever way we go with the fallback.

@compnerd compnerd force-pushed the posix branch 2 times, most recently from d9dc35a to c3f96c7 Compare August 10, 2024 18:16
@compnerd
Copy link
Member Author

Fixed and rebased to current main

@mstorsjo
Copy link
Member

Thanks!

I still run into a compiler error due to a technicality:

In file included from /home/wbs/code/llvm-project/llvm/lib/Support/Path.cpp:1203:
/home/wbs/code/llvm-project/llvm/lib/Support/Windows/Path.inc:477:7: error: non-constant-expression cannot be narrowed from type 'int' to 'DWORD' (aka 'unsigned long') in initializer list [-Wc++11-narrowing]
  477 |     { FILE_RENAME_FLAG_POSIX_SEMANTICS | (ReplaceIfExists ? FILE_RENAME_FLAG_REPLACE_IF_EXISTS : 0), FileRenameInfoEx },
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

So we need to add a cast to avoid this.

Other than that, this implementation starts looking quite good to me, and it seems to work on Windows 7.

Can you update the PR subject/description to match the current direction of the implementation?

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - this version builds and runs fine - but sorry, I realized one thing about the loop we probably should tweak still.

This addresses a small window of in-atomicity in the rename support for
Windows. If the rename API is stressed sufficiently, it can run afoul of
the small window of opportunity where the destination file will not
exist. In Swift, this causes the `LockFileManager` to improperly handle
file locking for module compilation resulting in spurious failures as
the file file is failed to be read after writing due to multiple
processes overwriting the file and exposing the window constantly to the
scheduled process when building with `-num-threads` > 1 for the Swift
driver.

The implementation of `llvm::sys::fs::rename` was changed in SVN r315079
(80e31f1) which even explicitly
identifies the issue:

~~~
This implementation is still not fully POSIX. Specifically in the case
where the destination file is open at the point when rename is called,
there will be a short interval of time during which the destination
file will not exist. It isn't clear whether it is possible to avoid
this using the Windows API.
~~~

Special thanks to Ben Barham for the discussions and help with analyzing
logging to track down this issue!

This API was introduced with Windows 10 RS1, and although there are LTSC
contracts that allow older versions to be supported still, the
mainstream support system has declared this release to be obsoleted. We
attempt to use the new API, and if we are unable to do so, fallback to
the old API. This allows us to maintain compatibility with older
releases.
@compnerd compnerd changed the title Support: unlock Windows API support, switch to Windows 10 RS1+ APIs Support: use Windows 10 RS1+ API if possible to make moves more atomic Aug 15, 2024
Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@compnerd
Copy link
Member Author

Hmm, seems that the support test was testing for the existing behaviour:

--
  | GTEST_OUTPUT=json:C:\ws\src\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-63012-25-43.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=43 GTEST_SHARD_INDEX=25 C:\ws\src\build\unittests\Support\.\SupportTests.exe
  | --
  |  
  | Script:
  | --
  | C:\ws\src\build\unittests\Support\.\SupportTests.exe --gtest_filter=rename.ExistingTemp
  | --
  | C:\ws\src\llvm\unittests\Support\ReplaceFileTest.cpp(163): error: Expected equality of these values:
  | errc::permission_denied
  | Which is: generic:13
  | fs::openFileForRead(TargetTmp1FileName, Tmp1FD)
  | Which is: generic:2
  |  
  |  
  | C:\ws\src\llvm\unittests\Support\ReplaceFileTest.cpp:163
  | Expected equality of these values:
  | errc::permission_denied
  | Which is: generic:13
  | fs::openFileForRead(TargetTmp1FileName, Tmp1FD)
  | Which is: generic:2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants