-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[llvm][Support][Windows] Avoid crash calling remove_directories() #118677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We faced an unexpected crash in SHELL32_CallFileCopyHooks() on the buildbot [lldb-remote-linux-win](https://lab.llvm.org/staging/#/builders/197/builds/1066). The host is Windows Server 2022 w/o any 3rd party shell extensions. See llvm#118032 for more details. Based on [this article](https://devblogs.microsoft.com/oldnewthing/20120330-00/?p=7963).
@llvm/pr-subscribers-llvm-support @llvm/pr-subscribers-platform-windows Author: Dmitry Vasilyev (slydiman) ChangesWe faced an unexpected crash in SHELL32_CallFileCopyHooks() on the buildbot lldb-remote-linux-win. The host is Windows Server 2022 w/o any 3rd party shell extensions. See #118032 for more details. Full diff: https://github.com/llvm/llvm-project/pull/118677.diff 1 Files Affected:
diff --git a/llvm/lib/Support/Windows/Path.inc b/llvm/lib/Support/Windows/Path.inc
index c4bd5e24723517..ecea1efefe0098 100644
--- a/llvm/lib/Support/Windows/Path.inc
+++ b/llvm/lib/Support/Windows/Path.inc
@@ -1387,12 +1387,31 @@ std::error_code remove_directories(const Twine &path, bool IgnoreErrors) {
Path16.push_back(0);
Path16.push_back(0);
- SHFILEOPSTRUCTW shfos = {};
- shfos.wFunc = FO_DELETE;
- shfos.pFrom = Path16.data();
- shfos.fFlags = FOF_NO_UI;
+ HRESULT HR = CoInitializeEx(NULL, COINIT_MULTITHREADED);
+ if (SUCCEEDED(HR)) {
+ IFileOperation *FileOp;
+ HR = CoCreateInstance(CLSID_FileOperation, NULL, CLSCTX_ALL,
+ IID_PPV_ARGS(&FileOp));
+ if (SUCCEEDED(HR)) {
+ HR = FileOp->SetOperationFlags(FOF_NO_UI | FOFX_NOCOPYHOOKS);
+ if (SUCCEEDED(HR)) {
+ IShellItem *ShItem = NULL;
+ HR = SHCreateItemFromParsingName(Path16.data(), NULL,
+ IID_PPV_ARGS(&ShItem));
+ if (SUCCEEDED(HR)) {
+ HR = FileOp->DeleteItem(ShItem, NULL);
+ ShItem->Release();
+ }
+ if (SUCCEEDED(HR)) {
+ HR = FileOp->PerformOperations();
+ }
+ }
+ FileOp->Release();
+ }
+ }
+ CoUninitialize();
- int result = ::SHFileOperationW(&shfos);
+ int result = FAILED(HR) ? HRESULT_CODE(HR) : 0;
if (result != 0 && !IgnoreErrors)
return mapWindowsError(result);
return std::error_code();
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really feel qualified to review this, so take this with a large grain of salt, but my thoughts are:
- I'm not convinced that this is the same problem as in the referenced article. Unlike the article, where the crash was in some third party extension, this appears to be a built-in OS component which (I would hope) doesn't have these kinds of bugs. I'm not saying we shouldn't suppress these hooks (this ties into the second point), but I'm not sure that this will actually resolve the underlying issue (which could be some sort of corruption/race in lldb/llvm which manifests as a crash in this place).
- is there any advantage in doing this instead of the "manual" iterate-over-subdirectories approach (which is what the unix version of this does)? If not, we could probably even share code with that impl. I'm asking this because
SHFileOperationW
sounds like its meant for a shell, and we're not trying to be a shell -- we just want to make a directory disappear.
The article refers buggy Contoso shell extensions. But it seems Contoso is an alias of Microsoft. No guarantees. |
No, Contoso is not an alias of Microsoft. It's a placeholder name of any random third party company/project, used in these docs to not need to name the actual third parties involved. |
Yes, but I don't think that's a sufficient reason to make this change.
This might be a sufficient reason, if true, which I simply just don't know. @mstorsjo, what do you make of this? |
Can we use std::filesystem::remove_all()? Note it requres C++17. BTW, look at llvm/utils/lit/lit/TestRunner.py, executeBuiltinRm(). It calls SHFileOperationW() directly from python. |
We now build llvm for C++17 so I don't see why not. Does it work? And could we then move remove_directories out of the windows/Linux Path.inc custom code? |
std::filesystem::remove_all() works. But it uses std::filesystem::path as the argument. std::filesystem::path uses char on POSIX, wchar_t on Windows. So some custom code is still necessary. But remove_directories() may be simplified and unified very much for both POSIX and Windows. I will try to prepare such PR. |
I'm not entirely sure. Based on the linked blog post, this does seem like a recommended way of doing it, but invoking COM objects just for deleting a set of files/directories does also feel a bit like overkill.
Even though the compilers generally support C++17, So if possible, using common code shared with Unix, for traversing the file tree and recursively deleting files/directories, sounds like my preferred way forward as well, unless we have other arguments for not doing that. (Why was this codepath using shell32 for deleting directories in the first place?) |
Manually performing the DFS and doing the removal would require that we properly handle long paths and that we handle some of the permission work (e.g. IIRC backup semantics may sometimes be required to remove the file). Even |
Please note my second commit is the result of disassembling SHFileOperationW(). I have figured out the parameters for CoInitializeEx() and used ILCreateFromPathW() and SHCreateItemFromIDList() instead of SHCreateItemFromParsingName().
Note executeBuiltinRm() in llvm/utils/lit/lit/TestRunner.py uses SHFileOperationW() on Windows and just shutil.rmtree() for other OS. Why don't we use shutil.rmtree() for Windows too? I know that shutil.rmtree() cannot delete files marked RO and it is possible to fix it using a custom error handler. But I don't see such handler around shutil.rmtree() in executeBuiltinRm() for non-Windows OS. |
FWIW, I’m not sure if the libc++ implementation of these operations have got any extra logic to take care of those corner cases, at least I don’t remember anything extra relating to long paths. |
3f5353f
to
973a23b
Compare
973a23b
to
2cb0548
Compare
Can we make a decision regarding this patch as #119146 seems to be unacceptable due to issues with libstdc++. Thanks. |
Unless we can find a workaround I don't see how we can move to std::filesystem, which is very, very annoying 😞 |
I can try to file a bug to libstdc++ about it, so that there at least is some path towards being able to use it at some point in the future. |
Reported that bug in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118003. That's of course no guarantee that it'll be fixed anytime soon though (and from personal experience, implementing std::filesystem entirely correctly on Windows is a bit fiddly), but at least there is a bug report. |
I think reliably recursively deleting files in a directory is actually a difficult problem (see this 2015 talk), so Zach Turner probably reached for the native all-in-one API that was at hand. LLVM's directory deletion doesn't need to be bullet-proof, it needs to be good enough, especially given that the only clients are LLDB, dsymutil, and some unit tests. So, I think maybe Pavel is right, and maybe we should implement this using generic portable code. |
We should already handle long paths, I think I would be more worried about "backup semantics" and other cases where we empty a directory, but then cannot remove the parent directory because it is not yet empty. Those are the kinds of cases where you just want to ask some Windows system DLL to do it for you. |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/10129 Here is the relevant piece of the build log for the reference
|
Unfortunately, this implementation caused other build issues; it uses I guess this affects the use of Is it possible to write a similar COM pointer wrapper that we can use, instead of relying on ATL for this? Covering the uses we have here should be quite trivial I guess? |
I will prepare the path w/o CComPtr and atlbase.h in few hours. BTW, it seems this patch fixed the issue #118032 |
…and atlbase.h This is the update of llvm#118677.
We faced an unexpected crash in SHELL32_CallFileCopyHooks() on the buildbot lldb-remote-linux-win. The host is Windows Server 2022 w/o any 3rd party shell extensions. See #118032 for more details.
Based on this article.