-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang][driver] Fix an issue where clang does not correctly resolve the system header if invoked via symlink (on MacOS) #68091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…olute path (#66704) In clang/tools/driver/driver.cpp, the function SetInstallDir() tries to determine the location where the toolchain is installed, basically by taking the parent folder of the executable absolute path. This is not correct when the compiler is invoked via a link, since it returns the parent of the link. This leads to subtle errors, for example on macOS it silently picks the wrong system headers.
@llvm/pr-subscribers-clang-driver @llvm/pr-subscribers-clang ChangesThis patch fixes #66704. Full diff: https://github.com/llvm/llvm-project/pull/68091.diff 1 Files Affected:
diff --git a/clang/tools/driver/driver.cpp b/clang/tools/driver/driver.cpp
index 531b5b4a61c1804..45e1469aafca95e 100644
--- a/clang/tools/driver/driver.cpp
+++ b/clang/tools/driver/driver.cpp
@@ -57,6 +57,15 @@ using namespace clang;
using namespace clang::driver;
using namespace llvm::opt;
+#if defined(__linux__)
+#include <unistd.h>
+#elif defined(__APPLE__)
+#include <libproc.h>
+#include <unistd.h>
+#elif defined(__MINGW32__)
+#include <windows.h>
+#endif
+
std::string GetExecutablePath(const char *Argv0, bool CanonicalPrefixes) {
if (!CanonicalPrefixes) {
SmallString<128> ExecutablePath(Argv0);
@@ -331,6 +340,56 @@ static void SetInstallDir(SmallVectorImpl<const char *> &argv,
// path being a symlink.
SmallString<128> InstalledPath(argv[0]);
+#if defined(__linux__)
+
+ char ProcessAbsolutePath[PATH_MAX];
+
+ int len = readlink("/proc/self/exe", ProcessAbsolutePath,
+ sizeof(ProcessAbsolutePath) - 1);
+ if (len <= 0) {
+ llvm::errs() << "Internal error: readlink(\"/proc/self/exe\") failed with "
+ << strerror(errno) << "\n";
+ exit(1);
+ }
+
+ ProcessAbsolutePath[len] = 0;
+ InstalledPath = ProcessAbsolutePath;
+
+#elif defined(__APPLE__)
+
+ // The size must be higher than PROC_PIDPATHINFO_SIZE, otherwise the call
+ // fails with ENOMEM (12) - Cannot allocate memory.
+ // https://opensource.apple.com/source/Libc/Libc-498/darwin/libproc.c
+ char ProcessAbsolutePath[PROC_PIDPATHINFO_SIZE + 1];
+
+ int len = proc_pidpath(getpid(), ProcessAbsolutePath,
+ sizeof(ProcessAbsolutePath) - 1);
+ if (len <= 0) {
+ llvm::errs() << "Internal error: proc_pidpath() failed with "
+ << strerror(errno) << "\n";
+ exit(1);
+ }
+
+ ProcessAbsolutePath[len] = 0;
+ InstalledPath = ProcessAbsolutePath;
+
+#elif defined(__MINGW32__)
+
+ char ProcessAbsolutePath[PATH_MAX];
+
+ len = GetModuleFileName(NULL, ProcessAbsolutePath,
+ sizeof(ProcessAbsolutePath) - 1);
+ if (len <= 0) {
+ llvm::errs() << "Internal error: GetModuleFileName() failed with "
+ << strerror(errno) << "\n";
+ exit(1);
+ }
+
+ ProcessAbsolutePath[len] = 0;
+ InstalledPath = ProcessAbsolutePath;
+
+#else
+
// Do a PATH lookup, if there are no directory components.
if (llvm::sys::path::filename(InstalledPath) == InstalledPath)
if (llvm::ErrorOr<std::string> Tmp = llvm::sys::findProgramByName(
@@ -341,6 +400,8 @@ static void SetInstallDir(SmallVectorImpl<const char *> &argv,
if (CanonicalPrefixes)
llvm::sys::fs::make_absolute(InstalledPath);
+#endif
+
StringRef InstalledPathParent(llvm::sys::path::parent_path(InstalledPath));
if (llvm::sys::fs::exists(InstalledPathParent))
TheDriver.setInstalledDir(InstalledPathParent);
|
You have failed tests on Linux CI:
Windows CI has another test in Clang interpreter failing, but that might be not related to you. |
clang/tools/driver/driver.cpp
Outdated
@@ -331,6 +340,56 @@ static void SetInstallDir(SmallVectorImpl<const char *> &argv, | |||
// path being a symlink. | |||
SmallString<128> InstalledPath(argv[0]); | |||
|
|||
#if defined(__linux__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use getMainExecutable()
instead?
(This sort of system-specific logic should live in LLVM's Support directory so that we don't sprinkle host-specific preprocessor conditionals all over, but better still if we can use our existing facilities rather than invent new ones.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was kind of thinking the same. And then started to wonder about why SetInstallDir says "We do this manually, because we want to support that path being a symlink.". I don't really understand if that indicates that the installed dir should be set based on the symlinks path or not.
Afaict Driver::ClangExecutable
already is computed via getMainExecutable()
.
And the Driver c'tor is setting both Driver::Dir
and Driver::InstalledDir
based on the path of the clang executable.
So is the problem here is that SetInstallDir is called after the above (having contructed TheDriver
). Resulting in InstalledDir
being changed into being relative to the symlink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use
getMainExecutable()
instead?
Good point!
So is the problem here that SetInstallDir is called after the above (having constructed TheDriver). Resulting in InstalledDir being changed into being relative to the symlink?
When I first encountered this problem I tried to run a debug session to understand how things work, but got confused and in the end I patched SetInstallDir
and for my needs it was ok, but since getMainExecutable()
already does the job I'll reconsider and try to understand where the problem comes from.
FWIW, I'd be quite wary about modifying the logic here (whatever it is - I haven't checked closely); if we used to resolve to one point of a symlink before, and now resolve to the other end, I can almost guarantee you that there are dozens of existing users that rely on the current behaviour - that would be broken if it is changed. |
Hi Martin, please check the #66704 bug report, the current behaviour is plainly wrong, clang does not pick the correct libraries and silently uses the system libraries, which leads to very subtle and hard to debug errors if the new version is several versions apart from the system. I'm open to any suggestions to make clang correctly identify the InstalledDir when invoked via a symbolic link. |
I would kind of agree with that in general - resolving this to the actual clang binary would indeed seem like the right thing to do. But it makes me wonder if someone actually is relying on the current behaviour, or is this a case that hasn't been encountered before (symlinks primarily being used within the same bin directory)?
Well whatever the consequences are of using a different install dir is kinda of irrelevant here IMO; there should be a well defined logic for how that's calculated (and I agree that it ideally should be based on the actual executable), and based on that, each target toolchain has various fallbacks for finding the headers to use if it's not colocated with the compiler. |
Great! The new simplified patch does exactly this.
To rephrase your question, you ask if someone is using a configuration with a folder where various custom libraries are and bin folder with a link to the actual clang, and expects for clang to pick the custom libraries instead of the distributed ones? I obviously can't tell, but I would say that the chances are slim. And, in case someone wants such a configuration, the current patch, if
Well, given how much time I spent trying to understand some weird errors caused by this, I would not call them irrelevant :-(
Right, but I would say that the current logic based on the folder where the clang executable is located worked fine till now, and I see no reasons to change it, just that till now there were no use cases with symlinks from different folders (a configuration specific to the npm/xpm ecosystem). |
To me, it actually sounds entirely possible for someone to want to set up a toolchain that way. (I even had a user look into setting up something similar to that recently; building user-local installed llvm-mingw toolchains around a clang binary provided by a linux distribution.)
I'm not diminishing your debugging effort - I'm just saying that how annoying something was to debug (until you realized that the detected installation dir was wrong) doesn't really affect whether clang should change its definition for how the install directory is determined. Yes, having it use the wrong install directory will of course make things not work as they should.
With well defined logic, I mean with respect to how symlinks are handled as well. Without symlinks involved, the logic for determining the clang install directory is near-trivial.
You make it sound like nobody else might ever have used such symlinks before - isn't that quite a big assumption? Anyway, TL;DR, I agree that the current interpretation of symlinks here does seem weird and that it probably would be right to change it. I'm just saying that it's plausible that such a change might disrupt something, if the previous logic has been in place for as long as Clang has existed. And that I don't entirely agree with all the arguments presented. But the end goal seems reasonable. |
Ah, sorry for the confusion, I did not intend to make it sound like a big assumption. I'm convinced that such links were occasionally used before, just that the result was not necessarily an error, I would say that in most cases using the system libraries is functional, and this explains why such cases were not reported till now. However, in the npm/xpm ecosystem, by design, instead of adding lots of paths to the PATH, the installer add symlinks into a local .bin folder, and all binaries are invoked via these links (on windows things are more complicated, So, I don't want to minimise others experiences, I just wanted to explain how I got into this.
Ok, thank you. |
Not necessarily; this depends entirely on what target you're using. If you're cross compiling, if you pick the wrong install dir it probably won't find any usable sysroot at all. The case you've run into, of finding and using the system installed toolchain transparently, is only one of many different outcomes in such a case. |
I took some time to analyse the failed tests. I managed to reproduce them (except The failing tests are:
They all create folders where they place various symlinks to clang, then perform some checks on InstalledDir or other output lines. The question is whether the reliance of these tests on the path where the symlink is located is a feature that must be checked by the tests, or not. As Martin noticed, without a clear definition of the correct way to handle these symlinks, this is a tricky issue. Without any other reference, I checked the Apple clang, on my macOS:
So Apple decided to follow the links, and, in my opinion, this is the less ambiguous and expected behaviour. I think that we first need an agreement from the clang maintainers that this is the correct behaviour, then find a way to fix the current tests, and possibly improve the implementation (for example i'm not sure if |
@jansvoboda11, since the Apple clang already fixed this bug, could you suggest how to proceed? Could you backport the Apple patch to upstream? Please also take a look at the original bug report #66704, since it is related to macOS. |
Another way to rephrase the question: the current implementation that preserves the symlinks, was it an explicitly required feature, or rather a side effect that came into existence by accident when implementing the current tests? |
No, this is a side effect of something else. Note, Let's recreate your experiment like this: % mkdir -p tmp/a
% ln -s /Library/Developer/CommandLineTools/usr/bin/clang tmp/a
% tmp/a/clang -v
Apple clang version 12.0.5 (clang-1205.0.22.9)
Target: arm64-apple-darwin21.6.0
Thread model: posix
InstalledDir: /Users/martin/tmp/a There you see that Apple's clang also behaves exactly the same, if you disregard the external wrapper executable.
Regardless of whether it was intentional or not, due to Hyrum's law, it's quite possible that someone has ended up depending on the previous defacto behaviour. We did a similar cleanup for llvm-rc not long ago, and llvm-rc has much much fewer users than clang. In 8c6a0c8 we changed llvm-rc to explicitly look up the executable path (instead of looking for what would be found in |
Then we have to be super creative and find a better solution. How about changing the logic, and where I don't know how complicated the implementation might be, but, at first sight, it should maintain compatibility with curent use cases that create a full toolchain folder and also prevent silently falling back to system headers/libraries (like in my use case). |
Yes, something like that would be possible. I wonder if this would need to be somewhat target specific. Different targets have different logic for fallback when things don't exist in the most expected directory. But perhaps it would be enough to check for On the other hand; I agree that it's kinda implausible that this directory would exist at all anywhere else than next to the actual executable. In that case, perhaps those testcases are the only cases that would need to be handled after all? @nolange - in your efforts in mstorsjo/llvm-mingw#362 - do you rely on making a symlink to the system installed clang executable and locating sysroots and the clang |
I took a first look, and it seems target specific. The strategy was to search for It looks like in many places both locations are used, which means things are not that bad: getProgramPaths().push_back(getDriver().getInstalledDir());
if (getDriver().getInstalledDir() != D.Dir)
getProgramPaths().push_back(D.Dir); At a closer look, my specific issue is related only to picking the system headers, the libraries seem to be identified correctly. This now clearly explains the errors I encountered, compiling with the system headers and linking with the clang own libraries cannot work if the versions are too far apart. For example the code identifying the C++ headers is in Darwin.cpp, where only // Get from '<install>/bin' to '<install>/include/c++/v1'.
// Note that InstallBin can be relative, so we use '..' instead of
// parent_path.
llvm::SmallString<128> InstallBin =
llvm::StringRef(getDriver().getInstalledDir()); // <install>/bin
llvm::sys::path::append(InstallBin, "..", "include", "c++", "v1");
if (getVFS().exists(InstallBin)) {
addSystemInclude(DriverArgs, CC1Args, InstallBin);
return;
} else if (DriverArgs.hasArg(options::OPT_v)) {
llvm::errs() << "ignoring nonexistent directory \"" << InstallBin
<< "\"\n";
} I'll extend the code to also check I'll take a look at the other targets too, but I'm not sure that I'll be able to provide fixes for all, since I don't know them, and I don't want to mess anything. |
If it exists this might cause someone unimaginable pain ;)
Nope, I use a script as I need to set Btw. I understand this is only a bug on MacOS? Because the title suggests that you want to pickup the headers from the symlink and not the clang-installation, and I hope that is exactly the opposite of what you want to archive. |
@nolange, English is not my first language, and I know sometimes I'm very confusing. In my understanding, the title suggest that if clang is not invoked directly but via a symlink, it does not behave correctly, i.e. instead of picking the headers from the folders where clang is installed, it picks the system headers. This is documented with examples in the bug report. If this is not what the title reads, please suggest a better phrasing. |
I'm using clang only to compile my binary packages on macOS (the Linux binaries are compiled with gcc), so I got bitten by this bug only on macOS. The other platforms/targets may or may not be affected, I don't know; but my initial search for InstalledDir showed multiple references not paired with references to Dir. |
(Out of town and I do not have enough time to investigate the behavior.)
The description (first comment, editable) should be self-containing so that readers don't need to read the linked issue. You can summarize the issue and give a short description (if the patch is accepted, the final commit message). A simplified example is needed to help understand the issue.
I think we should figure out how to remove the following fixme as well: // FIXME: We don't actually canonicalize this, we just make it absolute.
if (CanonicalPrefixes)
llvm::sys::fs::make_absolute(InstalledPath); |
This is one of the issues I have with mainly with the problem formulation - as there are two levels of logic involved here. The generic code which sets the install dir, and the target specific files in clang/lib/Driver/ToolChains, which do various things based on the install dir that has been detected elsewhere, plus target specific heuristics for falling back on other ways to detecting things that are needed. If the installation dir is detected as the wrong thing, this would be fatal for all targets, but the exact behaviour you'd see would be quite target dependent. That's why explanations like "silently using system headers from a different version is hard to debug here" are out of scope here; using the wrong install dir is fatal of course. If the darwin specific fallback handling could be improved, that'd of course be great. On the other hand - in principle I kinda agree that the fix itself, locating things based on the actual executable, sounds like the right thing to do in any case. We're early in the 18.x cycle, and if we could get the tests fixed to work despite that, we'd have lots of time to see if someone is affected negatively by it, before 18.x is released. |
On macOS, when clang is invoked via a symlink, since the InstalledDir is where the link is located, the C++ headers were not identified and the default system headers were used. The fix add a second check using the folder where the executable is located.
Yes, my problem with the title is, that it sound like you want to change the way how clang should behave when using symlinks.
Here is how I would phrase it: Main point being that you fix a bug and don't want to change the usual behavior (like gcc, like it works on Linux), |
@nolange, thank you for rephrasing the title. |
The latest patch fixes the issue #66704 without changing the meaning of It is specific to macOS headers, the libraries were ok and do not need fixes. I also checked the other places where There are also several references in One other file where checking for the path where the executable is located might be in // Then look for gcc installed alongside clang.
Prefixes.push_back(D.InstalledDir + "/..");
Fully agree, but I suggest we first fix this specific issue in this PR, and do the rest in a separate PR. |
The Linux tests were fine, but the Windows tests failed in |
Any suggestions how to proceed from here? |
I'm not familiar with this, but @yln might. |
@jansvoboda11, as Martin showed before, Apple clang did not fix this, so this patch applies to it too. |
Add <executable>/../include/c++/v1 to include path
On macOS, when clang is invoked via a symlink, since the InstalledDir is
where the link is located, the C++ headers are not identified and the
default system headers are used.
This fix adds a second check using the folder where the executable is
located.
This patch fixes #66704.