Skip to content

[clang][driver] Fix an issue where clang does not correctly resolve the system header if invoked via symlink (on MacOS) #68091

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

ilg-ul
Copy link
Contributor

@ilg-ul ilg-ul commented Oct 3, 2023

Add <executable>/../include/c++/v1 to include path

On macOS, when clang is invoked via a symlink, since the InstalledDir is
where the link is located, the C++ headers are not identified and the
default system headers are used.

This fix adds a second check using the folder where the executable is
located.

This patch fixes #66704.

ilg-ul added 2 commits October 3, 2023 14:07
…olute path (#66704)

In clang/tools/driver/driver.cpp, the function SetInstallDir() tries to determine the
location where the toolchain is installed, basically by taking the parent folder of
the executable absolute path. This is not correct when the compiler is invoked
via a link, since it returns the parent of the link.

This leads to subtle errors, for example on macOS it silently picks the wrong
system headers.
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Oct 3, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 3, 2023

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Changes

This patch fixes #66704.


Full diff: https://github.com/llvm/llvm-project/pull/68091.diff

1 Files Affected:

  • (modified) clang/tools/driver/driver.cpp (+61)
diff --git a/clang/tools/driver/driver.cpp b/clang/tools/driver/driver.cpp
index 531b5b4a61c1804..45e1469aafca95e 100644
--- a/clang/tools/driver/driver.cpp
+++ b/clang/tools/driver/driver.cpp
@@ -57,6 +57,15 @@ using namespace clang;
 using namespace clang::driver;
 using namespace llvm::opt;
 
+#if defined(__linux__)
+#include <unistd.h>
+#elif defined(__APPLE__)
+#include <libproc.h>
+#include <unistd.h>
+#elif defined(__MINGW32__)
+#include <windows.h>
+#endif
+
 std::string GetExecutablePath(const char *Argv0, bool CanonicalPrefixes) {
   if (!CanonicalPrefixes) {
     SmallString<128> ExecutablePath(Argv0);
@@ -331,6 +340,56 @@ static void SetInstallDir(SmallVectorImpl<const char *> &argv,
   // path being a symlink.
   SmallString<128> InstalledPath(argv[0]);
 
+#if defined(__linux__)
+
+  char ProcessAbsolutePath[PATH_MAX];
+
+  int len = readlink("/proc/self/exe", ProcessAbsolutePath,
+                     sizeof(ProcessAbsolutePath) - 1);
+  if (len <= 0) {
+    llvm::errs() << "Internal error: readlink(\"/proc/self/exe\") failed with "
+                 << strerror(errno) << "\n";
+    exit(1);
+  }
+
+  ProcessAbsolutePath[len] = 0;
+  InstalledPath = ProcessAbsolutePath;
+
+#elif defined(__APPLE__)
+
+  // The size must be higher than PROC_PIDPATHINFO_SIZE, otherwise the call
+  // fails with ENOMEM (12) - Cannot allocate memory.
+  // https://opensource.apple.com/source/Libc/Libc-498/darwin/libproc.c
+  char ProcessAbsolutePath[PROC_PIDPATHINFO_SIZE + 1];
+
+  int len = proc_pidpath(getpid(), ProcessAbsolutePath,
+                         sizeof(ProcessAbsolutePath) - 1);
+  if (len <= 0) {
+    llvm::errs() << "Internal error: proc_pidpath() failed with "
+                 << strerror(errno) << "\n";
+    exit(1);
+  }
+
+  ProcessAbsolutePath[len] = 0;
+  InstalledPath = ProcessAbsolutePath;
+
+#elif defined(__MINGW32__)
+
+  char ProcessAbsolutePath[PATH_MAX];
+
+  len = GetModuleFileName(NULL, ProcessAbsolutePath,
+                          sizeof(ProcessAbsolutePath) - 1);
+  if (len <= 0) {
+    llvm::errs() << "Internal error: GetModuleFileName() failed with "
+                 << strerror(errno) << "\n";
+    exit(1);
+  }
+
+  ProcessAbsolutePath[len] = 0;
+  InstalledPath = ProcessAbsolutePath;
+
+#else
+
   // Do a PATH lookup, if there are no directory components.
   if (llvm::sys::path::filename(InstalledPath) == InstalledPath)
     if (llvm::ErrorOr<std::string> Tmp = llvm::sys::findProgramByName(
@@ -341,6 +400,8 @@ static void SetInstallDir(SmallVectorImpl<const char *> &argv,
   if (CanonicalPrefixes)
     llvm::sys::fs::make_absolute(InstalledPath);
 
+#endif
+
   StringRef InstalledPathParent(llvm::sys::path::parent_path(InstalledPath));
   if (llvm::sys::fs::exists(InstalledPathParent))
     TheDriver.setInstalledDir(InstalledPathParent);

@ilg-ul ilg-ul changed the title Use platform specific calls to get the executable absolute path [clang][driver] Use platform specific calls to get the executable absolute path Oct 3, 2023
@Endilll Endilll added clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' and removed clang Clang issues not falling into any other category labels Oct 3, 2023
@Endilll
Copy link
Contributor

Endilll commented Oct 3, 2023

You have failed tests on Linux CI:

Failed Tests (4):
  Clang :: Driver/mingw-sysroot.cpp
  Clang :: Driver/no-canonical-prefixes.c
  Clang :: Driver/program-path-priority.c
  Clang :: Driver/rocm-detect.hip

Windows CI has another test in Clang interpreter failing, but that might be not related to you.
You should be able to reproduce those test failures with clang --build . -t check-clang-driver (or ninja check-clang-driver).

@@ -331,6 +340,56 @@ static void SetInstallDir(SmallVectorImpl<const char *> &argv,
// path being a symlink.
SmallString<128> InstalledPath(argv[0]);

#if defined(__linux__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use getMainExecutable() instead?

(This sort of system-specific logic should live in LLVM's Support directory so that we don't sprinkle host-specific preprocessor conditionals all over, but better still if we can use our existing facilities rather than invent new ones.)

Copy link
Collaborator

@bjope bjope Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was kind of thinking the same. And then started to wonder about why SetInstallDir says "We do this manually, because we want to support that path being a symlink.". I don't really understand if that indicates that the installed dir should be set based on the symlinks path or not.

Afaict Driver::ClangExecutable already is computed via getMainExecutable().
And the Driver c'tor is setting both Driver::Dir and Driver::InstalledDir based on the path of the clang executable.

So is the problem here is that SetInstallDir is called after the above (having contructed TheDriver). Resulting in InstalledDir being changed into being relative to the symlink?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use getMainExecutable() instead?

Good point!

So is the problem here that SetInstallDir is called after the above (having constructed TheDriver). Resulting in InstalledDir being changed into being relative to the symlink?

When I first encountered this problem I tried to run a debug session to understand how things work, but got confused and in the end I patched SetInstallDir and for my needs it was ok, but since getMainExecutable() already does the job I'll reconsider and try to understand where the problem comes from.

@mstorsjo
Copy link
Member

mstorsjo commented Oct 3, 2023

FWIW, I'd be quite wary about modifying the logic here (whatever it is - I haven't checked closely); if we used to resolve to one point of a symlink before, and now resolve to the other end, I can almost guarantee you that there are dozens of existing users that rely on the current behaviour - that would be broken if it is changed.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 3, 2023

I haven't checked closely

Hi Martin, please check the #66704 bug report, the current behaviour is plainly wrong, clang does not pick the correct libraries and silently uses the system libraries, which leads to very subtle and hard to debug errors if the new version is several versions apart from the system.

I'm open to any suggestions to make clang correctly identify the InstalledDir when invoked via a symbolic link.

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Oct 3, 2023
@mstorsjo
Copy link
Member

mstorsjo commented Oct 3, 2023

I haven't checked closely

Hi Martin, please check the #66704 bug report, the current behaviour is plainly wrong,

I would kind of agree with that in general - resolving this to the actual clang binary would indeed seem like the right thing to do. But it makes me wonder if someone actually is relying on the current behaviour, or is this a case that hasn't been encountered before (symlinks primarily being used within the same bin directory)?

clang does not pick the correct libraries and silently uses the system libraries, which leads to very subtle and hard to debug errors if the new version is several versions apart from the system.

Well whatever the consequences are of using a different install dir is kinda of irrelevant here IMO; there should be a well defined logic for how that's calculated (and I agree that it ideally should be based on the actual executable), and based on that, each target toolchain has various fallbacks for finding the headers to use if it's not colocated with the compiler.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 3, 2023

resolving this to the actual clang binary would indeed seem like the right thing to do.

Great! The new simplified patch does exactly this.

it makes me wonder if someone actually is relying on the current behaviour

To rephrase your question, you ask if someone is using a configuration with a folder where various custom libraries are and bin folder with a link to the actual clang, and expects for clang to pick the custom libraries instead of the distributed ones? I obviously can't tell, but I would say that the chances are slim.

And, in case someone wants such a configuration, the current patch, if -no-canonical-prefixes is used, reverts to the previous behaviour, when the symlink is not followed.

whatever the consequences are of using a different install dir is kinda of irrelevant here

Well, given how much time I spent trying to understand some weird errors caused by this, I would not call them irrelevant :-(

there should be a well defined logic for how that's calculated

Right, but I would say that the current logic based on the folder where the clang executable is located worked fine till now, and I see no reasons to change it, just that till now there were no use cases with symlinks from different folders (a configuration specific to the npm/xpm ecosystem).

@mstorsjo
Copy link
Member

mstorsjo commented Oct 3, 2023

it makes me wonder if someone actually is relying on the current behaviour

To rephrase your question, you ask if someone is using a configuration with a folder where various custom libraries are and bin folder with a link to the actual clang, and expects for clang to pick the custom libraries instead of the distributed ones? I obviously can't tell, but I would say that the chances are slim.

To me, it actually sounds entirely possible for someone to want to set up a toolchain that way. (I even had a user look into setting up something similar to that recently; building user-local installed llvm-mingw toolchains around a clang binary provided by a linux distribution.)

And, in case someone wants such a configuration, the current patch, if -no-canonical-prefixes is used, reverts to the previous behaviour, when the symlink is not followed.

whatever the consequences are of using a different install dir is kinda of irrelevant here

Well, given how much time I spent trying to understand some weird errors caused by this, I would not call them irrelevant :-(

I'm not diminishing your debugging effort - I'm just saying that how annoying something was to debug (until you realized that the detected installation dir was wrong) doesn't really affect whether clang should change its definition for how the install directory is determined. Yes, having it use the wrong install directory will of course make things not work as they should.

there should be a well defined logic for how that's calculated

Right, but I would say that the current logic based on the folder where the clang executable is located worked fine till now, and I see no reasons to change it,

With well defined logic, I mean with respect to how symlinks are handled as well. Without symlinks involved, the logic for determining the clang install directory is near-trivial.

just that till now there were no use cases with symlinks from different folders (a configuration specific to the npm/xpm ecosystem).

You make it sound like nobody else might ever have used such symlinks before - isn't that quite a big assumption?

Anyway, TL;DR, I agree that the current interpretation of symlinks here does seem weird and that it probably would be right to change it. I'm just saying that it's plausible that such a change might disrupt something, if the previous logic has been in place for as long as Clang has existed. And that I don't entirely agree with all the arguments presented. But the end goal seems reasonable.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 3, 2023

You make it sound like nobody else might ever have used such symlinks before - isn't that quite a big assumption?

Ah, sorry for the confusion, I did not intend to make it sound like a big assumption.

I'm convinced that such links were occasionally used before, just that the result was not necessarily an error, I would say that in most cases using the system libraries is functional, and this explains why such cases were not reported till now.

However, in the npm/xpm ecosystem, by design, instead of adding lots of paths to the PATH, the installer add symlinks into a local .bin folder, and all binaries are invoked via these links (on windows things are more complicated, .cmd shims are used). This is why this issue hit me to the point of searching for a fix. And to make things worse, the xPack build environment (used to build all my binary tools) uses the latest compilers, but runs them on slightly older build machines (for example clang 16 on macOS 10.13), in other words the distance between the two clang versions is large, and builds fail with the system headers, although clang 16 has much newer headers.

So, I don't want to minimise others experiences, I just wanted to explain how I got into this.

I agree that the current interpretation of symlinks here does seem weird and that it probably would be right to change it.

Ok, thank you.

@mstorsjo
Copy link
Member

mstorsjo commented Oct 4, 2023

I'm convinced that such links were occasionally used before, just that the result was not necessarily an error, I would say that in most cases using the system libraries is functional, and this explains why such cases were not reported till now.

Not necessarily; this depends entirely on what target you're using. If you're cross compiling, if you pick the wrong install dir it probably won't find any usable sysroot at all. The case you've run into, of finding and using the system installed toolchain transparently, is only one of many different outcomes in such a case.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 4, 2023

I took some time to analyse the failed tests. I managed to reproduce them (except rocm-detect.hip) on my macOS, so I had more freedom to experiment.

The failing tests are:

  • Clang::Driver/mingw-sysroot.cpp
  • Clang::Driver/no-canonical-prefixes.c
  • Clang::Driver/program-path-priority.c

They all create folders where they place various symlinks to clang, then perform some checks on InstalledDir or other output lines.

The question is whether the reliance of these tests on the path where the symlink is located is a feature that must be checked by the tests, or not.

As Martin noticed, without a clear definition of the correct way to handle these symlinks, this is a tricky issue.

Without any other reference, I checked the Apple clang, on my macOS:

ilg@wksi ~ % /usr/bin/clang -v
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: x86_64-apple-darwin21.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin


ilg@wksi ~ % mkdir tmp/a 
ilg@wksi ~ % ln -s /usr/bin/clang tmp/a

ilg@wksi ~ % tmp/a/clang -v
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: x86_64-apple-darwin21.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

So Apple decided to follow the links, and, in my opinion, this is the less ambiguous and expected behaviour.

I think that we first need an agreement from the clang maintainers that this is the correct behaviour, then find a way to fix the current tests, and possibly improve the implementation (for example i'm not sure if InstalledDir should be affected by -no-canonical-prefixes).

@ilg-ul ilg-ul changed the title [clang][driver] Use platform specific calls to get the executable absolute path [clang][driver] Use TheDriver.ClangExecutable in SetInstallDir to point to the actual install folder Oct 4, 2023
@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 4, 2023

@jansvoboda11, since the Apple clang already fixed this bug, could you suggest how to proceed? Could you backport the Apple patch to upstream?

Please also take a look at the original bug report #66704, since it is related to macOS.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 4, 2023

Another way to rephrase the question: the current implementation that preserves the symlinks, was it an explicitly required feature, or rather a side effect that came into existence by accident when implementing the current tests?

@mstorsjo
Copy link
Member

mstorsjo commented Oct 4, 2023

Without any other reference, I checked the Apple clang, on my macOS:

ilg@wksi ~ % /usr/bin/clang -v
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: x86_64-apple-darwin21.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin


ilg@wksi ~ % mkdir tmp/a 
ilg@wksi ~ % ln -s /usr/bin/clang tmp/a

ilg@wksi ~ % tmp/a/clang -v
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: x86_64-apple-darwin21.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

So Apple decided to follow the links, and, in my opinion, this is the less ambiguous and expected behaviour.

No, this is a side effect of something else. Note, /usr/bin/clang isn't a symlink to /Library/Developer/CommandLineTools/usr/bin/clang, but it is a wrapper executable that executes the latter (after setting some env variables, iirc, and after locating where it is installed - it can also be in an Xcode.app bundle somewhere).

Let's recreate your experiment like this:

% mkdir -p tmp/a
% ln -s /Library/Developer/CommandLineTools/usr/bin/clang tmp/a
% tmp/a/clang -v
Apple clang version 12.0.5 (clang-1205.0.22.9)
Target: arm64-apple-darwin21.6.0
Thread model: posix
InstalledDir: /Users/martin/tmp/a

There you see that Apple's clang also behaves exactly the same, if you disregard the external wrapper executable.

Another way to rephrase the question: the current implementation that preserves the symlinks, was it an explicitly required feature, or rather a side effect that came into existence by accident when implementing the current tests?

Regardless of whether it was intentional or not, due to Hyrum's law, it's quite possible that someone has ended up depending on the previous defacto behaviour.

We did a similar cleanup for llvm-rc not long ago, and llvm-rc has much much fewer users than clang. In 8c6a0c8 we changed llvm-rc to explicitly look up the executable path (instead of looking for what would be found in $PATH, essentially finding the source end of symlinks - just like what clang does right now). But this broke Google's internal uses of the tool, so in e4eb8d9 we reverted to checking both paths, essentially testing both directories.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 4, 2023

it's quite possible that someone has ended up depending on the previous de facto behaviour.

Then we have to be super creative and find a better solution.

How about changing the logic, and where Driver.InstalledDir is used, if the desired file/folder is not found (like header or library), search again in Driver.Dir (which is the parent of ClangExecutable)?

I don't know how complicated the implementation might be, but, at first sight, it should maintain compatibility with curent use cases that create a full toolchain folder and also prevent silently falling back to system headers/libraries (like in my use case).

@ilg-ul ilg-ul changed the title [clang][driver] Use TheDriver.ClangExecutable in SetInstallDir to point to the actual install folder [clang][driver] Prevent clang pick the system headers/libraries when started via a symlink Oct 4, 2023
@ilg-ul ilg-ul changed the title [clang][driver] Prevent clang pick the system headers/libraries when started via a symlink [clang][driver] Prevent clang picking the system headers/libraries when started via a symlink Oct 4, 2023
@mstorsjo
Copy link
Member

mstorsjo commented Oct 4, 2023

it's quite possible that someone has ended up depending on the previous de facto behaviour.

Then we have to be super creative and find a better solution.

How about changing the logic, and where Driver.InstalledDir is used, if the desired file/folder is not found (like header or library), search again in Driver.Dir (which is the parent of ClangExecutable)?

I don't know how complicated the implementation might be, but, at first sight, it should maintain compatibility with curent use cases that create a full toolchain folder and also prevent silently falling back to system headers/libraries (like in my use case).

Yes, something like that would be possible. I wonder if this would need to be somewhat target specific. Different targets have different logic for fallback when things don't exist in the most expected directory. But perhaps it would be enough to check for <executable>/../lib/clang/<version> or something like that, to distinguish between the two locations - that's at least quite target independent.

On the other hand; I agree that it's kinda implausible that this directory would exist at all anywhere else than next to the actual executable. In that case, perhaps those testcases are the only cases that would need to be handled after all?

@nolange - in your efforts in mstorsjo/llvm-mingw#362 - do you rely on making a symlink to the system installed clang executable and locating sysroots and the clang <base>/lib/clang directory next to the symlink?

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 4, 2023

I wonder if this would need to be somewhat target specific.

I took a first look, and it seems target specific.

The strategy was to search for InstalledDir and see how it is used.

It looks like in many places both locations are used, which means things are not that bad:

  getProgramPaths().push_back(getDriver().getInstalledDir());
  if (getDriver().getInstalledDir() != D.Dir)
    getProgramPaths().push_back(D.Dir);

At a closer look, my specific issue is related only to picking the system headers, the libraries seem to be identified correctly. This now clearly explains the errors I encountered, compiling with the system headers and linking with the clang own libraries cannot work if the versions are too far apart.

For example the code identifying the C++ headers is in Darwin.cpp, where only InstalledDir is checked:

    // Get from '<install>/bin' to '<install>/include/c++/v1'.
    // Note that InstallBin can be relative, so we use '..' instead of
    // parent_path.
    llvm::SmallString<128> InstallBin =
        llvm::StringRef(getDriver().getInstalledDir()); // <install>/bin
    llvm::sys::path::append(InstallBin, "..", "include", "c++", "v1");
    if (getVFS().exists(InstallBin)) {
      addSystemInclude(DriverArgs, CC1Args, InstallBin);
      return;
    } else if (DriverArgs.hasArg(options::OPT_v)) {
      llvm::errs() << "ignoring nonexistent directory \"" << InstallBin
                   << "\"\n";
    }

I'll extend the code to also check Dir and this should fix my problem.

I'll take a look at the other targets too, but I'm not sure that I'll be able to provide fixes for all, since I don't know them, and I don't want to mess anything.

@nolange
Copy link

nolange commented Oct 4, 2023

On the other hand; I agree that it's kinda implausible that this directory would exist at all anywhere else than next to the actual executable. In that case, perhaps those testcases are the only cases that would need to be handled after all?

If it exists this might cause someone unimaginable pain ;)

@nolange - in your efforts in mstorsjo/llvm-mingw#362 - do you rely on making a symlink to the system installed clang executable and locating sysroots and the clang <base>/lib/clang directory next to the symlink?

Nope, I use a script as I need to set -resource-dir=.

Btw. I understand this is only a bug on MacOS? Because the title suggests that you want to pickup the headers from the symlink and not the clang-installation, and I hope that is exactly the opposite of what you want to archive.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 4, 2023

@nolange, English is not my first language, and I know sometimes I'm very confusing. In my understanding, the title suggest that if clang is not invoked directly but via a symlink, it does not behave correctly, i.e. instead of picking the headers from the folders where clang is installed, it picks the system headers. This is documented with examples in the bug report.

If this is not what the title reads, please suggest a better phrasing.

@ilg-ul ilg-ul changed the title [clang][driver] Prevent clang picking the system headers/libraries when started via a symlink [clang][driver] Prevent clang picking the system headers when started via a symlink Oct 4, 2023
@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 4, 2023

... I understand this is only a bug on MacOS?

I'm using clang only to compile my binary packages on macOS (the Linux binaries are compiled with gcc), so I got bitten by this bug only on macOS. The other platforms/targets may or may not be affected, I don't know; but my initial search for InstalledDir showed multiple references not paired with references to Dir.

@MaskRay
Copy link
Member

MaskRay commented Oct 5, 2023

(Out of town and I do not have enough time to investigate the behavior.)

This patch fixes #66704.

The description (first comment, editable) should be self-containing so that readers don't need to read the linked issue. You can summarize the issue and give a short description (if the patch is accepted, the final commit message). A simplified example is needed to help understand the issue.

-canonical-prefixes currently affects 2 places in driver.cpp and making the second use case reuse the computed path from the first one seems an improvement to me.

I think we should figure out how to remove the following fixme as well:

  // FIXME: We don't actually canonicalize this, we just make it absolute.
  if (CanonicalPrefixes)
    llvm::sys::fs::make_absolute(InstalledPath);

@mstorsjo
Copy link
Member

mstorsjo commented Oct 5, 2023

... I understand this is only a bug on MacOS?

I'm using clang only to compile my binary packages on macOS (the Linux binaries are compiled with gcc), so I got bitten by this bug only on macOS. The other platforms/targets may or may not be affected, I don't know; but my initial search for InstalledDir showed multiple references not paired with references to Dir.

This is one of the issues I have with mainly with the problem formulation - as there are two levels of logic involved here. The generic code which sets the install dir, and the target specific files in clang/lib/Driver/ToolChains, which do various things based on the install dir that has been detected elsewhere, plus target specific heuristics for falling back on other ways to detecting things that are needed.

If the installation dir is detected as the wrong thing, this would be fatal for all targets, but the exact behaviour you'd see would be quite target dependent. That's why explanations like "silently using system headers from a different version is hard to debug here" are out of scope here; using the wrong install dir is fatal of course.

If the darwin specific fallback handling could be improved, that'd of course be great.

On the other hand - in principle I kinda agree that the fix itself, locating things based on the actual executable, sounds like the right thing to do in any case. We're early in the 18.x cycle, and if we could get the tests fixed to work despite that, we'd have lots of time to see if someone is affected negatively by it, before 18.x is released.

ilg-ul added 3 commits October 5, 2023 12:26
On macOS, when clang is invoked via a symlink, since the InstalledDir is
where the link is located, the C++ headers were not identified and the
default system headers were used.

The fix add a second check using the folder where the executable is
located.
@nolange
Copy link

nolange commented Oct 5, 2023

@nolange, English is not my first language, and I know sometimes I'm very confusing. In my understanding, the title suggest that if clang is not invoked directly but via a symlink, it does not behave correctly, i.e. instead of picking the headers from the folders where clang is installed, it picks the system headers. This is documented with examples in the bug report.

Yes, my problem with the title is, that it sound like you want to change the way how clang should behave when using symlinks.

If this is not what the title reads, please suggest a better phrasing.

Here is how I would phrase it:
"Fix an issue where clang does not correctly resolve the system header if invoked via symlink" (on MacOS)

Main point being that you fix a bug and don't want to change the usual behavior (like gcc, like it works on Linux),
and that you don't prevent clang picking up the system headers but fix picking up the system headers .
This is how I understand the Bug Report and the discussion.

@ilg-ul ilg-ul changed the title [clang][driver] Prevent clang picking the system headers when started via a symlink [clang][driver] Fix an issue where clang does not correctly resolve the system header if invoked via symlink (on MacOS) Oct 5, 2023
@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 5, 2023

@nolange, thank you for rephrasing the title.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 5, 2023

The latest patch fixes the issue #66704 without changing the meaning of InstalledDir and possibly breaking compatibility.

It is specific to macOS headers, the libraries were ok and do not need fixes.


I also checked the other places where installedDir is used, but I'm not familiar with them (Hexagon.cpp, MipsLinux.cpp, OHOS.cpp, WebAssembly.cpp).

There are also several references in MinGW.cpp, but on Windows starting clang via a link is highly unlikely, so I'll leave them to Martin, to decide if they need any changes.

One other file where checking for the path where the executable is located might be in Gnu.cpp, around line 2132, where the list of prefixes is calculated, but I don't know the logic well enough to suggest a change:

    // Then look for gcc installed alongside clang.
    Prefixes.push_back(D.InstalledDir + "/..");

in principle I kinda agree that the fix itself, locating things based on the actual executable, sounds like the right thing to do in any case. We're early in the 18.x cycle, and if we could get the tests fixed to work despite that, we'd have lots of time to see if someone is affected negatively by it, before 18.x is released.

Fully agree, but I suggest we first fix this specific issue in this PR, and do the rest in a separate PR.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 5, 2023

The Linux tests were fine, but the Windows tests failed in Interpreter/const.cpp, which does not seem to be related to my patch.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 5, 2023

Any suggestions how to proceed from here?

@jansvoboda11
Copy link
Contributor

@jansvoboda11, since the Apple clang already fixed this bug, could you suggest how to proceed? Could you backport the Apple patch to upstream?

I'm not familiar with this, but @yln might.

@ilg-ul
Copy link
Contributor Author

ilg-ul commented Oct 5, 2023

@jansvoboda11, as Martin showed before, Apple clang did not fix this, so this patch applies to it too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug in getInstalledDir() prevents picking up the correct headers when clang is started via a link
9 participants