Skip to content

[clang-include-cleaner] Fix incorrect directory issue for writing files #111375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Oct 17, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions clang-tools-extra/include-cleaner/test/tool.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,13 @@ int x = foo();
// RUN: clang-include-cleaner -edit --ignore-headers="foobar\.h,foo\.h" %t.cpp -- -I%S/Inputs/
// RUN: FileCheck --match-full-lines --check-prefix=EDIT2 %s < %t.cpp
// EDIT2-NOT: {{^}}#include "foo.h"{{$}}

// RUN: mkdir -p $(dirname %t)/out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use %T instead of dirname %t, you can find full list in https://llvm.org/docs/CommandGuide/lit.html#substitutions, same below.

also it's safer to run a rm -rf %T beforehand to make sure we're starting clean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out %T is also deprecated. I replaced my shell invocation with %t.dir, as this is the recommended method, as described in https://reviews.llvm.org/D69572. I also slightly modified the test invocation so that it works on Windows as well. (Tested on a local Windows machine.)

// RUN: cp %s %t.cpp
// RUN: echo "[{\"directory\":\"$(dirname %t)/out\",\"file\":\"../$(basename %t).cpp\",\"command\":\":clang++ -I%S/Inputs/ ../$(basename %t).cpp\"}]" > $(dirname %t)/out/compile_commands.json
// RUN: pushd $(dirname %t)
// RUN: clang-include-cleaner -p out -edit $(basename %t).cpp
// RUN: popd
// RUN: FileCheck --match-full-lines --check-prefix=EDIT3 %s < %t.cpp
// EDIT3: #include "foo.h"
// EDIT3-NOT: {{^}}#include "foobar.h"{{$}}
96 changes: 89 additions & 7 deletions clang-tools-extra/include-cleaner/tool/IncludeCleaner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -173,9 +173,11 @@ class Action : public clang::ASTFrontendAction {
if (!HTMLReportPath.empty())
writeHTML();

llvm::StringRef Path =
SM.getFileEntryRefForID(SM.getMainFileID())->getName();
assert(!Path.empty() && "Main file path not known?");
// Source File's path of compiler invocation, converted to absolute path.
llvm::SmallString<256> AbsPath(
SM.getFileEntryRefForID(SM.getMainFileID())->getName());
assert(!AbsPath.empty() && "Main file path not known?");
SM.getFileManager().makeAbsolutePath(AbsPath);
llvm::StringRef Code = SM.getBufferData(SM.getMainFileID());

auto Results =
Expand All @@ -185,7 +187,7 @@ class Action : public clang::ASTFrontendAction {
Results.Missing.clear();
if (!Remove)
Results.Unused.clear();
std::string Final = fixIncludes(Results, Path, Code, getStyle(Path));
std::string Final = fixIncludes(Results, AbsPath, Code, getStyle(AbsPath));

if (Print.getNumOccurrences()) {
switch (Print) {
Expand All @@ -202,7 +204,7 @@ class Action : public clang::ASTFrontendAction {
}

if (!Results.Missing.empty() || !Results.Unused.empty())
EditedFiles.try_emplace(Path, Final);
EditedFiles.try_emplace(AbsPath, Final);
}

void writeHTML() {
Expand Down Expand Up @@ -305,8 +307,84 @@ int main(int argc, const char **argv) {
}
}

clang::tooling::ClangTool Tool(OptionsParser->getCompilations(),
OptionsParser->getSourcePathList());
auto VFS = llvm::vfs::getRealFileSystem();
auto &CDB = OptionsParser->getCompilations();
// CDBToAbsPaths is a map from the path in the compilation database to the
// writable absolute path of the file.
std::map<std::string, std::string> CDBToAbsPaths;
if (Edit) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think it saves much to do this only when Edit is on, parsing C++ is way slower than anything we can do here over a couple of filenames.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed the check for Edit mode.

// if Edit is enabled, `Factory.editedFiles()` will contain the final code,
// along with the path given in the compilation database. That path can be
// absolute or relative, and if it is relative, it is relative to the
// "Directory" field in the compilation database. We need to make it
// absolute to write the final code to the correct path.
// There are several cases to consider:
// 1. The "Directory" field isn't same as the current working directory.
// 2. The file path resolved from the "Directory" field is not writable.
// For these cases, we need to find a writable path for the file.
// To effectively handle these cases, we only need to consider
// the files from `getSourcePathList()` that are present in the compilation
// database.
for (auto &Source : OptionsParser->getSourcePathList()) {
llvm::SmallString<256> AbsPath(Source);
if (auto Err = VFS->makeAbsolute(AbsPath)) {
llvm::errs() << "Failed to get absolute path for " << Source << " : "
<< Err.message() << '\n';
return 1;
}
std::vector<clang::tooling::CompileCommand> Cmds =
CDB.getCompileCommands(AbsPath);
if (Cmds.empty()) {
// Try with the original path.
Cmds = CDB.getCompileCommands(Source);
if (Cmds.empty()) {
continue;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fallback isn't necessary, clang-invocation underneath will also use the AbsPath as-is, and it'll skip the file if it couldn't find any compile flags for it.

I think it's better to just fail early in this case, similar to above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I checked, and even if we don't specify a compilation database, clang will create one using the FixedCompilationDatabase class. So, I added an error and return early if the compilation database can't be found.

// We only need the first command to get the directory.
auto Cmd = Cmds[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, underlying clang-invocation will run the compiler for every compile command we received. hence it isn't enough to just do this for the first command, as each command can have a different (Directory, Filename) combination. also it doesn't hurt even if we get duplicates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

llvm::SmallString<256> CDBPath(Cmd.Filename);
std::string Directory(Cmd.Directory);

if (llvm::sys::path::is_absolute(CDBPath)) {
// If the path in the compilation database is already absolute, we don't
// need to do anything.
CDBToAbsPaths[static_cast<std::string>(CDBPath)] =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we prefer std::string(CDBPath) or CDBPath.str().str() to static_cast. (also for other occurences of static_cast)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

static_cast<std::string>(AbsPath);
} else {
auto Sept = llvm::sys::path::get_separator();
// First, try to find the file based on the compilation database.
llvm::Twine FilePathTwine = Directory + Sept + CDBPath;
llvm::SmallString<256> FilePath;
FilePathTwine.toVector(FilePath);
// Check if it is writable.
if (llvm::sys::fs::access(FilePath, llvm::sys::fs::AccessMode::Write) !=
std::error_code()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see why we need to complicate the logic by checking for this. is there any reasons to not always map to file-path relative to current process?

i am pretty convinced that the user intention is to almost always edit the file path as referred to during invocation, e.g. cd /foo && clang-include-cleaner path/to/foo.cc is always meant to edit /foo/path/to/foo.cc no matter what file path we get from CDB.

i can see how this might also work, but I prefer to maintain code that has as few special cases as possible. so unless something is indeed breaking with this simplification, can you please get rid of this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out we don't need to check for this. I removed those checks.

// If not, try to find the file based on the current working
// directory, as some Bazel invocations may not set the compilation
// invocation's filesystem as non-writable. In such cases, we can
// find the file based on the current working directory.
FilePath = Source;
if (auto EC = VFS->makeAbsolute(CDBPath)) {
llvm::errs() << "Failed to get absolute path for " << CDBPath
<< " : " << EC.message() << '\n';
return 1;
}
if (llvm::sys::fs::access(FilePath,
llvm::sys::fs::AccessMode::Write) !=
std::error_code()) {
llvm::errs() << "Failed to find a writable path for " << Source
<< '\n';
return 1;
}
}
CDBToAbsPaths[static_cast<std::string>(CDBPath)] =
static_cast<std::string>(FilePath);
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm::sys::fs::make_absolute(Cmd.Directory, CDBPath); already performs concatenation only if CDBPath is not an absolute path, you don't re-implement all of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


clang::tooling::ClangTool Tool(CDB, OptionsParser->getSourcePathList());

auto HeaderFilter = headerFilter();
if (!HeaderFilter)
Expand All @@ -316,6 +394,10 @@ int main(int argc, const char **argv) {
if (Edit) {
for (const auto &NameAndContent : Factory.editedFiles()) {
llvm::StringRef FileName = NameAndContent.first();
if (auto It = CDBToAbsPaths.find(FileName.str());
It != CDBToAbsPaths.end())
FileName = It->second;

const std::string &FinalCode = NameAndContent.second;
if (auto Err = llvm::writeToOutput(
FileName, [&](llvm::raw_ostream &OS) -> llvm::Error {
Expand Down
Loading