Skip to content

[clang][deps] Support single-file mode for all formats #88764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 17, 2024

Conversation

jansvoboda11
Copy link
Contributor

@jansvoboda11 jansvoboda11 commented Apr 15, 2024

The clang-scan-deps tool can be used for fast scanning of batches of compilation commands passed in via the -compilation-database option. This gets awkward in our tests where we have to resort to using .in/.template JSON files and running them through sed in order to embed LIT's %t variable into them. However, most of our tests only need to pass single compilation command, so this dance is entirely unnecessary.

This patch makes sure the existing "per-file" mode (where the compilation command is passed in-line after the -- argument) works for all output formats, not only P1689.

@jansvoboda11 jansvoboda11 requested a review from Bigcheese April 15, 2024 17:38
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Apr 15, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2024

@llvm/pr-subscribers-clang

Author: Jan Svoboda (jansvoboda11)

Changes

The clang-scan-deps tool can be used for fast scanning of batches of compilation commands passed in via the -compilation-database option. This gets awkward in our tests where we have to resort to using .in/.template JSON files and run them through sed in order to embed LIT's %t variable into them. However, most of our tests only need to pass single compilation command, so this dance is entirely unnecessary.

This patch makes sure the existing "per-file" mode (where the compilation command is passed in-line after the -- argument) works for all output formats, not only P1689.


Full diff: https://github.com/llvm/llvm-project/pull/88764.diff

2 Files Affected:

  • (modified) clang/test/ClangScanDeps/error.cpp (+2-16)
  • (modified) clang/tools/clang-scan-deps/ClangScanDeps.cpp (+21-36)
diff --git a/clang/test/ClangScanDeps/error.cpp b/clang/test/ClangScanDeps/error.cpp
index 0095a6c900c3b3..99cfdb0b070892 100644
--- a/clang/test/ClangScanDeps/error.cpp
+++ b/clang/test/ClangScanDeps/error.cpp
@@ -1,23 +1,10 @@
 // RUN: rm -rf %t
 // RUN: split-file %s %t
 
-//--- missing_tu.json.in
-[{
-  "directory": "DIR",
-  "command": "clang -fsyntax-only DIR/missing_tu.c",
-  "file": "DIR/missing_tu.c"
-}]
-//--- missing_header.json.in
-[{
-  "directory": "DIR",
-  "command": "clang -fsyntax-only DIR/missing_header.c",
-  "file": "DIR/missing_header.c"
-}]
 //--- missing_header.c
 #include "missing.h"
 
-// RUN: sed -e "s|DIR|%/t|g" %t/missing_tu.json.in > %t/missing_tu.json
-// RUN: not clang-scan-deps -compilation-database %t/missing_tu.json 2>%t/missing_tu.errs
+// RUN: not clang-scan-deps -- "clang" %t/missing_tu.c 2>%t/missing_tu.errs
 // RUN: echo EOF >> %t/missing_tu.errs
 // RUN: cat %t/missing_tu.errs | sed 's:\\\\\?:/:g' | FileCheck %s --check-prefix=CHECK-MISSING-TU -DPREFIX=%/t
 // CHECK-MISSING-TU: Error while scanning dependencies for [[PREFIX]]/missing_tu.c
@@ -26,8 +13,7 @@
 // CHECK-MISSING-TU-NEXT: error:
 // CHECK-MISSING-TU-NEXT: EOF
 
-// RUN: sed -e "s|DIR|%/t|g" %t/missing_header.json.in > %t/missing_header.json
-// RUN: not clang-scan-deps -compilation-database %t/missing_header.json 2>%t/missing_header.errs
+// RUN: not clang-scan-deps -- "clang" %t/missing_header.c 2>%t/missing_header.errs
 // RUN: echo EOF >> %t/missing_header.errs
 // RUN: cat %t/missing_header.errs | sed 's:\\\\\?:/:g' | FileCheck %s --check-prefix=CHECK-MISSING-HEADER -DPREFIX=%/t
 // CHECK-MISSING-HEADER: Error while scanning dependencies for [[PREFIX]]/missing_header.c
diff --git a/clang/tools/clang-scan-deps/ClangScanDeps.cpp b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
index eaa76dd43e41dd..94510515cd4403 100644
--- a/clang/tools/clang-scan-deps/ClangScanDeps.cpp
+++ b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
@@ -98,8 +98,8 @@ static bool RoundTripArgs = DoRoundTripDefault;
 static void ParseArgs(int argc, char **argv) {
   ScanDepsOptTable Tbl;
   llvm::StringRef ToolName = argv[0];
-  llvm::BumpPtrAllocator A;
-  llvm::StringSaver Saver{A};
+  llvm::BumpPtrAllocator Alloc;
+  llvm::StringSaver Saver{Alloc};
   llvm::opt::InputArgList Args =
       Tbl.parseArgs(argc, argv, OPT_UNKNOWN, Saver, [&](StringRef Msg) {
         llvm::errs() << Msg << '\n';
@@ -186,14 +186,8 @@ static void ParseArgs(int argc, char **argv) {
     }
   }
 
-  if (const llvm::opt::Arg *A = Args.getLastArg(OPT_compilation_database_EQ)) {
+  if (const llvm::opt::Arg *A = Args.getLastArg(OPT_compilation_database_EQ))
     CompilationDB = A->getValue();
-  } else if (Format != ScanningOutputFormat::P1689) {
-    llvm::errs() << ToolName
-                 << ": for the --compiilation-database option: must be "
-                    "specified at least once!";
-    std::exit(1);
-  }
 
   if (const llvm::opt::Arg *A = Args.getLastArg(OPT_module_name_EQ))
     ModuleName = A->getValue();
@@ -225,9 +219,8 @@ static void ParseArgs(int argc, char **argv) {
 
   RoundTripArgs = Args.hasArg(OPT_round_trip_args);
 
-  if (auto *A = Args.getLastArgNoClaim(OPT_DASH_DASH))
-    CommandLine.insert(CommandLine.end(), A->getValues().begin(),
-                       A->getValues().end());
+  if (const llvm::opt::Arg *A = Args.getLastArgNoClaim(OPT_DASH_DASH))
+    CommandLine.assign(A->getValues().begin(), A->getValues().end());
 }
 
 class SharedStream {
@@ -694,38 +687,28 @@ static std::string getModuleCachePath(ArrayRef<std::string> Args) {
   return std::string(Path);
 }
 
-// getCompilationDataBase - If -compilation-database is set, load the
-// compilation database from the specified file. Otherwise if the we're
-// generating P1689 format, trying to generate the compilation database
-// form specified command line after the positional parameter "--".
+/// Attempts to construct the compilation database from '-compilation-database'
+/// or from the arguments following the positional '--'.
 static std::unique_ptr<tooling::CompilationDatabase>
-getCompilationDataBase(int argc, char **argv, std::string &ErrorMessage) {
+getCompilationDatabase(int argc, char **argv, std::string &ErrorMessage) {
   ParseArgs(argc, argv);
 
+  if (!(CommandLine.empty() ^ CompilationDB.empty())) {
+    llvm::errs() << "The compilation command line must be provided either via "
+                    "'-compilation-database' or after '--'.";
+    return nullptr;
+  }
+
   if (!CompilationDB.empty())
     return tooling::JSONCompilationDatabase::loadFromFile(
         CompilationDB, ErrorMessage,
         tooling::JSONCommandLineSyntax::AutoDetect);
 
-  if (Format != ScanningOutputFormat::P1689) {
-    llvm::errs() << "the --compilation-database option: must be specified at "
-                    "least once!";
-    return nullptr;
-  }
-
-  // Trying to get the input file, the output file and the command line options
-  // from the positional parameter "--".
-  char **DoubleDash = std::find(argv, argv + argc, StringRef("--"));
-  if (DoubleDash == argv + argc) {
-    llvm::errs() << "The command line arguments is required after '--' in "
-                    "P1689 per file mode.";
-    return nullptr;
-  }
-
   llvm::IntrusiveRefCntPtr<DiagnosticsEngine> Diags =
       CompilerInstance::createDiagnostics(new DiagnosticOptions);
   driver::Driver TheDriver(CommandLine[0], llvm::sys::getDefaultTargetTriple(),
                            *Diags);
+  TheDriver.setCheckInputsExist(false);
   std::unique_ptr<driver::Compilation> C(
       TheDriver.BuildCompilation(CommandLine));
   if (!C || C->getJobs().empty())
@@ -740,7 +723,8 @@ getCompilationDataBase(int argc, char **argv, std::string &ErrorMessage) {
 
   FrontendOptions &FEOpts = CI->getFrontendOpts();
   if (FEOpts.Inputs.size() != 1) {
-    llvm::errs() << "Only one input file is allowed in P1689 per file mode.";
+    llvm::errs()
+        << "Exactly one input file is required in the per-file mode ('--').\n";
     return nullptr;
   }
 
@@ -749,8 +733,9 @@ getCompilationDataBase(int argc, char **argv, std::string &ErrorMessage) {
   auto LastCmd = C->getJobs().end();
   LastCmd--;
   if (LastCmd->getOutputFilenames().size() != 1) {
-    llvm::errs() << "The command line should provide exactly one output file "
-                    "in P1689 per file mode.\n";
+    llvm::errs()
+        << "Exactly one output file is required in the per-file mode ('--').\n";
+    return nullptr;
   }
   StringRef OutputFile = LastCmd->getOutputFilenames().front();
 
@@ -790,7 +775,7 @@ getCompilationDataBase(int argc, char **argv, std::string &ErrorMessage) {
 int clang_scan_deps_main(int argc, char **argv, const llvm::ToolContext &) {
   std::string ErrorMessage;
   std::unique_ptr<tooling::CompilationDatabase> Compilations =
-      getCompilationDataBase(argc, argv, ErrorMessage);
+      getCompilationDatabase(argc, argv, ErrorMessage);
   if (!Compilations) {
     llvm::errs() << ErrorMessage << "\n";
     return 1;

llvm::IntrusiveRefCntPtr<DiagnosticsEngine> Diags =
CompilerInstance::createDiagnostics(new DiagnosticOptions);
driver::Driver TheDriver(CommandLine[0], llvm::sys::getDefaultTargetTriple(),
*Diags);
TheDriver.setCheckInputsExist(false);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is here because some tests rely on the fact that we fail later and print a specific error message if the main input file doesn't exist.

//--- missing_header.c
#include "missing.h"

// RUN: sed -e "s|DIR|%/t|g" %t/missing_tu.json.in > %t/missing_tu.json
// RUN: not clang-scan-deps -compilation-database %t/missing_tu.json 2>%t/missing_tu.errs
// RUN: not clang-scan-deps -- "clang" %t/missing_tu.c 2>%t/missing_tu.errs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's up with the quoting around "clang"?

Also, why was -fsyntax-only removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIT complains about unquoted clang in attempt to nudge people towards using %clang. That, however, adds a bunch of default flags to the invocation and I'm not sure I want those in our scanner tests (especially when testing the search path optimization for example).

The "-fsyntax-only" flag was removed because it's unnecessary (it was originally used to prevent the scanner actually generating compilation output files I think) and because it implies no output file (which the per-file mode currently requires). Since the scanner is intended to be run on actual compilation jobs, I think it's even desirable removing stuff like "-fsyntax-only".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the scanner is intended to be run on actual compilation jobs, I think it's even desirable removing stuff like "-fsyntax-only".

Changing -fsyntax-only to -c would make sense then, but it's weird to me to have linker steps in scanning unless that's specifically what you're testing. Anyway, since this one fails before it gets there I guess it doesn't matter, but I wouldn't want to see tests defaulting to linking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried double-checking what %clang expands to and I was wrong, it just expands to the Clang binary in the build directory, so I think I can remove the quote workaround.

However, both "clang" and %clang expand to -cc1 invocations that refer to system directories: -isystem /usr/local/include -internal-externc-isystem /usr/include. It would be nice to have something that cleans these out (like %clang_cc1 does), but I consider that out of scope for this patch. (Not functional change from in this aspect anyways.)

Copy link
Contributor

@Bigcheese Bigcheese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Good to get rid of more unneeded differences between ways of doing modules.

@jansvoboda11 jansvoboda11 merged commit eafd515 into llvm:main Apr 17, 2024
@jansvoboda11 jansvoboda11 deleted the clang-scan-deps-dash-dash branch April 17, 2024 02:47
jansvoboda11 added a commit to swiftlang/llvm-project that referenced this pull request Apr 17, 2024
The `clang-scan-deps` tool can be used for fast scanning of batches of
compilation commands passed in via the `-compilation-database` option.
This gets awkward in our tests where we have to resort to using
`.in`/`.template` JSON files and running them through `sed` in order to
embed LIT's `%t` variable into them. However, most of our tests only
need to pass single compilation command, so this dance is entirely
unnecessary.

This patch makes sure the existing "per-file" mode (where the
compilation command is passed in-line after the `--` argument) works for
all output formats, not only `P1689`.

(cherry picked from commit eafd515)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants