-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Wrap Command Line Arguments in a Response File if System Limits are Exceeded #16362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrap Command Line Arguments in a Response File if System Limits are Exceeded #16362
Conversation
@swift-ci please smoke test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for continuing to push this forward, Austin. Some comments on what's there but I like this better than the original version.
include/swift/Driver/Action.h
Outdated
@@ -321,6 +325,8 @@ class LinkJobAction : public JobAction { | |||
static bool classof(const Action *A) { | |||
return A->getKind() == Action::Kind::LinkJob; | |||
} | |||
|
|||
virtual bool allowsResponseFiles() const { return true; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this is a toolchain-dependent decision, and even without that it doesn't really make sense at the Action level. Action is supposed to be independent of the executable you use to accomplish that action.
include/swift/Driver/Job.h
Outdated
/// Argument vector containing a single argument pointing to the response file | ||
/// path with the '@' prefix. | ||
/// The argument string must be kept alive as long as the Job is alive. | ||
llvm::opt::ArgStringList ResponseFileArg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like overkill. Can't we just keep a std::string
around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was created to address the following issues:
- To use the response file, we need to pass a different argument list to
TQ->addTask
that lasts as long as theJob
object. - The
Job
object isconst
shortly after its creation, so we can't modify its existing argument list when it is added to the Task Queue.
I would have preferred to create the response file ArgStringList
directly in addPendingJobToTaskQueue
and then populate it with the name of the response file as a std::string
, retrieved from the Job
. However, I could not find a way to make sure the ArgStringList
would persist as long as necessary for the Job
to be executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't respond to this at the time, so clearly I understood it then, but I still don't see why that wouldn't be a std::string
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TaskQueue::addTask
takes ArrayRef<const char *> Args
as its second argument and not a std::string
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A std::string
carries a const char *
inside it at a constant address (if not modified), which is all you need to build an ArrayRef.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I use a std::string
, its memory gets cleaned up when the Job
goes out of scope, and this is before the task actually gets executed. The compiler then tries to execute something like this:
swift-UH��H��0H�}�H�u�H�U�H�}����
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the string needs to live on the llvm::opt::ArgList
that ends up being owned by the Compilation. Okay, but a const char *
would work then, too, right? No need for a whole vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We ended up having to modify Job::getResponseFileArg()
to return a const char * const&
, which takes advantage of the one-element constructor for ArrayRef
when calling addTask
. The const &
is necessary to ensure that it persists as long as the Job
object. Without it, we read junk memory.
lib/Driver/Job.cpp
Outdated
@@ -352,7 +352,11 @@ void Job::printCommandLineAndEnvironment(raw_ostream &Stream, | |||
void Job::printCommandLine(raw_ostream &os, StringRef Terminator) const { | |||
escapeAndPrintString(os, Executable); | |||
os << ' '; | |||
printArguments(os, Arguments); | |||
if (hasResponseFile()) { | |||
printArguments(os, ResponseFileArg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong; we'd want to dump the regular command line, but note that the job will be using a response file (similar to how we do environment variables).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not exactly sure how the output should be formatted (I don't see environment variables printed when I use -v
or -debug-crash-immediately
). As written, it would look similar to:
/usr/bin/swift @/tmp/arguments-70c499.resp
A possible alternative would be:
[Arguments: @/tmp/arguments-70c499.resp]
/usr/bin/swift -frontend -c -primary-file main.swift test.swift -target x86_64-unknown-linux-gnu -disable-objc-interop -color-diagnostics -module-name test -o /tmp/main-6d68e0.o
Also, wouldn't you run into the same issue when filelists are used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, but filelists hold much less information. (Though it's still important enough that the frontend jumps through hoops to show them in the PrettyStackTrace.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, is my suggested alternative sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry. The idea is, but we still want one command per line. I could see either
/usr/bin/swift @/tmp/arguments-70c499.resp # -frontend -c -primary-file main.swift …
or
/usr/bin/swift -frontend -c -primary-file main.swift … # [Arguments: @/tmp/arguments-70c499.resp]
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the first of your two options since it is literally what is being executed. Plus, a user could actually execute the first one, whereas the second would likely fail due to too many arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am looking at how to implement this for the stack trace during a crash, and ran into some issues. Printing out the program and its arguments during a crash is handled by llvm::PrettyStackTrace
, and it just loops over the argv
. The problem is that the argv
is initialized in the first line of main
in tools/driver/driver.cpp
. At this point, the executable has no knowledge of the command line arguments if a response file is being used. I would either 1) move PROGRAM_START
after the response files are expanded, or 2) modify the internals of PrettyStackTrace
to expand any response files. It looks like clang
initializes the stack trace before expanding response files: https://github.com/apple/swift-clang/blob/stable/tools/driver/driver.cpp#L326
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer (1). AFAIK Clang never uses response files to communicate between the driver and frontend, and those are the most important crash logs. @davidungar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should point out that if I move PROGRAM_START
, the response file name and path will not be printed, since it is consumed when expanded.
test/Driver/response-file.swift
Outdated
|
||
// RUN: python -c 'for i in range(500001): print "-DTEST7_" + str(i)' > %t.9.resp | ||
// RUN: not %target-swiftc_driver %s @%t.9.resp -Xfrontend -debug-crash-immediately 2>&1 | %FileCheck %s -check-prefix=CRASH | ||
// CRASH: @{{[^ ]*}}arguments-{{[0-9a-zA-Z]+}}.resp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely don't want this behavior. The crash log should have the expanded form of the response file, not the before-expansion form. Otherwise we have no chance of reproducing the crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should still be possible to reproduce the crash in this situation. Temporary files are automatically saved during a crash, correct? The intermediate response files are registered as ordinary temporary files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true locally, but not very useful when a user sends us a crash log and we want to see what they were doing.
lib/Driver/Job.cpp
Outdated
@@ -403,6 +407,30 @@ void Job::printSummary(raw_ostream &os) const { | |||
os << "}"; | |||
} | |||
|
|||
static void | |||
writeResponseFile(llvm::raw_ostream &OS, const llvm::opt::ArgStringList &Arguments) { | |||
// wrap arguments in quotes to ensure compatibility with Unix and Windows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Please use complete sentences in comments, including capitalization and a period at the end.
lib/Driver/Job.cpp
Outdated
OS << '"'; | ||
for (; *arg != '\0'; arg++) { | ||
if (*arg == '\"' || *arg == '\\') { | ||
OS << '\\'; // escape quote and backslash characters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep looking around for an existing function that does this. The closest is escapeAndPrintString
; any chance we can combine these?
lib/Driver/Job.cpp
Outdated
|
||
void Job::writeArgsToResponseFile() const { | ||
std::string responseContents; | ||
llvm::raw_string_ostream SS(responseContents); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use raw_fd_ostream
here and just write directly to the file. No need to write to an in-memory buffer and then export it.
Also, you should probably have a way of handling failure here. (What if the temporary directory isn't writable?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of the right way to handle a potential error since I don't appear to have access to the Diagnostic Engine within Job
. There is an example in Compilation.cpp
where raw_fd_ostream
is used, but there is a FIXME
for the error handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could just returning a bool or an llvm::Error from the function for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that writeFileWithEncoding
can fail just as much as raw_fd_ostream
can.)
lib/Driver/ToolChain.cpp
Outdated
llvm::opt::ArgStringList responseFileArg = {}; | ||
if (!llvm::sys::commandLineFitsWithinSystemLimits(executablePath, | ||
invocationInfo.Arguments) | ||
&& JA.allowsResponseFiles()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: You should probably test these two in the opposite order, so we don't waste time measuring arguments if we can't use a response file anyway.
test/Driver/response-file.swift
Outdated
|
||
// RUN: python -c 'for i in range(500001): print "-DTEST6_" + str(i)' > %t.8.resp | ||
// RUN: %empty-directory(%t/tmp) | ||
// RUN: env TMPDIR=%t/tmp/ %target-swiftc_driver %s @%t.8.resp -save-temps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include something for TEST6 to test, so that we know the arguments are actually getting read and there isn't just some kind of silent success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TEST6 is specifically testing that the temporary response files are being saved with -save-temps
. TEST5 uses an input identical to that of TEST6, and it does test that the flags are being read. That being said, does TEST6 need to test that the flags are being read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, that wasn't obvious to me. What if you grouped this RUN line with the previous ones and reused TEST5's response file?
(Also noticing that the response files and the compilation conditions should be using the same numbering to avoid confusion.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty reasonable, but I'd like to see some changes to help the next person better understand what's going on.
include/swift/Driver/Job.h
Outdated
/// Argument vector containing a single argument pointing to the response file | ||
/// path with the '@' prefix. | ||
/// The argument string must be kept alive as long as the Job is alive. | ||
llvm::opt::ArgStringList ResponseFileArg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about grouping ResponseFilePath
and ResponseFileArg
into a substructure? I think the extra structure makes the Job class more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new struct or class could have a nice comment describing it's purpose and use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense for the instance variable in Job holding the new structure to be optional?
include/swift/Driver/ToolChain.h
Outdated
@@ -121,6 +121,7 @@ class ToolChain { | |||
llvm::opt::ArgStringList Arguments; | |||
std::vector<std::pair<const char *, const char *>> ExtraEnvironment; | |||
std::vector<FilelistInfo> FilelistInfos; | |||
bool allowsResponseFiles = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this variable needs a comment explaining why it would ever be true or false.
Something like: "Some platforms limit argument length and those use response files..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not just a per-platform decision, so it's not that, but I agree that a comment would be nice.
lib/Driver/Compilation.cpp
Outdated
@@ -291,8 +291,14 @@ namespace driver { | |||
"not implemented for compilations with multiple jobs"); | |||
if (Comp.ShowJobLifecycle) | |||
llvm::outs() << "Added to TaskQueue: " << LogJob(Cmd) << "\n"; | |||
TQ->addTask(Cmd->getExecutable(), Cmd->getArguments(), llvm::None, | |||
(void *)Cmd); | |||
if (Cmd->hasResponseFile()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer a style that puts the shorter arm of the if-then-else first, even though it is the do-less arm. I feel that such a choice reduces the max depth of the reader's mental stack. But this is just a suggestion.
lib/Driver/Compilation.cpp
Outdated
} else { | ||
TQ->addTask(Cmd->getExecutable(), Cmd->getArguments(), llvm::None, | ||
(void *)Cmd); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, as I read further down, I noticed I had to stare at the two addTask
call lines to see the difference.
Therefore I propose factoring into a small, static function that gets the arguments, or else pushing this logic into the Cmd->getArguments
function. There should be only one line that calls TQ->addTask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cmd->getArguments
may be used elsewhere with the expectation that it returns the vector of arguments executed by this job, regardless of whether or not a response file is used. I felt it would be safer to have a separate method specifically for returning the response file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, but I still want the factoring. How about defining a new function, called getArgumentsOrResponseFileName
or getArgumentsAccountingForResponseFile
or getArgumentsForTask
? Or call the new one getArguments
and rename the old one to getArgumentsIgnoringResponseFile
. What IMO is going in is that the very notion of response file muddies the notion of arguments, as you appositely point out. So the best solution would be two, more verbose names for the existing and new getArguments
that make the distinction clear. For instance getDirectArguments
and getArgumentsWithResponseFIleIndirection
(or maybe getPossiblyPackagedArguments
). (I bet you can think of two better names!)
lib/Driver/Job.cpp
Outdated
printArguments(os, ResponseFileArg); | ||
} else { | ||
printArguments(os, Arguments); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the above is right, then I think it should be:
printArguments(os, hasResponseFile() ? ResponseFileArg : Arguments);
rather than the if.
Why? Because the code is only doing one printArguments
operation, and it is exactly the same, except for that second argument.
@@ -247,6 +247,7 @@ ToolChain::constructInvocation(const CompileJobAction &job, | |||
const JobContext &context) const { | |||
InvocationInfo II{SWIFT_EXECUTABLE_NAME}; | |||
ArgStringList &Arguments = II.Arguments; | |||
II.allowsResponseFiles = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it true in this case? Comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remain puzzled by that line. Please add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, see below.
@@ -325,7 +325,10 @@ toolchains::GenericUnix::constructInvocation(const LinkJobAction &job, | |||
Arguments.push_back( | |||
context.Args.MakeArgString(context.Output.getPrimaryOutputFilename())); | |||
|
|||
return {Clang, Arguments}; | |||
InvocationInfo II{Clang, Arguments}; | |||
II.allowsResponseFiles = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why true? Comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are a bunch of similar lines, could you please factor them into an intention-revealing method? Or add a comment to each?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, see below.
@@ -168,5 +168,8 @@ toolchains::Windows::constructInvocation(const LinkJobAction &job, | |||
Arguments.push_back( | |||
context.Args.MakeArgString(context.Output.getPrimaryOutputFilename())); | |||
|
|||
return {Clang, Arguments}; | |||
InvocationInfo II{Clang, Arguments}; | |||
II.allowsResponseFiles = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is the third time I'm seeing the setting to true. Is there something in common that the code should be reifying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. It's all based on what other tool is getting invoked, which is not necessarily the same across each toolchain and certainly not the same across each job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is, this is the customization point, so that the rest of the code can treat everything the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but how about comments explaining why allowsResponseFiles
is true in each case? Why might it be false vs true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment would read: // clang supports response files
. Given the way in which the code is written, such a comment would be redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still with Austin on this. If the executable specified in the InvocationInfo constructor supports response files, you can put true
here. I guess the doc comment for allowsResponseFiles
could say that a little more clearly, but it doesn't belong at the use sites.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the comment in include/swift/Driver/ToolChain.h
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating the comment. In thinking about this item, the comment you added is good and it is sufficient. I'm not asking for the following, but if I were to try to make it even clearer, I would use a one-line intention-revealing function. But, never mind.
tools/driver/driver.cpp
Outdated
@@ -134,6 +132,10 @@ int main(int argc_, const char **argv_) { | |||
llvm::cl::TokenizeGNUCommandLine, | |||
argv); | |||
|
|||
// Initialize the stack trace using the parsed argument vector with expanded | |||
// response files. | |||
PROGRAM_START(argv.size(), argv.data()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this can be affected by subsequent operations on argv
. Possible options are to push PROGRAM_START
even further down in main
, or to make a copy of argv
at this location for the sole purpose of initializing the stack trace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. Maybe we can do it the other way, and make sure that no one else changes argv after this point? The case where that does happen, shouldRunAsSubcommand
, can be responsible for making the copy. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure I understand your suggestion, the signature for shouldRunAsSubcommand
would look like:
static bool shouldRunAsSubcommand(
StringRef ExecName,
SmallString<256> &SubcommandName,
const SmallVectorImpl<const char *> &InputArgs,
SmallVectorImpl<const char *> &OutputArgs) { ... }
InputArgs
comes from what is currently argv
. OutputArgs
would initially point to InputArgs
, but would be replaced with a brand new vector in the event of a modification. OutputArgs
is what would be used in the rest of main
. This wouldn't necessarily prevent someone from modifying args
/InputArgs
, however.
I'm not sure if there is a stronger way to ensure that the argument vector doesn't get modified, aside from:
const SmallVector<const char *, 256> trace_args(argv);
PROGRAM_START(trace_args.size(), trace_args.data());
// continue as normal with 'argv'
Unfortunately, I don't think we can just make argv
a const
since it must be operated on twice before calling PROGRAM_START
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't realize that shouldRunAsSubcommand
was already modifying the input arguments. I feel like I'd change that to just taking the input arguments as an immutable ArrayRef, and then making the copy inside the body of the if
:
if (shouldRunAsSubcommand(ExecName, SubcommandName, argv)) {
SmallVector<const char *, 256> subCommandArgs(argv.begin(), argv.end());
subCommandArgs.erase(1);
// …
What do you think for that?
(Sorry for the delayed responses. The run-up to WWDC has been slamming a bunch of us!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrose-apple About WWDC: no worries!
I attempted to implement your suggestion, and created a commit for it. It was necessary to factor out the lower half of main
in order to properly handle the repl
case. If the sub-command is repl
, shouldRunAsSubcommand
modifies argv
and it returns false
, so a modified version of argv
is used outside of the if
block.
@dabelknap I'm checking all the PRs I'm involved in. Do you need anything more from me at this time? |
@jrose-apple I just need input on the handling of the arguments for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this may be the least-bad version. @davidungar, what do you think about the driver.cpp code now?
tools/driver/driver.cpp
Outdated
} | ||
|
||
static int run_driver(StringRef ExecName, | ||
const SmallVectorImpl<const char *> &argv) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: ArrayRef is a handy wrapper type for any array-like structure; please use it here and in shouldRunAsSubcommand
.
tools/driver/driver.cpp
Outdated
|
||
// If the subcommand is the "built-in" 'repl', then use the | ||
// normal driver. | ||
if (Subcommand == "repl") | ||
if (Subcommand == "repl") { | ||
isRepl=true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: spacing.
@jrose-apple Just checking to see if anything else needs to be done for this PR. |
I was still waiting for @davidungar to sign off as well, since I know he wants to make sure the additional complexity is as small as possible. |
tools/driver/driver.cpp
Outdated
@@ -71,7 +71,8 @@ extern int swift_format_main(ArrayRef<const char *> Args, const char *Argv0, | |||
/// \returns True if running as a subcommand. | |||
static bool shouldRunAsSubcommand(StringRef ExecName, | |||
SmallString<256> &SubcommandName, | |||
SmallVectorImpl<const char *> &Args) { | |||
const ArrayRef<const char *> &Args, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick (here and below): ArrayRef is intended to be passed around by value; it's just a pointer/length pair.
@jrose-apple Also, for this to be fully implemented, |
Yikes, thanks. I'll look into cherry-picking that into the Swift 4.2 LLVM branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started looking over the changes and my comments again, and AFAICT, @dabelknap just hasn't had a chance to address them yet. Let me know when you want me to look again, thanks!
lib/Driver/Compilation.cpp
Outdated
} else { | ||
TQ->addTask(Cmd->getExecutable(), Cmd->getArguments(), llvm::None, | ||
(void *)Cmd); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, but I still want the factoring. How about defining a new function, called getArgumentsOrResponseFileName
or getArgumentsAccountingForResponseFile
or getArgumentsForTask
? Or call the new one getArguments
and rename the old one to getArgumentsIgnoringResponseFile
. What IMO is going in is that the very notion of response file muddies the notion of arguments, as you appositely point out. So the best solution would be two, more verbose names for the existing and new getArguments
that make the distinction clear. For instance getDirectArguments
and getArgumentsWithResponseFIleIndirection
(or maybe getPossiblyPackagedArguments
). (I bet you can think of two better names!)
@@ -168,5 +168,8 @@ toolchains::Windows::constructInvocation(const LinkJobAction &job, | |||
Arguments.push_back( | |||
context.Args.MakeArgString(context.Output.getPrimaryOutputFilename())); | |||
|
|||
return {Clang, Arguments}; | |||
InvocationInfo II{Clang, Arguments}; | |||
II.allowsResponseFiles = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but how about comments explaining why allowsResponseFiles
is true in each case? Why might it be false vs true?
include/swift/Driver/Job.h
Outdated
/// Response file path | ||
const char *ResponseFilePath; | ||
|
||
/// Argument vector containing a single argument pointing to the response file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, I think the comment is inaccurate. It's not an argument vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that property is no longer a vector
. I'll fix the comment.
@@ -271,6 +271,14 @@ class Job { | |||
/// Whether the job wants a list of input or output files created. | |||
std::vector<FilelistInfo> FilelistFileInfos; | |||
|
|||
/// Response file path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this comment. It is completely redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of this comment? @jrose-apple Is there some coding standard I don't know about? Why not remove it entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This purpose of the comment is to remain consistent with the rest of the file. A few lines above reads:
/// The output of this command.
std::unique_ptr<CommandOutput> Output;
/// The executable to run.
const char *Executable;
for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand. Since I believe that this sort of comment degrades maintainability (yes, it's a tiny cut but there are virtually thousands of them), and since I try to enhance the maintainabilty of the code with each review, I asked for its deletion. However, a consistent-but-redundant comment is only a mere bagatelle.
@@ -271,6 +271,14 @@ class Job { | |||
/// Whether the job wants a list of input or output files created. | |||
std::vector<FilelistInfo> FilelistFileInfos; | |||
|
|||
/// Response file path | |||
const char *ResponseFilePath; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of a nitpick, but I would prefer it with another "const" after the star (const char * const
) in order to signal that this variable is never assigned to (assuming that is the case). I'm assuming it is initialized in the constructor and never changed thereafter. (Also applies in other cases, such as ResponseFileArg
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still like the const after the star here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const char *
is what is conventionally used in the rest of the class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I count about 100 lines of code in methods of that class. If I had to maintain this code and needed to know if that instance variable ever changed, I would have to read them. Adding the "cost" would save me that work. Therefore, I would not approve this PR without this "const". (Of course you don't need my approval to land it & that's OK. I would just feel derelict in my duty to favor consistency over maintainability in this case.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meta: At this point you've expressed a strong enough opinion that Austin does need your approval to land the PR. If you want to rescind that, you'd have to signify that, perhaps by "dismissing" your X review.
Personally, I'm with Austin on this one. Being inconsistent can imply that the inconsistency is deliberate, and therefore it is better to be consistent with existing code in a PR that's adding functionality. (That's especially true for someone who's a relative newcomer.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meta: @jrose-apple Please help me out here. Since the merge box is green, I figured he did not need my approval to land the PR. (I seem to recall reading verbiage to that effect on github, too.) Is there some policy (or implicit understanding) that I've missed? Where are you thinking that I express such a strong enough opinion? Was the need expressed, or implied? I don't understand.
You have a fair point about inconsistency, but I weigh that point against the benefits of this particular proposed inconsistency. In this case I would argue (on your side) that if the const is added to one variable, a future reader might infer that the other variables are not const. However, the counter argument would be that it would be clear for anyone with a passing familiarity with the code base that const is often not used when it could be. Thus, it seems to me that the consistency benefit is outweighed by the not-having-to-read-100-lines-of-code benefit of the const.
My experience as a novitiate to this code base has been that such consts would save me significant time. Ironically, it was my experience with adopting Swift for my code that convinced me of the significance of the benefits of write-once variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the merge box is green, I figured he did not need my approval to land the PR. (I seem to recall reading verbiage to that effect on github, too.) Is there some policy (or implicit understanding) that I've missed? Where are you thinking that I express such a strong enough opinion? Was the need expressed, or implied?
We aren't using GitHub's built-in gating for reviewers, true. But only a little above the green merge button is the part where "@davidungar requested changes", with a red X. Even without that, though, I think we want to take expressed concerns seriously unless the reviewer explicitly says something like "this is fine as is (but I think it could be better in these ways)".
include/swift/Driver/Job.h
Outdated
@@ -296,6 +308,7 @@ class Job { | |||
|
|||
const char *getExecutable() const { return Executable; } | |||
const llvm::opt::ArgStringList &getArguments() const { return Arguments; } | |||
const char * const &getResponseFileArg() const { return ResponseFileArg; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is a reference involved here? Why not just const char * getResponseFileArg
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Job
class needs to own the reference so that it remains in scope sufficiently long to execute. Otherwise the task queue will read junk memory and try to execute a nonsense command. This change was necessary to avoid keeping around a vector for the purpose of storing a single argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning a pointer wouldn't actually break any of that, but I guess using a const reference for a value that won't change signifies that the lifetime is tied to something else a little more strongly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did try just returning a const char *
, but doing so would cause it to read junk memory. Myself, @allevato, and @harlanhaskins looked at this issue, and this is what we came up with that worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TaskQueue::addTask
takes an ArrayRef<char *>
. If we return the value of getResponseFileArg()
as a pointer, then the returned pointer will get promoted to an ArrayRef
through ArrayRef::ArrayRef(T &OneElt)
, and the reference passed in is a temporary stack slot storing the returned address, not the actual address inside the Job
struct.
Returning a const
reference here ensures that the ArrayRef<char *>
that is passed to the TaskQueue
is actually tied to the Job
itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's horrible; nice work. Can you add that as a comment so that a later refactorer like me doesn't try to change it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternately I suppose you could return an ArrayRef here even though the backing store is the one const char *
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the ArrayRef
does appear to work actually; I went ahead and used that. Do you think it still needs a comment? The code is more straightforward now, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think this version is okay. Though I personally would have kept using the accessor methods in the other function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and replaced them.
@CodaFi can we start another round of smoke tests? |
@jrose-apple Of course—my intention wasn't to imply that we should just dismiss feedback from anyone, so my apologies if that's the impression that I gave. I merely want to make sure that the review for a bug fix that has risen greatly in severity for us lately is making good forward progress and to understand what we should prioritize (e.g., consistency with prevailing style or reviewer feedback, particularly when they conflict) for this and future reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we're almost there & thank you for your patience.
I know I originally said it was a bit of a nitpick, but I find myself feeling strongly about that const
after the *
for the ResponseFilePath
. I'm ready to approve as soon as that's added.
@allevato Add that one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After an offline conversation with @jrose-apple, I still would really like that "const" in there, but I realize that my "Request Changes" checkbox may actually keep you from merging. Since I intend to express a strong opinion but not to prevent you from disregarding my opinion, I am changing the box to "Comment".
After an offline conversation with @jrose-apple, I still would really like that "const" in there, but I realize that my "Request Changes" checkbox may actually keep you from merging. Since I intend to express a strong opinion but not to prevent you from disregarding my opinion, I am changing the box to "Comment".
@davidungar Thank you for your time and effort in reviewing this PR. I do agree that adding the extra @jrose-apple, is there anything else that needs to be addressed before merging? |
I think we can take it! Thanks for working with us throughout the review process. I'll take care of cherry-picking it to the 4.2 branch, too. |
Wrap Command Line Arguments in a Response File if System Limits are Exceeded https://bugs.swift.org/browse/SR-4517 (cherry picked from commit 7d8e40b)
Wrap Command Line Arguments in a Response File if System Limits are Exceeded https://bugs.swift.org/browse/SR-4517 (cherry picked from commit 7d8e40b)
On the master-next branch, the InitLLVM class used by the PROGRAM_START macro can modify the argc/argv values (currently only on Windows). Expanding response files before initializing the stack trace also modifies the arguments so use a separate SmallVector for each copy of the argument vector.
[master-next] Fix up the driver's use of PROGRAM_START after PR #16362
Hello. I wanted to know is this now enabled by default on XCode 10? We have hit this argument limit wall, and builds are failing. So is there a straightforward way to enable this? |
This is for communication between the different commands that the Swift compiler runs internally. We know there's still an issue with long command lines for Xcode running the top-level Swift command. That'll require an Xcode-side change that didn't make it into Xcode 10 (or 10.1). |
Hi there, |
The same thing I said above still applies, although the Xcode-side change is not in 10.2 either. |
Can I ask if this made XCode 10.2.1? |
"The same thing I said above still applies, although the Xcode-side change is not in 10.2.1 either." Please stop asking about Xcode changes on the Swift repository. |
Thanks so much for the reply. Can you recommend where we should ask? |
I…don't think Apple has a place to ask about future plans. But this seems like the sort of thing that should end up in the Xcode release notes when it changes. |
Sorry to add more comments about Xcode here, but just to bring a bit of clarification, an engineer at the WWDC labs checked it and told us it may be fixed by Xcode 11 beta 3, so hoping for that 🤞 Otherwise, we may need to keep duplicating rdar://50886131 on the new Feedback Assistant. |
(just to be clear, not locking in response to @Lascorbe's comment, but rather to keep this from happening again in the future) |
This is a follow-up to PR 15853, which allows the Swift compiler to accept response files. When the driver attempts to submit frontend jobs, the jobs will fail if the command line argument length exceeds the system limit. This PR automatically wraps the command line arguments in a temporary response file in the event the system limits are exceeded.
We use an implementation pattern similar to
filelists
. The temporary response file is reserved when theJob
object is created, but only when the argument vector exceeds the system limits. The arguments are not written to the temporary response file until theJob
is added to theTaskQueue
.-filelist
and-temporary-filelist
are no longer needed. They can be removed in this PR, or in a separate one if desired.-save-temps
flag does preserve the temporary response files, if any are created. Otherwise, they are removed when the compiler cleans up temporary files.The related Swift Forum discussion is here.
Resolves SR-4517.