Skip to content

Rdar 40526328 emit sigint for cancelled batch constituents #17167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

graydon
Copy link
Contributor

@graydon graydon commented Jun 13, 2018

This detects "batch constituents that had one of their neighbours emit an error" as signalled by #17131 and propagates that information to parseable output by writing a "signalled" message (pseudo-SIGINT) for each such pseudo-task.

This is backward-compatible with previous semantics, while giving consumers the opportunity to more-fully suppress such cases if they like (SIGINT is after all a user-generated interrupt signal; there's likely no point reporting it to the user as an error in any case.)

I also included a commit in here that supports adjusting batch size independently from -j / parallelism size. This was useful for testing this change, and has come up repeatedly as "something that would help test a hypothesis" during the past few months of testing batch mode.

rdar://40526328

Copy link
Contributor

@davidungar davidungar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fixes! As it stands, it is above threshold for committing. I don't see any correctness issues. I have pointed out some places where I think a bit of factoring could improve clarity for the next person who has to pick up this code. Of course, such things are partially subjective, so feel free to take or leave them.

@@ -110,6 +110,9 @@ def driver_filelist_threshold_EQ : Joined<["-"], "driver-filelist-threshold=">,
def driver_batch_seed : Separate<["-"], "driver-batch-seed">,
InternalDebugOpt,
HelpText<"Use the given seed value to randomize batch-mode partitions">;
def driver_batch_size : Separate<["-"], "driver-batch-size">,
InternalDebugOpt,
HelpText<"Use the given number of batch-mode partitions, rather than default parallelism level">;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!! Great that you added this; we have needed it for a while.

@@ -174,6 +174,9 @@ class Compilation {
/// Provides a randomization seed to batch-mode partitioning, for debugging.
const unsigned BatchSeed;

/// Overrides parallelism level as count of batches, if in batch-mode.
const Optional<unsigned> BatchSize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the use of an Optional here.

@@ -669,6 +669,15 @@ Driver::buildCompilation(const ToolChain &TC,
A->getAsString(*ArgList), A->getValue());
}
}
if (const Arg *A = ArgList->getLastArg(options::OPT_driver_batch_size)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have factored this out into a static function, in order to keep a large routine (buildCompilation) from getting larger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. also localized the flag and two others to the function, as they were previously (pointlessly) members of Driver only to be passed off to members of Compilation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

const Job *J) const {
auto DiaPath =
J->getOutput().getAnyOutputForType(file_types::TY_SerializedDiagnostics);
if (!DiaPath.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formulation requires the reader to read many lines before understanding what happens if the DiaPath is empty. I'd be inclined to reverse the sense of the if, and use an early return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

auto DiaPath =
J->getOutput().getAnyOutputForType(file_types::TY_SerializedDiagnostics);
if (!DiaPath.empty()) {
if (llvm::sys::fs::is_regular_file(DiaPath)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above-- I'd be inclined to collapse and reverse this and prior into an or:
"if ( || ) return false;"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

if (!DiaPath.empty()) {
if (llvm::sys::fs::is_regular_file(DiaPath)) {
uint64_t Size;
auto EC = llvm::sys::fs::file_size(DiaPath, Size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along the same handle-errors-first style (I picked up long ago doing OS-oidal coding),
my own style would be moving the error case handling first. Then the whole routine would read better (to my eyes), because it would be series of error tests, each returning false, with the return Size == 0 last. But these are suggestions, not requirements as far as I am concerned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love how this turned out.

ReturnCode, Output, ProcInfo);
if (ReturnCode != 0 &&
isBatchJob(FinishedCmd) &&
batchConstituentJobProducedZeroLengthSerializedDiagnostics(J)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three conjuncts in this if deserve to be factored out. There is an intension-crossing here: as I read it, the if is really testing something like wasJobIncomplete, so I would factor out the predicate into a (static?) function testing the three values and returning a bool. Then there could be a nice Doxygen comment for that function explaining a bit.
I would also be tempted (in addition) to factor out the whole paragraph from lines 536 to 547 into an parseable_output static function called something like emitAppropriateMessage in order to reflect the intentionality here, as well as to keep the enclosing context from growing any larger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

// Simulate SIGINT-interruption to parseable-output consumer for any
// constituent of a failing batch job that produced no errors of its own.
parseable_output::emitSignalledMessage(llvm::errs(), *J, P,
"cancelled batch constituent",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A string constant ("cancelled batch constituent") probably belongs in a header file somewhere, in case it needs to be used somewhere else in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, nothing ought to sniff it and it's just a diagnostic hint for someone reading the parseable output; we have other diagnostic string constants sprinkled around. The only string constants I see in headers are builtin symbols / module names / filename extensions. I think this is ok as-is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, fine to leave it. (Just a general feeling I have about string constants, about what may or may not happen in the future.)

@graydon graydon force-pushed the rdar-40526328-emit-SIGINT-for-cancelled-batch-constituents branch from b94af50 to 72b3640 Compare June 14, 2018 05:03
@graydon
Copy link
Contributor Author

graydon commented Jun 14, 2018

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 72b3640d1ad16d2f860ec64e6f2b6bf909c6673b

@graydon graydon force-pushed the rdar-40526328-emit-SIGINT-for-cancelled-batch-constituents branch from 72b3640 to 1a39b71 Compare June 14, 2018 06:12
@graydon
Copy link
Contributor Author

graydon commented Jun 14, 2018

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 72b3640d1ad16d2f860ec64e6f2b6bf909c6673b

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 72b3640d1ad16d2f860ec64e6f2b6bf909c6673b

}
DriverForceOneBatchRepartition =
unsigned DriverBatchSeed = getDriverBatchSeed(ArgList, Diags);
Optional<unsigned> DriverBatchCount = getDriverBatchCount(ArgList, Diags);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for factoring; I think it helps readability. When I come back to #16762 (waiting for @jrose-apple ), I will move these computations closer to their uses and constify it.

/// Check to see if a job produced a zero-length serialized diagnostics
/// file, which is used to indicate batch-constituents that were batched
/// together with a failing constituent but did not, themselves, produce any
/// errors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if the function were named based on that purpose, rather than on how it's planning to check. jobWasBatchedWithFailingJobs or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I wish I had caught that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

if (EC)
return false;
return Size == 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for rearranging. I like the way this came out, hope you do, too.

isBatchJob(ContainerJob) &&
jobProducedZeroLengthSerializedDiagnostics(ConstituentJob);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice encapsulation!

StringRef Output,
const Job *FinishedCmd,
TaskProcessInformation ProcInfo) {
FinishedCmd->forEachContainedJobAndPID(Pid, [&](const Job *J,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but there is a small potential for confusion remaining: the function is called emitParseableOutputForFinishedJob yet it includes an iterator: forEachContainedJobAndPID. The latter suggests that it is emitting parseable output for many jobs, but the former implies emitting output for one job. Is the problem an ambiguity in "job"? Are there better names? Or perhaps just a comment?? Or am I splitting hairs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does say "for each contained job", so I don't think it's wrong, per se. The correct parseable output for a batch job includes the parseable output for all the sub-jobs. But I see what you mean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's hugely ambiguous but I get the point; renamed to emitParseableOutputForEachFinishedJob, which at least puts some multiplicity / each-ness in the name.

Copy link
Contributor

@davidungar davidungar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I did spot one minor potential for confusion that could be remedied by a different name or an explanatory comment. Thank you.

Copy link
Contributor

@jrose-apple jrose-apple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a few comments from me, nothing critical. I'm always glad when something like this turns out to be straightforward.

@@ -643,6 +643,35 @@ static bool getFilelistThreshold(DerivedArgList &Args, size_t &FilelistThreshold
return false;
}

static unsigned
getDriverBatchSeed(std::unique_ptr<llvm::opt::InputArgList> &ArgList,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Taking a reference to an owning pointer is sending some really weird ownership signals. I would have expected the parameter type to be const llvm::opt::InputArgList &, since there's no ownership transfer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I need to pay more attention to ownership when I review PRs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

/// Check to see if a job produced a zero-length serialized diagnostics
/// file, which is used to indicate batch-constituents that were batched
/// together with a failing constituent but did not, themselves, produce any
/// errors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if the function were named based on that purpose, rather than on how it's planning to check. jobWasBatchedWithFailingJobs or something?

StringRef Output,
const Job *FinishedCmd,
TaskProcessInformation ProcInfo) {
FinishedCmd->forEachContainedJobAndPID(Pid, [&](const Job *J,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does say "for each contained job", so I don't think it's wrong, per se. The correct parseable output for a batch job includes the parseable output for all the sub-jobs. But I see what you mean.

@graydon graydon force-pushed the rdar-40526328-emit-SIGINT-for-cancelled-batch-constituents branch from 1a39b71 to dcc4373 Compare June 14, 2018 19:05
@graydon
Copy link
Contributor Author

graydon commented Jun 14, 2018

@swift-ci please test and merge

1 similar comment
@graydon
Copy link
Contributor Author

graydon commented Jun 14, 2018

@swift-ci please test and merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants