-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Driver] Added process information to emitted task messages #16444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Driver] Added process information to emitted task messages #16444
Conversation
7d23272
to
48f7340
Compare
@swift-ci please smoke test |
@BenchR267 I'm not sure about this; we already gather and emit this information in JSON in the statistics-reporting subsystem (i.e. when you run with |
We definitely don't want Xcode or other external tools depending on the format from |
An advantage of having this information in the driver output is also that it's bundled to the finish message which is emitted when the sub-process finishes, not the whole driver invocation. |
^ That part's less useful until we have a way to identify which "jobs" were actually from the same process. |
What do you mean by that? The measured and emitted information in this PR are always bound to a process. Do you mean that a process could fulfill multiple jobs? |
@BenchR267 The callbacks that come out of the task queue are process events. Those processes no longer correspond 1:1 to (This is a compromise around how best to represent the sub-jobs within a batch process that is compiling multiple files. We had to choose something, and this seems like the most congruent with the way Xcode thinks about subtasks of the driver.) |
Ah, I see. So if a process is processing 5 different Swift files, I get 5 begin and 5 end messages (if all succeed) but each one has all input and output files in it. The only difference is that only one of them has a real PID (positive) and all others have quasi-PIDs? |
@BenchR267 No, none of them has a real PID. We only send messages about the quasi-PIDs, in order to avoid confusing consumers who think that the message stream is one-process-per-output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice start, but ydou'll want to check with @graydon about overlap of functionality. He's already added much of this functionality in a different way.
include/swift/Basic/TaskQueue.h
Outdated
@@ -39,6 +41,43 @@ enum class TaskFinishedResponse { | |||
StopExecution, | |||
}; | |||
|
|||
/// Define basic metrics about tasks that ran | |||
struct TaskFinishedMetrics { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see a bit in this comment about what the metrics will be used for, what code will consume them, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also (nitpick, in the Jordan sense), the name TaskFinishedMetrics
seems not quite write in two respects:
- These are only times, and
- These are not about the finishing, they are about the whole task.
So, how aboutTaskTimes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! Sorry, I just changed the structure a bit to react to @graydon s recent changes. This does now also contains the real pid (which is not emitted if jobs are batched otherwise). The Metrics struct does not only contain times (maxrss is also part of it) but I agree that another name might be a better fit.
What about Stats
or Usage
or ResourceUsage
?
include/swift/Basic/TaskQueue.h
Outdated
@@ -81,13 +120,15 @@ class TaskQueue { | |||
/// \param Errors the errors from the task which finished execution, if | |||
/// available and SeparateErrors was true. (This may not be available on all | |||
/// platforms.) | |||
/// \param Metrics define basic metrics about the task that ran (has no value | |||
/// if no separate process) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: "define" seems like the wrong word here. The parameter isn't defining which metrics are provided, rather it is supplying the values for the metrics.
The parenthetical comment deserves more prominence.
Something like: "If the task ran as a separate process, Metrics supplies the values for the times in took."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I changed the description.
include/swift/Basic/TaskQueue.h
Outdated
@@ -100,16 +141,19 @@ class TaskQueue { | |||
/// \param Errors the errors from the task which exited abnormally, if | |||
/// available and SeparateErrors was true. (This may not be available on all | |||
/// platforms.) | |||
/// \param Metrics define basic metrics about the task that ran (has no value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seem comment above.
lib/Basic/TaskQueue.cpp
Outdated
@@ -85,8 +85,8 @@ bool DummyTaskQueue::execute(TaskQueue::TaskBeganCallback Began, | |||
std::string Output = "Output placeholder\n"; | |||
std::string Errors = | |||
P.second->SeparateErrors ? "Error placeholder\n" : ""; | |||
if (Finished(P.first, 0, Output, Errors, P.second->Context) == | |||
TaskFinishedResponse::StopExecution) | |||
if (Finished(P.first, 0, Output, Errors, Optional<TaskFinishedMetrics>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will None
work instead of Optional<TaskFinishedMetrics>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, None
is much better. I just learned about llvm::Optional
, wanted to use a nullptr before 🙈 Btw: is there a map operation on Optional
?
lib/Basic/Unix/TaskQueue.inc
Outdated
static Optional<int> waitForPid(const pid_t pidToWaitFor); | ||
|
||
/** | ||
Wait for a process with a given pid to finish. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: should be "Wait for the process"
lib/Basic/Unix/TaskQueue.inc
Outdated
@@ -555,34 +562,36 @@ static bool | |||
cleanUpAHungUpTask(Task &T, | |||
const TaskQueue::TaskFinishedCallback FinishedCallback, | |||
const TaskQueue::TaskSignalledCallback SignalledCallback) { | |||
const Optional<int> StatusIfOK = waitForPid(T.getPid()); | |||
if (!StatusIfOK) | |||
const auto StatusIfOK = waitForPid(T.getPid()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename StatusIfOK
to StatusAndTimes
(assuming you change the Metrics class name to Times). Or rename to StatusAndMetrics
.
lib/Basic/Unix/TaskQueue.inc
Outdated
: false /* Can this case ever happen? */; | ||
} | ||
|
||
static Optional<int> waitForPid(const pid_t pidToWaitFor) { | ||
static std::pair<Optional<int>, Optional<TaskFinishedMetrics>> waitForPid(const pid_t pidToWaitFor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, you have waitForPid returning two optionals, but only two of the four possible cases can ever occur. If I'm right, it should be returning an optional pair, not a pair of optionals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was an optional status before, that's why I chose to keep the status optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better!
lib/Driver/Driver.cpp
Outdated
OI.SDKPath = output.str(); | ||
} | ||
return sys::TaskFinishedResponse::ContinueExecution; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lines 1429-1440 seem to show that they were reindented. Is this what git-clang-format did? (Folks here like to use git-clang-format to keep formatting stable across changes.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used git-clang-format over all changes, so probably it wasn't stable before ¯\(ツ)/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(But I still reverted the change to keep it as small as possible)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sigh... Thanks.
lib/Driver/ParseableOutput.cpp
Outdated
@@ -180,24 +181,29 @@ class BeganMessage : public DetailedCommandBasedMessage { | |||
|
|||
class TaskOutputMessage : public TaskBasedMessage { | |||
std::string Output; | |||
Optional<sys::TaskFinishedMetrics> Metrics; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll want to check with @graydon about overlap of functionality. He's already added much of this functionality in a different way.
7112672
to
1ca1aac
Compare
@graydon @davidungar Thanks for your input! I reworked on this patch and changed it so the output does contain the "real" pid as well so the client could know which jobs ran in one batch/process. I added this information to begin, finished and signalled messages while only finished and signalled may contain resource usage (thanks for the lead to a better name @davidungar). |
docs/DriverParseableOutput.rst
Outdated
|
||
Example:: | ||
|
||
{ | ||
"kind": "began", | ||
"name": "compile", | ||
"pid": 12345, | ||
"process": { | ||
"real_pid": 38732 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this impossible? From reading the documentation I inferred that the following must hold:
pid == process.real_pid || pid < 0
If so, please update the example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sure, you're right! I should also add that this will match the pid
value if pid
> 0 to the documentation!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for finding this! 👍🏻
docs/DriverParseableOutput.rst
Outdated
"exit-status": 0 | ||
"exit-status": 0, | ||
"process": { | ||
"real_pid": 31232, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
docs/DriverParseableOutput.rst
Outdated
"signal": 4 | ||
"signal": 4, | ||
"process": { | ||
"real_pid": 31232, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again
13c285d
to
18c4c14
Compare
579c9b1
to
5133e5a
Compare
@swift-ci please smoke test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks basically ok, modulo a bunch of changes for portability and such (noted inline). I would also ask that you test it with some older Xcodes (at least 9.3) in both new and old build systems, to be sure that it doesn't break them.
include/swift/Basic/TaskQueue.h
Outdated
uint64_t(Usage.ru_utime.tv_usec), | ||
uint64_t(Usage.ru_stime.tv_sec) * 1000000 + | ||
uint64_t(Usage.ru_stime.tv_usec), | ||
Usage.ru_maxrss) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is, unfortunately, a count of bytes rather than KB on macOS. See https://github.com/apple/swift/blob/master/lib/Basic/Statistic.cpp#L43-L60 for conditional form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh noes 🤦♂️
include/swift/Basic/TaskQueue.h
Outdated
/// \param Context the context which was passed when the task was added | ||
/// | ||
/// \returns true if further execution of tasks should stop, | ||
/// false if execution should continue | ||
using TaskFinishedCallback = std::function<TaskFinishedResponse( | ||
ProcessId Pid, int ReturnCode, StringRef Output, StringRef Errors, | ||
void *Context)>; | ||
TaskProcessInformation Process, void *Context)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor preference for calling this variable ProcInfo or something; it's not a process in the sense of being a handle or something, just auxiliary information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense 👍🏻 ProcInfo is a good idea
include/swift/Basic/TaskQueue.h
Outdated
@@ -21,6 +22,7 @@ | |||
#include <functional> | |||
#include <memory> | |||
#include <queue> | |||
#include <sys/resource.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be in include/swift/Basic/TaskQueue.h
; that file covers all platform task queues, including those non-unix / without-rusage. Instead, it should be in the lib/Basic/Unix/TaskQueue.inc
unix-specific include fragment, and the code that includes this should be guarded by #ifdef HAVE_SYS_RESOURCE_H
, and the code that instantiates struct rusage should be guarded by #if defined(HAVE_GETRUSAGE) && !defined(__HAIKU__)
See the rusage-extracting code in lib/Basic/Statistic.cpp
for example (also for the apple-specific bytes-vs-kb bit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the import isn't needed at all. It's at least building (incremental and clean) without it.
lib/Basic/Unix/TaskQueue.inc
Outdated
for (;;) { | ||
int Status = 0; | ||
const pid_t pidFromWait = waitpid(pidToWaitFor, &Status, 0); | ||
struct rusage Usage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wait4
call isn't POSIX, it's mostly-standard on anything BSD-compat but it should probably be conditionalized. I suspect it's safe to hang this on the HAVE_GETRUSAGE
probe already done over in LLVM, but it's a bit indirect; maybe add a specific check_symbol_exists(wait4, "sys/wait.h", HAVE_WAIT4)
probe to swift's CMakeLists.txt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's a good idea. If wait4 is not available, I just fallback to waitpid and return None usage. Just need to find out how that cmake definitions work :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried different configurations but it didn't worked out like I expected. I added check_symbol_exists(wait4 "sys/wait.h" HAVE_WAIT4)
to the CMakeLists.txt and expected that I can check with #if defined(HAVE_WAIT4)
in the code but it didn't work (it evaluated to false while wait4 should be there on my system.
Could you please give me a hint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your help offline, I added the check-symbol to the cmakelist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is there a map operation on Optional?" Good question! I don't know about one. Let me know what you find out!
lib/Driver/ParseableOutput.cpp
Outdated
int ExitStatus, StringRef Output) { | ||
FinishedMessage msg(Cmd, Pid, Output, ExitStatus); | ||
void parseable_output::emitFinishedMessage( | ||
raw_ostream &os, const Job &Cmd, int64_t Pid, int ExitStatus, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this git-clang-format indenting?
lib/Driver/ParseableOutput.cpp
Outdated
ExitStatus(ExitStatus) {} | ||
Optional<sys::TaskFinishedMetrics> Metrics, int ExitStatus) | ||
: TaskOutputMessage("finished", Cmd, Pid, Output, Metrics), | ||
ExitStatus(ExitStatus) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check formatting.
|
||
Example:: | ||
|
||
{ | ||
"kind": "began", | ||
"name": "compile", | ||
"pid": 12345, | ||
"process": { | ||
"real_pid": 12345 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pid
and real_pid
have some potential to create confusion, but I don't see answer, and it's not your fault, anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main problem here is that 'pid' is re-used, but I also don't see another solution without breaking infrastructure that relies on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sigh.
@@ -29,15 +29,19 @@ class Job; | |||
namespace parseable_output { | |||
|
|||
/// \brief Emits a "began" message to the given stream. | |||
void emitBeganMessage(raw_ostream &os, const Job &Cmd, int64_t Pid); | |||
void emitBeganMessage(raw_ostream &os, const Job &Cmd, int64_t Pid, | |||
sys::TaskProcessInformation Process); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Echoing Graydon, Process
is the wrong name here. I'd be fine with TPI
(pi
could be interpreted as 3.14) or processInfo
.
Also it would be nice to be consistent (locally) on matters of casing. In other words, either OS
instead of os
, or cmd
and pid
and tpi
. (Sigh, the casing is a nitpick, I know.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed it to ProcessInfo already, just want to fix the other concerns as well before pushing it :) Interpreting pi
as the number pi
is an interesting thought though :D It could also be interpreted as 🍰.
I didn't create the signature of the function but it seems to be convention in llvm to write variables in UpperCamelCase, os
however refers better to output stream than OS
which could indicate operating system. It's not ideal but I would like to keep it like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've seen OS
for output stream, and I keep thinking operating system, yes. Feel free to keep it as is.
include/swift/Basic/TaskQueue.h
Outdated
@@ -39,6 +41,57 @@ enum class TaskFinishedResponse { | |||
StopExecution, | |||
}; | |||
|
|||
/// TaskProcessInformation contains information about the process a task was | |||
/// associated with. | |||
struct TaskProcessInformation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still looking for more info concerning how this information will or might be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also TaskProcessInformation
is OK, if a bit awkward. "Information" could be anything, including the execution arguments. Is there a more specific term? Right now, its "times". Are you planning to generalize? Or would "TaskExecutionStatistics" cover it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that this contains the real_pid I see it as a container for all 'information' about the OS process that ran. The times/usage are one part of it, real_pid is the other (for now). I don't plan to add more to this but it could contain other information like the user who started it (which is pretty useless in this case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better! I think the main thing I'm missing now is a comment explaining how the new info will be used (unless it's there & I missed it). Think about someone porting this code to a platform where there's a different facility than rusage that gives different info. What would he have to know about the info that is really needed and in what form? Or what would break if all the times were zero?
2d71325
to
0f6ea7d
Compare
@swift-ci please smoke test |
@graydon @davidungar could you please take another look? Thanks :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a good offline discussion. LGTM!
/// tasks ran in the same process (in multifile-mode or when WMO is activated | ||
/// e.g.). If available, it also contains information about the usage of | ||
/// resources like CPU time or memory the process used in the system. How ever, | ||
/// this could differ from platform to platform and is therefore optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two thoughts:
- (nitpick) If the comment says "task resource usage", why is the name not
TaskResourceUsage
? - The "If available" sentence is redundant and (redundantly) not specific enough. Why are you making this change? Is there some specific use you have in mind? For instance, "When available, an Xcode extension will provide a changing desktop color indicating how fast the compilations are running." In other words, I want to know what might break if I as a maintainer, mess this up in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is explicitly to not about task resource usage but about the usage of the process that ran this task. But it might be worth mentioning that this is the case. The difference between the two is already becoming very confusing!
A process can run multiple tasks (e.g. a process that compiles 5 swift files) but still all tasks are separately outputted. The usage relates always to the process so if someone would like to sum all usages up, (s)he should unique by real-pid
first. I'll add this to the comment since it's very essential!
The 'if available' relates to the fact that this is depending on the rusage
and wait4
symbols. If they are not available on your system, you also don't get any usage information. You could also run the driver by stating skip-execution
and you would also get no usage information. I'll also add this to the comment!
Thanks for the talk, really appreciate your feedback! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it sure can be confusing.Your comment will be great! My pleasure, I enjoyed it.
285dc0f
to
e49d954
Compare
@swift-ci please smoke test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok to me, though would like to see the cpp chatter moved to 0th column.
lib/Basic/Unix/TaskQueue.inc
Outdated
#ifndef __APPLE__ | ||
// Apple systems report bytes; everything else appears to report KB. | ||
this->ProcessUsage.getValue().Maxrss <<= 10; | ||
#endif // __APPLE__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad indentation on the endif here.
lib/Basic/Unix/TaskQueue.inc
Outdated
for (;;) { | ||
int Status = 0; | ||
|
||
#if defined(HAVE_GETRUSAGE) && !defined(__HAIKU__) && defined(HAVE_WAIT4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put preprocessor directives in column 0.
When providing the -parseable-output flag to the swift compiler, it will provide json formatted messages about tasks that run. I added some optional usage information in form of user time, system time and maxrss to the output. This can be used by other tools using the compiler to get some insights about time and memory usage. Since the output does not longer match processes run (in batch mode), I also added a real_pid field so the client could reason about jobs that belong together if needed. rdar://39798231
e49d954
to
ac10fb3
Compare
@swift-ci please smoke test |
When providing the -parseable-output flag to the swift compiler, it will provide json formatted messages about tasks that run.
I added some optional metrics in form of user time, system time and maxrss to the output. This can be used by other tools using the compiler to get some insights about time and memory usage.
rdar://39798231