Skip to content

Wire up FileSpecificDiagnosticConsumer for serialized diagnostics #14900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 3, 2018

Conversation

jrose-apple
Copy link
Contributor

Now that most of @davidungar's batch mode work has landed in the frontend, I can actually use the RangeSpecificDiagnosticConsumer from #14844 to get us outputting multiple .dia files in batch mode. We want diagnostics that are definitely associated with a particular file to be emitted in that file's .dia output, while diagnostics that may be associated with other files (such as those that come from Clang) get emitted in all .dia outputs to make sure they show up everywhere they're needed.

This deserves some extra documentation somewhere about why this is safe and correct, but hey, it appears to work!

@jrose-apple
Copy link
Contributor Author

@swift-ci Please test

@swift-ci
Copy link
Contributor

swift-ci commented Mar 1, 2018

Build failed
Swift Test Linux Platform
Git Sha - 3d345a8260cee4834abef9c88020a7e7e47ebc33

@swift-ci
Copy link
Contributor

swift-ci commented Mar 1, 2018

Build failed
Swift Test OS X Platform
Git Sha - 3d345a8260cee4834abef9c88020a7e7e47ebc33

/// \brief DiagnosticConsumer that funnels diagnostics in certain files to
/// particular sub-consumers.
///
/// Diagnostics that are not in one of the special files are emitted into every
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a sentence or two to explain what a "certain" or "special" file is would help a novice to understand what's going on here. What's the basic problem? Something like: Xcode needs serialized diagnostic files with the diagnostic routed to the right file. [Why?] This bigger picture is the sort of thing one cannot get from just reading the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was doing that somewhat intentionally, since this is in theory a reusable component. But you're right, a few lines on the example, intended use would make sense.

/// position can be done using binary search.
///
/// \see #consumerForLocation
SmallVector<std::pair<CharSourceRange, unsigned>, 4> LocationToConsumerMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you could call it something like ConsumersOrderedByLocation which is a tad more specific.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to read consumerForLocation to figure out that LocationToConsumerMap maps not directly to a consumer but to an index in the SubConsumers vector. Suggest adding a comment to point out or at least imply the invariant that you cannot later change the code to mess with the order in that vector without also adjusting the map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the second comment: I should probably just make the SubConsumers vector const.

To the first: Hm, I'll think about it. I don't quite like that name either but it is better than what I have, now that you've pointed it out.

/// that was most recently emitted.
///
/// If null, Note diagnostics are sent to every consumer.
DiagnosticConsumer *ConsumerForSubsequentNotes = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice name!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I realized I was being too clever. I think at some point we're going to want to emit notes before a particular error or warning (which Clang already does in some cases, IIRC), and then we're going to need the notion of an explicit "diagnostic group", rather than the implicit thing we have now. But we can change that when it comes to that.

public:
/// Takes ownership of the DiagnosticConsumers specified in \p consumers.
///
/// There must be no duplicate files, but an empty file name can be used to
Copy link
Contributor

@davidungar davidungar Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this sentence uses both "duplicate files" and "file name" it could potentially confuse a reader. How about "File paths must be unique, but an empty file path ..."? Crazy question: Can an empty file path be duplicated? Does an empty string really stand for no-path? This convention has confused me before: I think the code base uses empty strings in many places to mean "no string". My preference would be to explicitly call such a convention out in a comment, even though my code has neglected to do so, because it seemed like common practice.

FileSpecificDiagnosticConsumer::FileSpecificDiagnosticConsumer(
SmallVectorImpl<ConsumerPair> &consumers)
: SubConsumers(std::move(consumers)) {
#ifndef NDEBUG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to stare at this for a minute to figure out that the only purpose for the following code here is to check for duplicate files. If it had been factored out into a function called something like checkForDuplicateFiles it would have sped up my comprehension.

#ifndef NDEBUG
llvm::StringSet<> seenFiles;
for (const ConsumerPair &consumerPair : SubConsumers) {
if (consumerPair.first.empty())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would the first element ever be empty? Is it worth a comment somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the "empty files represent no affinity" thing from the doc comment of the constructor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Maybe reify that into a (static) function that returns a bool, and takes a ConsumerPair argument, and call the function hasAffinity.

if (consumerPair.first.empty())
continue;
bool isUnique = seenFiles.insert(consumerPair.first).second;
assert(isUnique &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would duplicates be an error instead of just something to ignore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the way things are set up only one of the two consumers will get the diagnostics from that file, rather than both of them, and I don't think it's even guaranteed which (because of std::sort). We could change that if it ever mattered, but it's slightly harder to implement and we don't need it today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add what you wrote above as a comment for the assert?

}

void
FileSpecificDiagnosticConsumer::buildLocationToConsumerMap(SourceManager &SM) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factoring this out really helps clarify the code for me.

@@ -111,6 +112,55 @@ class NullDiagnosticConsumer : public DiagnosticConsumer {
ArrayRef<DiagnosticArgument> FormatArgs,
const DiagnosticInfo &Info) override;
};

/// \brief DiagnosticConsumer that funnels diagnostics in certain files to
/// particular sub-consumers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the funneling needed? (I know I already commented about a bigger picture comment, which could help.)
Inheritance, or aggregation? Is it a special kind of DiagnosticConsumer that funnels, or is it an aggregation or ordinary DiagnosticConsumers, a multiplexer in front of a bunch of them? Just asking, don't know the best answer.

getRawLoc(right.first.getEnd()).getPointer());
});

// Check that the ranges are non-overlapping.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What part of the compiler guarantees non-overlap? What would be the harm? Would this code fall over and die, or just possibly misfile a diagnostic? If in the future, someone introduces a bug that produces overlapping ranges, do we want the compiler to crash or to merely whine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misfiled diagnostics, but yes, I think we still want the compiler to crash in such a case when assertions are enabled.


// This map is generated on first use and cached, to allow the
// FileSpecificDiagnosticConsumer to be set up before the source files are
// actually loaded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great comment!

// actually loaded.
if (LocationToConsumerMap.empty()) {
auto *mutableThis = const_cast<FileSpecificDiagnosticConsumer*>(this);
mutableThis->buildLocationToConsumerMap(SM);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the use of mutableThis!

}

// "Find the first range whose end location is greater than or equal to
// 'loc'."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the quotes around the comment a Doxygen thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to imply that it was a sort of "translation" of the following invocation, rather than a higher-level operation. (Because as a higher-level operation, this would be a terrible comment. It doesn't tell you why it's looking for something >=.)

That didn't come across, so I'll think about rephrasing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to do that could be to use a verbose name for possiblyContainingRangeIter such as locatorForEncompassingRange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the same name, no? The iterator for the range that might contain the location.

getRawLoc(loc).getPointer());
});

if (possiblyContainingRangeIter != LocationToConsumerMap.end() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My style would be to factor out this predicate into a function called actuallyContains or some such.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it takes all the context of the current method (both the LocationToConsumerMap and the location we're searching for), I don't think that pulls its weight. Doubly so because iterators are weird annoying types to spell out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it clear that I meant merely factoring out possiblyContainingRangeIter != LocationToConsumerMap.end() & possiblyContainingRangeIter->first.contains(loc)? Let's see... you would have to pass in possiblyContainingRangeIterandLocationToConsumerMap.endandloc. I would still factor it out, but I see your point! If it doesn't pull its weight, how about a comment for the if` like: "Does it really contain the location?".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new comment at the top of the lower_bound covers this now.

StringRef FormatString, ArrayRef<DiagnosticArgument> FormatArgs,
const DiagnosticInfo &Info) {

DiagnosticConsumer *specificConsumer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would factor the switch out into a function that returned the specificConsumer (or null). The specificConsumer variable becomes local to the first three cases that way, so it's less code to trace through in a reader's head.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think that's worth it, especially since there's an update to ConsumerForSubsequentNotes. Lots of small functions are also hard to reason about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's a bit of Smalltalk-oidal transformative experience going on on my part here. Here's how I would push back: In this case, it would be a function called findSpecificConsumerIfAny and I could just read the name and know what's going on in the caller. Without the separate function, I need to read the body of the switch to see that it is setting the variable, and then also read further to see that the variable is never changed again. (Swift is way better:)

I realize that the uninitialized variable followed by conditional code that sets it is a common idiom in C(++), but my other beef is that the compiler does nothing to guarantee it is not modified below. When functions get longer, I always have to grep for the variable to ensure it doesn't get modified in the middle.

This particular case is not that bad, but I feel that there is something like "the death of a thousand cuts", where enough not-too-tough cases have a cumulative effect on the effort required to dive into an unfamiliar code base.

// behavior.
bool hadError = false;
for (auto &subConsumer : SubConsumers)
hadError |= subConsumer.second->finishProcessing();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen this idiom before in our code, but I worry about the bool/int conflation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's always safe with C++ bool, so I don't feel too bad about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

// FIXME: Do we need one SerializedConsumer per primary? Owned by the
// SourceFile?
std::unique_ptr<DiagnosticConsumer> SerializedConsumer;
std::unique_ptr<DiagnosticConsumer> SerializedConsumerMultiplexer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SerializedConsumerMultiplexer gives me pause. Thinking aloud... It directs SerializedDiagnostics to the appropriate consumer. In a way, it's a demultiplexer, taking a single stream of multiplexed diagnostics and separating them out into separate streams, each directed to a specific consumer. Feels like there could be a better name here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, right. I always get the mux/demux order backwards.

I don't actually care what this name is, and neither does the code; this is only here to keep the consumer alive. I had TopSerializedConsumer before but that was really even more vacuous. Any suggestions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SerializedDiagnosticConsumerSorter? SerializedDiagnosticDistributor?? What's this thing doing? Directing diagnostics to an ultimate consumer?? DiagnosticPachinko? (No, that would be random.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pavel came up with "dispatcher", which I thought was fine. "Distributor" isn't bad either.

std::string SerializedDiagnosticsPath = Input.serializedDiagnosticsPath();
if (SerializedDiagnosticsPath.empty()) {
// If one input does not have a serialized diagnostic output path, none
// of them will.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this checked somewhere with a diagnostic issued? Does it need to be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For noobs could add the following to the comment: "because serialized diagnostics are only used when Swift is invoked for Xcode, and in that case each primary input must communicate its diagnostics back to the IDE."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, hm. I was assuming this was guaranteed by the driver, not the IDE. I suppose there's no reason to build in that limitation here, though. I'll just return false instead of true to keep going.

Invocation.getSerializedDiagnosticsPathForAtMostOnePrimary();
if (!SerializedDiagnosticsPath.empty()) {
SerializedConsumer.reset(
SmallVector<FileSpecificDiagnosticConsumer::ConsumerPair, 4>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 4? There must be something about the implementation of SmallVector. If I had to guess, I would bet that the common cases are 1 and many, FWIW.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading it, I see that the following code produces a vector of ConsumerPairs. Suggest adding either a comment to that effect or factoring into a subroutine. In any event, it would help to add a comment explaining why such a vector is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right about "1 and many", but there are lots of projects where "many" is still going to be <= 4. (On an eight-core machine, that supports up to 32 files in a clean build.) You're right that I just made the number up, though.

I don't understand what you mean about the following code. The vector is built up and then passed to the constructor that uses it. The code that does the building up says what it does. What would the subroutine be called?

(I could see factoring out the whole creation of the FileSpecificDiagnosticConsumer, including building this vector, but just the vector part doesn't seem to buy us anything.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the 4, how about a comment explaining why 4?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding factoring the vector-building: lines 1627 - 1653 are all computing one thing, the SerializedConsumerMultiplexer. But I had to read it carefully to see that that was the only thing it was doing, and that the multiplexer is not modified thereafter. Likewise for the 20-ish lines to compute the vector. It feels like there are functions being computed, but it could be easier to see that than what is apparent at a glance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: vector-building: Again, I'd be more inclined to pull out the whole multiplexer part (now "dispatcher" thanks to Pavel). I can do that. If I wanted to pin down why the vector computation isn't worth pulling out separately, I think it's because it's not reusable in any interesting way and it can only be described in terms of "this will then be used to create a FileSpecificDiagnosticConsumer".

Re: 4: I don't know. We have SmallVectors all over the place. Most of them have arbitrary initial sizes. 4 sounds good to me, but is it better than 8? I have no idea. I don't think it's worth my time to look into, but maybe it is.

SerializedConsumer.reset(
SmallVector<FileSpecificDiagnosticConsumer::ConsumerPair, 4>
SerializedConsumers;
Invocation.getFrontendOptions().InputsAndOutputs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it good practice to add a (void) in front of this line to show that the code explicitly throws away the result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Maybe? I feel like this is one of those cases where the result was always incidental to begin with, like the result of Dictionary.remove in Swift. Some situations care about it, some don't, but ignoring it is intended much more often than it's a mistake.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Whatever the common practice is for our team is OK with me.

return false;
});

if (!SerializedConsumers.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to execute this block even if SerializedConsumers is empty? Would that just use a bit more time? It could potentially make the compiler a scintilla more reliable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would waste time on every diagnostic. I'd rather not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a conceivable situation where the time difference would be noticeable? I had assumed that the extra time for a diagnostic (when there are no serialized consumers) would be unobservable, given the time required to generate the diagnostic, but maybe I'm wrong. It would be good for me to correct my intuition.

If the if-test is merely an optimization, and not needed for correctness, it would help to add a comment: "Save time when there are no SerializedConsumers" so that future maintainers know it is just an optimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably is negligible, but that just feels like the wrong attitude to me. The test is free and doesn't really add any complexity for a human, and it makes the implementation of FileSpecificDiagnosticConsumer have one less case to worry about. (Right now it would try to build the sorted list of consumers over and over again because I happen to be using "empty" as a sentinel state for "not yet computed".)

Copy link
Contributor

@davidungar davidungar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good! I've noted places where it might be possible to increase the clarity of the code.

@jrose-apple
Copy link
Contributor Author

New version, with some but not all of David's suggestions.

@swift-ci Please test

@swift-ci
Copy link
Contributor

swift-ci commented Mar 2, 2018

Build failed
Swift Test OS X Platform
Git Sha - 3d345a8260cee4834abef9c88020a7e7e47ebc33

@swift-ci
Copy link
Contributor

swift-ci commented Mar 2, 2018

Build failed
Swift Test Linux Platform
Git Sha - 3d345a8260cee4834abef9c88020a7e7e47ebc33

}
}

// FIXME: Do the fix-its need to be multiplexed like the serialized
// diagnostics?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked to Xi and the answer is "yes", but I'll do that in a separate PR.

Copy link
Contributor

@davidungar davidungar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I don't have anything more to contribute, and the substance is fine. Land it or polish it, either is OK.

}

// "Find the first range whose end location is greater than or equal to
// 'loc'."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to do that could be to use a verbose name for possiblyContainingRangeIter such as locatorForEncompassingRange.

getRawLoc(loc).getPointer());
});

if (possiblyContainingRangeIter != LocationToConsumerMap.end() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it clear that I meant merely factoring out possiblyContainingRangeIter != LocationToConsumerMap.end() & possiblyContainingRangeIter->first.contains(loc)? Let's see... you would have to pass in possiblyContainingRangeIterandLocationToConsumerMap.endandloc. I would still factor it out, but I see your point! If it doesn't pull its weight, how about a comment for the if` like: "Does it really contain the location?".

StringRef FormatString, ArrayRef<DiagnosticArgument> FormatArgs,
const DiagnosticInfo &Info) {

DiagnosticConsumer *specificConsumer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's a bit of Smalltalk-oidal transformative experience going on on my part here. Here's how I would push back: In this case, it would be a function called findSpecificConsumerIfAny and I could just read the name and know what's going on in the caller. Without the separate function, I need to read the body of the switch to see that it is setting the variable, and then also read further to see that the variable is never changed again. (Swift is way better:)

I realize that the uninitialized variable followed by conditional code that sets it is a common idiom in C(++), but my other beef is that the compiler does nothing to guarantee it is not modified below. When functions get longer, I always have to grep for the variable to ensure it doesn't get modified in the middle.

This particular case is not that bad, but I feel that there is something like "the death of a thousand cuts", where enough not-too-tough cases have a cumulative effect on the effort required to dive into an unfamiliar code base.

// behavior.
bool hadError = false;
for (auto &subConsumer : SubConsumers)
hadError |= subConsumer.second->finishProcessing();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

// FIXME: Do we need one SerializedConsumer per primary? Owned by the
// SourceFile?
std::unique_ptr<DiagnosticConsumer> SerializedConsumer;
std::unique_ptr<DiagnosticConsumer> SerializedConsumerMultiplexer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SerializedDiagnosticConsumerSorter? SerializedDiagnosticDistributor?? What's this thing doing? Directing diagnostics to an ultimate consumer?? DiagnosticPachinko? (No, that would be random.)

Invocation.getSerializedDiagnosticsPathForAtMostOnePrimary();
if (!SerializedDiagnosticsPath.empty()) {
SerializedConsumer.reset(
SmallVector<FileSpecificDiagnosticConsumer::ConsumerPair, 4>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the 4, how about a comment explaining why 4?

Invocation.getSerializedDiagnosticsPathForAtMostOnePrimary();
if (!SerializedDiagnosticsPath.empty()) {
SerializedConsumer.reset(
SmallVector<FileSpecificDiagnosticConsumer::ConsumerPair, 4>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding factoring the vector-building: lines 1627 - 1653 are all computing one thing, the SerializedConsumerMultiplexer. But I had to read it carefully to see that that was the only thing it was doing, and that the multiplexer is not modified thereafter. Likewise for the 20-ish lines to compute the vector. It feels like there are functions being computed, but it could be easier to see that than what is apparent at a glance.

SerializedConsumer.reset(
SmallVector<FileSpecificDiagnosticConsumer::ConsumerPair, 4>
SerializedConsumers;
Invocation.getFrontendOptions().InputsAndOutputs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Whatever the common practice is for our team is OK with me.

return false;
});

if (!SerializedConsumers.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a conceivable situation where the time difference would be noticeable? I had assumed that the extra time for a diagnostic (when there are no serialized consumers) would be unobservable, given the time required to generate the diagnostic, but maybe I'm wrong. It would be good for me to correct my intuition.

If the if-test is merely an optimization, and not needed for correctness, it would help to add a comment: "Save time when there are no SerializedConsumers" so that future maintainers know it is just an optimization.

If a top-level diagnostic is in a particular source range, the
RangeSpecificDiagnosticConsumer will funnel it and any attached notes
to a particular "sub-consumer" designated for that range (intended to
be used with whole files). If it's not in a range designated for any
sub-consumer, the diagnostic is passed to all registered
sub-consumers.

This is intended to be used for batch mode, so that diagnostics that
are definitely associated with a particular file can be emitted in
that file's .dia output, while diagnostics that may be associated with
other files (such as those that come from Clang) will still get
presented to the user.
It turns out that we need to have the diagnostic consumers set up
before we've actually opened the input files, which makes sense
because we might want to emit diagnostics about not being able to open
an input file. Switch to using file names instead, and mapping those
over to source ranges only once we actually need to handle a
diagnostic with a valid source location.
@jrose-apple
Copy link
Contributor Author

Last update incorporating some parts of David's feedback and not other parts. ;-) Per his comment above I'm going to go with this version unless there are serious objections.

@swift-ci Please smoke test

Copy link
Contributor

@graydon graydon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a solid improvement to me.

@jrose-apple
Copy link
Contributor Author

Bot got stuck, trying again.

@swift-ci Please smoke test Linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants