Skip to content

Add support for collection, recording and analysis of SIL optimizer counters #11813

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 11, 2017

Conversation

swiftix
Copy link
Contributor

@swiftix swiftix commented Sep 8, 2017

This PR implements collection and dumping of statistics about SILModules, SILFunctions and memory consumption during the execution of SIL optimization pipelines.

The following statistics can be collected:

  • For SILFunctions: the number of SIL basic blocks, the number of SIL instructions, the number of SIL instructions of a specific kind, duration of a pass
  • For SILModules: the number of SIL basic blocks, the number of SIL instructions, the number of SIL instructions of a specific kind, the number of SILFunctions, the amount of memory used by the compiler, duration of a pass

By default, any collection of statistics is disabled to avoid affecting compile times.

One can enable the collection of statistics and dumping of these statistics for the whole SILModule and/or for SILFunctions.

To reduce the amount of produced data, one can set thresholds in such a way that changes in the statistics are only reported if the delta between the old and the new values are at least X%. The deltas are computed using the following formula:

Delta = 100 * (NewValue - OldValue) / OldValue

Thresholds provide a simple way to perform a simple filtering of the collected statistics during the compilation. But if there is a need for a more complex analysis of collected data (e.g. aggregation by a pipeline stage or by the type of a transformation), it is often better to dump as much data as possible into a file using e.g. -sil-stats-dump-all -sil-stats-modules -sil-stats-functions and then e.g. use the helper scripts to store the collected data into a database and then perform complex queries on it. Many kinds of analysis can be then formulated pretty easily as SQL queries.

Copy link
Contributor

@gottesmm gottesmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments I forgot to send yesterday. I am going back through it again though.

@@ -214,6 +214,9 @@ class SILPassManager {
// Sets the name of the current optimization stage used for debugging.
void setStageName(llvm::StringRef NextStage = "");

// Get the name of the current optimization stage used for debugging.
std::string getStageName() const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a std::string? Why not return a StringRef?

@@ -338,8 +341,8 @@ void SILPassManager::runPassOnFunction(SILFunctionTransform *SFT,
assert(analysesUnlocked() && "Expected all analyses to be unlocked!");
Mod->removeDeleteNotificationHandler(SFT);

auto Delta = (std::chrono::system_clock::now() - StartTime).count();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this conditional on running with the given option enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you afraid of std::chrono::system_clock::now() being an expensive call?

BTW, the matching start time computation, which is a couple of lines above, is always performed unconditionally:

llvm::sys::TimePoint<> StartTime = std::chrono::system_clock::now();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Copy link
Contributor

@gottesmm gottesmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of small comments. A bigger ask though:

Can you add a lit *.sil that shows all of this working. This will ensure that this doesn't break. I mean something that shows the optimizer producing the appropriate counters.

@@ -0,0 +1,258 @@
## Collecting and analyzing optimizer counters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a ToC?

@@ -0,0 +1,258 @@
## Collecting and analyzing optimizer counters

This document describes how you collect and analyze the counters produced by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be made clearer. It almost sounds like the counters are always produced by the compiler. Also, you sort of just go straight into talking about optimizer counters, without saying "what an optimizer counter is".

CounterValue, Duration, Symbol`

where:
* `Kind` is one of `function`, `module`, `function_history`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you messed up the formatting here.

@@ -214,6 +214,9 @@ class SILPassManager {
// Sets the name of the current optimization stage used for debugging.
void setStageName(llvm::StringRef NextStage = "");

// Get the name of the current optimization stage used for debugging.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a doxygen comment here: I.e.:

/// Get the name of the current optimization stage.
///
/// This is useful for debugging. Please keep this out of line so that lldb is able to always find it.

@@ -338,8 +341,8 @@ void SILPassManager::runPassOnFunction(SILFunctionTransform *SFT,
assert(analysesUnlocked() && "Expected all analyses to be unlocked!");
Mod->removeDeleteNotificationHandler(SFT);

auto Delta = (std::chrono::system_clock::now() - StartTime).count();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

if (!SILStatsOutputFile.empty()) {
// Try to open the file.
std::error_code EC;
llvm::raw_fd_ostream *fd_stream = new llvm::raw_fd_ostream(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::unique_ptr + move?

// Does the name contain a user-defined substring?
if (!StatsOnlyFunctionsNamePattern.empty()) {
StringRef Pattern(StatsOnlyFunctionsNamePattern);
return FuncName.find(Pattern) != StringRef::npos;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringRef::contains?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, isn't StatsOnlyFunctionNamesPattern a String as well?


/// Compute the delta between the old and new values.
/// Return it as a percent value.
double computeDelta(int Old, int New) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is inside an anonymous namespace anyways. So, it cannot be referenced from outside.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

SILInstructionKind Kind = SILInstructionKind(i);
if (!StatsOnlyInstructionsOptLoc.shouldComputeInstCount(Kind))
continue;
std::string CounterName = "inst_";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm::SmallString<64>? (I forgot if 32 or 64 is the limit), I doubt any name of an instruction is more than 20 characters).

}
}

NewModStat.addMemoryStat();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not duplicate this into the if statement and return early. The else statement is pretty long here and you could then get rid of the indentation. I don't have a strong opinion on this though.

@gottesmm
Copy link
Contributor

gottesmm commented Sep 8, 2017 via email

The database uses a very simple self-explaining schema:

```sql
CREATE TABLE Counters(Id INTEGER PRIMARY KEY AUTOINCREMENT, Stage TEXT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add indentation here to make this more readable?

@swiftix
Copy link
Contributor Author

swiftix commented Sep 8, 2017

@swift-ci please smoke test

3 similar comments
@swiftix
Copy link
Contributor Author

swiftix commented Sep 8, 2017

@swift-ci please smoke test

@swiftix
Copy link
Contributor Author

swiftix commented Sep 8, 2017

@swift-ci please smoke test

@swiftix
Copy link
Contributor Author

swiftix commented Sep 8, 2017

@swift-ci please smoke test

@swiftix
Copy link
Contributor Author

swiftix commented Sep 9, 2017

@swift-ci please test

1 similar comment
@swiftix
Copy link
Contributor Author

swiftix commented Sep 9, 2017

@swift-ci please test

@swift-ci
Copy link
Contributor

swift-ci commented Sep 9, 2017

Build failed
Swift Test Linux Platform
Git Sha - 6b9929096d6f00f43330a4a2e8a12323d305dc10

@swiftix
Copy link
Contributor Author

swiftix commented Sep 9, 2017

@swift-ci please test

@swift-ci
Copy link
Contributor

swift-ci commented Sep 9, 2017

Build failed
Swift Test Linux Platform
Git Sha - 2f559451ff30af0caed6e5bb25388921c41b1af6

@swiftix
Copy link
Contributor Author

swiftix commented Sep 9, 2017

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - c3717e28dcaa1df38b33d7548c5c402583e4d099

@swiftix
Copy link
Contributor Author

swiftix commented Sep 10, 2017

@swift-ci please smoke test Linux

1 similar comment
@swiftix
Copy link
Contributor Author

swiftix commented Sep 10, 2017

@swift-ci please smoke test Linux

@swiftix
Copy link
Contributor Author

swiftix commented Sep 10, 2017

@swift-ci please smoke test

1 similar comment
@swiftix
Copy link
Contributor Author

swiftix commented Sep 10, 2017

@swift-ci please smoke test

@swiftix
Copy link
Contributor Author

swiftix commented Sep 10, 2017

Linux lib dispatch errors seem to be unrelated:

linux-x86_64/src/.libs/libdispatch.so: undefined symbol: _T0s19_emptyStringStorages6UInt32Vvp
FAIL dispatch_plusplus (exit status: 127)

Does anyone has an idea what it could be? Is it a known issue?

@swiftix
Copy link
Contributor Author

swiftix commented Sep 10, 2017

@swift-ci please smoke test Linux

This patch implements collection and dumping of statistics about SILModules, SILFunctions and memory consumption during the execution of SIL optimization pipelines.

The following statistics can be collected:
  *  For SILFunctions: the number of SIL basic blocks, the number of SIL instructions, the number of SIL instructions of a specific kind, duration of a pass
  *  For SILModules: the number of SIL basic blocks, the number of SIL instructions, the number of SIL instructions of a specific kind, the number of SILFunctions, the amount of memory used by the compiler, duration of a pass

By default, any collection of statistics is disabled to avoid affecting compile times.

One can enable the collection of statistics and dumping of these statistics for the whole SILModule and/or for SILFunctions.

To reduce the amount of produced data, one can set thresholds in such a way that changes in the statistics are only reported if the delta between the old and the new values are at least X%. The deltas are computed as using the following formula:

   Delta = (NewValue - OldValue) / OldValue

Thresholds provide a simple way to perform a simple filtering of the collected statistics during the compilation. But if there is a need for a more complex analysis of collected data (e.g. aggregation by a pipeline stage or by the type of a transformation), it is often better to dump as much data as possible into a file using e.g. -sil-stats-dump-all -sil-stats-modules -sil-stats-functions and then e.g. use the helper scripts to store the collected data into a database and then perform complex queries on it. Many kinds of analysis can be then formulated pretty easily as SQL queries.
…SILValue/SILInstruction and its mnemonic name

Use it for the stats collection, but also for SIL printing.
It described the overall idea behind the optimizer counters, how to collect and record them and how to analyze them
@swiftix
Copy link
Contributor Author

swiftix commented Sep 11, 2017

@swift-ci please smoke test Linux

1 similar comment
@swiftix
Copy link
Contributor Author

swiftix commented Sep 11, 2017

@swift-ci please smoke test Linux

@swiftix
Copy link
Contributor Author

swiftix commented Sep 11, 2017

@swift-ci please smoke test

@swiftix swiftix merged commit 12841a9 into swiftlang:master Sep 11, 2017
@eeckstein
Copy link
Contributor

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants