Skip to content

[mlir] Enhance TimingManager Printing Flexibility #85821

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 78 additions & 24 deletions mlir/docs/PassManagement.md
Original file line number Diff line number Diff line change
Expand Up @@ -1124,17 +1124,44 @@ pipeline. This display mode is available in mlir-opt via
$ mlir-opt foo.mlir -mlir-disable-threading -pass-pipeline='builtin.module(func.func(cse,canonicalize),convert-func-to-llvm)' -mlir-timing -mlir-timing-display=list

===-------------------------------------------------------------------------===
... Pass execution timing report ...
... Execution time report ...
===-------------------------------------------------------------------------===
Total Execution Time: 0.0203 seconds

---Wall Time--- --- Name ---
0.0047 ( 55.9%) Canonicalizer
0.0019 ( 22.2%) VerifierPass
0.0016 ( 18.5%) LLVMLoweringPass
0.0003 ( 3.4%) CSE
0.0002 ( 1.9%) (A) DominanceInfo
0.0084 (100.0%) Total
Total Execution Time: 0.0135 seconds

----Wall Time---- ----Name----
0.0135 (100.0%) root
0.0041 ( 30.1%) Parser
0.0018 ( 13.3%) ConvertFuncToLLVMPass
0.0011 ( 8.2%) Output
0.0007 ( 5.2%) Pipeline Collection : ['func.func']
0.0006 ( 4.6%) 'func.func' Pipeline
0.0005 ( 3.5%) Canonicalizer
0.0001 ( 0.9%) CSE
0.0001 ( 0.5%) (A) DataLayoutAnalysis
0.0000 ( 0.1%) (A) DominanceInfo
0.0058 ( 43.2%) Rest
0.0135 (100.0%) Total
```

The results can be displayed in JSON format via `-mlir-output-format=json`.

```shell
$ mlir-opt foo.mlir -mlir-disable-threading -pass-pipeline='builtin.module(func.func(cse,canonicalize),convert-func-to-llvm)' -mlir-timing -mlir-timing-display=list -mlir-output-format=json

[
{"wall": {"duration": 0.0135, "percentage": 100.0}, "name": "root"},
{"wall": {"duration": 0.0041, "percentage": 30.1}, "name": "Parser"},
{"wall": {"duration": 0.0018, "percentage": 13.3}, "name": "ConvertFuncToLLVMPass"},
{"wall": {"duration": 0.0011, "percentage": 8.2}, "name": "Output"},
{"wall": {"duration": 0.0007, "percentage": 5.2}, "name": "Pipeline Collection : ['func.func']"},
{"wall": {"duration": 0.0006, "percentage": 4.6}, "name": "'func.func' Pipeline"},
{"wall": {"duration": 0.0005, "percentage": 3.5}, "name": "Canonicalizer"},
{"wall": {"duration": 0.0001, "percentage": 0.9}, "name": "CSE"},
{"wall": {"duration": 0.0001, "percentage": 0.5}, "name": "(A) DataLayoutAnalysis"},
{"wall": {"duration": 0.0000, "percentage": 0.1}, "name": "(A) DominanceInfo"},
{"wall": {"duration": 0.0058, "percentage": 43.2}, "name": "Rest"},
{"wall": {"duration": 0.0135, "percentage": 100.0}, "name": "Total"}
]
```

##### Tree Display Mode
Expand All @@ -1149,21 +1176,48 @@ invalidated and recomputed. This is the default display mode.
$ mlir-opt foo.mlir -mlir-disable-threading -pass-pipeline='builtin.module(func.func(cse,canonicalize),convert-func-to-llvm)' -mlir-timing

===-------------------------------------------------------------------------===
... Pass execution timing report ...
... Execution time report ...
===-------------------------------------------------------------------------===
Total Execution Time: 0.0249 seconds

---Wall Time--- --- Name ---
0.0058 ( 70.8%) 'func.func' Pipeline
0.0004 ( 4.3%) CSE
0.0002 ( 2.6%) (A) DominanceInfo
0.0004 ( 4.8%) VerifierPass
0.0046 ( 55.4%) Canonicalizer
0.0005 ( 6.2%) VerifierPass
0.0005 ( 5.8%) VerifierPass
0.0014 ( 17.2%) LLVMLoweringPass
0.0005 ( 6.2%) VerifierPass
0.0082 (100.0%) Total
Total Execution Time: 0.0127 seconds

----Wall Time---- ----Name----
0.0038 ( 30.2%) Parser
0.0006 ( 4.8%) 'func.func' Pipeline
0.0001 ( 0.9%) CSE
0.0000 ( 0.1%) (A) DominanceInfo
0.0005 ( 3.7%) Canonicalizer
0.0017 ( 13.7%) ConvertFuncToLLVMPass
0.0001 ( 0.6%) (A) DataLayoutAnalysis
0.0010 ( 8.2%) Output
0.0054 ( 42.5%) Rest
0.0127 (100.0%) Total
```

The results can be displayed in JSON format via `-mlir-output-format=json`.

```shell
$ mlir-opt foo.mlir -mlir-disable-threading -pass-pipeline='builtin.module(func.func(cse,canonicalize),convert-func-to-llvm)' -mlir-timing -mlir-output-format=json

[
{"wall": {"duration": 0.0038, "percentage": 30.2}, "name": "Parser", "passes": [
{}]},
{"wall": {"duration": 0.0006, "percentage": 4.8}, "name": "'func.func' Pipeline", "passes": [
{"wall": {"duration": 0.0001, "percentage": 0.9}, "name": "CSE", "passes": [
{"wall": {"duration": 0.0000, "percentage": 0.1}, "name": "(A) DominanceInfo", "passes": [
{}]},
{}]},
{"wall": {"duration": 0.0005, "percentage": 3.7}, "name": "Canonicalizer", "passes": [
{}]},
{}]},
{"wall": {"duration": 0.0017, "percentage": 13.7}, "name": "ConvertFuncToLLVMPass", "passes": [
{"wall": {"duration": 0.0001, "percentage": 0.6}, "name": "(A) DataLayoutAnalysis", "passes": [
{}]},
{}]},
{"wall": {"duration": 0.0010, "percentage": 8.2}, "name": "Output", "passes": [
{}]},
{"wall": {"duration": 0.0054, "percentage": 42.5}, "name": "Rest"},
{"wall": {"duration": 0.0127, "percentage": 100.0}, "name": "Total"}
]
```

##### Multi-threaded Pass Timing
Expand Down
62 changes: 58 additions & 4 deletions mlir/include/mlir/Support/Timing.h
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,53 @@ class TimingScope {
Timer timer;
};

//===----------------------------------------------------------------------===//
// OutputStrategy
//===----------------------------------------------------------------------===//

/// Simple record class to record timing information.
struct TimeRecord {
TimeRecord(double wall = 0.0, double user = 0.0) : wall(wall), user(user) {}

TimeRecord &operator+=(const TimeRecord &other) {
wall += other.wall;
user += other.user;
return *this;
}

TimeRecord &operator-=(const TimeRecord &other) {
wall -= other.wall;
user -= other.user;
return *this;
}

double wall, user;
};

/// Facilities for printing timing reports to various output formats.
///
/// This is an abstract class that serves as the foundation for printing.
/// Users can implement additional output formats by extending this abstract
/// class.
class OutputStrategy {
public:
OutputStrategy(raw_ostream &os) : os(os) {}
virtual ~OutputStrategy() = default;

virtual void printHeader(const TimeRecord &total) = 0;
virtual void printFooter() = 0;
virtual void printTime(const TimeRecord &time, const TimeRecord &total) = 0;
virtual void printListEntry(StringRef name, const TimeRecord &time,
const TimeRecord &total,
bool lastEntry = false) = 0;
virtual void printTreeEntry(unsigned indent, StringRef name,
const TimeRecord &time,
const TimeRecord &total) = 0;
virtual void printTreeEntryEnd(unsigned indent, bool lastEntry = false) = 0;

raw_ostream &os;
};

//===----------------------------------------------------------------------===//
// DefaultTimingManager
//===----------------------------------------------------------------------===//
Expand Down Expand Up @@ -351,6 +398,15 @@ class DefaultTimingManager : public TimingManager {
Tree,
};

/// The different output formats for printing the timers.
enum class OutputFormat {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum seems like a poor basis for extensibility. It seems to me a more powerful API would be to directly expose the OutputStrategy abstract class and allow users to inject their own std::unique_ptr<OutputStrategy> implementation. We can ship Text and Json as "batteries" for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestions.

I keep the enum for option settings and move OutputStrategy to Timing.h file. Users can use void setOutput(std::unique_ptr<OutputStrategy> output); to set their own OutputStrategy implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping.

/// In this format the results are displayed in text format.
Text,

/// In this format the results are displayed in JSON format.
Json,
};

DefaultTimingManager();
DefaultTimingManager(DefaultTimingManager &&rhs);
~DefaultTimingManager() override;
Expand All @@ -372,10 +428,7 @@ class DefaultTimingManager : public TimingManager {
DisplayMode getDisplayMode() const;

/// Change the stream where the output will be printed to.
void setOutput(raw_ostream &os);

/// Return the current output stream where the output will be printed to.
raw_ostream &getOutput() const;
void setOutput(std::unique_ptr<OutputStrategy> output);

/// Print and clear the timing results. Only call this when there are no more
/// references to nested timers around, as printing post-processes and clears
Expand Down Expand Up @@ -408,6 +461,7 @@ class DefaultTimingManager : public TimingManager {

private:
const std::unique_ptr<detail::DefaultTimingManagerImpl> impl;
std::unique_ptr<OutputStrategy> out;
};

/// Register a set of useful command-line options that can be used to configure
Expand Down
Loading