-
Notifications
You must be signed in to change notification settings - Fork 156
Add command for profiling and comparing toolchains #1028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Couple nits really but this looks great, thanks! |
anyhow::bail!("failed to annotate cachegrind output"); | ||
} | ||
|
||
fs::write(path, output.stdout).context("failed to write `cg_annotate` output")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also typically rustfilt the input to cg_annotate -- seems like a nice thing to add? I'm not sure it's necessary without symbol-mangling-version = v0, but I think that valgrind doesn't yet support that (at least not on all platforms).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been using valgrind-3.17.0 which does have demangling support builtin based on libiberty (I think?). The only limitation is lack of support for symbols renamed during LTO, those with dot suffix, i.e., .llvm.*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ubuntu 20.04 (latest LTS) only has 3.15, so I think we should try to avoid relying on the built-in demangling, I suspect it doesn't work. (Admittedly, I've been using a locally compiled version from git, so who knows...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, we can do this later if it becomes a problem for folks.
Current implementation allows using an arbitrary profiler, but the intended use case is profiling with cachegrind. When cachegrind is selected, the command will automatically compare results with `cg_diff` and post process output with `cg_annotate`.
Note: it would be great for us to update the collector README with these changes so they're not so hidden. At some point, I think we should make a book on profiling and measuring perf of rustc (that can be linked to from the rustc-dev-guide). |
Current implementation allows using an arbitrary profiler, but the
intended use case is profiling with cachegrind. When cachegrind is
selected, the command will automatically compare results with
cg_diff
and post process output with
cg_annotate
.For example to compare two builds on ctfe-stress-4-doc benchmark: