Skip to content

Commit fec2896

Browse files
shunting314facebook-github-bot
authored andcommitted
metric table (#109245)
Summary: In dynamo/inductor, sometimes it helps to gather metrics/statistics for each model in different levels like model level, graph level, kernel level or pair of fusion nodes level. This kind of thing will be very easy to do with Scuba, but we only have scuba in fbcode. This PR build metric tables to solve part of the problem. Q: why not log to stdout/err direclty A: sometimes we need more structured data. E.g., it would be helpful to gather all the stats in a CSV and then do post-processing (like calculating a geomean etc.). Also metric table will tag each row with the model name which is helpful. Q: what's the difference with speedup_indcutor.csv A: speedup_indcutor.csv is a special case that gather statistics on model level: i.e., we have one row for each model. But recording statistics on finer grain level like graph etc. is also helpful. Example use cases: - As a followup on the bechmark fusion PR, I want to gather all the 'slow' fusion and analyze them. With the metric table, I can easily log slow fusion for each model into a csv file. Here is the log gathered for huggingface: https://gist.github.com/shunting314/964e73cc98368b301414ec7b7ad4c702 . - To help understand the effect of 'loop ordering after fusion' PR, it would be helpful to gather stats like how many fusions happens for each graph. Previously we log the metric to stderr directly. But logging these metrics in a structural way is useful. - gather number of registers, register spills, shared memory usage for each kernel in each model with runnable kernel code logged. X-link: pytorch/pytorch#109245 Approved by: https://github.com/jansel, https://github.com/mlazos Reviewed By: ZainRizvi Differential Revision: D50873155 Pulled By: shunting314 fbshipit-source-id: f9bf01564e5d7f079cd54cd062b62acda10a67d6
1 parent b7d90f7 commit fec2896

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

userbenchmark/dynamo/dynamobench/common.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@
6868
except ImportError:
6969
from _dynamo.utils import clone_inputs, graph_break_reasons
7070
from torch._functorch.aot_autograd import set_model_name
71-
from torch._inductor import config as inductor_config
71+
from torch._inductor import config as inductor_config, metrics
7272
from torch._subclasses.fake_tensor import FakeTensorMode
7373

7474
from torch.utils import _pytree as pytree
@@ -3882,6 +3882,7 @@ def detect_and_mark_batch(t):
38823882
],
38833883
)
38843884
else:
3885+
metrics.purge_old_log_files()
38853886
if output_filename and os.path.exists(output_filename):
38863887
os.unlink(output_filename)
38873888
if original_dir:

0 commit comments

Comments
 (0)