You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: scripts/benchmarks/README.md
+5-4Lines changed: 5 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -27,21 +27,22 @@ You can also include additional benchmark parameters, such as environment variab
27
27
28
28
Once all the required information is entered, click the "Run workflow" button to initiate a new workflow run. This will execute the benchmarks and then post the results as a comment on the specified Pull Request.
29
29
30
-
By default, all benchmark runs are compared against `baseline`, which is a well-established set of the latest data.
30
+
It is recommended that all benchmark runs should be compared against `baseline` by passing `--compare baseline` to benchmark parameters. `baseline` is a well-established set of the latest data.
31
31
32
32
You must be a member of the `oneapi-src` organization to access these features.
33
33
34
34
## Comparing results
35
35
36
36
By default, the benchmark results are not stored. To store them, use the option `--save <name>`. This will make the results available for comparison during the next benchmark runs.
37
37
38
-
To compare a benchmark run with a previously stored result, use the option `--compare <name>`. You can compare with more than one result.
39
-
40
-
If no `--compare` option is specified, the benchmark run is compared against a previously stored `baseline`.
38
+
You can compare benchmark results using `--compare` option. The comparison will be presented in a markdown output file (see below). If you want to calculate the relative performance of the new results against the previously saved data, use `--compare <previously_saved_data>` (i.e. `--compare baseline`). In case of comparing only stored data without generating new results, use `--dry-run --compare <name1> --compare <name2> --relative-perf <name1>`, where `name1` indicates the baseline for the relative performance calculation and `--dry-run` prevents the script for running benchmarks. Listing more than two `--compare` options results in displaying only execution time, without statistical analysis.
41
39
42
40
Baseline, as well as baseline-v2 (for the level-zero adapter v2) is updated automatically during a nightly job. The results
43
41
are stored [here](https://oneapi-src.github.io/unified-runtime/benchmark_results.html).
44
42
43
+
## Output formats
44
+
You can display the results in the form of a HTML file by using `--ouptut-html` and a markdown file by using `--output-markdown`. Due to character limits for posting PR comments, the final content of the markdown file might be reduced. In order to obtain the full markdown output, use `--output-markdown full`.
parser.add_argument("--no-rebuild", help='Do not rebuild the benchmarks from scratch.', action="store_true")
252
255
parser.add_argument("--env", type=str, help='Use env variable for a benchmark run.', action="append", default=[])
253
256
parser.add_argument("--save", type=str, help='Save the results for comparison under a specified name.')
254
-
parser.add_argument("--compare", type=str, help='Compare results against previously saved data.', action="append", default=["baseline"])
257
+
parser.add_argument("--compare", type=str, help='Compare results against previously saved data.', action="append")
255
258
parser.add_argument("--iterations", type=int, help='Number of times to run each benchmark to select a median value.', default=options.iterations)
256
259
parser.add_argument("--stddev-threshold", type=float, help='If stddev pct is above this threshold, rerun all iterations', default=options.stddev_threshold)
257
260
parser.add_argument("--timeout", type=int, help='Timeout for individual benchmarks in seconds.', default=options.timeout)
parser.add_argument("--exit-on-failure", help='Exit on first failure.', action="store_true")
262
265
parser.add_argument("--compare-type", type=str, choices=[e.valueforeinCompare], help='Compare results against previously saved data.', default=Compare.LATEST.value)
263
266
parser.add_argument("--compare-max", type=int, help='How many results to read for comparisions', default=options.compare_max)
267
+
parser.add_argument("--output-markdown", nargs='?', const=options.output_markdown, help='Specify whether markdown output should fit the content size limit for request validation')
264
268
parser.add_argument("--output-html", help='Create HTML output', action="store_true", default=False)
parser.add_argument("--dry-run", help='Do not run any actual benchmarks', action="store_true", default=False)
267
270
parser.add_argument("--compute-runtime", nargs='?', const=options.compute_runtime_tag, help="Fetch and build compute runtime")
268
271
parser.add_argument("--iterations-stddev", type=int, help="Max number of iterations of the loop calculating stddev after completed benchmark runs", default=options.iterations_stddev)
269
272
parser.add_argument("--build-igc", help="Build IGC from source instead of using the OS-installed version", action="store_true", default=options.build_igc)
273
+
parser.add_argument("--relative-perf", type=str, help="The name of the results which should be used as a baseline for metrics calculation", default=options.current_run_name)
0 commit comments