New perf. metrics, stability and other improvements #184

Alexsandruss · 2025-04-28T11:16:56Z

Description

Changes:

Add support of LightGBM daal4py modelbuilders
Add garbage collection and result cleaning in datasets prefetching function to avoid out-of-memory errors
Add SKLBENCH_DATA_CACHE env variable as the first default location for datasets cache for convenience ($PWD/data_cache is still working if env variable is not set)
Change default dtype to float32
Adjust compatibility mode of report generator to work with latest versions of stock sklearn and RAPIDS and for other cases
Update collected performance metrics:
- Add cost metrics counted in microdollars (most readable degree for usual case computation time)
- Add CPU load profiling
- Add RAM and VRAM usage profiling
- Add coefficient of variation for time
- Add 1st run time
- Add 1st-mean run ratio
Change color scale from RED-YELLOW-GREEN to RED-WHITE-GREEN in perf. report for better readability
Add option for cache flushing between case runs
Docs:
- Mention Kaggle dataset download requirements
- Note about experimental configs content and meaning
- Move Benchmarking Config Specification to separate file
- Add benchmarking scopes short explanation

PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed.
This approach ensures that reviewers don't spend extra time asking for regular requirements.

You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way.

Checklist to comply with before moving PR from draft:

PR completeness and readability

I have reviewed my changes thoroughly before submitting this pull request.
I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have added a respective label(s) to PR if I have a permission for that.
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

ahuber21

Huge pile of changes that I only skimmed, but I don't see any obvious issues. Would you like comments or opinions on specific changes, or anything that needs to be double-checked?

BTW: I don't think the execution with shell=True is a problem here, as the command is hardcoded and can't be changed. But still it would be nice to work around it. Did you try running without shell=True?

david-cortes-intel · 2025-05-05T08:47:03Z

sklbench/datasets/README.md

 - Custom loaders for named datasets
 - User-provided datasets in compatible format

+Kaggle API keys and competition rules acceptance are required for next dataset:


I think it requires additional config files placed under specific folders too.

The only available Kaggle dataset is used twice in weekly scope only, so little to no sense to separate it further.

@Alexsandruss It's not about separating it, but about making the instructions cover what's necessary for running cases with this benchmark.

david-cortes-intel · 2025-05-05T08:49:26Z

sklbench/datasets/transformer.py

@@ -137,7 +137,7 @@ def split_and_transform_data(bench_case, data, data_description):
    device = get_bench_case_value(bench_case, "algorithm:device", None)
    common_data_format = get_bench_case_value(bench_case, "data:format", "pandas")
    common_data_order = get_bench_case_value(bench_case, "data:order", "F")
-    common_data_dtype = get_bench_case_value(bench_case, "data:dtype", "float64")
+    common_data_dtype = get_bench_case_value(bench_case, "data:dtype", "float32")


I would venture to guess that usage of float32 is much, much less common in both sklearn and sklearnex than float64.

float32 better allows to run high-memory cases on GPUs and low RAM machines.

sklbench/emulators/svs/neighbors.py

david-cortes-intel · 2025-05-05T08:57:15Z

configs/BENCH-CONFIG-SPEC.md

+ - `INCLUDE` - Other configuration files whose parameter sets to include
+ - `PARAMETERS_SETS` - Benchmark parameters within each set
+ - `TEMPLATES` - List different setups with parameters sets template-specific parameters
+ - `SETS` - List parameters sets to include in the template


It could explain what a "set" is.

david-cortes-intel · 2025-05-05T08:59:13Z

configs/BENCH-CONFIG-SPEC.md

+|:---------------|:--------------|:--------|:------------|
+|<h3>Benchmark workflow parameters</h3>||||
+| `bench`:`taskset` | None |  | Value for `-c` argument of `taskset` utility used over benchmark subcommand. |
+| `bench`:`vtune_profiling` | None |  | Analysis type for `collect` argument of Intel(R) VTune* Profiler tool. Linux* OS only. |


VTune functionalities are quite undocumented. Could expand on them and add examples and screenshots.

configs/BENCH-CONFIG-SPEC.md

david-cortes-intel · 2025-05-05T09:03:29Z

sklbench/datasets/README.md

@@ -10,9 +10,13 @@ Data handling steps:
 Existing data sources:
 - Synthetic data from sklearn
 - OpenML datasets
+ - Kaggle competition datasets


Unrelated to these changes but: this repository would be a lot easier to use if it could avoid pulling data from kaggle.

Updates and fixes

acde702

Alexsandruss added enhancement New feature or request docs documentation and readme update labels Apr 28, 2025

Alexsandruss mentioned this pull request Apr 28, 2025

Update benchmarking scopes #185

Open

9 tasks

Alexsandruss changed the title ~~Updates and fixes~~ New perf. metrics, stability improvements and other fixes Apr 28, 2025

Fix cpuinfo usage on Windows

251a00c

Alexsandruss marked this pull request as ready for review April 28, 2025 11:45

Alexsandruss changed the title ~~New perf. metrics, stability improvements and other fixes~~ New perf. metrics, stability and other improvements Apr 28, 2025

Alexsandruss requested review from icfaust, ahuber21, ethanglaser, david-cortes-intel, razdoburdin and avolkov-intel April 28, 2025 11:47

ahuber21 reviewed Apr 30, 2025

View reviewed changes

david-cortes-intel reviewed May 5, 2025

View reviewed changes

Alexsandruss added 2 commits June 10, 2025 15:32

Remove online inference mode mentions

068a27e

Set upper limit for sklearn version

7cebf5d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New perf. metrics, stability and other improvements #184

New perf. metrics, stability and other improvements #184

Uh oh!

Alexsandruss commented Apr 28, 2025 •

edited

Loading

Uh oh!

ahuber21 left a comment

Uh oh!

david-cortes-intel May 5, 2025

Uh oh!

Alexsandruss Jun 10, 2025

Uh oh!

david-cortes-intel Jun 10, 2025

Uh oh!

david-cortes-intel May 5, 2025

Uh oh!

Alexsandruss Jun 10, 2025

Uh oh!

Uh oh!

david-cortes-intel May 5, 2025

Uh oh!

david-cortes-intel May 5, 2025

Uh oh!

Uh oh!

david-cortes-intel May 5, 2025

Uh oh!

Uh oh!

New perf. metrics, stability and other improvements #184

Are you sure you want to change the base?

New perf. metrics, stability and other improvements #184

Uh oh!

Conversation

Alexsandruss commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

ahuber21 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Alexsandruss commented Apr 28, 2025 •

edited

Loading