-
Notifications
You must be signed in to change notification settings - Fork 787
[CI] Tune nightly benchmarking job for better reliability #17122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
8f45038
0a8b0d2
e672cda
684f94e
57e530e
aaa19f0
bdf68d3
b495154
384c4d6
dd3f861
0ee8a9a
5cab659
e556368
84eda44
565bfbd
247ed16
04d5220
7637810
98a9b3d
13f86ec
fedb018
26180cd
302d6b0
fdfbb9a
08fb37a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -247,7 +247,7 @@ jobs: | |
sycl_cts_artifact: sycl_cts_bin | ||
|
||
aggregate_benchmark_results: | ||
if: always() && !cancelled() | ||
if: github.repository == 'intel/llvm' && !cancelled() | ||
name: Aggregate benchmark results and produce historical averages | ||
uses: ./.github/workflows/sycl-benchmark-aggregate.yml | ||
secrets: | ||
|
@@ -262,13 +262,8 @@ jobs: | |
fail-fast: false | ||
matrix: | ||
include: | ||
- name: Run compute-benchmarks on L0 Gen12 | ||
runner: '["Linux", "gen12"]' | ||
image_options: -u 1001 --device=/dev/dri -v /dev/dri/by-path:/dev/dri/by-path --privileged --cap-add SYS_ADMIN | ||
target_devices: level_zero:gpu | ||
reset_intel_gpu: true | ||
- name: Run compute-benchmarks on L0 PVC | ||
runner: '["Linux", "pvc"]' | ||
runner: '["PVC_PERF"]' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO we should include the OS here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We'll need to add a "Linux" tag to the UR runner then, although it's worth noting that the UR folks would prefer to not have other jobs run on their runners, so we'll need to make sure whatever tags we add to that runner does not result in other workflows picking it up. For now, I'll add "Linux" somewhere in the name There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh wow the runner doesnt have the linux tag automatically, thats weird There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that was intentional though, I would check up with @pbalcer There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lukaszstolarczuk was the one who set it up ;-) I'm not an expert on runners. But yeah, the original idea behind this system was that it'd just have one runner script instance, used exclusively for the benchmarks. But right now this system isn't very busy, so... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I explicitly used On general note - what Piotr said - we wanted this runner to be used mostly for performance, as it's also used for measuring UMF perf and too much traffic may influence the results. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
+1 Yeah, I also wanted to reserve this runner for performance benchmarking exclusively. @lukaszstolarczuk there's only one GHA process running on this runner, right? If so, we can be sure that only one benchmarking job will be executing at a given time. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. At this moment it is not a one runner. We want to move UMF into intel org - only then we can have a single, shared runner. For now we have 2 runners (one for SYCL, one for UMF), each bound to a different NUMA node. |
||
image_options: -u 1001 --device=/dev/dri -v /dev/dri/by-path:/dev/dri/by-path --privileged --cap-add SYS_ADMIN | ||
ianayl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
target_devices: level_zero:gpu | ||
reset_intel_gpu: true | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -46,6 +46,27 @@ runs: | |
echo "# This workflow is not guaranteed to work with other backends." | ||
echo "#" ;; | ||
esac | ||
- name: Compute CPU core range to run benchmarks on | ||
shell: bash | ||
run: | | ||
# Taken from ur-benchmark-reusable.yml: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should we make it a shared action in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When we combine both CI's there should only be 1 invocation of this, but that's when the dust all settles |
||
|
||
# Compute the core range for the first NUMA node; second node is used by | ||
# UMF. Skip the first 4 cores as the kernel is likely to schedule more | ||
# work on these. | ||
CORES="$(lscpu | awk ' | ||
/NUMA node0 CPU|On-line CPU/ {line=$0} | ||
END { | ||
split(line, a, " ") | ||
split(a[4], b, ",") | ||
sub(/^0/, "4", b[1]) | ||
print b[1] | ||
}')" | ||
echo "CPU core range to use: $CORES" | ||
echo "CORES=$CORES" >> $GITHUB_ENV | ||
|
||
ZE_AFFINITY_MASK=0 | ||
echo "ZE_AFFINITY_MASK=$ZE_AFFINITY_MASK" >> $GITHUB_ENV | ||
- name: Run compute-benchmarks | ||
shell: bash | ||
run: | | ||
|
@@ -69,7 +90,7 @@ runs: | |
echo "-----" | ||
sycl-ls | ||
echo "-----" | ||
./devops/scripts/benchmarking/benchmark.sh -n '${{ runner.name }}' -s || exit 1 | ||
taskset -c "$CORES" ./devops/scripts/benchmarking/benchmark.sh -n '${{ runner.name }}' -s || exit 1 | ||
- name: Push compute-benchmarks results | ||
if: always() | ||
shell: bash | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,9 +10,9 @@ | |
; Compute-benchmark compile/run options | ||
[compute_bench] | ||
; Value for -j during compilation of compute-benchmarks | ||
compile_jobs = 2 | ||
compile_jobs = 40 | ||
; Number of iterations to run compute-benchmark tests | ||
iterations = 100 | ||
iterations = 5000 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does this affect the Nightly? i.e. how much overhead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The run itself takes about 45 minutes, but since it's on PVC_PERF it should run in parallel with the other E2E tests, so I suspect it shouldn't be much. However, I've started a nightly run to confirm this. |
||
|
||
; Options for benchmark result metrics (to record/compare against) | ||
[metrics] | ||
|
@@ -23,15 +23,15 @@ recorded = Median,StdDev | |
; the historical average. Metrics not included here are not compared against | ||
; when passing/failing benchmark results. | ||
; Format: comma-separated list of <metric>:<deviation percentage in decimals> | ||
tolerances = Median:0.5 | ||
tolerances = Median:0.08 | ||
|
||
; Options for computing historical averages | ||
[average] | ||
; Number of days (from today) to look back for results when computing historical | ||
; average | ||
cutoff_range = 7 | ||
; Minimum number of samples required to compute a historical average | ||
min_threshold = 3 | ||
min_threshold = 10 | ||
|
||
; ONEAPI_DEVICE_SELECTOR linting/options | ||
[device_selector] | ||
|
Uh oh!
There was an error while loading. Please reload this page.