-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lit] add --max-retries-per-test execution option #141851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this should be an openmp specific option. Lit should provide a generic option to set the (default) number of retry attempts, which is available for all test suites.
Fair enough. Let me make that change then. |
122ecfe
to
5a41841
Compare
✅ With the latest revision this PR passed the Python code formatter. |
5a41841
to
02bb64f
Compare
02bb64f
to
4590902
Compare
@llvm/pr-subscribers-testing-tools Author: Konrad Kleine (kwk) ChangesWhen packaging LLVM we've seen arbitrary tests fail. The tests themselves were always different and we didn't This change allows us to set Please note, that this only adds the possibility to re-run Also note, that one can still use Any option in the list below overrules its predecessor:
Downstream PR to make use of the Full diff: https://github.com/llvm/llvm-project/pull/141851.diff 13 Files Affected:
diff --git a/llvm/utils/lit/lit/LitConfig.py b/llvm/utils/lit/lit/LitConfig.py
index 5dc712ae28370..cb4aef6f72a87 100644
--- a/llvm/utils/lit/lit/LitConfig.py
+++ b/llvm/utils/lit/lit/LitConfig.py
@@ -35,6 +35,7 @@ def __init__(
params,
config_prefix=None,
maxIndividualTestTime=0,
+ maxRetriesPerTest=None,
parallelism_groups={},
per_test_coverage=False,
gtest_sharding=True,
@@ -86,6 +87,7 @@ def __init__(
self.valgrindArgs.extend(self.valgrindUserArgs)
self.maxIndividualTestTime = maxIndividualTestTime
+ self.maxRetriesPerTest = maxRetriesPerTest
self.parallelism_groups = parallelism_groups
self.per_test_coverage = per_test_coverage
self.gtest_sharding = bool(gtest_sharding)
diff --git a/llvm/utils/lit/lit/TestingConfig.py b/llvm/utils/lit/lit/TestingConfig.py
index c063851b89526..1f6ad497aa97e 100644
--- a/llvm/utils/lit/lit/TestingConfig.py
+++ b/llvm/utils/lit/lit/TestingConfig.py
@@ -235,6 +235,8 @@ def finish(self, litConfig):
# files. Should we distinguish them?
self.test_source_root = str(self.test_source_root)
self.excludes = set(self.excludes)
+ if litConfig.maxRetriesPerTest is not None and getattr(self, 'test_retry_attempts', None) is None:
+ self.test_retry_attempts = litConfig.maxRetriesPerTest
@property
def root(self):
diff --git a/llvm/utils/lit/lit/cl_arguments.py b/llvm/utils/lit/lit/cl_arguments.py
index 1d776e0216a1e..30160a7bd5622 100644
--- a/llvm/utils/lit/lit/cl_arguments.py
+++ b/llvm/utils/lit/lit/cl_arguments.py
@@ -199,6 +199,13 @@ def parse_args():
"0 means no time limit. [Default: 0]",
type=_non_negative_int,
)
+ execution_group.add_argument(
+ "--max-retries-per-test",
+ dest="maxRetriesPerTest",
+ metavar="N",
+ help="Maximum number of allowed retry attempts per test (NOTE: ALLOWED_RETRIES keyword always takes precedence)",
+ type=_positive_int,
+ )
execution_group.add_argument(
"--max-failures",
help="Stop execution after the given number of failures.",
diff --git a/llvm/utils/lit/lit/main.py b/llvm/utils/lit/lit/main.py
index ba80330d22400..0939838b78ceb 100755
--- a/llvm/utils/lit/lit/main.py
+++ b/llvm/utils/lit/lit/main.py
@@ -42,6 +42,7 @@ def main(builtin_params={}):
config_prefix=opts.configPrefix,
per_test_coverage=opts.per_test_coverage,
gtest_sharding=opts.gtest_sharding,
+ maxRetriesPerTest=opts.maxRetriesPerTest,
)
discovered_tests = lit.discovery.find_tests_for_inputs(
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/lit.cfg
new file mode 100644
index 0000000000000..2f16e95dbe196
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/lit.cfg
@@ -0,0 +1,10 @@
+import lit.formats
+
+config.name = "allow-retries-no-test_retry_attempts"
+config.suffixes = [".py"]
+config.test_format = lit.formats.ShTest()
+config.test_source_root = None
+config.test_exec_root = None
+
+config.substitutions.append(("%python", lit_config.params.get("python", "")))
+config.substitutions.append(("%counter", lit_config.params.get("counter", "")))
\ No newline at end of file
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py
new file mode 100644
index 0000000000000..f2333a7de455a
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py
@@ -0,0 +1,23 @@
+# ALLOW_RETRIES: 3
+# RUN: "%python" "%s" "%counter"
+
+import sys
+import os
+
+counter_file = sys.argv[1]
+
+# The first time the test is run, initialize the counter to 1.
+if not os.path.exists(counter_file):
+ with open(counter_file, "w") as counter:
+ counter.write("1")
+
+# Succeed if this is the fourth time we're being run.
+with open(counter_file, "r") as counter:
+ num = int(counter.read())
+ if num == 4:
+ sys.exit(0)
+
+# Otherwise, increment the counter and fail
+with open(counter_file, "w") as counter:
+ counter.write(str(num + 1))
+ sys.exit(1)
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg
new file mode 100644
index 0000000000000..97e4edb2dfded
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg
@@ -0,0 +1,12 @@
+import lit.formats
+
+config.name = "allow-retries-no-test_retry_attempts"
+config.suffixes = [".py"]
+config.test_format = lit.formats.ShTest()
+config.test_source_root = None
+config.test_exec_root = None
+
+config.substitutions.append(("%python", lit_config.params.get("python", "")))
+config.substitutions.append(("%counter", lit_config.params.get("counter", "")))
+
+config.test_retry_attempts = 2
\ No newline at end of file
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py
new file mode 100644
index 0000000000000..f2333a7de455a
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py
@@ -0,0 +1,23 @@
+# ALLOW_RETRIES: 3
+# RUN: "%python" "%s" "%counter"
+
+import sys
+import os
+
+counter_file = sys.argv[1]
+
+# The first time the test is run, initialize the counter to 1.
+if not os.path.exists(counter_file):
+ with open(counter_file, "w") as counter:
+ counter.write("1")
+
+# Succeed if this is the fourth time we're being run.
+with open(counter_file, "r") as counter:
+ num = int(counter.read())
+ if num == 4:
+ sys.exit(0)
+
+# Otherwise, increment the counter and fail
+with open(counter_file, "w") as counter:
+ counter.write(str(num + 1))
+ sys.exit(1)
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/lit.cfg
new file mode 100644
index 0000000000000..f851ba08002a0
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/lit.cfg
@@ -0,0 +1,10 @@
+import lit.formats
+
+config.name = "no-allow-retries-no-test_retry_attempts"
+config.suffixes = [".py"]
+config.test_format = lit.formats.ShTest()
+config.test_source_root = None
+config.test_exec_root = None
+
+config.substitutions.append(("%python", lit_config.params.get("python", "")))
+config.substitutions.append(("%counter", lit_config.params.get("counter", "")))
\ No newline at end of file
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/test.py
new file mode 100644
index 0000000000000..a139976cc49ec
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-no-test_retry_attempts/test.py
@@ -0,0 +1,22 @@
+# RUN: "%python" "%s" "%counter"
+
+import sys
+import os
+
+counter_file = sys.argv[1]
+
+# The first time the test is run, initialize the counter to 1.
+if not os.path.exists(counter_file):
+ with open(counter_file, "w") as counter:
+ counter.write("1")
+
+# Succeed if this is the fourth time we're being run.
+with open(counter_file, "r") as counter:
+ num = int(counter.read())
+ if num == 4:
+ sys.exit(0)
+
+# Otherwise, increment the counter and fail
+with open(counter_file, "w") as counter:
+ counter.write(str(num + 1))
+ sys.exit(1)
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg
new file mode 100644
index 0000000000000..2279d293849a8
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/lit.cfg
@@ -0,0 +1,12 @@
+import lit.formats
+
+config.name = "no-allow-retries-test_retry_attempts"
+config.suffixes = [".py"]
+config.test_format = lit.formats.ShTest()
+config.test_source_root = None
+config.test_exec_root = None
+
+config.substitutions.append(("%python", lit_config.params.get("python", "")))
+config.substitutions.append(("%counter", lit_config.params.get("counter", "")))
+
+config.test_retry_attempts = 3
\ No newline at end of file
diff --git a/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py
new file mode 100644
index 0000000000000..a139976cc49ec
--- /dev/null
+++ b/llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py
@@ -0,0 +1,22 @@
+# RUN: "%python" "%s" "%counter"
+
+import sys
+import os
+
+counter_file = sys.argv[1]
+
+# The first time the test is run, initialize the counter to 1.
+if not os.path.exists(counter_file):
+ with open(counter_file, "w") as counter:
+ counter.write("1")
+
+# Succeed if this is the fourth time we're being run.
+with open(counter_file, "r") as counter:
+ num = int(counter.read())
+ if num == 4:
+ sys.exit(0)
+
+# Otherwise, increment the counter and fail
+with open(counter_file, "w") as counter:
+ counter.write(str(num + 1))
+ sys.exit(1)
diff --git a/llvm/utils/lit/tests/allow-retries.py b/llvm/utils/lit/tests/allow-retries.py
index 45610fb70d348..ba95d34a09a4c 100644
--- a/llvm/utils/lit/tests/allow-retries.py
+++ b/llvm/utils/lit/tests/allow-retries.py
@@ -70,3 +70,51 @@
# CHECK-TEST7: # executed command: export LLVM_PROFILE_FILE=
# CHECK-TEST7-NOT: # executed command: export LLVM_PROFILE_FILE=
# CHECK-TEST7: Passed With Retry: 1
+
+# This test only passes on the 4th try. Here we check that a test can be re-run when:
+# * The "--max-retries-per-test" is specified high enough (3).
+# * No ALLOW_RETRIES keyword is used in the test script.
+# * No config.test_retry_attempts is adjusted in the test suite config file.
+# RUN: rm -f %t.counter
+# RUN: %{lit} %{inputs}/max-retries-per-test/no-allow-retries-no-test_retry_attempts/test.py \
+# RUN: --max-retries-per-test=3 \
+# RUN: -Dcounter=%t.counter \
+# RUN: -Dpython=%{python} \
+# RUN: | FileCheck --check-prefix=CHECK-TEST8 %s
+# CHECK-TEST8: Passed With Retry: 1
+
+# This test only passes on the 4th try. Here we check that a test can be re-run when:
+# * The "--max-retries-per-test" is specified too low (2).
+# * ALLOW_RETRIES is specified high enough (3)
+# * No config.test_retry_attempts is adjusted in the test suite config file.
+# RUN: rm -f %t.counter
+# RUN: %{lit} %{inputs}/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py \
+# RUN: --max-retries-per-test=2 \
+# RUN: -Dcounter=%t.counter \
+# RUN: -Dpython=%{python} \
+# RUN: | FileCheck --check-prefix=CHECK-TEST9 %s
+# CHECK-TEST9: Passed With Retry: 1
+
+# This test only passes on the 4th try. Here we check that a test can be re-run when:
+# * The "--max-retries-per-test" is specified too low (2).
+# * No ALLOW_RETRIES keyword is used in the test script.
+# * config.test_retry_attempts is set high enough (3).
+# RUN: rm -f %t.counter
+# RUN: %{lit} %{inputs}/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py \
+# RUN: --max-retries-per-test=2 \
+# RUN: -Dcounter=%t.counter \
+# RUN: -Dpython=%{python} \
+# RUN: | FileCheck --check-prefix=CHECK-TEST10 %s
+# CHECK-TEST10: Passed With Retry: 1
+
+# This test only passes on the 4th try. Here we check that a test can be re-run when:
+# * The "--max-retries-per-test" is specified too low (1).
+# * ALLOW_RETRIES keyword set high enough (3).
+# * config.test_retry_attempts is set too low enough (2).
+# RUN: rm -f %t.counter
+# RUN: %{lit} %{inputs}/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py \
+# RUN: --max-retries-per-test=1 \
+# RUN: -Dcounter=%t.counter \
+# RUN: -Dpython=%{python} \
+# RUN: | FileCheck --check-prefix=CHECK-TEST11 %s
+# CHECK-TEST11: Passed With Retry: 1
|
When packaging LLVM we've seen arbitrary tests fail. It happened sporadically and most of the times the test do work if they are run a second time on the next day. The tests themselves were always different and we didn't know ahead of time which ones we wanted to re-run. That's we filter-out a lot of `libomp` and `libarcher` tests [1]. This change allows us to set `LIT_OPTS="--max-retries-per-test=12"` when running any "check-XXX" build target. Then any lit test will at most be re-run 12 times, unless there's an `ALLOW_RETRIES:` in one of the test scripts that's specifying a different value than `12`. `12` is just an example here, any positive integer will work. Please note, that this only adds the possibility to re-run lit tests. It does not actually do it until the caller specifies `--max-retries-per-test=<POSITIVE_INT>` either on a call to `lit` or in `LIT_OPTS`. Also note, that one can still use `ALLOW_RETRIES:` in test scripts and it will always rule over `--max-retries-per-test`. When `--max-retries-per-test` is set too low, but the `config.test_retry_attempts` is high enough, it works as well. Any option in the list below overrules its predecessor: * `--max-retries-per-test` * `config.test_retry_attempts` * `ALLOW_RETRIES` keyword [1]: https://src.fedoraproject.org/rpms/llvm/blob/rawhide/f/llvm.spec#_2326 Downstream PR to make use of the `--max-retries-per-test` option: https://src.fedoraproject.org/rpms/llvm/pull-request/434 Downstream ticket: https://issues.redhat.com/browse/LLVM-145
4590902
to
1e07072
Compare
llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-no-test_retry_attempts/test.py
Show resolved
Hide resolved
llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/lit.cfg
Outdated
Show resolved
Hide resolved
llvm/utils/lit/tests/Inputs/max-retries-per-test/allow-retries-test_retry_attempts/test.py
Show resolved
Hide resolved
llvm/utils/lit/tests/Inputs/max-retries-per-test/no-allow-retries-test_retry_attempts/test.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also document the option in https://github.com/llvm/llvm-project/blob/main/llvm/docs/CommandGuide/lit.rst?
Done in 009ce5e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for what it's worth.
This is a fixup for llvm#141851
This is a fixup for llvm#141851 and consistently applies the "=" to all options with additional arguments. Before 14 out of 22 options with arguments used "=" and 7 didn't.
This introduces `--show-attempts-count` output option to `lit`. It shows the number of attempts needed for a test to pass. It also shows the maximum number of attempts that were allowed for this test. This option is useful when you want to understand how many attempts took for a flaky test (``FLAKYPASS``) to pass. This is how the test output looks like *without* `--show-attempts-count`: ``` PASS: your-test-suite :: your-first-test.py (1 of 3) FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2) ``` This is the output *with* `--show-attempts-count`: ``` PASS: your-test-suite :: your-first-test.py (1 of 2) FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2) [attempts=3,max_allowed_attempts=4] ``` In this case `your-second-test.py` was executed three times and it succeeded after the third time (`attempts=3`). Technically another run would have been possible (`max_allowed_attempts=4`). We will only append the extra information when a test was allowed more than one attempt to succeed (i.e. see :option:`--max-retries-per-test`). NOTE: Additionally this is a fixup for llvm#141851 where the tests were not quite right. To correlate better between the test output and the test script I've used higher numbers of max allowed retries.
This is a fixup for llvm#141851 and removes `=` from all options with additional arguments. Before 14 out of 22 options with arguments used "=" and 7 didn't.
This introduces `--show-attempts-count` output option to `lit`. It shows the number of attempts needed for a test to pass. It also shows the maximum number of attempts that were allowed for this test. This option is useful when you want to understand how many attempts took for a flaky test (``FLAKYPASS``) to pass. This is how the test output looks like *without* `--show-attempts-count`: ``` PASS: your-test-suite :: your-first-test.py (1 of 3) FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2) ``` This is the output *with* `--show-attempts-count`: ``` PASS: your-test-suite :: your-first-test.py (1 of 2) FLAKYPASS: your-test-suite :: your-second-test.py (2 of 2) [attempts=3,max_allowed_attempts=4] ``` In this case `your-second-test.py` was executed three times and it succeeded after the third time (`attempts=3`). Technically another run would have been possible (`max_allowed_attempts=4`). We will only append the extra information when a test was allowed more than one attempt to succeed (i.e. see :option:`--max-retries-per-test`). NOTE: Additionally this is a fixup for llvm#141851 where the tests were not quite right. `max-retries-per-test/allow-retries-test_retry_attempts/test.py` was added but never used there. Now we're calling it. To correlate better between the test output and the test script I've used higher numbers of max allowed retries.
This is a fixup for llvm#141851 and removes `=` from all options with additional arguments. Before 14 out of 22 options with arguments used "=" and 7 didn't.
This is a fixup for #141851 and removes `=` from all options with additional arguments. Before 14 out of 22 options with arguments used "=" and 7 didn't.
If a test took more than one attempt to complete, show the number of attempts and the maximum allowed attempts as `2 of 4 attempts` inside the `<progress info>` (see [TEST RUN OUTPUT FORMAT](https://llvm.org/docs/CommandGuide/lit.html#test-run-output-format)). NOTE: Additionally this is a fixup for #141851 where the tests were not quite right. `max-retries-per-test/allow-retries-test_retry_attempts/test.py` was added but never used there. Now we're calling it. To correlate better between the test output and the test script I've used higher numbers of max allowed retries.
This is a fixup for llvm#141851 and removes `=` from all options with additional arguments. Before 14 out of 22 options with arguments used "=" and 7 didn't.
If a test took more than one attempt to complete, show the number of attempts and the maximum allowed attempts as `2 of 4 attempts` inside the `<progress info>` (see [TEST RUN OUTPUT FORMAT](https://llvm.org/docs/CommandGuide/lit.html#test-run-output-format)). NOTE: Additionally this is a fixup for llvm#141851 where the tests were not quite right. `max-retries-per-test/allow-retries-test_retry_attempts/test.py` was added but never used there. Now we're calling it. To correlate better between the test output and the test script I've used higher numbers of max allowed retries.
This is a fixup for llvm#141851 and removes `=` from all options with additional arguments. Before 14 out of 22 options with arguments used "=" and 7 didn't.
If a test took more than one attempt to complete, show the number of attempts and the maximum allowed attempts as `2 of 4 attempts` inside the `<progress info>` (see [TEST RUN OUTPUT FORMAT](https://llvm.org/docs/CommandGuide/lit.html#test-run-output-format)). NOTE: Additionally this is a fixup for llvm#141851 where the tests were not quite right. `max-retries-per-test/allow-retries-test_retry_attempts/test.py` was added but never used there. Now we're calling it. To correlate better between the test output and the test script I've used higher numbers of max allowed retries.
When packaging LLVM we've seen arbitrary tests fail.
It happened sporadically and most of the times the test
do work if they are run a second time on the next day.
The tests themselves were always different and we didn't
know ahead of time which ones we wanted to re-run.
That's we filter-out a lot of
libomp
andlibarcher
tests 1.This change allows us to set
LIT_OPTS="--max-retries-per-test=12"
when running any "check-XXX" build target. Then any lit test
will at most be re-run 12 times, unless there's an
ALLOW_RETRIES:
in one of the test scripts that's specifying a different value
than
12
.12
is just an example here, any positive integerwill work.
Please note, that this only adds the possibility to re-run
lit tests. It does not actually do it until the caller specifies
--max-retries-per-test=<POSITIVE_INT>
either on a call tolit
orin
LIT_OPTS
.Also note, that one can still use
ALLOW_RETRIES:
in test scriptsand it will always rule over
--max-retries-per-test
. When--max-retries-per-test
is set too low, but theconfig.test_retry_attempts
is high enough, it works as well.
Any option in the list below overrules its predecessor:
--max-retries-per-test
config.test_retry_attempts
ALLOW_RETRIES
keywordFrom the above options to re-run tests,
--max-retries-per-test
is the only one that doesn't require a change in the test scripts or the test config.Downstream PR to make use of the
--max-retries-per-test
option: https://src.fedoraproject.org/rpms/llvm/pull-request/434Downstream ticket: https://issues.redhat.com/browse/LLVM-145