[SYCL][ESIMD][EMU] tolerated mismatch rate in binary files comparison #885

lsatanov · 2022-03-02T20:45:09Z

…g involding floating point computations (due to deviations in GPU vs host-based FP).

lsatanov · 2022-03-04T13:09:05Z

SYCL/ESIMD/esimd_test_utils.hpp

@@ -99,26 +100,74 @@ bool write_binary_file(const char *fname, const std::vector<T> &vec,
 }

 template <typename T>
-bool cmp_binary_files(const char *fname1, const char *fname2, T tolerance) {
+bool cmp_binary_files(const char *fname1, const char *fname2, const T tolerance,


Discussion: can either reuse this function template, or make an overload.

… (e.g. for cases resulted from deviations in GPU vs host-based FP computations).

lsatanov · 2022-03-07T09:57:50Z

SYCL/ESIMD/esimd_test_utils.hpp

+          std::cerr << std::endl;
+          return false;
+        } else {
+          std::cerr << ". Current mismatch rate: " << std::setprecision(8)


Question: is it ok to output mismatches (up to the threshold) or output only in case of failure?

I'd say, always output mismatch summary, even if they are "tolerated" and test passes.

SYCL/ESIMD/esimd_test_utils.hpp

kbobrovs · 2022-03-15T17:58:59Z

SYCL/ESIMD/esimd_test_utils.hpp

-bool cmp_binary_files(const char *fname1, const char *fname2, T tolerance) {
+bool cmp_binary_files(const char *fname1, const char *fname2, const T tolerance,
+                      const double toleratedMismatchRate = 0,
+                      const int mismatchReportThrottleLimit = 5) {


what is mismatchReportThrottleLimit? please add comment. what "throttle" in the name signifies?

"Threshold" might be a better name.

sounds good

kbobrovs · 2022-03-15T18:08:40Z

SYCL/ESIMD/esimd_test_utils.hpp

-        std::cerr << (int)vec1[i] << " vs " << (int)vec2[i] << std::endl;
-      } else {
-        std::cerr << vec1[i] << " vs " << vec2[i] << std::endl;
+      if (!toleratedMismatchRate ||


if toleratedMismatchRate == 0 and there are errors in every vector element, there will be size lines of output, which is bad. I understand that this is how it was before, but since you are refactoring this anyway, please output at most mismatchReportThrottleLimit errors (also consider changing the name to something simpler - mismatchReportLimit)

if toleratedMismatchRate == 0 and there are errors in every vector element, there will be size lines of output, which is bad.

Correction: there will be one line of output and return:

if (!toleratedMismatchRate) {
std::cerr << std::endl;
return false;
}

// Yes, this is return from inside a loop which is hard to notice.

I see. ok then

kbobrovs · 2022-03-15T18:11:11Z

SYCL/ESIMD/esimd_test_utils.hpp

+          std::cerr << std::endl;
+          return false;
+        } else {
+          std::cerr << ". Current mismatch rate: " << std::setprecision(8)


I'd say, always output mismatch summary, even if they are "tolerated" and test passes.

kbobrovs · 2022-03-15T18:12:12Z

SYCL/ESIMD/esimd_test_utils.hpp

+          return false;
+        } else {
+          std::cerr << ". Current mismatch rate: " << std::setprecision(8)
+                    << std::fixed << totalMismatches / size << std::endl;


Suggested change

<< std::fixed << totalMismatches / size << std::endl;

<< std::fixed << totalMismatches / (double)size << std::endl;

accompanies the above suggestion to make size of size_t type.

TBH, I was relying on implicit conversions.
Yes, implicit conversions can be evil (int promotion is just one to name among those, not related here
however).

If you explicitly oppose any implicit conversions, I can change. If not, let's leave it as is, not to make more code.

if size becomes size_t as I suggested above, then totalMismatches / size would be 0

I know where confusion comes from:
yes it would be 0 if totalMismatches was of integral type, BUT totalMismatches is of type double, so totalMismatches / size will not be 0 unless totalMismatches is 0.0.
"size" will be implicitly converted to double in this expression.

totalMismatches should be integral as well then. I did not even expect it to be double, very counter-intuitive. Please make them both integral and do casting where needed.

The type of totalMismatches conforms to its use case (also, intuition is subjective).

please change totalMismatches and size to integer type to avoid hard-to-detect conversion errors when double can't accurately represent these integer (by nature) values.

For double after reaching 2^53 we'd stop incrementing the value if incrementing by 1, that's true.
If there'll ever be a > petabyte file checked in llvm-test-suite, well, yes, that could be a problem.
Another argument against double could be performance.

Anyways.

kbobrovs · 2022-03-15T18:15:49Z

SYCL/ESIMD/esimd_test_utils.hpp

+        }
+
+      } else if (totalMismatches == mismatchReportThrottleLimit) {
+        std::cerr << "Mismatch output throttled ... " << std::endl;


I don't understand meaning of "throttled" here. "truncated", "stopped" ?

Yes, stopped.

ok, please reword then

kbobrovs · 2022-03-15T18:18:59Z

SYCL/ESIMD/esimd_test_utils.hpp

      return false;
    }
+
+    std::cerr << "Total mismatch rate is " << totalMismatchRate << std::endl;


With introduction of tolerated mismatched rate, we need to also introduce and report maximum detected relative mismatch (in %). I would say if this is more than 10-20% in any element, that would indicate a problem, and test test developer should pay attention.

if this is more than 10-20% in any element, that would indicate a problem

If I understand correctly, you mean checking for absolute difference (distance) between reference vs tested value of each data element (and expressing it as a percentage).

That's subtle: interpretation of this distance depends on generator semantics, e.g. in case of overflow allowed during generation for uint8_t what's the distance between 255 and 0? It may be considered as 255 in one direction, but also as 1+N*(255+1) if we go the other way over the circle, where N is from 0 to +inf.

And the same goes for Mandelbrot set image generator: color value overflows and we are not certain about what's the distance: ambiguity we can't interpret

No, I mean relative, not absolute, just as I wrote. abs((expected-val)/expected)*100.
So, please report this info rather than silently returning "pass".

Note that I did not propose return "fail", irrespective of the max relative mismatch. I would like test developers or people who run tests see it and decide themselves if it is OK or not.

No, I mean relative, ... abs((expected-val)/expected)*100

Konstantin, I understand, but formally it's derived from absolute difference [ abs((expected-val) ... ]
and it doesn't matter if later it's expressed as a percentage from the expected value or anything else: it gives no practically useful information.

Example. Let's look at some full error log:
...
byte 1: got 1 instead of 255
byte 2: got 10 instead of 9
byte 3: got 5 instead of 4
byte 100: got 12 instead of 9
byte 101: got 11 instead of 8
byte 102: got 6 instead of 3
...

With this info we can do at least something: observe the whole picture, try to infer a pattern.
But if we report max relative divergency for a single byte (byte 1 in this case) it will give no useful information by itself.

The fact that it does not give practically useful information in this particular case with mandelbrot does not imply it won't in other cases. Since this is a library function to be used by other cases, we should care about other use cases.

Konstantin,
It's not applicable in case of toleratedMismatchRate == 0 as we fail right away there.
In case of toleratedMismatchRate != 0 log contains plenty of records that allow a user to make conclusions about deviations or increase print limit to observe the full log.

There's practically no use case for the suggested data.

I could understand adding something if there's a use case, but, I'm sorry, I can't understand adding more code if there's no foreseeable use case (taking into account all mentioned above).

I outlined the use case above - please take a look.
Having user scan through all tolerated mismatches in case there are many of these is error prone.

NB: in current implementation arguments fname1, fname2 are not assigned to any particular role of being reference or not.
Adding the requested printout will imply assigning these roles.

Co-authored-by: kbobrovs <[email protected]>

…es comparison (for cases resulted from deviations in GPU vs host-based FP computations).

… tolerated-mismatch-rate

…/llvm-test-suite into tolerated-mismatch-rate

…es comparison (for cases resulted from deviations in GPU vs host-based FP computations).

lsatanov · 2022-03-22T09:25:22Z

SYCL/ESIMD/esimd_test_utils.hpp

+bool cmp_binary_files(const char *testOutFile, const char *referenceFile,
+                      const T tolerance = 0,
+                      const double mismatchRateTolerance = 0,
+                      const int mismatchReportLimit = 9) {


@kbobrovs , maybe it's better to make it overridable via environment for convenience.

Sounds good. Could be a separate patch

lsatanov · 2022-03-22T09:26:31Z

SYCL/ESIMD/esimd_test_utils.hpp

+          typename std::enable_if<std::is_integral<T>::value ||
+                                      std::is_floating_point<T>::value,
+                                  int>::type = 0>
+bool cmp_binary_files(const char *testOutFile, const char *referenceFile,


Args renamed to reflect roles of test output vs reference source.

lsatanov

# command stderr:
Tolerated mismatch rate set to 0.0027
Mismatch at 135942 133 vs 118. Current mismatch rate: 0.00000000
Mismatch at 135943 45 vs 38. Current mismatch rate: 0.00000069
Mismatch at 135944 129 vs 126. Current mismatch rate: 0.00000139
Mismatch at 183876 146 vs 236. Current mismatch rate: 0.00000208
Mismatch at 183877 34 vs 76. Current mismatch rate: 0.00000278
Mismatch at 183878 234 vs 252. Current mismatch rate: 0.00000347
Mismatch at 183918 11 vs 252. Current mismatch rate: 0.00000417
Mismatch at 183919 227 vs 220. Current mismatch rate: 0.00000486
Mismatch at 183920 207 vs 204. Current mismatch rate: 0.00000556
Mismatch output stopped.
Total mismatch rate is 0.00267497 with max relative difference of 0.95634921
PASSED

… tolerated-mismatch-rate

…intel#885) * [SYCL][ESIMD][EMU] tolerated mismatch rate in binary files comparison (e.g. for cases resulted from deviations in GPU vs host-based FP computations).

…intel/llvm-test-suite#885) * [SYCL][ESIMD][EMU] tolerated mismatch rate in binary files comparison (e.g. for cases resulted from deviations in GPU vs host-based FP computations).

lsatanov force-pushed the tolerated-mismatch-rate branch 8 times, most recently from 51b5647 to 7cb7c0e Compare March 3, 2022 14:49

lsatanov commented Mar 4, 2022

View reviewed changes

[SYCL][ESIMD][EMU] tolerated mismatch rate in binary files comparison…

49bd916

… (e.g. for cases resulted from deviations in GPU vs host-based FP computations).

lsatanov force-pushed the tolerated-mismatch-rate branch from 7cb7c0e to 49bd916 Compare March 7, 2022 09:10

lsatanov changed the title ~~DRAFT [SYCL][ESIMD][EMU]~~ DRAFT [SYCL][ESIMD][EMU] tolerated mismatch rate in binary files comparison Mar 7, 2022

lsatanov changed the title ~~DRAFT [SYCL][ESIMD][EMU] tolerated mismatch rate in binary files comparison~~ [SYCL][ESIMD][EMU] tolerated mismatch rate in binary files comparison Mar 7, 2022

lsatanov commented Mar 7, 2022

View reviewed changes

lsatanov requested review from dongkyunahn-intel and kbobrovs March 7, 2022 11:46

lsatanov marked this pull request as ready for review March 7, 2022 11:47

lsatanov requested a review from a team as a code owner March 7, 2022 11:47

dongkyunahn-intel requested a review from kychendev March 7, 2022 20:26

kbobrovs suggested changes Mar 15, 2022

View reviewed changes

lsatanov and others added 5 commits March 16, 2022 13:02

Update SYCL/ESIMD/esimd_test_utils.hpp

ad7921c

Co-authored-by: kbobrovs <[email protected]>

[SYCL][ESIMD][EMU] addendum to mismatch rate tolerance for binary fil…

6cf0701

…es comparison (for cases resulted from deviations in GPU vs host-based FP computations).

Merge branch 'intel' of https://github.com/intel/llvm-test-suite into…

959d546

… tolerated-mismatch-rate

Merge branch 'tolerated-mismatch-rate' of https://github.com/lsatanov…

780f77f

…/llvm-test-suite into tolerated-mismatch-rate

[SYCL][ESIMD][EMU] addendum to mismatch rate tolerance for binary fil…

5fefebe

…es comparison (for cases resulted from deviations in GPU vs host-based FP computations).

lsatanov requested review from sndmitriev and vladimirlaz March 18, 2022 10:53

[SYCL][ESIMD][EMU] addendum to mismatch rate tolerance for binary fil…

de9449f

…es comparison (for cases resulted from deviations in GPU vs host-based FP computations).

lsatanov commented Mar 22, 2022

View reviewed changes

lsatanov added 3 commits March 22, 2022 12:34

Merge branch 'intel' of https://github.com/intel/llvm-test-suite into…

9a7d3ea

… tolerated-mismatch-rate

[SYCL][ESIMD][EMU] addendum to mismatch rate tolerance ...

04ec087

[SYCL][ESIMD][EMU] addendum to mismatch rate tolerance ...

0d204e9

kbobrovs approved these changes Mar 24, 2022

View reviewed changes

kbobrovs merged commit b63c248 into intel:intel Mar 24, 2022

	<< std::fixed << totalMismatches / size << std::endl;
	<< std::fixed << totalMismatches / (double)size << std::endl;

[SYCL][ESIMD][EMU] tolerated mismatch rate in binary files comparison #885

[SYCL][ESIMD][EMU] tolerated mismatch rate in binary files comparison #885

Uh oh!

Conversation

lsatanov commented Mar 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lsatanov Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsatanov Mar 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsatanov Mar 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsatanov Mar 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsatanov Mar 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsatanov Mar 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsatanov Mar 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsatanov Mar 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsatanov commented Mar 2, 2022 •

edited

Loading

lsatanov Mar 4, 2022 •

edited

Loading

lsatanov Mar 7, 2022 •

edited

Loading

lsatanov Mar 17, 2022 •

edited

Loading

lsatanov Mar 18, 2022 •

edited

Loading

lsatanov Mar 22, 2022 •

edited

Loading

lsatanov Mar 16, 2022 •

edited

Loading

lsatanov Mar 16, 2022 •

edited

Loading

lsatanov Mar 18, 2022 •

edited

Loading

lsatanov left a comment •

edited

Loading