[stdlib] Fix exact floating point to integer cast overflow detection #16960

mundaym · 2018-06-02T01:50:11Z

The round trip approach for detecting exact conversions has two
problems. Firstly, rounding during the conversions might mean that
a round trip succeeds even if the floating point value and integer
value are different. For example Int32.max-1 cannot be represented
by a Float but the conversion succeeds because the rounding happens
to be symmetric. Secondly, LLVM specifies that the result of a
sitofp or uitofp instruction is undefined if it overflows. This
means that the result is platform-dependent and therefore whether or
not the round trip succeeds is also platform dependent.

This commit uses integer arithmetic to calculate whether the
conversion is exact or not in a platform independent manner.

~~This should resolve SR-6322.~~

This commit should probably add some additional tests. Any ideas
where best to add them?

natecook1000 · 2018-06-28T15:36:39Z

@mundaym Thanks for looking at this — do you have tests that show failures under the existing code? We have tests for these kinds of conversions in a few different places, plus parallel ones in swift-corelibs-foundation:

stephentyrone

This is a step in the right direction, but the detail about bit-widths should be addressed, and we should have some tests added to exercise these cases.

stephentyrone · 2018-07-02T15:06:58Z

stdlib/public/core/FloatingPointTypes.swift.gyb

-%   if srcBits >= SignificandBitCount:
-    guard let roundTrip = ${That}(exactly: self),
-      roundTrip == value else {
+%   if srcBits > SignificandBitCount:


This check isn't quite as fine-grained as it should be, though it doesn't matter for Float or Double and any of the usual integer sizes. Binary floating-point can represent any integer with <= SignificandBitCount + 1 bits (+1 because of the implicit bit), and srcBits doesn't account for signed/unsigned.

Thanks. Should I also make the code resilient to short exponents that cannot encode the full range of the integer type?

If it's useful for reference, I implemented some of the routines for generic conversion in the other direction (integer to floating-point) as well as floating-point to floating-point:

#13782

stephentyrone · 2018-07-02T15:07:52Z

stdlib/public/core/FloatingPointTypes.swift.gyb

+    // The magnitude of the integer must be representable by the significand.
+    // Omit trailing zeros as they will be encoded by the exponent.
+    let absv = value.magnitude
+    let mask: ${That}.Magnitude = ~((1 << (${SignificandBitCount}+1)) - 1)


I'm mainly concerned about the point I called out above in light of this; specifically, I'm worried about the case where SignificandBitCount+1 == srcBits.

mundaym · 2018-07-12T15:50:30Z

Thanks for the comments. After reviewing this PR again I don't think it is related to SR-6322 like I originally thought. In fact, the example given in that issue produces the correct result on Ubuntu 16.04 on IBM Z and on a Skylake macOS machine, so it is possible that the issue as stated is already fixed in master (I haven't got an Ubuntu x86 machine set up to try it there yet).

Given that this PR just fixes test failures on IBM Z I don't think it needs any extra tests anymore (a test would need to check that we weren't relying on undefined behavior and I'm not sure if that's possible).

The float to int conversion change is enough to fix the issues we are seeing, but I've also included the int to float code anyway since it is slightly more robust than the original and no longer requires a round trip. I can drop those changes from this PR if you'd prefer though.

xwu

Excellent effort on this. It's not trivial to actually test for some of these issues because there aren't concrete builtin types that exhibit the necessary properties, but I think the algorithm you've written is incorrect in several cases. Please take a look and see what you think.

xwu · 2018-07-12T17:45:57Z

stdlib/public/core/Integers.swift.gyb

-    self._value = Builtin.fpto${u}i_FPIEEE${FloatBits}_${BuiltinName}(source._value)
-    if ${FloatType}(self) != source {
+    if _slowPath(source <= ${str(lower)}.0 || source >= ${str(upper)}.0) {
+      // out of bounds (including infinities)


Nit: please capitalize and punctuate all comments.

xwu · 2018-07-12T17:50:12Z

stdlib/public/core/Integers.swift.gyb

@@ -3296,10 +3296,15 @@ public struct ${Self}
  /// - Parameter source: A floating-point value to convert to an integer.
  @_transparent
  public init?(exactly source: ${FloatType}) {
-    self._value = Builtin.fpto${u}i_FPIEEE${FloatBits}_${BuiltinName}(source._value)
-    if ${FloatType}(self) != source {
+    if _slowPath(source <= ${str(lower)}.0 || source >= ${str(upper)}.0) {


Q: why is this <= and >= here, whereas it's < and > respectively in the preceding method? (In other words, why can't a value equal to ${str(lower)}.0 be converted exactly?

These bounds are exclusive so the bounds themselves are not valid integers. For example, the lower bound of an unsigned integer is -1. The code in the preceding method is performing the opposite check so the comparisons are inverted (i.e. > instead of <= and < instead of >=).

xwu · 2018-07-12T18:29:23Z

stdlib/public/core/FloatingPointTypes.swift.gyb

+%     # representable in floating point (assuming a large enough exponent)
+%     # because only one bit is set. All other possible magnitudes can be
+%     # represented using one bit less than the original source type.
+%     dstBits += 1


Can we simplify this to clarify the logic here? It's not dstBits that gains one if the source type is signed; rather, the maximum representable positive source value uses one fewer bit if the source is signed. Also, the implicit leading one can remain implicit if we continue to use SignificandBitCount explicitly:

% srcBits = self_ty.bits # No change. % srcBitsExcludingSign = srcBits - 1 if self_ty.is_signed else srcBits % # No need to define `dstBits`. /* ... */

We can also tighten up the logic so as to check, rather than assume, a sufficient number of exponent bits:

% if srcBitsExcludingSign < SignificandBitCount and srcBits < 2 ** (ExponentBitCount - 1): @available(*, message: "Converting ${That} to ${Self} will always succeed.") /* ... */

I'd like to keep the implicit bit in the count. It makes the mental juggling easier in my opinion. You're right that it shouldn't really be called dstBits though. I'll try renaming it to something like maxBitLength that makes the intent a bit clearer.

I'll add the exponent size check to the conditions.

xwu · 2018-07-12T18:41:05Z

stdlib/public/core/FloatingPointTypes.swift.gyb

@@ -1599,17 +1608,26 @@ extension ${Self} {
  /// can't be represented exactly, the result is `nil`.
  ///
  /// - Parameter value: The integer to represent as a floating-point value.
-%   if srcBits < SignificandBitCount:
+%   if srcBits <= dstBits:


This change doesn't seem right; at least, it's not behavior preserving even for unsigned values. srcBits <= dstBits is equivalent to srcBits <= SignificandBitCount + 1, which is equivalent to srcBits - 1 <= SignificandBitCount, which isn't the same as srcBits < SignificandBitCount.

Yes, this is deliberate. The old code was doing unnecessary checks. By making this condition correct we can simplify the exactness checking code (by avoiding the possible overflow).

For example, the maximum value that can be represented with an unsigned 24-bit integer (or a signed 25-bit integer) is 16777215 (0xffffff). This value can be encoded exactly by a Float32 which has a 23-bit significand (24-bit including the implicit bit). Therefore we can represent any possible value of an unsigned integer exactly if the length of the integer is less than or equal to the number of significand bits plus one. Assuming the exponent is large enough of course.

xwu · 2018-07-12T18:43:14Z

stdlib/public/core/FloatingPointTypes.swift.gyb

+    // The magnitude of the integer must be representable by the significand.
+    // Omit trailing zeros that can be encoded by the exponent.
+    let absv = value.magnitude
+    let mask: ${That}.Magnitude = ~((1 << (${SignificandBitCount} + 1)) - 1)


I'm concerned as to what happens here if srcBits == SignificandBitCount (assuming you meant to preserve the original condition srcBits >= SignificandBitCount modulo refactoring). It appears this calculation would not be correct because it'd overflow. Rather, I think you'd want to construct the mask by right-shifting ${That}.Magnitude.max the appropriate number of bits. (Note that we now have the guarantee that Magnitude is unsigned as of a recent commit.)

The check above excludes the srcBits == SignificandBitCount case, so overflow can't occur here.

xwu · 2018-07-12T18:45:52Z

stdlib/public/core/FloatingPointTypes.swift.gyb

+%   if srcBits > dstBits:
+    // The magnitude of the integer must be representable by the significand.
+    // Omit trailing zeros that can be encoded by the exponent.
+    let absv = value.magnitude


Nit: It doesn't really matter for local variables, but typical style is to avoid uncommon abbreviations and to use camelCase. Maybe absoluteValue or magnitude_?

xwu · 2018-07-12T18:52:30Z

stdlib/public/core/FloatingPointTypes.swift.gyb

+      return nil
+    }
+    // Check that the required exponent is in range.
+    let maxexp = (1 << (${ExponentBitCount} - 1)) - 1


It'd be nice if we could tighten the GYB condition not to assume that the maximum exponent is representable as an integer. If this overflows, we'll erroneously return nil when in fact the maximum exponent is more than enough to accommodate the integer.

xwu · 2018-07-12T18:54:03Z

stdlib/public/core/FloatingPointTypes.swift.gyb

+    }
+    // Check that the required exponent is in range.
+    let maxexp = (1 << (${ExponentBitCount} - 1)) - 1
+    if _slowPath(absv >> maxexp > 1) {


This isn't quite right, I don't think. If a value has fewer ones than SignificandBitCount, but exactly the number of trailing zeros so that absv >> maxexp == 1, then this algorithm will return an "exact" value that's inexact.

I'm not sure what you mean here. absv hasn't been shifted by the number of trailing zeros, so this check should be enough to check that it is in range.

For example the largest finite value a 32-bit Float can represent is 0xffffff << 104. (0xffffff << 104) >> 127 == 1 and so passes the check. Higher values (e.g. 0x1000000 << 104) are equal to 2 or more when shifted right by 127 or do not have enough trailing zeros and will be caught by the earlier check.

That said I do think we need to hoist this out of its current position so we can do the exponent check separately.

xwu · 2018-07-12T19:10:18Z

I think, alternatively, if you'd like to address the underlying bug only without essentially writing all of this code from scratch, it'd suffice to check after roundtripping that:

let m = value.magnitude
let requiredBits = ${That}.Magnitude.bitWidth
  &- m.leadingZeroBitCount &- m.trailingZeroBitCount
guard requiredBits == significandWidth + 1 else { return nil }

mundaym · 2018-07-13T10:14:47Z

Thanks for the review, I'll make the adjustments.

Out of interest what is the preferred swift style with regards to guard vs if statements? I noticed in your proposal above (and in other parts of this codebase) you use a guard-else statement, rather than an if which coming from other languages seems strange to me, especially when combined with _slowPath or _fastPath. Do people generally use guard-else when returning from a function early?

xwu · 2018-07-13T13:42:03Z

Early return is more clearly spelled out with guard...else because the body must exit the scope. Therefore, if that (i.e., early exit) is the intent of the condition being tested, then one can make it explicit to the reader by opting for guard.

Incidentally, fptoui and fptosi can produce undefined values, but uitofp and sitofp cannot; the result is rounded using the default rounding mode if exact representation isn't possible. Therefore, we can rely on the result of the conversion already computed to determine if "exactness" was preserved where it helps readability or performance.

I'll give further comments later--other responsibilities for the moment :)

LLVM specifies that the result of fptosi and fptoui instructions are undefined if the target type cannot represent the value exactly. On IBM Z (s390x) these instructions currently saturate when overflow occurs. This means that the round-trip used to detect overflow succeeds in situations where the conversion is not actually exact. For example, casting Int32.max to a Float32 via a sitofp instruction results in Int32.max + 1. This is inexact. However if we then convert back to an Int32 via a fptosi instruction the result is clamped to Int32.max and so the round trip has resulted in the same value. We therefore cannot rely on round trips alone to verify the exactness of this cast portably. This commit modifies the conversion routines so that they do not rely on undefined behavior and avoid using round trips in general.

mundaym · 2018-07-16T12:18:30Z

Thanks @xwu. I've modified the PR to use guard statements where appropriate.

stephentyrone · 2018-07-16T18:23:13Z

@swift-ci please smoke test

AnnaZaks requested a review from natecook1000 June 27, 2018 05:03

natecook1000 requested a review from stephentyrone June 28, 2018 15:36

stephentyrone requested changes Jul 2, 2018

View reviewed changes

mundaym force-pushed the s390x-overflow-fix branch from c6d1492 to 53a97e7 Compare July 12, 2018 15:39

mundaym changed the title ~~[stdlib] Fix exact integer to floating point conversion overflow dete…~~ [stdlib] Fix exact floating point to integer cast overflow detection Jul 12, 2018

xwu reviewed Jul 12, 2018

View reviewed changes

mundaym force-pushed the s390x-overflow-fix branch from 53a97e7 to 7ac96f7 Compare July 13, 2018 12:35

mundaym force-pushed the s390x-overflow-fix branch from 7ac96f7 to 66b751a Compare July 16, 2018 12:13

stephentyrone approved these changes Jul 16, 2018

View reviewed changes

stephentyrone merged commit e50f3ba into swiftlang:master Jul 16, 2018

xwu mentioned this pull request Aug 4, 2018

[stdlib] Simplify concrete floating-point-to-integer conversion #18511

Merged

[stdlib] Fix exact floating point to integer cast overflow detection #16960

[stdlib] Fix exact floating point to integer cast overflow detection #16960

Uh oh!

Conversation

mundaym commented Jun 2, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

natecook1000 commented Jun 28, 2018

Uh oh!

stephentyrone left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mundaym commented Jul 12, 2018

Uh oh!

xwu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xwu commented Jul 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mundaym commented Jul 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xwu commented Jul 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mundaym commented Jul 16, 2018

Uh oh!

stephentyrone commented Jul 16, 2018

Uh oh!

Uh oh!

mundaym commented Jun 2, 2018 •

edited

Loading

xwu commented Jul 12, 2018 •

edited

Loading

mundaym commented Jul 13, 2018 •

edited

Loading

xwu commented Jul 13, 2018 •

edited

Loading