Skip to content

[TF] TF-496: Conditionally conform SIMD types to Differentiable #24786

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

bartchr808
Copy link
Contributor

@bartchr808 bartchr808 commented May 14, 2019

Implements this JIRA ticket which makes SIMD vectors differentiable.

Also here is a link to the Swift Forums where some of this was discussed. See the forum post for any context regarding the changes.

There is also the change to the -=/+= operator. As background, we need SIMD to conform to the AdditiveArithmetic protocol which already provides a conditional definition of the -= and += operator. So to avoid the "dual default conditional definition protocol" issue, I removed the default implementation in SIMD for now on our branch, but we will fix it when we pitch it to get it merged into master.

Also, I had to remove @_alwaysEmitIntoClient from sum() due to a bug (filed as TF-545).

And finally, there are custom VJPs that I defined that don't actually need to be defined, as the body of the original methods define methods that I already marked as differentiable. But the bug TF-547 is blocking me on that.

@bartchr808 bartchr808 requested review from stephentyrone and rxwei May 14, 2019 22:49
@bartchr808
Copy link
Contributor Author

bartchr808 commented May 14, 2019

Forgot to remove the zero property as mentioned in the forum....will push change soon.

@dan-zheng dan-zheng added the tensorflow This is for "tensorflow" branch PRs. label May 14, 2019
@bartchr808 bartchr808 changed the title TF-496: Conditionally conform SIMD types to Differentiable [TF] TF-496: Conditionally conform SIMD types to Differentiable May 14, 2019
@bartchr808
Copy link
Contributor Author

bartchr808 commented May 15, 2019

@stephentyrone I'm not sure what you and the the Swift Core team thinks as to whether this is a bug fix or a new feature that would require a proposal for, but I just realized that we could create a separate PR just for the -=/+= part and merge that into master.

Additionally, for the +=/-= fix of my PR, which do you think would be better (or possibly even an alternate idea if you have one):

  • My current PR where I removed the operators from the SIMD protocol, keeping the one in AdditiveArithmetic
  • The complete reverse where we keep the SIMD protocol conditional definition, and remove the conditional definition in AdditiveArithmetic

In either case the SIMD implementation would be preferred, it just depends whether in the protocol or the various structs

@bartchr808 bartchr808 force-pushed the TF-496-conform-SIMD-types-differentiable branch from 32e963a to 0268fc6 Compare May 16, 2019 05:15
where Scalar : Differentiable & BinaryFloatingPoint,
Scalar.CotangentVector : BinaryFloatingPoint {
public func _vjpSubscript(index: Int) ->
(Scalar, (Scalar.CotangentVector) -> CotangentVector)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent by 2 and put the -> on this line.

Suggested change
(Scalar, (Scalar.CotangentVector) -> CotangentVector)
-> (Scalar, (Scalar.CotangentVector) -> CotangentVector)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To verify:

public func _vjpSubscript(index: Int)
  -> (Scalar, (Scalar.CotangentVector) -> CotangentVector)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. VJP functions should not be public though. When the original function is public, a VJP function should be internal. When it's internal, if the original function has @inlinable, the VJP function should be @inlinable, otherwise the VJP function should be @usableFromInline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh okay. But for let's say _vjpSum, the original function (in the protocol) is marked as public and @inlinable. How should _vjpSum be marked? Looking at Tensor, its _vjpSum is just @inlinable in a regular extension

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the original function is @inlinable, the VJP should also be @inlinable. In all cases where the original function is public, the VJP should be internal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for SIMD.sum(), it's not @inlinable, so the VJP should be @usableFromInline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh okay thanks!

public func _vjpSubscript(index: Int) ->
(Scalar, (Scalar.CotangentVector) -> CotangentVector)
{
return (self[index], { (v: Scalar.CotangentVector) in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the type signature is needed here. Did the compiler suggest otherwise?

Suggested change
return (self[index], { (v: Scalar.CotangentVector) in
return (self[index], { v in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point it was asking me for it either here in the struct or in the SIMD protocol where it gave me an error along the lines of cannot convert return expression of type '(Self, (_) -> _)' to return type '(Self, (Self.CotangentVector) -> (Self.Scalar, Self.CotangentVector))', but I think this ended up being fixed when I corrected the protocol refinements. So I actually can remove these all now!

@bartchr808 bartchr808 removed the request for review from stephentyrone May 16, 2019 21:43
@bartchr808
Copy link
Contributor Author

bartchr808 commented May 16, 2019

Seems like this may have a bug, filed it here.

TODO:
- Determine whether sum() needs to be differentiated
- See what other inits I can differentiate
- Remove some of the custom VJPs now that init(repeating:) is differentiable
@bartchr808
Copy link
Contributor Author

In case you see that I pushed commits to this PR, Ignore re-reviewing this PR for now while I clean up some stuff. Pushed it so it's accessible from desktop and laptop for me. Still need to do three things:

  • Determine whether sum() needs to be differentiated
  • See what other inits I can differentiate
  • Remove some of the custom VJPs now that init(repeating:) is differentiable

@@ -843,6 +874,13 @@ extension SIMD where Scalar: FloatingPoint {

/// Returns the sum of the scalars in the vector.
@_alwaysEmitIntoClient
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to look into this, but I think this attribute is what is preventing me from differentiating the sum()

public static func /(lhs: Self, rhs: Scalar) -> Self {
return lhs / Self(repeating: rhs)
}

@_transparent
public static func +=(lhs: inout Self, rhs: Self) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to make SIMD conform to AdditiveArithmetic, I had to remove both of these implementations as two protocols cannot both have default implementations. Let me know if this is okay, or if there is something else we should do.

@bartchr808 bartchr808 force-pushed the TF-496-conform-SIMD-types-differentiable branch 2 times, most recently from d387338 to b6b8fe8 Compare June 3, 2019 20:04
@bartchr808
Copy link
Contributor Author

@swift-ci please test tensorflow

1 similar comment
@bartchr808
Copy link
Contributor Author

@swift-ci please test tensorflow

@bartchr808
Copy link
Contributor Author

TODO: getting assertion failure if I want to eliminate redundant custom VJPs: https://bugs.swift.org/browse/TF-547

@bartchr808 bartchr808 force-pushed the TF-496-conform-SIMD-types-differentiable branch from b6b8fe8 to 8693e01 Compare June 4, 2019 17:58
@bartchr808 bartchr808 force-pushed the TF-496-conform-SIMD-types-differentiable branch from 8693e01 to 2821b8f Compare June 4, 2019 17:59
@bartchr808 bartchr808 force-pushed the TF-496-conform-SIMD-types-differentiable branch from d7c29b5 to 8d495dd Compare June 13, 2019 15:47
@bartchr808 bartchr808 force-pushed the TF-496-conform-SIMD-types-differentiable branch from 8d495dd to 7abb1c3 Compare June 13, 2019 15:51
@bartchr808 bartchr808 requested review from rxwei and dan-zheng June 13, 2019 15:57
Copy link
Contributor

@dan-zheng dan-zheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

-> (Scalar, (Scalar.TangentVector) -> TangentVector) {
return (self[index], { v in
var zeros = Self.zero
zeros[index] = Scalar(v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the Scalar initializer call here necessary?

Suggested change
zeros[index] = Scalar(v)
zeros[index] =v

Copy link
Contributor Author

@bartchr808 bartchr808 Jun 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v is of type Scalar.TangentVector while zeros is of type SIMD3<Scalar>, and since the only constraint on Scalar.TangentVector is Scalar.TangentVector : BinaryFloatingPoint which Scalar also conforms to, I have to cast it or else I I get the error cannot assign value of type 'Scalar.TangentVector' to type 'Scalar'. I could add the more constraining requirement of Scalar.TangentVector == Scalar, but I think doing this cast is better. Let me know what you think!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @rxwei regarding whether to add Scalar.TangentVector == Scalar constraint.
I agree that not adding the constraint is better.

@bartchr808
Copy link
Contributor Author

@swift-ci please test tensorflow

Copy link
Contributor

@dan-zheng dan-zheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generic SIMD differentiation tests look great!

@bartchr808
Copy link
Contributor Author

@dan-zheng macOS tests are still going 16 hours later, is this a known problem currently? Or is this the linux passing good enough to merge in?

Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Bart!

@bartchr808
Copy link
Contributor Author

Thanks @dan-zheng and @rxwei ! 😄

@bartchr808 bartchr808 merged commit 3417a08 into swiftlang:tensorflow Jun 14, 2019
@bartchr808 bartchr808 deleted the TF-496-conform-SIMD-types-differentiable branch June 14, 2019 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tensorflow This is for "tensorflow" branch PRs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants