Change momentum and epsilon properties to scalars. #525

dan-zheng · 2019-10-02T21:04:09Z

Change the following properties from Scalar to Tensor<Scalar>.

BatchNorm.momentum
BatchNorm.epsilon
LayerNorm.epsilon
Semantically, these properties are always scalars.

Note: this will be an API breaking change in Swift for TensorFlow 0.6.
Deprecating the other BatchNorm and LayerNorm initializers is tricky
because it causes ambiguity problems.

dan-zheng · 2019-10-02T21:05:48Z

I will trigger a full Kokoro build to check if other repositories (tensorflow/swift-models, fastai/swiftai) are broken.

rxwei

Scalar is a generic parameter. It will turn BatchNorm into an address-only type, which will be lowered as a SIL address type. In the meantime, the BatchNorm.TangentVector is loadable and will lowered to a SIL object type. This will break the current assumptions in the differentiation transform.

dan-zheng · 2019-10-02T21:10:33Z

Scalar is a generic parameter. It will turn BatchNorm into an address-only type, which will be lowered as a SIL address type. In the meantime, the BatchNorm.TangentVector is loadable and will lowered to a SIL object type. This will break the current assumptions in the differentiation transform.

I thought the same, but I couldn't reproduce a crash after trying a few tests.

import TensorFlow
// With modified `BatchNorm` from this patch.
func foo<T>(_ layer: BatchNorm<T>, _ x: Tensor<T>) -> Tensor<T> { layer(x) }
func bar<T>(_ layer: BatchNorm<T>, x: Tensor<T>) {
  _ = pullback(at: layer, x, in: foo)
}

rxwei · 2019-10-02T21:12:50Z

It is not clear why BatchNorm originally defined these properties as Tensor<Scalar> since the beginning: 6ca3813

It was originally Tensor<Scalar> exactly because of the type mismatch issue. This will be fixed by TF-625.

rxwei · 2019-10-02T21:15:49Z

I don't yet have a reproducer, but I'll look at swift-models and fast.ai notebooks test results. I'd be wary of turning this on just because no easy reproducer is found. It is technically pretty clear that lowering isn't able to handle such cases.

dan-zheng · 2019-10-02T21:27:09Z

CI failed because tensorflow/swift-models requires changes:

/swift-models/MiniGo/Models/GoModel.swift:57:21: error: cannot invoke initializer for type 'BatchNorm<_>' with an argument list of type '(featureCount: Int, momentum: Tensor<Float>, epsilon: Tensor<Float>)'
        self.norm = BatchNorm(
                    ^
/swift-models/MiniGo/Models/GoModel.swift:57:21: note: expected an argument list of type '(featureCount: Int, axis: Int, momentum: Scalar, epsilon: Scalar)'
        self.norm = BatchNorm(
                    ^

Not pursuing further for now due to low-priority. swiftlang/swift#27298 will fix TF-625.

rxwei · 2019-10-02T21:40:56Z

One possible reproducer at the SIL level is the following:

%f = function_ref @BatchNorm.callAsFunction(_:)
%f' = differentiable_function %f : $(*BatchNorm<Scalar>, Tensor<Scalar>) -> Tensor<Scalar>
%f_vjp = differentiable_function_extract [vjp] %f'

... after canonicalization:

%f = function_ref @BatchNorm.callAsFunction(_:)
%f' = differentiable_function %f : $(@in_guaranteed BatchNorm<Scalar>, @guaranteed Tensor<Scalar>) -> @owned Tensor<Scalar> with {
  %f_jvp : $(@in BatchNorm<Scalar>, @guaranteed Tensor<Scalar>) -> (@owned Tensor<Scalar>, (@guaranteed BatchNorm<Scalar>.TangentVector, @guaranteed Tensor<Scalar>) -> Tensor<Scalar>),
  %f_vjp : $(@in_guaranteed BatchNorm<Scalar>, @guaranteed Tensor<Scalar>) -> (@owned Tensor<Scalar>, (@guaranteed Tensor<Scalar>) -> (@owned BatchNorm<Scalar>.TangentVector, @owned Tensor<Scalar>) }
%f_vjp = differentiable_function_extract [vjp] %f'

According to the differentiable_function_extract type calculation rule, %f_vjp will have $(@in_guaranteed BatchNorm<Scalar>, @guaranteed Tensor<Scalar>) -> (@owned Tensor<Scalar>, (@guaranteed Tensor<Scalar>) -> (@out BatchNorm<Scalar>.TangentVector, @owned Tensor<Scalar>), which doesn't match what's in the canonicalized differentiable_function instruction.

It's possible that the current transform happens to handle this case despite formal type mismatches. For example, the transform sometimes checks the type of the actual SIL values instead of formal types, which is very suboptimal.

If pullback-based tests are not failing, it's possible that a differential-based test will trigger this.

saeta · 2019-11-07T18:29:59Z

Notes:

Change LayerNorm as well?
Be sure to include this in the v0.6 release notes as API breaking.
Also fix swift-models.

Change the following properties from `Scalar` to `Tensor<Scalar>`. - `BatchNorm.momentum` - `BatchNorm.epsilon` - `LayerNorm.epsilon` Semantically, these properties are always scalars. Note: this will be an API breaking change in Swift for TensorFlow 0.6. Deprecating the other `BatchNorm` and `LayerNorm` initializers is tricky because it causes ambiguity problems.

dan-zheng · 2019-11-07T19:47:15Z

LayerNorm.epsilon has also been updated.

Note: this will be an API breaking change in Swift for TensorFlow 0.6.

Deprecating the old BatchNorm and LayerNorm initializers is tricky because it causes ambiguity problems:

/Users/danielzheng/swift-apis/Tests/TensorFlowTests/LayerTests.swift:1171:23: error: ambiguous use of 'init(featureCount:axis:momentum:epsilon:)'
        let bnLayer = BatchNorm<Float>(featureCount: 5, axis: 0)
                      ^
TensorFlow.BatchNorm:12:12: note: found this candidate
    public init(featureCount: Int, axis: Int = -1, momentum: Scalar = 0.99, epsilon: Scalar = 0.001)
           ^
TensorFlow.BatchNorm:14:12: note: found this candidate
    public init(featureCount: Int, axis: Int = -1, momentum: TensorFlow.Tensor<Scalar> = Tensor(0.99), epsilon: TensorFlow.Tensor<Scalar> = Tensor(0.001))
           ^

rxwei suggested changes Oct 2, 2019

View reviewed changes

dan-zheng force-pushed the batch-norm branch from 79bcb7c to e20ab3c Compare November 7, 2019 19:45

dan-zheng mentioned this pull request Nov 7, 2019

Change momentum and epsilon properties to scalars. tensorflow/swift-models#220

Merged

saeta approved these changes Nov 7, 2019

View reviewed changes

saeta merged commit b7ba0d5 into tensorflow:master Nov 7, 2019

dan-zheng deleted the batch-norm branch November 7, 2019 20:50

dan-zheng changed the title ~~Change BatchNorm momentum and epsilon to scalars.~~ Change momentum and epsilon properties to scalars. Nov 7, 2019

BradLarson mentioned this pull request Nov 8, 2019

Fixes deprecation warning in MiniGo tensorflow/swift-models#221

Merged

sgugger mentioned this pull request Dec 6, 2019

Type inconsistency between Tensor.batchNormalized and BatchNorm #490

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change momentum and epsilon properties to scalars. #525

Change momentum and epsilon properties to scalars. #525

Uh oh!

dan-zheng commented Oct 2, 2019 •

edited

Loading

Uh oh!

dan-zheng commented Oct 2, 2019

Uh oh!

rxwei left a comment

Uh oh!

dan-zheng commented Oct 2, 2019 •

edited

Loading

Uh oh!

rxwei commented Oct 2, 2019

Uh oh!

rxwei commented Oct 2, 2019

Uh oh!

dan-zheng commented Oct 2, 2019

Uh oh!

rxwei commented Oct 2, 2019 •

edited

Loading

Uh oh!

saeta commented Nov 7, 2019 •

edited

Loading

Uh oh!

dan-zheng commented Nov 7, 2019

Uh oh!

Uh oh!

Change momentum and epsilon properties to scalars. #525

Change momentum and epsilon properties to scalars. #525

Uh oh!

Conversation

dan-zheng commented Oct 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dan-zheng commented Oct 2, 2019

Uh oh!

rxwei left a comment

Choose a reason for hiding this comment

Uh oh!

dan-zheng commented Oct 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rxwei commented Oct 2, 2019

Uh oh!

rxwei commented Oct 2, 2019

Uh oh!

dan-zheng commented Oct 2, 2019

Uh oh!

rxwei commented Oct 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saeta commented Nov 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dan-zheng commented Nov 7, 2019

Uh oh!

Uh oh!

dan-zheng commented Oct 2, 2019 •

edited

Loading

dan-zheng commented Oct 2, 2019 •

edited

Loading

rxwei commented Oct 2, 2019 •

edited

Loading

saeta commented Nov 7, 2019 •

edited

Loading