Add `variance()` and `variance(squeezingAxes:)`. #23811

dan-zheng · 2019-04-05T06:24:58Z

Add variance() and variance(squeezingAxes:).
Add derivatives for squeezingAxes: versions of sum and mean.
Test variance.
Test variance derivatives.

Todos:

Add "bias correction" flag to variance ops.

dan-zheng · 2019-04-05T06:31:05Z

test/TensorFlowRuntime/tensor_autodiff_runtime.swift

+  // TODO: Fix variance derivative calculation.
+  // PyTorch has: `expected == [[-0.75, -0.25], [ 0.25,  0.75]]`.
+  let input: Tensor<Float> = [[1, 2], [3, 4]]
+  let expected = Tensor<Float>(repeating: -0.75, shape: [2, 2])


Our variance ops are implemented using mean and squared - no custom derivatives for variance.

However, the derivatives don't seem to match Python:

import torch import torch.nn.functional as F x = torch.tensor([[1, 2], [3, 4]], dtype=torch.float) x.requires_grad = True y = x.var(unbiased=False) y.backward() print(x) # tensor([[1., 2.], # [3., 4.]], requires_grad=True) print(y) # tensor(1.2500, grad_fn=<VarBackward0>) print(x.grad) # tensor([[-0.7500, -0.2500], # [ 0.2500, 0.7500]]) # We are not consistent with this.

Any ideas on fixing on this? Is a custom variance derivative necessary?
(I would find it strange if a custom derivative is necessary for correctness.)

Seeking help: cc @jekbradbury @jph00 @sgugger

PyTorch has a custom var_backward implementation:

Tensor var_backward(const Tensor & grad, const Tensor & self, bool unbiased) { return (2.0 / (self.numel() - unbiased)) * grad * (self - self.mean()); } Tensor var_backward(Tensor grad, const Tensor & self, IntArrayRef dim, bool unbiased, bool keepdim) { if (self.dim() == 0) { return var_backward(grad, self, unbiased); } if (!keepdim && self.dim() > 1) { grad = unsqueeze_multiple(grad, dim, self.sizes().size()); } return (2.0 / (_safe_size(self.sizes(), dim) - unbiased)) * grad * (self - self.mean(dim, true)); }

I'll try to port it to Swift.

I'm still concerned about derivative correctness - a custom derivative shouldn't be necessary for correctness, especially because variance isn't super complicated (it just calls mean and squared).

Sorry it's not something that @sgugger or I have looked at before so we don't have any particular insights here.

Why don't we hand-calculate and verify a simple case? It'd be highly preferable to not make it a primitive differentiable function.

PyTorch is correct and doing

let xGrad = gradient(at: x) {x in return (x - x.mean()).squared().mean()}

or

let xGrad = gradient(at: x) {x in return {x in return x.variance(alongAxes: 0,1)}}

gives me the same result as PyTorch.

The PyTorch gradient is correct. This Swift version also gives the right results:

@differentiable func variance(_ x: Tensor<Float>) -> Tensor<Float> { let mean = x.mean() let squaredDiff = (x - mean).squared() return squaredDiff.mean() }

dan-zheng · 2019-04-05T06:37:35Z

test/TensorFlowRuntime/tensor.swift

-              x.product(squeezingAxes: 0).toHost(shape: []).array)
-  expectEqual(ShapedArray(shape: [1, 5], scalars: [1, 4, 9, 16, 25]),
-              x.product(alongAxes: 0).toHost(shape: []).array)
+  expectEqual(Tensor(30), x.sum().toHost(shape: []))


I wonder if reduction ops still fail on TPU. The #if !TPU condition was added months ago (b/111815968).

cc @pschuh @bgogul: I wonder if you have any clue if this test still fails on TPU?
Or if toHost(shape: ...) is necessary everywhere?

Also, the arguments to toHost(shape:) don't seem quite right (it shouldn't always be []).

Update: I'll remove the toHost calls to force us to re-evaluate this bug.

Reorganize reduction ops: `sum`, `product`, `mean`, `variance`.

Todo: fix variance derivatives. They are currently inconsistent with PyTorch.

dan-zheng · 2019-04-09T02:42:47Z

@swift-ci Please test tensorflow

* Add `variance()` and `variance(squeezingAxes:)`. * Reorganize reduction ops: `sum`, `product`, `mean`, `variance`.

dan-zheng added the tensorflow This is for "tensorflow" branch PRs. label Apr 5, 2019

dan-zheng requested review from rxwei and jekbradbury April 5, 2019 06:24

dan-zheng commented Apr 5, 2019

View reviewed changes

dan-zheng mentioned this pull request Apr 5, 2019

TF-404: Add .std() to Tensor. #23763

Merged

dan-zheng commented Apr 5, 2019

View reviewed changes

dan-zheng added 5 commits April 8, 2019 18:46

Add variance() and variance(squeezingAxes:).

52f5910

Reorganize reduction ops: `sum`, `product`, `mean`, `variance`.

Add derivatives for squeezingAxes: versions of sum and mean.

40d73ab

Test variance.

0c28110

Test variance derivatives.

8ce92df

Todo: fix variance derivatives. They are currently inconsistent with PyTorch.

Fix variance derivatives.

d8aeafc

dan-zheng force-pushed the tensor-variance branch from 2964e58 to d8aeafc Compare April 9, 2019 02:41

dan-zheng merged commit ec50086 into swiftlang:tensorflow Apr 9, 2019

dan-zheng deleted the tensor-variance branch April 9, 2019 10:20

rxwei pushed a commit to rxwei/swift that referenced this pull request May 11, 2019

Add variance() and variance(squeezingAxes:). (swiftlang#23811)

47cf747

* Add `variance()` and `variance(squeezingAxes:)`. * Reorganize reduction ops: `sum`, `product`, `mean`, `variance`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `variance()` and `variance(squeezingAxes:)`. #23811

Add `variance()` and `variance(squeezingAxes:)`. #23811

Uh oh!

dan-zheng commented Apr 5, 2019

Uh oh!

dan-zheng Apr 5, 2019

Uh oh!

dan-zheng Apr 5, 2019

Uh oh!

jph00 Apr 5, 2019

Uh oh!

rxwei Apr 5, 2019 •

edited

Loading

Uh oh!

sgugger Apr 5, 2019

Uh oh!

jekbradbury Apr 6, 2019

Uh oh!

dan-zheng Apr 5, 2019

Uh oh!

dan-zheng Apr 5, 2019

Uh oh!

dan-zheng Apr 8, 2019

Uh oh!

dan-zheng commented Apr 9, 2019

Uh oh!

Uh oh!

Add variance() and variance(squeezingAxes:). #23811

Add variance() and variance(squeezingAxes:). #23811

Uh oh!

Conversation

dan-zheng commented Apr 5, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rxwei Apr 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dan-zheng commented Apr 9, 2019

Uh oh!

Uh oh!

Add `variance()` and `variance(squeezingAxes:)`. #23811

Add `variance()` and `variance(squeezingAxes:)`. #23811

rxwei Apr 5, 2019 •

edited

Loading