Skip to content

[TF] Fix gradients in sum(squeezinAxes:) and mean(squeezinAxes:) #24164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 19, 2019

Conversation

sgugger
Copy link

@sgugger sgugger commented Apr 19, 2019

sum(squeezinAxes:) and mean(squeezinAxes:) were throwing an error during the backward pass because the gradients weren't unsqueezed before being broadcast, this PR fixes it.

Note that this could be refactored nicely if we had a function that took a list of ints for expandingShape.
Second note: I may be wrong, but it seems like _vjpMean(squeezingAxes axes: [Int]) is never used and only the Tensor version is.

sum(squeezinAxes:) and mean(squeezinAxes:) were throwing an error during the bawckward pass because the gradients weren't unsqueezed before being broadcast.
Note that this could be refactored nicely if we had a function that took a list of ints for `expandingShape`.
Second note: I may be wrong, but it seems like `_vjpMean(squeezingAxes axes: [Int])` is never used and only the Tensor<Int32> version is.
@dan-zheng
Copy link
Contributor

Reviewing now.

Copy link
Contributor

@dan-zheng dan-zheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add tests and an expandingShape(at shapeIndices: [Int]) API, but those needn't block progress.

@dan-zheng
Copy link
Contributor

@swift-ci Please test tensorflow

@sgugger
Copy link
Author

sgugger commented Apr 19, 2019

I can work on an expandingShape(at shapeIndices: [Int]). Not sure where to add tests, but I can look.

@dan-zheng
Copy link
Contributor

dan-zheng commented Apr 19, 2019

I can work on an expandingShape(at shapeIndices: [Int]).

That would be great, if you have time!

Not sure where to add tests, but I can look.

Please add tests to test/TensorFlowRuntime/tensor_autodiff_runtime.swift.

Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rxwei
Copy link
Contributor

rxwei commented Apr 19, 2019

I can work on an expandingShape(at shapeIndices: [Int]). Not sure where to add tests, but I can look.

That'd be great. You can check it into tensorflow/swift-apis if that's easier!

@dan-zheng
Copy link
Contributor

@swift-ci Please test tensorflow

@sgugger
Copy link
Author

sgugger commented Apr 19, 2019

There are actually commented tests for all squeezing things in the file you mentioned Dan. Is there any reason for that?

@dan-zheng
Copy link
Contributor

There are actually commented tests for all squeezing things in the file you mentioned Dan. Is there any reason for that?

Great catch! squeezingAxes: derivative tests were commented in #22112, exactly because the behavior was not correct/well-defined.

This PR fixes those concerns. Are you in a position to uncomment the tests and run tests locally? If not, I can do so. Please let me know.

@sgugger
Copy link
Author

sgugger commented Apr 19, 2019

I can update with the uncommented tests, just not sure how to run them locally.

@dan-zheng
Copy link
Contributor

I can update with the uncommented tests, just not sure how to run them locally.

Okay. Please update the tests, and I will verify that they pass locally, then trigger CI. Thanks!

@sgugger
Copy link
Author

sgugger commented Apr 19, 2019

Done, when you have a bit of time, please tell me how you can quickly test one of those files so that I don't bother you next time ;)

@dan-zheng
Copy link
Contributor

Done, when you have a bit of time, please tell me how you can quickly test one of those files so that I don't bother you next time ;)

Thanks! Running tests requires a local build of the Swift compiler - if you don't have a local build, quickly testing is difficult.

If you do have a local build, you can run tests using lit.py (the LLVM Integrated Tester). Here's a detailed guide from the Swift forums.

To run this particular test, I run the following (you may need to replace macosx-x86_64):

cd swift-dev # cd into directory containing `build` and `swift`
llvm/utils/lit/lit.py -vvv --param swift_test_mode=optimize build/Ninja-ReleaseAssert+stdlib-Release/swift-macosx-x86_64/test-macosx-x86_64/TensorFlowRuntime/tensor_autodiff_runtime.swift

@dan-zheng
Copy link
Contributor

Verified that tests pass locally. Triggering CI.

@dan-zheng
Copy link
Contributor

@swift-ci Please test tensorflow

@dan-zheng
Copy link
Contributor

@swift-ci Please test tensorflow linux

@rxwei rxwei merged commit 9e089ac into swiftlang:tensorflow Apr 19, 2019
@rxwei
Copy link
Contributor

rxwei commented Apr 19, 2019

Thanks all!

rxwei pushed a commit to rxwei/swift that referenced this pull request May 11, 2019
…ftlang#24164)

* [TF] Fix gradients in sum(squeezinAxes:) and mean(squeezinAxes:)

sum(squeezinAxes:) and mean(squeezinAxes:) were throwing an error during the bawckward pass because the gradients weren't unsqueezed before being broadcast.
Note that this could be refactored nicely if we had a function that took a list of ints for `expandingShape`.
Second note: I may be wrong, but it seems like `_vjpMean(squeezingAxes axes: [Int])` is never used and only the Tensor<Int32> version is.

* Remove unused `_vjpMean` function.

* Update Gradients.swift

* Add test

* Minor edit for consistency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants