Skip to content

[AutoDiff] Destroy all pullback indirect results after adjoint accumulation. #27711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 16, 2019

Conversation

rxwei
Copy link
Contributor

@rxwei rxwei commented Oct 16, 2019

When we differentiate a function (example below) with respect to a proper subset of its indirect parameters and when the function only has a derivative with respect to a proper superset of those indirect parameters, the pullback returns more indirect results than we need. However, unneeded indirect results are not destroyed, which causes a memory lifetime verification failure. This patch fixes this bug by releasing all pullback indirect results instead of just releasing the ones needed for calculating the derivative.

@differentiable(wrt: x)
func foo<T: Differentiable>(_ x: T, _ y: T, apply: @differentiable (T, T) -> T) -> T {
  return apply(x, y)
}

This patch also uncomments a test in test/AutoDiff/superset_adjoint.swift which is now passing. This fixed a FIXME.

Resolves TF-914.

…lation.

When we differentiate a function (example below) with respect to a proper subset of its indirect parameters and when the function only has a derivative with respecct to a superset of those indirect parameters, the pullback returns more indirect results that what we need. However, unneeded indirect results are not destroyed, which causes a memory lifetime verification failure. This patch fixes this bug by releasing all pullback indirect results instead of just releasing the ones needed for calculating the derivative.

Resolves [TF-914](https://bugs.swift.org/browse/TF-914).
@rxwei rxwei added the tensorflow This is for "tensorflow" branch PRs. label Oct 16, 2019
@rxwei rxwei requested review from dan-zheng and marcrasi October 16, 2019 00:09
// FIXME: The expression `(+) as @differentiable (Float, @nondiff Float) -> Float)`
// forms a curry thunk of `Float.+` before conversion to @differentiable, and AD
// doesn't know how to differentiate the curry thunk, so it produces a
// "function is not differentiable" error.
// FIXME: Propagate wrt indices correctly so that this actually takes the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was referring to lowering parameter indices to @nondiff parameters, which was done long ago.

@rxwei
Copy link
Contributor Author

rxwei commented Oct 16, 2019

@swift-ci please test tensorflow

@rxwei rxwei merged commit 0d17ddf into swiftlang:tensorflow Oct 16, 2019
@rxwei rxwei deleted the TF-914 branch October 16, 2019 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tensorflow This is for "tensorflow" branch PRs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants