Skip to content

[AutoDiff] Fix autodiff_function_extract operand ownership kind and memory leaks. #27199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 16, 2019

Conversation

rxwei
Copy link
Contributor

@rxwei rxwei commented Sep 16, 2019

The autodiff_function_extract instruction behaves like tuple_extract, where it extracts some element from an aggregate. Its operand should have the same ownership kind as that of tuple_extract. That is, it should be defined as CONSTANT_OR_TRIVIAL_OWNERSHIP_INST(Guaranteed, AutoDiffFunctionInst) in ValueOwnershipKindClassifier.

However, this is currently defined wrongly as FORWARDING_OWNERSHIP_INST(AutoDiffFunctionExtract), which caused a bug in the differentiation transform to be uncaught: VJPEmitter and JVPEmitter in the differentiation transform is performing autodiff_function_extract on an @owned @differentiable function, which caused associated functions that are not extracted to be not released:

%f = autodiff_function %original
%f_vjp = autodiff_function_extract [vjp] %f
... // %f is not released, and not caught by ownership verification!

After we fix the operand ownership kind for autodiff_function_extract, all these cases are now caught by ownership verification. The reproducer in TF-795 and most differentiation tests are failing to compile because ownership verification caught the bug in AD-generated code. The existing AD test suite is serving as good test cases for this ownership error.

To fix this, VJPEmitter and JVPEmitter are now changed to emit borrows of @differentiable functions and copies of associated functions and property destroying the @differentiable function:

%f = autodiff_function %original
%f_borrowed = begin_borrow %f
%f_vjp_extracted = autodiff_function_extract [vjp] %f_borrowed
%f_vjp = copy_value %f_vjp_extracted
end_borrow %f_borrowed
destroy_value %f

Fixes TF-795.

The `autodiff_function_extract` instruction behaves like `tuple_extract`, where it extracts some element from an aggregate. Its operand should have the same ownership kind as that of `tuple_extract`. That is, it should be defined as `CONSTANT_OR_TRIVIAL_OWNERSHIP_INST(Guaranteed, AutoDiffFunctionInst)` in ValueOwnershipKindClassifier.

However, this is currently defined wrongly as `FORWARDING_OWNERSHIP_INST(AutoDiffFunctionExtract)`, which caused a bug in the differentiation transform to be uncaught: VJPEmitter and JVPEmitter in the differentiation transform is performing `autodiff_function_extract` on an `@owned` `@differentiable` function, which caused associated functions that are not extracted to be not released:

```
%f = autodiff_function %original
%f_vjp = autodiff_function_extract [vjp] %f
... // %f is not released, and not caught by ownership verification!
```

After we fix the operand ownership kind for `autodiff_function_extract`, all these cases are now caught by ownership verification. The reproducer in [TF-795](https://bugs.swift.org/browse/TF-795) and most differentiation tests are failing to compile because ownership verification caught the bug in AD-generated code. The existing AD test suite is serving as good test cases for this ownership error.

To fix this, VJPEmitter and JVPEmitter are now changed to emit borrows of `@differentiable` functions and copies of associated functions and property destroying the `@differentiable` function:

```
%f = autodiff_function %original
%f_borrowed = begin_borrow %f
%f_vjp_extracted = autodiff_function_extract [vjp] %f_borrowed
%f_vjp = copy_value %f_vjp_extracted
end_borrow %f_borrowed
destroy_value %f
```

Fixes [TF-795](https://bugs.swift.org/browse/TF-795).
@rxwei rxwei added the tensorflow This is for "tensorflow" branch PRs. label Sep 16, 2019
@rxwei rxwei requested a review from dan-zheng September 16, 2019 18:55
@rxwei
Copy link
Contributor Author

rxwei commented Sep 16, 2019

@swift-ci please test tensorflow

@rxwei rxwei merged commit 0641a22 into swiftlang:tensorflow Sep 16, 2019
@rxwei rxwei deleted the adfuncextract_ownership branch September 16, 2019 19:24
@saeta
Copy link
Contributor

saeta commented Sep 17, 2019

Can you please add a unit test?

@rxwei
Copy link
Contributor Author

rxwei commented Sep 17, 2019

@saeta As explained in the PR description, failures are now caught by ownership verification after operand ownership kind is fixed, so any leak would cause all AD tests to fail to compile. Since this is a closure heap allocation leak, a refcount-tracked type will not work on the TF-795 reproducer so it's not trivially unit-testable. It can only be caught by more general leak-checking-enabled CI. The substantial fix here is to define the right operand ownership kind.

rxwei added a commit to rxwei/swift that referenced this pull request Sep 17, 2019
… memory leaks. (swiftlang#27199)

The `autodiff_function_extract` instruction behaves like `tuple_extract`, where it extracts some element from an aggregate. Its operand should have the same ownership kind as that of `tuple_extract`. That is, it should be defined as `CONSTANT_OR_TRIVIAL_OWNERSHIP_INST(Guaranteed, AutoDiffFunctionInst)` in ValueOwnershipKindClassifier.

However, this is currently defined wrongly as `FORWARDING_OWNERSHIP_INST(AutoDiffFunctionExtract)`, which caused a bug in the differentiation transform to be uncaught: VJPEmitter and JVPEmitter in the differentiation transform is performing `autodiff_function_extract` on an `@owned` `@differentiable` function, which caused associated functions that are not extracted to be not released:

```
%f = autodiff_function %original
%f_vjp = autodiff_function_extract [vjp] %f
... // %f is not released, and not caught by ownership verification!
```

After we fix the operand ownership kind for `autodiff_function_extract`, all these cases are now caught by ownership verification. The reproducer in [TF-795](https://bugs.swift.org/browse/TF-795) and most differentiation tests are failing to compile because ownership verification caught the bug in AD-generated code. The existing AD test suite is serving as good test cases for this ownership error.

To fix this, VJPEmitter and JVPEmitter are now changed to emit borrows of `@differentiable` functions and copies of associated functions and property destroying the `@differentiable` function:

```
%f = autodiff_function %original
%f_borrowed = begin_borrow %f
%f_vjp_extracted = autodiff_function_extract [vjp] %f_borrowed
%f_vjp = copy_value %f_vjp_extracted
end_borrow %f_borrowed
destroy_value %f
```

Fixes [TF-795](https://bugs.swift.org/browse/TF-795).

(cherry picked from commit 0641a22)
rxwei added a commit that referenced this pull request Oct 12, 2019
…27638)

`ValueOwnershipKindClassifier` and `OperandOwnershipKindClassifier` should have the same classification for `autodiff_function_extract`. `ValueOwnershipKindClassifier`'s classification was fixed by #27199, which gave the correct ownership verification results. Now we fix it in `OperandOwnershipKindClassifier`.

`DifferentiableFunctionExtractOriginalExpr`'s SILGen is now corrected to borrowing the argument and emitting a copy for the extracted original function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tensorflow This is for "tensorflow" branch PRs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants