Skip to content

[AutoDiff] Support differentiation of loops. #25558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 19, 2019

Conversation

dan-zheng
Copy link
Contributor

@dan-zheng dan-zheng commented Jun 18, 2019

  • Change predecessor enum generation to support loops.
    • Generate indirect predecessor enums for BBs in loops.
    • Handle boxed payloads of indirect enums.
  • Traverse basic blocks in post-order post-dominance order.
    • This is necessary for computational correctness.
  • Add loop tests (for-in, while, repeat-while, break/continue, nested).

Examples:

func for_loop(_ x: Float) -> Float {
  var result = x
  for _ in 1..<3 {
    result = result * x
  }
  return result
}
expectEqual((8, 12), valueWithGradient(at: 2, in: for_loop))
expectEqual((27, 27), valueWithGradient(at: 3, in: for_loop))
func nested_loop(_ x: Float, count: Int) -> Float {
  var outer = x
  for _ in 1..<count {
    outer = outer * x

    var inner = outer
    var i = 1
    while i < count {
      inner = inner + x
      i += 1
    }
    outer = inner
  }
  return outer
}
expectEqual((20, 22), valueWithGradient(at: 2, in: { x in nested_loop(x, count: 3) }))
expectEqual((104, 66), valueWithGradient(at: 4, in: { x in nested_loop(x, count: 3) }))

Verified loop differentiation correctness against PyTorch and Tangent.

Exposed TF-584: derivative computation bug for repeat-while loops.
Exposed TF-585: AllocBoxToStack crash with -O for nested loop AD.

@dan-zheng dan-zheng added the tensorflow This is for "tensorflow" branch PRs. label Jun 18, 2019
@dan-zheng dan-zheng requested a review from rxwei June 18, 2019 06:55
@dan-zheng dan-zheng force-pushed the autodiff-loops branch 3 times, most recently from 8ebed4a to eab1067 Compare June 18, 2019 07:06
- Change predecessor enum generation to support loops.
  - Generate indirect predecessor enums for BBs in loops.
  - Handle boxed payloads of indirect enums.
- Traverse basic blocks in post-order post-dominance order.
  - This is necessary for computational correctness.
- Add loop tests (`for-in`, `while`, nested).
if (!boxType)
return builder.createEnum(loc, pbStructVal, enumEltDecl, enumLoweredTy);
// Otherwise, box the pullback struct value and create an enum.
auto *allocBox = builder.createAllocBox(loc, boxType);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This alloc_box is not being deallocated and may cause leaks - to be verified.
Loop AD leak checking tests are added in test/AutoDiff/leakchecking.swift.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved - see this comment. Deallocation is handled automatically during release_value for pullback structs.

/// Mapping from original basic blocks to local adjoint values to be cleaned
/// up. This is populated when adjoint emission is run on one basic block and
/// cleaned before processing another basic block.
DenseMap<SILBasicBlock *, SmallVector<AdjointValue, 8>>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: changed blockLocalAdjointValues to a DenseMap for more principled/predictable behavior.
Previously, using SmallVector caused bugs when the original entry BB isn't the last visited BB during adjoint generation.

Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any emission of dealloc_box and this is apparently leaking a lot more memory (all pullback structs in loops are kept alive). This needs to be fixed before merging.
Actually, this should be handled automatically when you release_value the pullback struct.

expectEqual((8, 12), valueWithGradient(at: 2, in: for_loop))
expectEqual((27, 27), valueWithGradient(at: 3, in: for_loop))

func while_loop(_ x: Float) -> Float {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also test a repeat-while loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests for repeat-while loops and continue/break statements in 32f6629.
Exposed TF-584: derivative computation bug for repeat-while loops. Thanks for pointing this out!

Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch is pretty clean. LGTM!

Add tests for repeat-while loops and break/continue statements.
Expose TF-584: incorrect derivative computation for repeat-while loops.
@dan-zheng
Copy link
Contributor Author

@swift-ci Please test tensorflow

@dan-zheng
Copy link
Contributor Author

dan-zheng commented Jun 18, 2019

Two tests fail only with optimizations, which I hadn't tested locally:

Failing Tests (2):
    Swift(linux-x86_64) :: AutoDiff/leakchecking.swift
    Swift(linux-x86_64) :: AutoDiff/control_flow.swift
bool swift::StackNesting::solve(): Assertion `!(BI.Block->getTerminator()->isFunctionExiting() && Bits.any()) && "stack location is missing dealloc"' failed.

Perhaps there are some missing deallocations in generated code. Will investigate later.

EDIT: Filed TF-585 with details about the AllocBoxToStack crash.

@dan-zheng
Copy link
Contributor Author

@swift-ci Please test tensorflow

Generate `fix_lifetime` instruction on `alloc_box` to avoid crash.
@dan-zheng
Copy link
Contributor Author

@swift-ci Please test tensorflow

@dan-zheng dan-zheng merged commit 766a1de into swiftlang:tensorflow Jun 19, 2019
@dan-zheng dan-zheng deleted the autodiff-loops branch June 19, 2019 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tensorflow This is for "tensorflow" branch PRs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants