Skip to content

[AutoDiff] Revamp usefulness propagation in activity analysis. #28225

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

dan-zheng
Copy link
Contributor

Useful values are those that contribute to (specific) dependent variables,
i.e. function results.

For addresses: all projections of a useful address should be useful.
This has special support:
DifferentiableActivityInfo::propagateUsefulThroughAddress.

Previously:

  • Usefulness was propagated by iterating through all instructions in
    post-dominance order. This is not efficient because irrelevant instructions
    may be visited.
  • For useful addresses, propagateUsefulThroughAddress propagated
    usefulness one step to projections, but not recursively to users of the
    projections. This caused some values to incorrectly not be marked useful.

Now:

  • Usefulness is propagated by following use-def chains, starting from
    dependent variables (function results). This is handled by the
    following helpers:
    • setUsefulAndPropagateToOperands(SILValue, unsigned): marks a value as
      useful and recursively propagates usefulness through defining instruction
      operands and basic block argument incoming values.
    • propagateUseful(SILInstruction *inst, unsigned): propagates usefulness
      to the operands of the given instruction.
  • DifferentiableActivityInfo::propagateUsefulThroughAddress now calls
    propagateUseful to propagate usefulness recursively through users'
    operands.

Effects:

  • More values are now (correctly) marked as useful, affecting
    non-differentiability diagnostics for active enum values (TF-956) and for-in
    loops (TF-957). Both have room for improvement.

Resolves control flow differentiation correctness issue: TF-954.

@dan-zheng
Copy link
Contributor Author

Created PR as a draft because I feel there's room for improvement.
Suggestions/feedback are welcome!

@@ -1444,8 +1444,18 @@ class DifferentiableActivityInfo {
void setUseful(SILValue value, unsigned dependentVariableIndex);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: currently, setUseful has two users (setUsefulAndPropagateToOperands and propagateUsefulThroughAddress), so it hasn't been inlined.

auto *inst = value->getDefiningInstruction();
if (!inst)
return;
propagateUseful(inst, dependentVariableIndex);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: propagateUseful cannot be inlined because it has multiple users (called multiple times in propagateUsefulThroughAddress.

propagateUsefulThroughAddress(value, dependentVariableIndex);
return;
}
setUseful(value, dependentVariableIndex);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: both setUsefulAndPropagateToOperands and propagateUsefulThroughAddress call isUseful and setUseful, which seems suboptimal. I tried removing setUseful from one of them but ran into infinite loops.

Related: I think setUsefulAndPropagateToOperands should be the primary entry point for propagating usefulness, so I eliminated most direct calls to propagateUsefulThroughAddress.

@dan-zheng dan-zheng requested review from marcrasi and rxwei November 13, 2019 03:22
@dan-zheng dan-zheng added the tensorflow This is for "tensorflow" branch PRs. label Nov 13, 2019
@dan-zheng dan-zheng force-pushed the autodiff-fix-activity-analysis branch 2 times, most recently from 98cb4ec to 05e0b86 Compare November 13, 2019 04:51
Useful values are those that contribute to (specific) dependent variables,
i.e. function results.

For addresses: all projections of a useful address should be useful.
This has special support:
`DifferentiableActivityInfo::propagateUsefulThroughAddress`.

Previously:
- Usefulness was propagated by iterating through all instructions in
  post-dominance order. This is not efficient because irrelevant instructions
  may be visited.
- For useful addresses, `propagateUsefulThroughAddress` propagated
  usefulness one step to projections, but not recursively to users of the
  projections. This caused some values to incorrectly not be marked useful.

Now:
- Usefulness is propagated by following use-def chains, starting from
  dependent variables (function results). This is handled by the
  following helpers:
  - `setUsefulAndPropagateToOperands(SILValue, unsigned)`: marks a value as
    useful and recursively propagates usefulness through defining instruction
    operands and basic block argument incoming values.
  - `propagateUseful(SILInstruction *inst, unsigned)`: propagates usefulness
    to the operands of the given instruction.
- `DifferentiableActivityInfo::propagateUsefulThroughAddress` now calls
  `propagateUseful` to propagate usefulness recursively through users'
  operands.

Effects:
- More values are now (correctly) marked as useful, affecting
  non-differentiability diagnostics for active enum values (TF-956) and for-in
  loops (TF-957). Both have room for improvement.

Resolves control flow differentiation correctness issue: TF-954.
@dan-zheng dan-zheng force-pushed the autodiff-fix-activity-analysis branch from 05e0b86 to af915c0 Compare November 13, 2019 04:58
@dan-zheng dan-zheng marked this pull request as ready for review November 13, 2019 05:04
@dan-zheng
Copy link
Contributor Author

@swift-ci Please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Nov 13, 2019

@swift-ci Please test tensorflow

2 similar comments
@rxwei
Copy link
Contributor

rxwei commented Nov 13, 2019

@swift-ci Please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Nov 13, 2019

@swift-ci Please test tensorflow

@dan-zheng dan-zheng merged commit 414e029 into swiftlang:tensorflow Nov 13, 2019
@dan-zheng dan-zheng deleted the autodiff-fix-activity-analysis branch November 13, 2019 17:25
dan-zheng added a commit to dan-zheng/swift that referenced this pull request Nov 16, 2019
Hoist activity marking visited value set out of loop over original bbs.
This is safe because bbs directly start with dominator bbs's active values.

Visit bb arguments for activity marking. This was accidentally deleted in
swiftlang#28225. Re-adding the logic doesn't
seem to affect any tests.
dan-zheng added a commit that referenced this pull request Nov 16, 2019
Hoist activity marking visited value set out of loop over original bbs.
This is safe because bbs directly start with dominator bbs's active values.

Visit bb arguments for activity marking. This was accidentally deleted in
#28225. Re-adding the logic doesn't
seem to affect any tests.
bgogul pushed a commit that referenced this pull request Nov 19, 2019
Useful values are those that contribute to (specific) dependent variables,
i.e. function results.

For addresses: all projections of a useful address should be useful.
This has special support:
`DifferentiableActivityInfo::propagateUsefulThroughAddress`.

Previously:
- Usefulness was propagated by iterating through all instructions in
  post-dominance order. This is not efficient because irrelevant instructions
  may be visited.
- For useful addresses, `propagateUsefulThroughAddress` propagated
  usefulness one step to projections, but not recursively to users of the
  projections. This caused some values to incorrectly not be marked useful.

Now:
- Usefulness is propagated by following use-def chains, starting from
  dependent variables (function results). This is handled by the
  following helpers:
  - `setUsefulAndPropagateToOperands(SILValue, unsigned)`: marks a value as
    useful and recursively propagates usefulness through defining instruction
    operands and basic block argument incoming values.
  - `propagateUseful(SILInstruction *inst, unsigned)`: propagates usefulness
    to the operands of the given instruction.
- `DifferentiableActivityInfo::propagateUsefulThroughAddress` now calls
  `propagateUseful` to propagate usefulness recursively through users'
  operands.

Effects:
- More values are now (correctly) marked as useful, affecting
  non-differentiability diagnostics for active enum values (TF-956) and for-in
  loops (TF-957). Both have room for improvement.

Resolves control flow differentiation correctness issue: TF-954.
dan-zheng added a commit that referenced this pull request Nov 21, 2019
Hoist activity marking visited value set out of loop over original bbs.
This is safe because bbs directly start with dominator bbs's active values.

Visit bb arguments for activity marking. This was accidentally deleted in
#28225. Re-adding the logic doesn't
seem to affect any tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tensorflow This is for "tensorflow" branch PRs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants