[AutoDiff] Revamp usefulness propagation in activity analysis. #28225

dan-zheng · 2019-11-13T03:16:16Z

Useful values are those that contribute to (specific) dependent variables,
i.e. function results.

For addresses: all projections of a useful address should be useful.
This has special support:
DifferentiableActivityInfo::propagateUsefulThroughAddress.

Previously:

Usefulness was propagated by iterating through all instructions in
post-dominance order. This is not efficient because irrelevant instructions
may be visited.
For useful addresses, propagateUsefulThroughAddress propagated
usefulness one step to projections, but not recursively to users of the
projections. This caused some values to incorrectly not be marked useful.

Now:

Usefulness is propagated by following use-def chains, starting from
dependent variables (function results). This is handled by the
following helpers:
- setUsefulAndPropagateToOperands(SILValue, unsigned): marks a value as
  useful and recursively propagates usefulness through defining instruction
  operands and basic block argument incoming values.
- propagateUseful(SILInstruction *inst, unsigned): propagates usefulness
  to the operands of the given instruction.
DifferentiableActivityInfo::propagateUsefulThroughAddress now calls
propagateUseful to propagate usefulness recursively through users'
operands.

Effects:

More values are now (correctly) marked as useful, affecting
non-differentiability diagnostics for active enum values (TF-956) and for-in
loops (TF-957). Both have room for improvement.

Resolves control flow differentiation correctness issue: TF-954.

dan-zheng · 2019-11-13T03:17:55Z

Created PR as a draft because I feel there's room for improvement.
Suggestions/feedback are welcome!

dan-zheng · 2019-11-13T03:18:42Z

lib/SILOptimizer/Mandatory/Differentiation.cpp

@@ -1444,8 +1444,18 @@ class DifferentiableActivityInfo {
  void setUseful(SILValue value, unsigned dependentVariableIndex);


Note: currently, setUseful has two users (setUsefulAndPropagateToOperands and propagateUsefulThroughAddress), so it hasn't been inlined.

dan-zheng · 2019-11-13T03:20:09Z

lib/SILOptimizer/Mandatory/Differentiation.cpp

+  auto *inst = value->getDefiningInstruction();
+  if (!inst)
+    return;
+  propagateUseful(inst, dependentVariableIndex);


Note: propagateUseful cannot be inlined because it has multiple users (called multiple times in propagateUsefulThroughAddress.

dan-zheng · 2019-11-13T03:22:37Z

lib/SILOptimizer/Mandatory/Differentiation.cpp

+    propagateUsefulThroughAddress(value, dependentVariableIndex);
+    return;
+  }
+  setUseful(value, dependentVariableIndex);


Note: both setUsefulAndPropagateToOperands and propagateUsefulThroughAddress call isUseful and setUseful, which seems suboptimal. I tried removing setUseful from one of them but ran into infinite loops.

Related: I think setUsefulAndPropagateToOperands should be the primary entry point for propagating usefulness, so I eliminated most direct calls to propagateUsefulThroughAddress.

test/AutoDiff/activity_analysis.swift

Useful values are those that contribute to (specific) dependent variables, i.e. function results. For addresses: all projections of a useful address should be useful. This has special support: `DifferentiableActivityInfo::propagateUsefulThroughAddress`. Previously: - Usefulness was propagated by iterating through all instructions in post-dominance order. This is not efficient because irrelevant instructions may be visited. - For useful addresses, `propagateUsefulThroughAddress` propagated usefulness one step to projections, but not recursively to users of the projections. This caused some values to incorrectly not be marked useful. Now: - Usefulness is propagated by following use-def chains, starting from dependent variables (function results). This is handled by the following helpers: - `setUsefulAndPropagateToOperands(SILValue, unsigned)`: marks a value as useful and recursively propagates usefulness through defining instruction operands and basic block argument incoming values. - `propagateUseful(SILInstruction *inst, unsigned)`: propagates usefulness to the operands of the given instruction. - `DifferentiableActivityInfo::propagateUsefulThroughAddress` now calls `propagateUseful` to propagate usefulness recursively through users' operands. Effects: - More values are now (correctly) marked as useful, affecting non-differentiability diagnostics for active enum values (TF-956) and for-in loops (TF-957). Both have room for improvement. Resolves control flow differentiation correctness issue: TF-954.

dan-zheng · 2019-11-13T05:30:29Z

@swift-ci Please test tensorflow

rxwei · 2019-11-13T08:49:28Z

@swift-ci Please test tensorflow

rxwei · 2019-11-13T08:49:32Z

@swift-ci Please test tensorflow

rxwei · 2019-11-13T08:49:37Z

@swift-ci Please test tensorflow

Hoist activity marking visited value set out of loop over original bbs. This is safe because bbs directly start with dominator bbs's active values. Visit bb arguments for activity marking. This was accidentally deleted in swiftlang#28225. Re-adding the logic doesn't seem to affect any tests.

Hoist activity marking visited value set out of loop over original bbs. This is safe because bbs directly start with dominator bbs's active values. Visit bb arguments for activity marking. This was accidentally deleted in #28225. Re-adding the logic doesn't seem to affect any tests.

Useful values are those that contribute to (specific) dependent variables, i.e. function results. For addresses: all projections of a useful address should be useful. This has special support: `DifferentiableActivityInfo::propagateUsefulThroughAddress`. Previously: - Usefulness was propagated by iterating through all instructions in post-dominance order. This is not efficient because irrelevant instructions may be visited. - For useful addresses, `propagateUsefulThroughAddress` propagated usefulness one step to projections, but not recursively to users of the projections. This caused some values to incorrectly not be marked useful. Now: - Usefulness is propagated by following use-def chains, starting from dependent variables (function results). This is handled by the following helpers: - `setUsefulAndPropagateToOperands(SILValue, unsigned)`: marks a value as useful and recursively propagates usefulness through defining instruction operands and basic block argument incoming values. - `propagateUseful(SILInstruction *inst, unsigned)`: propagates usefulness to the operands of the given instruction. - `DifferentiableActivityInfo::propagateUsefulThroughAddress` now calls `propagateUseful` to propagate usefulness recursively through users' operands. Effects: - More values are now (correctly) marked as useful, affecting non-differentiability diagnostics for active enum values (TF-956) and for-in loops (TF-957). Both have room for improvement. Resolves control flow differentiation correctness issue: TF-954.

Hoist activity marking visited value set out of loop over original bbs. This is safe because bbs directly start with dominator bbs's active values. Visit bb arguments for activity marking. This was accidentally deleted in #28225. Re-adding the logic doesn't seem to affect any tests.

dan-zheng commented Nov 13, 2019

View reviewed changes

dan-zheng requested review from marcrasi and rxwei November 13, 2019 03:22

rxwei reviewed Nov 13, 2019

View reviewed changes

test/AutoDiff/activity_analysis.swift Outdated Show resolved Hide resolved

dan-zheng added the tensorflow This is for "tensorflow" branch PRs. label Nov 13, 2019

dan-zheng force-pushed the autodiff-fix-activity-analysis branch 2 times, most recently from 98cb4ec to 05e0b86 Compare November 13, 2019 04:51

dan-zheng force-pushed the autodiff-fix-activity-analysis branch from 05e0b86 to af915c0 Compare November 13, 2019 04:58

dan-zheng marked this pull request as ready for review November 13, 2019 05:04

rxwei approved these changes Nov 13, 2019

View reviewed changes

dan-zheng merged commit 414e029 into swiftlang:tensorflow Nov 13, 2019

dan-zheng deleted the autodiff-fix-activity-analysis branch November 13, 2019 17:25

dan-zheng mentioned this pull request Nov 16, 2019

[AutoDiff] Minor activity analysis changes. #28301

Merged

dan-zheng mentioned this pull request Nov 21, 2019

[AutoDiff] Minor activity analysis changes. #28409

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AutoDiff] Revamp usefulness propagation in activity analysis. #28225

[AutoDiff] Revamp usefulness propagation in activity analysis. #28225

Uh oh!

dan-zheng commented Nov 13, 2019

Uh oh!

dan-zheng commented Nov 13, 2019

Uh oh!

dan-zheng Nov 13, 2019

Uh oh!

dan-zheng Nov 13, 2019

Uh oh!

dan-zheng Nov 13, 2019

Uh oh!

Uh oh!

dan-zheng commented Nov 13, 2019

Uh oh!

rxwei commented Nov 13, 2019

Uh oh!

rxwei commented Nov 13, 2019

Uh oh!

rxwei commented Nov 13, 2019

Uh oh!

Uh oh!

		@@ -1444,8 +1444,18 @@ class DifferentiableActivityInfo {
		void setUseful(SILValue value, unsigned dependentVariableIndex);

[AutoDiff] Revamp usefulness propagation in activity analysis. #28225

[AutoDiff] Revamp usefulness propagation in activity analysis. #28225

Uh oh!

Conversation

dan-zheng commented Nov 13, 2019

Uh oh!

dan-zheng commented Nov 13, 2019

Uh oh!

dan-zheng Nov 13, 2019

Choose a reason for hiding this comment

Uh oh!

dan-zheng Nov 13, 2019

Choose a reason for hiding this comment

Uh oh!

dan-zheng Nov 13, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dan-zheng commented Nov 13, 2019

Uh oh!

rxwei commented Nov 13, 2019

Uh oh!

rxwei commented Nov 13, 2019

Uh oh!

rxwei commented Nov 13, 2019

Uh oh!

Uh oh!