[opt-remark] When looking for debug_value users, look modulo RC Identity preserving users. #33754

gottesmm · 2020-09-02T06:27:21Z

A key concept in late ARC optimization is "RC Identity". In short, a result of
an instruction is rc-identical to an operand of the instruction if one can
safely move a retain (release) from before the instruction on the result to one
after on the operand without changing the program semantics. This creates a
simple model where one can work on equivalence classes of rc-identical values
(using a dominating definition generally as the representative) and thus
optimize/pair retain, release.

When preparing for late ARC optimization, the optimizer will normalize aggregate
ARC operations (retain_value, release_value) into singular strong_retain,
strong_release operations on leaf types of the aggregate that are
non-trivial. As an example, a retain_value on a KlassPair would be canonicalized
into two strong_retain, one for the lhs and one for the rhs. When this is done,
the optimizer generally just creates new struct_extract at the point where the
retain is. In such a case, we may have that the debug_value for the underlying
type is actually on a reformed aggregate whose underlying parts we are
retaining:

bb0(%0 : $Builtin.NativeObject):
  strong_retain %0
  %1 = struct $Array(%0 : $Builtin.NativeObject, ...)
  debug_value %1 : $Array, ...

By looking through RC identical uses, we can handle a large subset of these
cases without much effort: ones were there is a single owning pointer like Array.
To handle more complex cases we would have to calculate an inverse access path needed to get
back to our value and somehow deal with all of the complexity therein (I am sure
we can do it I just haven't thought through all of the details).

The only interesting behavior that this results in is that when we emit
diagnostics, we just use the rc-identical transitive use debug_value's name
without a projection path. This is because the source location associated with
that debug_value is with a separate value that is rc-identical to the actual
value that we visited during our opt-remark traversal up the def-use
graph. Consider the following example below, noting the comments that show in
the SIL itself what I attempted to explain above.

struct KlassPair {
  var lhs: Klass
  var rhs: Klass
}

struct StateWithOwningPointer {
  var state: TrivialState
  var owningPtr: Klass
}

sil @theFunction : $@convention(thin) () -> () {
bb0:
  %0 = apply %getKlassPair() : $@convention(thin) () -> @owned KlassPair
  // This debug_value's name can be combined...
  debug_value %0 : $KlassPair, name "myPair"
  // ... with the access path from the struct_extract here...
  %1 = struct_extract %0 : $KlassPair, #KlassPair.lhs
  // ... to emit a nice diagnostic that 'myPair.lhs' is being retained.
  strong_retain %1 : $Klass

  // In contrast in the case below, we rely on looking through rc-identity uses
  // to find the debug_value. In this case, the source info associated with the
  // debug_value (%2) is no longer associated with the underlying access path we
  // have been tracking upwards (%1 is in our access path list). Instead, we
  // know that the debug_value is rc-identical to whatever value we were
  // originally tracking up (%1) and thus the correct identifier to use is the
  // direct name of the identifier alone (without access path) since that source
  // identifier must be some value in the source that by itself is rc-identical
  // to whatever is being manipulated. Thus if we were to emit the access path
  // here for na rc-identical use we would get "myAdditionalState.owningPtr"
  // which is misleading since ArrayWrapperWithMoreState does not have a field
  // named 'owningPtr', its subfield array does. That being said since
  // rc-identity means a retain_value on the value with the debug_value upon it
  // is equivalent to the access path value we found by walking up the def-use
  // graph from our strong_retain's operand.
  %0a = apply %getStateWithOwningPointer() : $@convention(thin) () -> @owned StateWithOwningPointer
  %1 = struct_extract %0a : $StateWithOwningPointer, #StateWithOwningPointer.owningPtr
  strong_retain %1 : $Klass
  %2 = struct $Array(%0 : $Builtin.NativeObject, ...)
  %3 = struct $ArrayWrapperWithMoreState(%2 : $Array, %moreState : MoreState)
  debug_value %2 : $ArrayWrapperWithMoreState, name "myAdditionalState"
}

@atrick @adrian-prantl this is actually adding something new and not just moving stuff around.

@thefunction

…ity preserving users. A key concept in late ARC optimization is "RC Identity". In short, a result of an instruction is rc-identical to an operand of the instruction if one can safely move a retain (release) from before the instruction on the result to one after on the operand without changing the program semantics. This creates a simple model where one can work on equivalence classes of rc-identical values (using a dominating definition generally as the representative) and thus optimize/pair retain, release. When preparing for late ARC optimization, the optimizer will normalize aggregate ARC operations (retain_value, release_value) into singular strong_retain, strong_release operations on leaf types of the aggregate that are non-trivial. As an example, a retain_value on a KlassPair would be canonicalized into two strong_retain, one for the lhs and one for the rhs. When this is done, the optimizer generally just creates new struct_extract at the point where the retain is. In such a case, we may have that the debug_value for the underlying type is actually on a reformed aggregate whose underlying parts we are retaining: ``` bb0(%0 : $Builtin.NativeObject): strong_retain %0 %1 = struct $Array(%0 : $Builtin.NativeObject, ...) debug_value %1 : $Array, ... ``` By looking through RC identical uses, we can handle a large subset of these cases without much effort: ones were there is a single owning pointer like Array. To handle more complex cases we would have to calculate an inverse access path needed to get back to our value and somehow deal with all of the complexity therein (I am sure we can do it I just haven't thought through all of the details). The only interesting behavior that this results in is that when we emit diagnostics, we just use the rc-identical transitive use debug_value's name without a projection path. This is because the source location associated with that debug_value is with a separate value that is rc-identical to the actual value that we visited during our opt-remark traversal up the def-use graph. Consider the following example below, noting the comments that show in the SIL itself what I attempted to explain above. ``` struct KlassPair { var lhs: Klass var rhs: Klass } struct StateWithOwningPointer { var state: TrivialState var owningPtr: Klass } sil @thefunction : $@convention(thin) () -> () { bb0: %0 = apply %getKlassPair() : $@convention(thin) () -> @owned KlassPair // This debug_value's name can be combined... debug_value %0 : $KlassPair, name "myPair" // ... with the access path from the struct_extract here... %1 = struct_extract %0 : $KlassPair, #KlassPair.lhs // ... to emit a nice diagnostic that 'myPair.lhs' is being retained. strong_retain %1 : $Klass // In contrast in the case below, we rely on looking through rc-identity uses // to find the debug_value. In this case, the source info associated with the // debug_value (%2) is no longer associated with the underlying access path we // have been tracking upwards (%1 is in our access path list). Instead, we // know that the debug_value is rc-identical to whatever value we were // originally tracking up (%1) and thus the correct identifier to use is the // direct name of the identifier alone (without access path) since that source // identifier must be some value in the source that by itself is rc-identical // to whatever is being manipulated. Thus if we were to emit the access path // here for na rc-identical use we would get "myAdditionalState.owningPtr" // which is misleading since ArrayWrapperWithMoreState does not have a field // named 'owningPtr', its subfield array does. That being said since // rc-identity means a retain_value on the value with the debug_value upon it // is equivalent to the access path value we found by walking up the def-use // graph from our strong_retain's operand. %0a = apply %getStateWithOwningPointer() : $@convention(thin) () -> @owned StateWithOwningPointer %1 = struct_extract %0a : $StateWithOwningPointer, #StateWithOwningPointer.owningPtr strong_retain %1 : $Klass %2 = struct $Array(%0 : $Builtin.NativeObject, ...) %3 = struct $ArrayWrapperWithMoreState(%2 : $Array, %moreState : MoreState) debug_value %2 : $ArrayWrapperWithMoreState, name "myAdditionalState" } ```

gottesmm · 2020-09-02T06:27:31Z

@swift-ci smoke test

gottesmm requested review from atrick and adrian-prantl September 2, 2020 06:27

gottesmm merged commit 646fcf6 into swiftlang:master Sep 4, 2020

gottesmm deleted the pr-d1a9d022618a039a807ff68c0d46d898e8fe5578 branch September 4, 2020 18:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[opt-remark] When looking for debug_value users, look modulo RC Identity preserving users. #33754

[opt-remark] When looking for debug_value users, look modulo RC Identity preserving users. #33754

gottesmm commented Sep 2, 2020

Uh oh!

gottesmm commented Sep 2, 2020

Uh oh!

Uh oh!

[opt-remark] When looking for debug_value users, look modulo RC Identity preserving users. #33754

[opt-remark] When looking for debug_value users, look modulo RC Identity preserving users. #33754

Conversation

gottesmm commented Sep 2, 2020

Uh oh!

gottesmm commented Sep 2, 2020

Uh oh!

Uh oh!