[region-isolation] Fix semantics around assigning into projections #69619

gottesmm · 2023-11-02T19:25:12Z

This PR contains a few different changes:

I improved the logging so it is easier to read.
When classifying an instruction as an assign, if the result/operand of the instruction maps to the same element, then we do not emit an assign. There are certain instructions that /always/ do this so I refactored them to instead of using an assign (and never emit anything) to instead just assert that the input/output values are the same. This exposed a different issue where in certain cases if we diagnose an error (via a require) but can't find the transfer instruction we do not emit an error. This can be seen in the tuple tests that I had to disable. In a subsequent commit I am going to change the compiler to emit a "I don't understand error" in such a case and investigate why in the tuple case we can't identify the transfer instruction.
Instead of using AccessStorage::compute directly, I changed the way that we identified address roots to use our own AccessUseDefChainVisitor so that we can determine if we saw a projection on the way to a root (and thus must treat an assignment as a merge).
The actual interesting commit. If we assign into a field of a struct/tuple with multiple stored fields, we must merge instead of assign. I included the entire commit msg below.

[region-isolation] Assigns to var with structs/tuples with multiple fields should be merges for now.

The reason that this is being done is that since currently region based
isolation is not field sensitive, when we assign to the struct or tuple field of
the var, the new region relationship is set for the entire struct, not just a
specific field. This means that we effectively lose any region information from
the other fields. For example in the following at (1), given the current rules, we
lose that s.y was assigned to x:

struct S {
  var x: NonSendableKlass
  var y: NonSendableKlass
}

func foo() {
  var s = S()
  // Regions: [s]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [(s), (x), (y)]
  s.y = x
  // Regions: [(s, x), (y)]
  s.x = y                       (1)
  // Regions: [(x), (s, y)]
}

The best solution to this is to track such var like bindings in a field
sensitive manner where the regions of the aggregate are the union of the
individual fields. This would let us represent the regions of the above as
follows:

func foo() {
  var s = S()
  // Regions: [((s.x), (s.y))]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [((s.x), (s.y)), (x), (y)]
  s.y = x
  // Regions: [((s.x), (s.y, x)), (y)]
  s.x = y                       (1)
  // Regions: [((s.x, y), (s.y, x))]
}

We cannot do this today so to plug this hole, we instead treat these operations
as merges. This provides a conservative answer. Thus we would have:"

func foo() {
  var s = S()
  // Regions: [s]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [(s), (x), (y)]
  s.y = x
  // Regions: [(s, x), (y)]
  s.x = y                       (1)
  // Regions: [(s, x, y])
}

This is because we are treating the assignment into s.y and s.x as merges into
s, so we do not lose that y was assigned into s before we assigned y into
it. The unfortunate side-effect of this is that if a struct or tuple has
multiple fields, the merge makes it so that if we assign over the same field, we
do not lose the region of the old value:

func foo() {
  var s = S()
  // Regions: [s]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [(s), (x), (y)]
  s.y = x
  // Regions: [(s, x), (y)]
  s.y = y                    (1)
  // Regions: [(s, x, y])
}

If we were not to do this, then the s.y at (1) would result in s and x being in
different regions.

rdar://117273952

…ltiAssign, pass in an OptionSet. I am going to be adding a second flag to be passed into translateMultiAssign. Having multiple boolean flags can lead to confusion and mistakes when using the API, so before I do that I am using this as an NFC commit to convert the API to take an option set so the options are explicit.

Specifically: 1. I changed Partition::apply so that it has an emitLog flag. The reason why I did this is we run apply in a few different situations sometimes when we want to emit logging other times when we really don't. For instance, we want to emit logging when walking instructions and updating the entry partition. On the other hand, we do not want to emit logging if we apply a value to a partition while attempting to determine why an error needed to be emitted. 2. When we create an assign partition op and we see that our destination and source are the same representative, we do not create the actual assign. Before we did not log this so it looked like there was a logic error that was stopping us from emitting a partition op when visiting said instructions. Now, we emit a small logging message so it isn't possible to be confused. 3. Since I am adding another parameter to Partition::apply, I decided to refactor Partition::apply to be in a separate PartitionOpEvaluator data structure that contains the options that we used to pass into Partition::apply. This prevents any mistakes around configuring Partition::apply since the fields provide nice names/common sense default values.

…assign. Currently when we create an assign instruction, if we find that the result of the instruction and the operand of the instruction reduce to the same element representative, then we do not actually emit an assign. For certain instructions this makes sense, but this is misleading for instructions like copies (copy_value) and geps (struct_element_addr) that this is always true for. Instead of attempting to assign and just have the builder always clean this up... make it explicit with a new routine called translateSILLookThrough. When this is called, we just look up the value and assert.

… root, use our own UseDefChainVisitor. This gives me more control in getUnderlyingTrackedValue and also will allow me to define another function that can detect if we see a projection on an address so I can change it into merges in the next commit.

…ields should be merges for now. The reason that this is being done is that since currently region based isolation is not field sensitive, when we assign to the struct or tuple field of the var, the new region relationship is set for the entire struct, not just a specific field. This means that we effectively lose any region information from the other fields. For example in the following at (1), given the current rules, we lose that s.y was assigned to x: ```swift struct S { var x: NonSendableKlass var y: NonSendableKlass } func foo() { var s = S() // Regions: [s] let x = NonSendableKlass() let y = NonSendableKlass() // Regions: [(s), (x), (y)] s.y = x // Regions: [(s, x), (y)] s.x = y (1) // Regions: [(x), (s, y)] } ``` The best solution to this is to track such var like bindings in a field sensitive manner where the regions of the aggregate are the union of the individual fields. This would let us represent the regions of the above as follows: ```swift func foo() { var s = S() // Regions: [((s.x), (s.y))] let x = NonSendableKlass() let y = NonSendableKlass() // Regions: [((s.x), (s.y)), (x), (y)] s.y = x // Regions: [((s.x), (s.y, x)), (y)] s.x = y (1) // Regions: [((s.x, y), (s.y, x))] } ``` We cannot do this today so to plug this hole, we instead treat these operations as merges. This provides a conservative answer. Thus we would have:" ```swift func foo() { var s = S() // Regions: [s] let x = NonSendableKlass() let y = NonSendableKlass() // Regions: [(s), (x), (y)] s.y = x // Regions: [(s, x), (y)] s.x = y (1) // Regions: [(s, x, y]) } ``` This is because we are treating the assignment into s.y and s.x as merges into s, so we do not lose that y was assigned into s before we assigned y into it. The unfortunate side-effect of this is that if a struct or tuple has multiple fields, the merge makes it so that if we assign over the same field, we do not lose the region of the old value: ```swift func foo() { var s = S() // Regions: [s] let x = NonSendableKlass() let y = NonSendableKlass() // Regions: [(s), (x), (y)] s.y = x // Regions: [(s, x), (y)] s.y = y (1) // Regions: [(s, x, y]) } ``` If we were not to do this, then the s.y at (1) would result in s and x being in different regions. rdar://117273952

gottesmm · 2023-11-02T23:37:51Z

@swift-ci smoke test

gottesmm · 2023-11-03T19:30:59Z

Slava is going to do post-commit review.

gottesmm added 5 commits October 27, 2023 15:08

[region-isolation] Convert a static global to a constexpr.

069a65d

gottesmm requested review from ktoso and kavon as code owners November 2, 2023 19:25

gottesmm requested a review from slavapestov November 2, 2023 19:30

gottesmm force-pushed the rdar117273952 branch from 41308d2 to 345b4cc Compare November 2, 2023 23:37

gottesmm merged commit edb5d6a into swiftlang:main Nov 3, 2023

gottesmm deleted the rdar117273952 branch November 3, 2023 19:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[region-isolation] Fix semantics around assigning into projections #69619

[region-isolation] Fix semantics around assigning into projections #69619

Uh oh!

gottesmm commented Nov 2, 2023

Uh oh!

gottesmm commented Nov 2, 2023

Uh oh!

gottesmm commented Nov 3, 2023

Uh oh!

Uh oh!

[region-isolation] Fix semantics around assigning into projections #69619

[region-isolation] Fix semantics around assigning into projections #69619

Uh oh!

Conversation

gottesmm commented Nov 2, 2023

Uh oh!

gottesmm commented Nov 2, 2023

Uh oh!

gottesmm commented Nov 3, 2023

Uh oh!

Uh oh!