Skip to content

[region-isolation] Fix semantics around assigning into projections #69619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 3, 2023

Conversation

gottesmm
Copy link
Contributor

@gottesmm gottesmm commented Nov 2, 2023

This PR contains a few different changes:

  1. I improved the logging so it is easier to read.
  2. When classifying an instruction as an assign, if the result/operand of the instruction maps to the same element, then we do not emit an assign. There are certain instructions that /always/ do this so I refactored them to instead of using an assign (and never emit anything) to instead just assert that the input/output values are the same. This exposed a different issue where in certain cases if we diagnose an error (via a require) but can't find the transfer instruction we do not emit an error. This can be seen in the tuple tests that I had to disable. In a subsequent commit I am going to change the compiler to emit a "I don't understand error" in such a case and investigate why in the tuple case we can't identify the transfer instruction.
  3. Instead of using AccessStorage::compute directly, I changed the way that we identified address roots to use our own AccessUseDefChainVisitor so that we can determine if we saw a projection on the way to a root (and thus must treat an assignment as a merge).
  4. The actual interesting commit. If we assign into a field of a struct/tuple with multiple stored fields, we must merge instead of assign. I included the entire commit msg below.

[region-isolation] Assigns to var with structs/tuples with multiple fields should be merges for now.

The reason that this is being done is that since currently region based
isolation is not field sensitive, when we assign to the struct or tuple field of
the var, the new region relationship is set for the entire struct, not just a
specific field. This means that we effectively lose any region information from
the other fields. For example in the following at (1), given the current rules, we
lose that s.y was assigned to x:

struct S {
  var x: NonSendableKlass
  var y: NonSendableKlass
}

func foo() {
  var s = S()
  // Regions: [s]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [(s), (x), (y)]
  s.y = x
  // Regions: [(s, x), (y)]
  s.x = y                       (1)
  // Regions: [(x), (s, y)]
}

The best solution to this is to track such var like bindings in a field
sensitive manner where the regions of the aggregate are the union of the
individual fields. This would let us represent the regions of the above as
follows:

func foo() {
  var s = S()
  // Regions: [((s.x), (s.y))]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [((s.x), (s.y)), (x), (y)]
  s.y = x
  // Regions: [((s.x), (s.y, x)), (y)]
  s.x = y                       (1)
  // Regions: [((s.x, y), (s.y, x))]
}

We cannot do this today so to plug this hole, we instead treat these operations
as merges. This provides a conservative answer. Thus we would have:"

func foo() {
  var s = S()
  // Regions: [s]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [(s), (x), (y)]
  s.y = x
  // Regions: [(s, x), (y)]
  s.x = y                       (1)
  // Regions: [(s, x, y])
}

This is because we are treating the assignment into s.y and s.x as merges into
s, so we do not lose that y was assigned into s before we assigned y into
it. The unfortunate side-effect of this is that if a struct or tuple has
multiple fields, the merge makes it so that if we assign over the same field, we
do not lose the region of the old value:

func foo() {
  var s = S()
  // Regions: [s]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [(s), (x), (y)]
  s.y = x
  // Regions: [(s, x), (y)]
  s.y = y                    (1)
  // Regions: [(s, x, y])
}

If we were not to do this, then the s.y at (1) would result in s and x being in
different regions.

rdar://117273952

…ltiAssign, pass in an OptionSet.

I am going to be adding a second flag to be passed into translateMultiAssign.
Having multiple boolean flags can lead to confusion and mistakes when using the
API, so before I do that I am using this as an NFC commit to convert the API to
take an option set so the options are explicit.
Specifically:

1. I changed Partition::apply so that it has an emitLog flag. The reason why I
did this is we run apply in a few different situations sometimes when we want to
emit logging other times when we really don't. For instance, we want to emit
logging when walking instructions and updating the entry partition. On the other
hand, we do not want to emit logging if we apply a value to a partition while
attempting to determine why an error needed to be emitted.

2. When we create an assign partition op and we see that our destination and
source are the same representative, we do not create the actual assign. Before
we did not log this so it looked like there was a logic error that was stopping
us from emitting a partition op when visiting said instructions. Now, we emit a
small logging message so it isn't possible to be confused.

3. Since I am adding another parameter to Partition::apply, I decided to
refactor Partition::apply to be in a separate PartitionOpEvaluator data
structure that contains the options that we used to pass into Partition::apply.
This prevents any mistakes around configuring Partition::apply since the fields
provide nice names/common sense default values.
…assign.

Currently when we create an assign instruction, if we find that the result of
the instruction and the operand of the instruction reduce to the same element
representative, then we do not actually emit an assign.

For certain instructions this makes sense, but this is misleading for
instructions like copies (copy_value) and geps (struct_element_addr) that this
is always true for. Instead of attempting to assign and just have the builder
always clean this up... make it explicit with a new routine called
translateSILLookThrough. When this is called, we just look up the value and
assert.
… root, use our own UseDefChainVisitor.

This gives me more control in getUnderlyingTrackedValue and also will allow me
to define another function that can detect if we see a projection on an address
so I can change it into merges in the next commit.
@gottesmm gottesmm requested review from ktoso and kavon as code owners November 2, 2023 19:25
@gottesmm gottesmm requested a review from slavapestov November 2, 2023 19:30
…ields should be merges for now.

The reason that this is being done is that since currently region based
isolation is not field sensitive, when we assign to the struct or tuple field of
the var, the new region relationship is set for the entire struct, not just a
specific field. This means that we effectively lose any region information from
the other fields. For example in the following at (1), given the current rules, we
lose that s.y was assigned to x:

```swift
struct S {
  var x: NonSendableKlass
  var y: NonSendableKlass
}

func foo() {
  var s = S()
  // Regions: [s]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [(s), (x), (y)]
  s.y = x
  // Regions: [(s, x), (y)]
  s.x = y                       (1)
  // Regions: [(x), (s, y)]
}
```

The best solution to this is to track such var like bindings in a field
sensitive manner where the regions of the aggregate are the union of the
individual fields. This would let us represent the regions of the above as
follows:

```swift
func foo() {
  var s = S()
  // Regions: [((s.x), (s.y))]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [((s.x), (s.y)), (x), (y)]
  s.y = x
  // Regions: [((s.x), (s.y, x)), (y)]
  s.x = y                       (1)
  // Regions: [((s.x, y), (s.y, x))]
}
```

We cannot do this today so to plug this hole, we instead treat these operations
as merges. This provides a conservative answer. Thus we would have:"

```swift
func foo() {
  var s = S()
  // Regions: [s]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [(s), (x), (y)]
  s.y = x
  // Regions: [(s, x), (y)]
  s.x = y                       (1)
  // Regions: [(s, x, y])
}
```

This is because we are treating the assignment into s.y and s.x as merges into
s, so we do not lose that y was assigned into s before we assigned y into
it. The unfortunate side-effect of this is that if a struct or tuple has
multiple fields, the merge makes it so that if we assign over the same field, we
do not lose the region of the old value:

```swift
func foo() {
  var s = S()
  // Regions: [s]
  let x = NonSendableKlass()
  let y = NonSendableKlass()
  // Regions: [(s), (x), (y)]
  s.y = x
  // Regions: [(s, x), (y)]
  s.y = y                    (1)
  // Regions: [(s, x, y])
}
```

If we were not to do this, then the s.y at (1) would result in s and x being in
different regions.

rdar://117273952
@gottesmm
Copy link
Contributor Author

gottesmm commented Nov 2, 2023

@swift-ci smoke test

@gottesmm
Copy link
Contributor Author

gottesmm commented Nov 3, 2023

Slava is going to do post-commit review.

@gottesmm gottesmm merged commit edb5d6a into swiftlang:main Nov 3, 2023
@gottesmm gottesmm deleted the rdar117273952 branch November 3, 2023 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant