-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Copy-on-write representation in SIL: instructions and builtins #31728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci smoke test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished reviewing the new instructions. The "requested change" flag is for the ARCAnalysis.cpp change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Builtins mostly look good. I just have some questions and think some comments are warranted.
lib/AST/Builtins.cpp
Outdated
@@ -2249,9 +2281,16 @@ ValueDecl *swift::getBuiltinValueDecl(ASTContext &Context, Identifier Id) { | |||
|
|||
case BuiltinValueKind::IsUnique: | |||
case BuiltinValueKind::IsUnique_native: | |||
case BuiltinValueKind::BeginCOWmutation: | |||
case BuiltinValueKind::BeginCOWmutation_native: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this since getIsUniqueOperation
returns a single bool result
[EDIT] Oh, now I get it. Builtin.isUnique could be treated exactly like the new builtin, except that we can't assume there's an eventual end mutation marker. Could we use begin_cow_mutation for Builtin.isUnique anyway and just not worry that there will not be an end_cow_mutation?
This is a little tricky so it would be helpful to have a short comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually we should convert all COW data structures in the stdlib to Begin/EndCOWMutation. So isUnique should go away anyway (except some uses in assert conditions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there will still be a public isUniquelyReferenced
API though. The question is whether it should have a different implementation from CoW mutation... just curious what the plan is so we can design for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SIL.rst docs look good
3f80301
to
ac99ac6
Compare
@swift-ci smoke test |
@atrick Thanks for reviewing! I pushed a new version |
* a new [immutable] attribute on ref_element_addr and ref_tail_addr * new instructions: begin_cow_mutation and end_cow_mutation These new instructions are intended to be used for the stdlib's COW containers, e.g. Array. They allow more aggressive optimizations, especially for Array.
* Builtin.COWBufferForReading -> ref_element_addr [immutable] / ref_tail_addr [immutable] * Builtin.beginCOWmutation -> begin_cow_mutation * Builtin.endCOWmutation -> end_cow_mutation
ac99ac6
to
8f26329
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Thanks!
@swift-ci smoke test and merge |
1 similar comment
@swift-ci smoke test and merge |
This is the first PR for copy-on-write (COW) representation in SIL. It adds new SIL instructions and instruction flags and builtins which can be used in the stdlib to create the new SIL instructions.
It's still a NFC (regarding the generated code) because the new instructions and builtins are not used in the library yet.
The goal is to have a clean representation of COW data structures in SIL, most importantly for stdlib's Array. It helps the optimizer and avoids hacks in the code where we currently have to hard code the
Array
type.For example, the optimizer then knows that an Array is immutable when this array is immutable in the source code. This is currently not the case.
Note that this work is purely about the SIL representation and not about a language feature. The builtins allow to use the COW representation in the stdlib. It's possible to create a nicer language feature around this, but this is not in the scope of this work.
In case anyone is interested, this is the remaining (not yet finished) work: #31730
I added a section in SIL.rst which gives an overview of the COW representation. I copy it here for convenience:
Copy-on-Write Representation
Copy-on-Write (COW) data structures are implemented by a reference to an object
which is copied on mutation in case it's not uniquely referenced.
A COW mutation sequence in SIL typically looks like:
Loading from a COW data structure looks like:
The
immutable
attribute means that loading values fromref_element_addr
and
ref_tail_addr
instructions, which have the same operand, areequivalent.
In other words, it's guaranteed that a buffer's properties are not mutated
between two
ref_element/tail_addr [immutable]
as long as they have thesame buffer reference as operand.
This is even true if e.g. the buffer 'escapes' to an unknown function.
In the example above,
%value2
is equal to%value1
because the operandof both
ref_element_addr
instructions is the same%immutable_buffer
.Conceptually, the content of a COW buffer object can be seen as part of
the same static (immutable) SSA value as the buffer reference.
The lifetime of a COW value is strictly separated into mutable and
immutable regions by
begin_cow_mutation
andend_cow_mutation
instructions::Both,
begin_cow_mutation
andend_cow_mutation
, consume their operandand return the new buffer as an owned value.
The
begin_cow_mutation
will compile down to a uniqueness check andend_cow_mutation
will compile to a no-op.Although the physical pointer value of the returned buffer reference is the
same as the operand, it's important to generate a new buffer reference in
SIL. It prevents the optimizer from moving buffer accesses from a mutable into
a immutable region and vice versa.
Because the buffer content is conceptually part of the
buffer reference SSA value, there must be a new buffer reference every time
the buffer content is mutated.
To illustrate this, let's look at an example, where a COW value is mutated in
a loop. As with a scalar SSA value, also mutating a COW buffer will enforce a
phi-argument in the loop header block (for simplicity the code for copying a
non-unique buffer is not shown)::
Two adjacent
begin_cow_mutation
andend_cow_mutation
instructionsdon't need to be in the same function.