-
Notifications
You must be signed in to change notification settings - Fork 10.5k
const evaluator: array values and initialization #22772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
/// This is the representation of a derived address. A derived address refers | ||
/// to a memory object along with an access path that drills into it. | ||
struct ArraySymbolicValue final |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should be updated for arrays.
lib/SILOptimizer/Utils/ConstExpr.cpp
Outdated
if (auto *p2ai = dyn_cast<PointerToAddressInst>(value)) | ||
return getConstantValue(p2ai->getOperand()); | ||
|
||
// Indexing a pointer moves its deepest index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be elaborate a bit more? For instance, "Indexing a pointer moves the deepest index of the accesspath it represents within a memory object" E.g. If a pointer p
represents the access path [1, 2] within a memory object, p + 1 represents [1, 3]
"
// This function has this signature: | ||
// func _allocateUninitializedArray<Element>(_ builtinCount: Builtin.Word) | ||
// -> (Array<Element>, Builtin.RawPointer) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be an assertion on number of SIL results and parameters similar to other cases?
test/SILOptimizer/pound_assert.swift
Outdated
// Arrays | ||
// | ||
// TODO: The constant evaluator does not implement any array access operations, | ||
// so theses tests cannot test that the implemented array operations produce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
theses -> these
stdlib/public/core/Array.swift
Outdated
@@ -803,6 +803,7 @@ extension Array: RangeReplaceableCollection { | |||
/// // Prints "true" | |||
@inlinable | |||
@_semantics("array.init") | |||
@_semantics("array.init_empty") | |||
public init() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need a new semantics attribute here or instead check that the parameters of the initializer are empty and that the receiver object is Array, and return unknown if it is not.
@eeckstein This PR is adding support for Arrays to the interpreter. We would like to handle only Array.init()
, and will have use a semantics annotation to hardcode its semantics in the interpreter. There seems to be an "array.init" semantics attribute on Array initializers, but it is shared by empty and non-empty initializes as well as by other types: ArraySilce
and ContiguousArray
. So, to distinguish Array.init()
there are two options, either we add another semantics attribute to Array.init()
, or check the receiver type and the parameter count in the interpreter to identify this initializer from others. Just wondering what approach you would recommend here?
stdlib/public/core/ArrayShared.swift
Outdated
@@ -27,6 +27,7 @@ public struct _DependenceToken { | |||
/// - Precondition: `storage` is `_ContiguousArrayStorage`. | |||
@inlinable // FIXME(inline-always) | |||
@inline(__always) | |||
@_semantics("array.allocate_uninitialized") | |||
public // COMPILER_INTRINSIC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this being a compiler intrinsic and also something declared in KnownDecls.def
, we should not have a semantics attribute here but instead just compare to astContext.getAllocateUninitializedArray()
.
@eeckstein It would be very helpful if you can share your thoughts on this issue as well.
@marcrasi Sorry for the delay. I just made another PR which when merged will make the changes here simpler. Btw, I came across this function: |
d61165c
to
a6e94ec
Compare
I rebased this to master and addressed all your non- |
Great. The @_semantics PR just got merged. |
a6e94ec
to
3cfbce6
Compare
Updated! |
Cool. LGTM. @devincoughlin let me know if this looks good to you. |
@swift-ci Please test |
Build failed |
Build failed |
Ping on this. Are we waiting on a look from @devincoughlin ? |
@marcrasi Yes |
RK_Array, | ||
|
||
/// This represents an address of a memory object containing an array. | ||
RK_ArrayAddress, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcrasi Sorry for the delay in the responding to this PR. Based on discussions with @devincoughlin, we have a few questions/comments about this PR. We are trying to understand the motivation for the two representations for an array: RK_Array
and RK_ArrayAddress
. Swift's representation of an Array, in principle, is a struct
which has a pointer to a backing storage that contains the elements. It seems to us that RK_Array (i.e, ArraySymbolicValue) models the backing storage which stores the elements. RK_ArrayAddress is modeling the more accurate view of the array as a struct with a pointer to the backing storage namely RK_Array. It seems this distinction between an Array and its backing storage becomes important for modeling _allocateUninitializedArray
function which returns them separately and initializes the backing storage directly through the pointer. Our first question is, is our understanding right?
If that is the case, it seems to us that RK_ArrayAddress is possibly the more accurate view of an array and RK_Array can be seen as a more optimal representation when the distinction between the array and the backing storage can be safely discarded. Therefore, the name RK_ArrayAddress
seems a bit misleading as it is not representing an address of an Array, but rather a more accurate view of an Array. Would it be possible to give RK_Array, RK_ArrayAddress, ArraySymbolicValue another name and also add some documentation? For instance, what do you think about, SymbolicArrayBuffer
for ArraySymbolicValue, RK_ArrayBuffer
for RK_Array and RK_ArrayStruct
for RK_ArrayAddress.
Ideally, I think we can create a new symbolic value for Array that will manage the differences in the representation. But, I think I can do that in another PR, after this lands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reflection, I think SymbolicArrayStorage
and RK_ArrayStorage
are better alternative names for SymbolicArrayBuffer
and RK_ArrayBuffer
I proposed above. This is inline with _ContiguousArrayStorage
class used in by struct Array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like these name suggestions, thanks! Your comments and name suggestions are making it clear to me that these RKs are two fundamentally different things that should actually have different Kind
s.
Also, I'm pretty sure that the place below where you noticed the collapsing is a bug. (Probably not exposed by anything yet, but likely to become exposed when there are more array features.)
Finally, now that it's clear that RK_ArrayAddress is just a struct containing a pointer, why not make it a RK_Aggregate, like all the normal structs? This should simplify things!
So in summary, the total changes from the current state of this PR are:
RK_Array
=>RK_ArrayStorage
ArraySymbolicValue
=>SymbolicArrayStorage
RK_ArrayAddress
=> eliminated, replaced by the existingRK_Aggregate
I haven't tried this out yet so there may be some difficulty I'm not seeing. So I'll try it out soon and let you know how it goes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reflection, I think that it may be a good idea to have a special kind for "Array" aggregates (in addition to ArrayStorage
kind) and not treat them as general aggregates. In fact, I even think we should have special ArraySymbolicValue
(in addition to SymbolicArrayStorage
). The following are some reasons for that:
-
We don't want to make it appear that the interpreter supports arbitrary aggregates with pointers. Having a special "kind" for such an aggregate, makes this clear.
-
get/setIndexedElement
would need to handleaddresses
(which could now appear inside aggregates). Normally,setIndexedElement
would be expected to update the underlying memory object in place, as we want everything that aliases with the pointer field to also see the changes whensetIndexedElement
completes. But, arrays are COW and provide a value semantics. This means we have to special case theaddress
case for arrays, like creating a new memory object and not in-place updating the underlying memory object. This would appear confusing, as it is only correct for a type like Array that uses pointers underneath but provides a value semantics.
To address both of this, we may rather have a new symbolic value for "array" which just stores a single DirectAddressValue
or even just a SymbolicMemoryObject
containing a SymbolicArrayStorage
. This ArraySymbolicValue
may support operations like getAddressOfStorage()
which creates a direct address from the underlying SymbolicMemoryObject, and getStorage()
which returns the storage. setIndexedElement
can just retrieve and (immutably) update the array storage and create a new ArraySymbolicValue with the new storage and return it.
As usual, we can model mutable array update methods like append
and subscript
which take an address to an ArraySymbolicValue, by creating a new ArraySymbolicValue with the necessary changes and replacing it in the SymbolicMemoryObject corresponding to the address. So the ArraySymbolicValue will remain mostly immutable much like other Symbolic values (despite containing an address), and the inner address of the buffer is only exposed by getAddressOfStorage
which is used only during initialization.
for (auto elt : elts) | ||
results.push_back(elt.cloneInto(astContext)); | ||
return getArray(results, elementType, astContext); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems clone is collapsing the representation of RK_Array and RK_ArrayAddress. It ignores the differences. Is this intended? Can we add a line to document this behavior,
This PR: #27344 has addressed the comments posted here and has been merged. |
Adds array values, initialization operations, and some tests that initialize arrays but that do not check the contents of the initialized arrays.
This PR includes support for AllocateUninitializedArray, but only in the flow-sensitive case. When I tried completely removing AllocateUninitializedArray (per your suggestion in #21703), I realized that it was no longer possible to write tests exercising the
RK_ArrayAddress
representation and thegetIndexedElement
/setIndexedElement
changes. I think that those are important pieces of basic array support that should be included in the initial PR.AllocateUninitializedArray in the flow-sensitive case exercises those features, so I added it back. It's still a lot simpler than #21703, because most of the AllocateUninitializedArray complexity was in the top level case.