Skip to content

const evaluator: array values and initialization #22772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

marcrasi
Copy link

Adds array values, initialization operations, and some tests that initialize arrays but that do not check the contents of the initialized arrays.

This PR includes support for AllocateUninitializedArray, but only in the flow-sensitive case. When I tried completely removing AllocateUninitializedArray (per your suggestion in #21703), I realized that it was no longer possible to write tests exercising the RK_ArrayAddress representation and the getIndexedElement/setIndexedElement changes. I think that those are important pieces of basic array support that should be included in the initial PR.

AllocateUninitializedArray in the flow-sensitive case exercises those features, so I added it back. It's still a lot simpler than #21703, because most of the AllocateUninitializedArray complexity was in the top level case.


/// This is the representation of a derived address. A derived address refers
/// to a memory object along with an access path that drills into it.
struct ArraySymbolicValue final
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should be updated for arrays.

if (auto *p2ai = dyn_cast<PointerToAddressInst>(value))
return getConstantValue(p2ai->getOperand());

// Indexing a pointer moves its deepest index.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be elaborate a bit more? For instance, "Indexing a pointer moves the deepest index of the accesspath it represents within a memory object" E.g. If a pointer p represents the access path [1, 2] within a memory object, p + 1 represents [1, 3]"

// This function has this signature:
// func _allocateUninitializedArray<Element>(_ builtinCount: Builtin.Word)
// -> (Array<Element>, Builtin.RawPointer)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an assertion on number of SIL results and parameters similar to other cases?

// Arrays
//
// TODO: The constant evaluator does not implement any array access operations,
// so theses tests cannot test that the implemented array operations produce
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theses -> these

@@ -803,6 +803,7 @@ extension Array: RangeReplaceableCollection {
/// // Prints "true"
@inlinable
@_semantics("array.init")
@_semantics("array.init_empty")
public init() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need a new semantics attribute here or instead check that the parameters of the initializer are empty and that the receiver object is Array, and return unknown if it is not.

@eeckstein This PR is adding support for Arrays to the interpreter. We would like to handle only Array.init(), and will have use a semantics annotation to hardcode its semantics in the interpreter. There seems to be an "array.init" semantics attribute on Array initializers, but it is shared by empty and non-empty initializes as well as by other types: ArraySilce and ContiguousArray. So, to distinguish Array.init() there are two options, either we add another semantics attribute to Array.init(), or check the receiver type and the parameter count in the interpreter to identify this initializer from others. Just wondering what approach you would recommend here?

@@ -27,6 +27,7 @@ public struct _DependenceToken {
/// - Precondition: `storage` is `_ContiguousArrayStorage`.
@inlinable // FIXME(inline-always)
@inline(__always)
@_semantics("array.allocate_uninitialized")
public // COMPILER_INTRINSIC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this being a compiler intrinsic and also something declared in KnownDecls.def, we should not have a semantics attribute here but instead just compare to astContext.getAllocateUninitializedArray().

@eeckstein It would be very helpful if you can share your thoughts on this issue as well.

@ravikandhadai
Copy link
Contributor

@marcrasi Sorry for the delay. I just made another PR which when merged will make the changes here simpler. Btw, I came across this function: ArrayAllocation::mapInitializationStores it seems very similar to the original _handleAllocateUninitialized or a similar helper function for handling top-level array literals

@marcrasi marcrasi force-pushed the marcrasi-constexpr-arrays branch from d61165c to a6e94ec Compare March 7, 2019 21:35
@marcrasi
Copy link
Author

marcrasi commented Mar 7, 2019

I rebased this to master and addressed all your non-_semantics comments! I'll rebase again and adjust the semantics stuff once your PR merges.

@ravikandhadai
Copy link
Contributor

Great. The @_semantics PR just got merged.

@marcrasi marcrasi force-pushed the marcrasi-constexpr-arrays branch from a6e94ec to 3cfbce6 Compare March 8, 2019 07:03
@marcrasi
Copy link
Author

marcrasi commented Mar 8, 2019

Updated!

@ravikandhadai
Copy link
Contributor

Cool. LGTM. @devincoughlin let me know if this looks good to you.

@ravikandhadai ravikandhadai self-requested a review March 8, 2019 19:44
@ravikandhadai
Copy link
Contributor

@swift-ci Please test

@swift-ci
Copy link
Contributor

swift-ci commented Mar 8, 2019

Build failed
Swift Test Linux Platform
Git Sha - d61165c

@swift-ci
Copy link
Contributor

swift-ci commented Mar 8, 2019

Build failed
Swift Test OS X Platform
Git Sha - d61165c

@marcrasi
Copy link
Author

marcrasi commented Mar 28, 2019

Ping on this. Are we waiting on a look from @devincoughlin ?

@ravikandhadai
Copy link
Contributor

@marcrasi Yes

RK_Array,

/// This represents an address of a memory object containing an array.
RK_ArrayAddress,
Copy link
Contributor

@ravikandhadai ravikandhadai Apr 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcrasi Sorry for the delay in the responding to this PR. Based on discussions with @devincoughlin, we have a few questions/comments about this PR. We are trying to understand the motivation for the two representations for an array: RK_Array and RK_ArrayAddress. Swift's representation of an Array, in principle, is a struct which has a pointer to a backing storage that contains the elements. It seems to us that RK_Array (i.e, ArraySymbolicValue) models the backing storage which stores the elements. RK_ArrayAddress is modeling the more accurate view of the array as a struct with a pointer to the backing storage namely RK_Array. It seems this distinction between an Array and its backing storage becomes important for modeling _allocateUninitializedArray function which returns them separately and initializes the backing storage directly through the pointer. Our first question is, is our understanding right?

If that is the case, it seems to us that RK_ArrayAddress is possibly the more accurate view of an array and RK_Array can be seen as a more optimal representation when the distinction between the array and the backing storage can be safely discarded. Therefore, the name RK_ArrayAddress seems a bit misleading as it is not representing an address of an Array, but rather a more accurate view of an Array. Would it be possible to give RK_Array, RK_ArrayAddress, ArraySymbolicValue another name and also add some documentation? For instance, what do you think about, SymbolicArrayBuffer for ArraySymbolicValue, RK_ArrayBuffer for RK_Array and RK_ArrayStruct for RK_ArrayAddress.

Ideally, I think we can create a new symbolic value for Array that will manage the differences in the representation. But, I think I can do that in another PR, after this lands.

Copy link
Contributor

@ravikandhadai ravikandhadai Apr 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, I think SymbolicArrayStorage and RK_ArrayStorage are better alternative names for SymbolicArrayBuffer and RK_ArrayBuffer I proposed above. This is inline with _ContiguousArrayStorage class used in by struct Array.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these name suggestions, thanks! Your comments and name suggestions are making it clear to me that these RKs are two fundamentally different things that should actually have different Kinds.

Also, I'm pretty sure that the place below where you noticed the collapsing is a bug. (Probably not exposed by anything yet, but likely to become exposed when there are more array features.)

Finally, now that it's clear that RK_ArrayAddress is just a struct containing a pointer, why not make it a RK_Aggregate, like all the normal structs? This should simplify things!

So in summary, the total changes from the current state of this PR are:

  • RK_Array => RK_ArrayStorage
  • ArraySymbolicValue => SymbolicArrayStorage
  • RK_ArrayAddress => eliminated, replaced by the existing RK_Aggregate

I haven't tried this out yet so there may be some difficulty I'm not seeing. So I'll try it out soon and let you know how it goes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great. :-)

Copy link
Contributor

@ravikandhadai ravikandhadai Apr 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, I think that it may be a good idea to have a special kind for "Array" aggregates (in addition to ArrayStorage kind) and not treat them as general aggregates. In fact, I even think we should have special ArraySymbolicValue (in addition to SymbolicArrayStorage). The following are some reasons for that:

  • We don't want to make it appear that the interpreter supports arbitrary aggregates with pointers. Having a special "kind" for such an aggregate, makes this clear.

  • get/setIndexedElement would need to handle addresses (which could now appear inside aggregates). Normally, setIndexedElement would be expected to update the underlying memory object in place, as we want everything that aliases with the pointer field to also see the changes when setIndexedElement completes. But, arrays are COW and provide a value semantics. This means we have to special case the address case for arrays, like creating a new memory object and not in-place updating the underlying memory object. This would appear confusing, as it is only correct for a type like Array that uses pointers underneath but provides a value semantics.

To address both of this, we may rather have a new symbolic value for "array" which just stores a single DirectAddressValue or even just a SymbolicMemoryObject containing a SymbolicArrayStorage. This ArraySymbolicValue may support operations like getAddressOfStorage() which creates a direct address from the underlying SymbolicMemoryObject, and getStorage() which returns the storage. setIndexedElement can just retrieve and (immutably) update the array storage and create a new ArraySymbolicValue with the new storage and return it.

As usual, we can model mutable array update methods like append and subscript which take an address to an ArraySymbolicValue, by creating a new ArraySymbolicValue with the necessary changes and replacing it in the SymbolicMemoryObject corresponding to the address. So the ArraySymbolicValue will remain mostly immutable much like other Symbolic values (despite containing an address), and the inner address of the buffer is only exposed by getAddressOfStorage which is used only during initialization.

for (auto elt : elts)
results.push_back(elt.cloneInto(astContext));
return getArray(results, elementType, astContext);
}
}
Copy link
Contributor

@ravikandhadai ravikandhadai Apr 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems clone is collapsing the representation of RK_Array and RK_ArrayAddress. It ignores the differences. Is this intended? Can we add a line to document this behavior,

@ravikandhadai
Copy link
Contributor

This PR: #27344 has addressed the comments posted here and has been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants