Skip to content

const evaluator: array values and initialization #22772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions include/swift/SIL/SILConstants.h
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,12 @@ class SymbolicValue {

/// This represents an index *into* a memory object.
RK_DerivedAddress,

/// This represents an array value.
RK_Array,

/// This represents an address of a memory object containing an array.
RK_ArrayAddress,
Copy link
Contributor

@ravikandhadai ravikandhadai Apr 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcrasi Sorry for the delay in the responding to this PR. Based on discussions with @devincoughlin, we have a few questions/comments about this PR. We are trying to understand the motivation for the two representations for an array: RK_Array and RK_ArrayAddress. Swift's representation of an Array, in principle, is a struct which has a pointer to a backing storage that contains the elements. It seems to us that RK_Array (i.e, ArraySymbolicValue) models the backing storage which stores the elements. RK_ArrayAddress is modeling the more accurate view of the array as a struct with a pointer to the backing storage namely RK_Array. It seems this distinction between an Array and its backing storage becomes important for modeling _allocateUninitializedArray function which returns them separately and initializes the backing storage directly through the pointer. Our first question is, is our understanding right?

If that is the case, it seems to us that RK_ArrayAddress is possibly the more accurate view of an array and RK_Array can be seen as a more optimal representation when the distinction between the array and the backing storage can be safely discarded. Therefore, the name RK_ArrayAddress seems a bit misleading as it is not representing an address of an Array, but rather a more accurate view of an Array. Would it be possible to give RK_Array, RK_ArrayAddress, ArraySymbolicValue another name and also add some documentation? For instance, what do you think about, SymbolicArrayBuffer for ArraySymbolicValue, RK_ArrayBuffer for RK_Array and RK_ArrayStruct for RK_ArrayAddress.

Ideally, I think we can create a new symbolic value for Array that will manage the differences in the representation. But, I think I can do that in another PR, after this lands.

Copy link
Contributor

@ravikandhadai ravikandhadai Apr 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, I think SymbolicArrayStorage and RK_ArrayStorage are better alternative names for SymbolicArrayBuffer and RK_ArrayBuffer I proposed above. This is inline with _ContiguousArrayStorage class used in by struct Array.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these name suggestions, thanks! Your comments and name suggestions are making it clear to me that these RKs are two fundamentally different things that should actually have different Kinds.

Also, I'm pretty sure that the place below where you noticed the collapsing is a bug. (Probably not exposed by anything yet, but likely to become exposed when there are more array features.)

Finally, now that it's clear that RK_ArrayAddress is just a struct containing a pointer, why not make it a RK_Aggregate, like all the normal structs? This should simplify things!

So in summary, the total changes from the current state of this PR are:

  • RK_Array => RK_ArrayStorage
  • ArraySymbolicValue => SymbolicArrayStorage
  • RK_ArrayAddress => eliminated, replaced by the existing RK_Aggregate

I haven't tried this out yet so there may be some difficulty I'm not seeing. So I'll try it out soon and let you know how it goes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great. :-)

Copy link
Contributor

@ravikandhadai ravikandhadai Apr 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, I think that it may be a good idea to have a special kind for "Array" aggregates (in addition to ArrayStorage kind) and not treat them as general aggregates. In fact, I even think we should have special ArraySymbolicValue (in addition to SymbolicArrayStorage). The following are some reasons for that:

  • We don't want to make it appear that the interpreter supports arbitrary aggregates with pointers. Having a special "kind" for such an aggregate, makes this clear.

  • get/setIndexedElement would need to handle addresses (which could now appear inside aggregates). Normally, setIndexedElement would be expected to update the underlying memory object in place, as we want everything that aliases with the pointer field to also see the changes when setIndexedElement completes. But, arrays are COW and provide a value semantics. This means we have to special case the address case for arrays, like creating a new memory object and not in-place updating the underlying memory object. This would appear confusing, as it is only correct for a type like Array that uses pointers underneath but provides a value semantics.

To address both of this, we may rather have a new symbolic value for "array" which just stores a single DirectAddressValue or even just a SymbolicMemoryObject containing a SymbolicArrayStorage. This ArraySymbolicValue may support operations like getAddressOfStorage() which creates a direct address from the underlying SymbolicMemoryObject, and getStorage() which returns the storage. setIndexedElement can just retrieve and (immutably) update the array storage and create a new ArraySymbolicValue with the new storage and return it.

As usual, we can model mutable array update methods like append and subscript which take an address to an ArraySymbolicValue, by creating a new ArraySymbolicValue with the necessary changes and replacing it in the SymbolicMemoryObject corresponding to the address. So the ArraySymbolicValue will remain mostly immutable much like other Symbolic values (despite containing an address), and the inner address of the buffer is only exposed by getAddressOfStorage which is used only during initialization.

};

union {
Expand Down Expand Up @@ -157,6 +163,12 @@ class SymbolicValue {
/// When this SymbolicValue is of "DerivedAddress" kind, this pointer stores
/// information about the memory object and access path of the access.
DerivedAddressValue *derivedAddress;

/// For RK_Array, this is the elements of the array.
ArraySymbolicValue *array;

/// For RK_ArrayAddress, this is the memory object referenced.
SymbolicValueMemoryObject *arrayAddress;
} value;

RepresentationKind representationKind : 8;
Expand Down Expand Up @@ -209,6 +221,9 @@ class SymbolicValue {
/// This value represents the address of, or into, a memory object.
Address,

/// This represents an array value.
Array,

/// These values are generally only seen internally to the system, external
/// clients shouldn't have to deal with them.
UninitMemory
Expand Down Expand Up @@ -332,6 +347,20 @@ class SymbolicValue {
/// Return just the memory object for an address value.
SymbolicValueMemoryObject *getAddressValueMemoryObject() const;

/// Produce an array of elements.
static SymbolicValue getArray(ArrayRef<SymbolicValue> elements,
CanType elementType,
ASTContext &astContext);
static SymbolicValue
getArrayAddress(SymbolicValueMemoryObject *memoryObject) {
SymbolicValue result;
result.representationKind = RK_ArrayAddress;
result.value.directAddress = memoryObject;
return result;
}

ArrayRef<SymbolicValue> getArrayValue(CanType &elementType) const;

//===--------------------------------------------------------------------===//
// Helpers

Expand Down
176 changes: 155 additions & 21 deletions lib/SIL/SILConstants.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,27 @@ void SymbolicValue::print(llvm::raw_ostream &os, unsigned indent) const {
os << "\n";
break;
}
case RK_Array:
case RK_ArrayAddress: {
CanType elementType;
ArrayRef<SymbolicValue> elements = getArrayValue(elementType);
os << "array<" << elementType << ">: " << elements.size();
switch (elements.size()) {
case 0:
os << " elements []\n";
return;
case 1:
os << " elt: ";
elements[0].print(os, indent + 2);
return;
default:
os << " elements [\n";
for (auto elt : elements)
elt.print(os, indent + 2);
os.indent(indent) << "]\n";
return;
}
}
}
}

Expand Down Expand Up @@ -137,6 +158,9 @@ SymbolicValue::Kind SymbolicValue::getKind() const {
case RK_DirectAddress:
case RK_DerivedAddress:
return Address;
case RK_Array:
case RK_ArrayAddress:
return Array;
}
}

Expand Down Expand Up @@ -177,6 +201,16 @@ SymbolicValue::cloneInto(ASTContext &astContext) const {
memObject->getType(), memObject->getValue(), astContext);
return getAddress(newMemObject, accessPath, astContext);
}
case RK_Array:
case RK_ArrayAddress: {
CanType elementType;
auto elts = getArrayValue(elementType);
SmallVector<SymbolicValue, 4> results;
results.reserve(elts.size());
for (auto elt : elts)
results.push_back(elt.cloneInto(astContext));
return getArray(results, elementType, astContext);
}
}
Copy link
Contributor

@ravikandhadai ravikandhadai Apr 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems clone is collapsing the representation of RK_Array and RK_ArrayAddress. It ignores the differences. Is this intended? Can we add a line to document this behavior,

}

Expand Down Expand Up @@ -520,6 +554,81 @@ SymbolicValueMemoryObject *SymbolicValue::getAddressValueMemoryObject() const {
return value.derivedAddress->memoryObject;
}

//===----------------------------------------------------------------------===//
// Arrays
//===----------------------------------------------------------------------===//

namespace swift {

/// This is the representation of an array.
struct ArraySymbolicValue final
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should be updated for arrays.

: private llvm::TrailingObjects<ArraySymbolicValue, SymbolicValue> {
friend class llvm::TrailingObjects<ArraySymbolicValue, SymbolicValue>;

const CanType elementType;

/// This is the number of indices in the derived address.
const unsigned numElements;

static ArraySymbolicValue *create(ArrayRef<SymbolicValue> elements,
CanType elementType,
ASTContext &astContext) {
auto byteSize =
ArraySymbolicValue::totalSizeToAlloc<SymbolicValue>(elements.size());
auto rawMem = astContext.Allocate(byteSize, alignof(ArraySymbolicValue));

// Placement initialize the object.
auto asv = ::new (rawMem) ArraySymbolicValue(elementType, elements.size());
std::uninitialized_copy(elements.begin(), elements.end(),
asv->getTrailingObjects<SymbolicValue>());
return asv;
}

/// Return the element constants for this aggregate constant. These are
/// known to all be constants.
ArrayRef<SymbolicValue> getElements() const {
return {getTrailingObjects<SymbolicValue>(), numElements};
}

// This is used by the llvm::TrailingObjects base class.
size_t numTrailingObjects(OverloadToken<SymbolicValue>) const {
return numElements;
}

private:
ArraySymbolicValue() = delete;
ArraySymbolicValue(const ArraySymbolicValue &) = delete;
ArraySymbolicValue(CanType elementType, unsigned numElements)
: elementType(elementType), numElements(numElements) {}
};
} // end namespace swift

/// Produce an array of elements.
SymbolicValue SymbolicValue::getArray(ArrayRef<SymbolicValue> elements,
CanType elementType,
ASTContext &astContext) {
// TODO: Could compress the empty array representation if there were a reason
// to.
auto asv = ArraySymbolicValue::create(elements, elementType, astContext);
SymbolicValue result;
result.representationKind = RK_Array;
result.value.array = asv;
return result;
}

ArrayRef<SymbolicValue>
SymbolicValue::getArrayValue(CanType &elementType) const {
assert(getKind() == Array);
auto val = *this;
if (representationKind == RK_ArrayAddress)
val = value.arrayAddress->getValue();

assert(val.representationKind == RK_Array);

elementType = val.value.array->elementType;
return val.value.array->getElements();
}

//===----------------------------------------------------------------------===//
// Higher level code
//===----------------------------------------------------------------------===//
Expand Down Expand Up @@ -653,22 +762,32 @@ static SymbolicValue getIndexedElement(SymbolicValue aggregate,
if (aggregate.getKind() == SymbolicValue::UninitMemory)
return SymbolicValue::getUninitMemory();

assert(aggregate.getKind() == SymbolicValue::Aggregate &&
assert((aggregate.getKind() == SymbolicValue::Aggregate ||
aggregate.getKind() == SymbolicValue::Array) &&
"the accessPath is invalid for this type");

unsigned elementNo = accessPath.front();

SymbolicValue elt = aggregate.getAggregateValue()[elementNo];
SymbolicValue elt;
Type eltType;
if (auto *decl = type->getStructOrBoundGenericStruct()) {
auto it = decl->getStoredProperties().begin();
std::advance(it, elementNo);
eltType = (*it)->getType();
} else if (auto tuple = type->getAs<TupleType>()) {
assert(elementNo < tuple->getNumElements() && "invalid index");
eltType = tuple->getElement(elementNo).getType();

// We need to have an array, struct or a tuple type.
if (aggregate.getKind() == SymbolicValue::Array) {
CanType arrayEltTy;
elt = aggregate.getArrayValue(arrayEltTy)[elementNo];
eltType = arrayEltTy;
} else {
llvm_unreachable("the accessPath is invalid for this type");
elt = aggregate.getAggregateValue()[elementNo];
if (auto *decl = type->getStructOrBoundGenericStruct()) {
auto it = decl->getStoredProperties().begin();
std::advance(it, elementNo);
eltType = (*it)->getType();
} else if (auto tuple = type->getAs<TupleType>()) {
assert(elementNo < tuple->getNumElements() && "invalid index");
eltType = tuple->getElement(elementNo).getType();
} else {
llvm_unreachable("the accessPath is invalid for this type");
}
}

return getIndexedElement(elt, accessPath.drop_front(), eltType);
Expand Down Expand Up @@ -718,22 +837,33 @@ static SymbolicValue setIndexedElement(SymbolicValue aggregate,
aggregate = SymbolicValue::getAggregate(newElts, astCtx);
}

assert(aggregate.getKind() == SymbolicValue::Aggregate &&
assert((aggregate.getKind() == SymbolicValue::Aggregate ||
aggregate.getKind() == SymbolicValue::Array) &&
"the accessPath is invalid for this type");

unsigned elementNo = accessPath.front();

ArrayRef<SymbolicValue> oldElts = aggregate.getAggregateValue();
ArrayRef<SymbolicValue> oldElts;
Type eltType;
if (auto *decl = type->getStructOrBoundGenericStruct()) {
auto it = decl->getStoredProperties().begin();
std::advance(it, elementNo);
eltType = (*it)->getType();
} else if (auto tuple = type->getAs<TupleType>()) {
assert(elementNo < tuple->getNumElements() && "invalid index");
eltType = tuple->getElement(elementNo).getType();

// We need to have an array, struct or a tuple type.
if (aggregate.getKind() == SymbolicValue::Array) {
CanType arrayEltTy;
oldElts = aggregate.getArrayValue(arrayEltTy);
eltType = arrayEltTy;
} else {
llvm_unreachable("the accessPath is invalid for this type");
oldElts = aggregate.getAggregateValue();

if (auto *decl = type->getStructOrBoundGenericStruct()) {
auto it = decl->getStoredProperties().begin();
std::advance(it, elementNo);
eltType = (*it)->getType();
} else if (auto tuple = type->getAs<TupleType>()) {
assert(elementNo < tuple->getNumElements() && "invalid index");
eltType = tuple->getElement(elementNo).getType();
} else {
llvm_unreachable("the accessPath is invalid for this type");
}
}

// Update the indexed element of the aggregate.
Expand All @@ -742,7 +872,11 @@ static SymbolicValue setIndexedElement(SymbolicValue aggregate,
accessPath.drop_front(), newElement,
eltType, astCtx);

aggregate = SymbolicValue::getAggregate(newElts, astCtx);
if (aggregate.getKind() == SymbolicValue::Aggregate)
aggregate = SymbolicValue::getAggregate(newElts, astCtx);
else
aggregate = SymbolicValue::getArray(newElts, eltType->getCanonicalType(),
astCtx);

return aggregate;
}
Expand Down
Loading