Skip to content

constant interpreter: addresses and memory #19657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 15, 2018

Conversation

marcrasi
Copy link

@marcrasi marcrasi commented Oct 2, 2018

This makes the interpreter able to evaluate functions that do mutation, and this also paves the way for generics support, array support, and string support.

This is a part of the #19579 mega-patch. See there if you're interested in what comes next.

@@ -65,6 +67,10 @@ enum class UnknownReason {
class SymbolicValue {
private:
enum RepresentationKind {
/// This value is an alloc stack that is has not (yet) been initialized
/// by flow-sensitive analysis.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcrasi "that is has" -> that has

@@ -88,6 +94,12 @@ class SymbolicValue {
/// This value is an array, struct, or tuple of constants. This is
/// tracked by the "aggregate" member of the value union.
RK_Aggregate,

/// This represents a direct reference to the address of a memory object.
RK_DirectAddress,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the phrase "direct reference to" needed in the comment? Could we instead say "This represents the address of a memory object". Is it missing something?

@@ -241,6 +293,29 @@ inline llvm::raw_ostream &operator<<(llvm::raw_ostream &os, SymbolicValue val) {
return os;
}

/// This is a representation of a memory object referred to be an address.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"referred to be an address" -> "referred to by an address"


/// Return the element constants for this aggregate constant. These are
/// known to all be constants.
ArrayRef<unsigned> getElements() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this comment be reworded and made a bit more detailed? I suppose 'getElements' returns the access path of this derived address, which is expected to be a sequence of constant unsigned ints. Perhaps, mentioning that would be better.

SymbolicValue createMemoryObject(SILValue addr, SymbolicValue initialValue) {
assert(!calculatedValues.count(addr));
auto type = simplifyType(addr->getType().getASTType());
auto *memObject = SymbolicValueMemoryObject::create(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a suggestion here. Not a must. Would it be possible to give simplifyType a more descriptive name e.g. like substituteGenericParamsAndSimplify or something like that?

// result of a call. Either way, we ask for the value of the pointer: in the
// former case this will be the latest value for this, in the later case, this
// must be a single-def value for us to analyze it.
if (auto li = dyn_cast<LoadInst>(value))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of "analyze it", is it better to explain it a bit more?

E.g. "In the latter case, the call must be the only definition of the address so that the memory object of the address can be computed by recursively processing the alloc-stack and the call instructions, in a demand-driven fashion."

@ravikandhadai
Copy link
Contributor

@marcrasi Do you plan to bring in top-level alloc-stack handling via getSingleWriterAddressValue function in a different commit?

@ravikandhadai
Copy link
Contributor

Okay, I see that getSingleWriterAddressValue is in the commit that handles generics.

@marcrasi marcrasi force-pushed the marcrasi-const-evaluator-part-1 branch 3 times, most recently from 312d5ba to 14cc50c Compare October 31, 2018 23:47
@marcrasi marcrasi force-pushed the marcrasi-const-evaluator-part-1 branch from e1fe25c to f91f967 Compare November 9, 2018 18:40
@marcrasi marcrasi force-pushed the marcrasi-const-evaluator-part-1 branch from f91f967 to 390daea Compare November 27, 2018 01:25
@ravikandhadai
Copy link
Contributor

@marcrasi I guess this is the next one that should be merged. Can you rebase this on the new master?

@jckarter
Copy link
Contributor

It's cool that you were able to handle objects and shared memory locations, and it's awesome that that could let the standard runtime implementations of Array/String/Dictionary "just work". Practically speaking, though, would it be better to represent those common collection types in a more symbolic fashion for compile-time evaluation? Especially if the compile-time evaluator has a cap on number of operations or some other complexity cap, interpreting the standard library implementations at such a low level seems like it could easily blow that cap, or else just be painfully slow to interpret in practice.

@marcrasi
Copy link
Author

The evaluator does have places where we can hook stuff like that in, and in fact it's what we did for some initial support of arrays and strings. You can see it in the huge PR https://github.com/apple/swift/pull/19579/files, if you search WellKnownFunction in ConstExpr.cpp. It represents arrays and strings symbolically, and interprets calls to some stdlib functions by operating on those representations.

I haven't created a smaller split patch for that logic yet.

The array/string support is sill very basic. The future work to increase the support will probably add more WellKnownFunctions that operate on the symbolic representations.

@marcrasi marcrasi force-pushed the marcrasi-const-evaluator-part-2 branch from 3328daf to 30bda99 Compare November 30, 2018 21:45
@marcrasi marcrasi changed the base branch from marcrasi-const-evaluator-part-1 to master November 30, 2018 21:45
@marcrasi
Copy link
Author

@ravikandhadai okay, I rebased it and addressed your comments!

@ravikandhadai
Copy link
Contributor

@marcrasi Great thanks. Let us wait for @devincoughlin's review.

@ravikandhadai
Copy link
Contributor

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 3328daf

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 3328daf

@ravikandhadai
Copy link
Contributor

@swift-ci Please test

Copy link
Contributor

@devincoughlin devincoughlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is great to handling of address projections and mutation! This looks good to me with the comments inline addressed.

/// This is the representation of a derived address. A derived address refers
/// to a memory object along with an access path that drills into it.
struct DerivedAddressValue final
: private llvm::TrailingObjects<DerivedAddressValue, unsigned> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't need to happen in this PR, but would be better to use IndexTrieNode from SILOptimizer/Utils/IndexTrie for the access path instead. With that you can share common access paths between different values to save memory and copying.

///
static bool updateIndexedElement(SymbolicValue &aggregate,
ArrayRef<unsigned> indices,
SymbolicValue scalar, Type type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling this parameter 'scalar' is a bit misleading since it may not be a scalar. For example, in:

struct Foo {
  var x: Int 
  var y: Int

  init() {
   x = 7
   y = 8
  }
}

func foo() -> Bool {
  var h = Foo()
  return (h.x == 7)
}

func bar() {
  #assert(foo())
}

the store to h will pass an aggregate value as the scalar parameter.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to newElement

var myMs = ms
addOne(to: &myMs.x.0)
addOne(to: &myMs.y)
return myMs.x.0 + myMs.x.1 + myMs.y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to make these test a bit more comprehensive. In particular, it would be good to make sure there is not unexpected aliasing (for example, between myMs and ms above).

It would also be good to test the case where you are storing an aggregate to a projection. Something like:

struct Foo {
  var x: Int 
  var y: Int

  init() {
   x = 7
   y = 8
  }
}

struct HasFoo {
  var f = Foo()
}

func foo() -> Bool {
  var h = HasFoo()
  var hp = h
  h.f.x = 22
  hp.f = h.f
  return (hp.f.x == 7)
}

func bar() {
  #assert(foo())
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test for undesired aliasing, and a test for storing aggregate to projection.

auto *memoryObject = it->second.getAddressValue(accessPath);

// If this is a direct store to tracked memory object, just update it.
if (accessPath.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? It seems like the base case of updateIndexedElement() should already handle this, regardless of whether it is a derived address.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, seems removable. I removed it.

}

// If we have an uninit memory, then scalarize it into an aggregate to
// continue. This happens when memory objects are initialized piecewise.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to write test that fails when this scalarization is removed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I added a SIL test for it!

// value of the memory. In the latter case, the call must be the only
// store to the address so that the memory object can be computed by
// recursively processing the allocation and call instructions in a
// demand-driven fasion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "fasion"


// If this is a deallocation of a memory object that we may be tracking,
// remove the memory from the set. We don't *have* to do this, but it seems
// useful for hygiene.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some kind of assertion that could be added to verify this hygiene? For example, at function exit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that my attention has been drawn to this code-comment, I realize that the code-comment and the code is wrong. It's left over from an earlier memory implementation where the memory values actually got stored in calculatedValues.

I have changed the code to do nothing here, and updated the comment to explain. We could do something a bit more elaborate, like add a new "freed" state to the memory and make it an assertion failure whenever you try to do any operations to "freed" memory.

}

// Copy addr is a load + store combination.
if (auto *copy = dyn_cast<CopyAddrInst>(inst)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add a test for this functionality. This should probably be a SIL-level test so that you can guarantee that the test continues to test the functionality even if SILGen changes its strategy about what is emitted as a CopyAddr.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a SIL test!

@ravikandhadai
Copy link
Contributor

@marcrasi I guess this PR is good to go, once Devin's comments are fixed.

SymbolicValueMemoryObject *
SymbolicValueMemoryObject::create(Type type, SymbolicValue value,
ASTContext &astContext) {
auto result = astContext.Allocate(sizeof(SymbolicValueMemoryObject),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcrasi a minor nitpick I just noticed. It is better to use auto * here.

auto result = astContext.Allocate(sizeof(SymbolicValueMemoryObject),
alignof(SymbolicValueMemoryObject));
new (result) SymbolicValueMemoryObject(type, value);
return (SymbolicValueMemoryObject*)result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcrasi space before '*'

@marcrasi marcrasi force-pushed the marcrasi-const-evaluator-part-2 branch 2 times, most recently from c721146 to 2b756b8 Compare December 12, 2018 04:44
@marcrasi
Copy link
Author

I have rebased this and addressed all the comments.

While addressing comments I also noticed that I was supposed to do this TODO in this PR, so I attempted to do it. The resulting assertion failed, because we actually do store addresses in calculatedValues. It turns out that the comment that led us to introduce this assertion was actually misleading. It meant to say that we don't store memory values in calculatedValues. So I deleted the assertion and clarified the comment.

Sorry about the slowness. I've been busy. I have everything that I need to start working on the next PR in the series (#19662), but I think I'll keep being busy until the end of the year. Is it okay if I delay working on the next parts until early Jan? I think it'll take a few weeks of back and forth to get the rest of constexpr merged once I start working on it again, so we'd be looking at finishing it in early Feb.

@ravikandhadai
Copy link
Contributor

@swift-ci Please smoke test

@ravikandhadai
Copy link
Contributor

@swift-ci Please test and merge

@ravikandhadai
Copy link
Contributor

@swift-ci Please Test and Merge

@marcrasi
Copy link
Author

Oh no, looks like some of the pound_assert.sil tests fail when Int is 32-bit. What is a good way to handle that?

@devincoughlin
Copy link
Contributor

@marcrasi You should be able to fix the 32-bit failures by explicitly making the tuple you create in piecewiseInit() to be Int64 instead of Int.

Adds memory objects and addresses to the constant interpreter, and
teaches the constant interpreter to interpret various instructions that
deal with memory and addresses.
@marcrasi marcrasi force-pushed the marcrasi-const-evaluator-part-2 branch from 2b756b8 to 8dcf89c Compare December 15, 2018 04:21
@marcrasi
Copy link
Author

Ah, that's a good simple idea, thanks. I have done it and uploaded the fixed commit.

@devincoughlin
Copy link
Contributor

@swift-ci please test and merge

1 similar comment
@devincoughlin
Copy link
Contributor

@swift-ci please test and merge

@swift-ci swift-ci merged commit 518f26a into master Dec 15, 2018
@compnerd compnerd deleted the marcrasi-const-evaluator-part-2 branch May 31, 2025 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants