Skip to content

Compile-time interpretation #19579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Conversation

marcrasi
Copy link

This brings the compile-time interpreter from the tensorflow branch into master, so that everyone can start using it. To make the interpreter testable, this also brings the #assert expression into master, hidden under a -enable-experimental-static-assert flag.

Some possible issues that I noticed while bringing the code over:

  • This makes SILModule::getSILLoader() and getWitnessMethodSubstitutions public so that the interpreter can access them. Maybe we should do something more proper before merging this.
  • The #assert tests don't exercise all the interpreter functionality, because there are tensorflow tests in the tensorflow branch that exercise those parts. e.g. decodeAllocUninitializedArray, and possibly various other features that never got added to #assert tests. Could we leave these in without tests, and defer tests to when someone adds a compiler feature that exercises them?

Let me know if you want this split into smaller parts for easier reviewing. It would probably make sense in 3 parts:

  1. SILConstants, ~1,300 lines
  2. The actual interpreter, ~2,000 lines
  3. #assert, ~500 lines

/// pointer to the instruction whose value this holds. This is known to
/// be one of a closed set of constant instructions:
/// IntegerLiteralInst, FloatLiteralInst, StringLiteralInst
SingleValueInstruction *inst;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcrasi Looks like this inst representation for literals is not used anymore. Is this dead code? (and also the declaration of SingleValueInstruction at the beginning of the file).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I forgot to remove it from the branch code. @marcrasi please feel free to remove.

//
//===----------------------------------------------------------------------===//

// SWIFT_ENABLE_TENSORFLOW
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcrasi To be removed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will remove.

Identifier name;
SymbolicValue value;
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcrasi Is this GraphOperationAttribute struct used anywhere in the interpreter? Can this be left out of the patch?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be removed. I moved it out of this file on the tensorflow branch in preparation for this, but I must have accidentally copied this file over before I did the move.

@ravikandhadai
Copy link
Contributor

ravikandhadai commented Sep 28, 2018

@marcrasi

The #assert tests don't exercise all the interpreter functionality, because there are tensorflow tests in the tensorflow branch that exercise those parts. e.g. decodeAllocUninitializedArray, and possibly various other features that never got added to #assert tests. Could we leave these in without tests, and defer tests to when someone adds a compiler feature that exercises them?

Here is one idea that occurred to me. As you say, I see that arrays/strings are not directly supported within #assert though some of those features exist. Does it make sense to add tests like the one shown where such values have to tracked by the interpreter but are not directly used within #asserts? E.g.

      struct Pair {
           var arr: [Int]
           var x: Int
      }

      func bar() -> Pair {
          return Pair(arr: [0, 1], x: 0)
      }

      func foo() {
        #assert(bar().x == 0) // assertion should succeed
      } 

An example with strings,

      struct Pair {
           var arr: [Int]
           var s: String
      }

      func bar() -> Pair {
          return Pair(s: "", x: 0)
      }

      func foo() {
        #assert(bar().x == 0) // assertion should succeed
      } 

I find that both these examples work with the interpreter. Btw, I did find a interpreter crash in the latter example with strings. I will make separate comment on the bug.

@marcrasi
Copy link
Author

marcrasi commented Sep 28, 2018

@ravikandhadai Very clever! I couldn't think of easy ways to write tests with arrays/strings because we haven't added support for any non-initialization operations (like equality) on them. Putting them in a struct and asserting something else about the struct is a nice way around that. I can add some tests that do this. The tests will at least prevent regressions that cause initialization to completely crash.

@ravikandhadai
Copy link
Contributor

@marcrasi Sounds great. Thanks a lot.

if (callee->isExternalDeclaration())
return computeOpaqueCallResult(apply, callee);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcrasi

Here is the compiler crash I found when using the example that uses string initializers. The interpreter runs fine but it results in a SIL verification failure in the later phases. The exact assertion failure is shown below, but at a high-level, this happens because the interpreter is pulling in the SIL of the stdlib function: String.init(_builtinStringLiteral:utf8CodeUnitCount:isASCII:) in the lines 783 to 790 (see the above code). The initializer calls _StringGuts.init<A>(_large:) which is left as an external function (without a body). The SILVerifier doesn't like this for some reason and fails.

A possible solution

I found that one way to fix this is to move the code in lines: 783 to 790 that loads an external function after the switch-case that handles well-known functions (i.e, after line 871 in the code that follows). This means that the SIL of an external function will not be loaded for a WellKnownFunction that the interpreter will handle specially. Does it makes sense? It seems this will eliminate some unnecessary work of loading the SIL of a well known function, and will also fix this assertion failure.

Test case:

  struct Pair {
       var arr: [Int]
       var s: String
  }

  func bar() -> Pair {
      return Pair(s: "", x: 0)
  }

  func foo() {
    #assert(bar().x == 0) // assertion should succeed
  } 

Assertion failure

SIL verification failed: external declarations of SILFunctions with shared visibility is not allowed: SingleFunction || !hasSharedVisibility(RefF->getLinkage()) || RefF->hasForeignBody()

Verifying instruction:
-> // function_ref specialized _StringGuts.init<A>(_large:) %88 = function_ref @$ss11_StringGutsV6_largeABs010_UnmanagedA0VyxG_tcs17FixedWidthIntegerRzSURzlufCs5UInt8V_Tgq5 : $@convention(method) (_UnmanagedString<UInt8>, @thin _StringGuts.Type) -> @owned _StringGuts // user: %89
%89 = apply %88(%87, %52) : $@convention(method) (_UnmanagedString<UInt8>, @thin _StringGuts.Type) -> @owned _StringGuts // user: %90

In function: // String.init(_builtinStringLiteral:utf8CodeUnitCount:isASCII:)

Steps to reproduce

swiftc -frontend -emit-sil -verify -enable-experimental-static-assert <string_test_case.swift>

Copy link
Author

@marcrasi marcrasi Sep 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix makes sense.

It may not completely fix the problem -- this problem may also happen with non-WellKnownFunctions. All the non-WellKnownFunctions that we use in practice (e.g. stdlib integer arithmetic) are simple enough that they might not expose the problem. Once we start using more complicated external non-WellKnownFunctions, they might expose this problem again.

The fix sounds good anyways, because saving the loading work for WellKnownFunctions is good. I'll add this fix and a test, but I'll also try a bit to see if I can make the problem happen with a non-WellKnownFunction.

@ravikandhadai
Copy link
Contributor

@marcrasi

Let me know if you want this split into smaller parts for easier reviewing. It would probably make sense in 3 parts:
SILConstants, ~1,300 lines
The actual interpreter, ~2,000 lines
#assert, ~500 lines

Thanks for bringing this up. I checked with @devincoughlin and few others about this and we are of the opinion that it would be best to split this into smaller parts. It will make it much easier for others to review/understand the code (possibly in parallel), and also to request expert reviews of changes touching upon other components of the compiler. Can we split things up in the following fashion?

  1. Support for just #assert, without the backing interpreter. This shall include the parsing, Sema and AST changes and tests that check if #asserts can be parsed/type-checked correctly.

  2. The Interpreter for a basic language fragment e.g just the handling of integers. I believe this may require having support for (builtin) integers, aggregates, function-refs and application. This will probably bring in much of the interpreter but will eliminate some of the extended features: enums, arrays, strings, floats (and if possible witness method calls etc.). This will probably introduce new files: SILConstants.h/.cpp, ConstExpr.h/.cpp, but will have little/no changes to existing ones. This patch shall include tests for #assert on basic integer operations.

  3. Subsequently, strings, arrays, floats and enums can be brought in in separate commits.

One benefit of this split up is that in every commit there will be tests for the features that are introduced. Let me know your thoughts/suggestions on this.

@marcrasi
Copy link
Author

marcrasi commented Sep 28, 2018

Yeah, that split makes a lot of sense and sounds like it will get things thoroughly reviewed and tested!

I already have "1. Support for just #assert, without the backing interpreter." in its own commit, so I'll create a PR for that right now.

Next, I will send a PR to the tensorflow branch addressing all the comments that you have made on this PR, because I want to get them incorporated somewhere asap.

Then (probably Monday), I will start working on splitting the interpreter into the pieces you describe and I'll send PRs as they are ready.

@marcrasi
Copy link
Author

Part 1 uploaded at #19618

@rxwei
Copy link
Contributor

rxwei commented Sep 29, 2018

This makes SILModule::getSILLoader() and getWitnessMethodSubstitutions public so that the interpreter can access them. Maybe we should do something more proper before merging this.

The interpreter needs to deserialize witness tables. Maybe we can refactor tryDevirtualizeWitnessMethod in SILOptimizer/Utils/Devirtualize.h to take a WitnessMethodInst * instead of an ApplySite.

@marcrasi
Copy link
Author

I have addressed all of @ravikandhadai's comments in #19621. I'll propagate those changes into my next PRs to master.

@ravikandhadai
Copy link
Contributor

@marcrasi Sounds awesome. Thanks.

@marcrasi
Copy link
Author

marcrasi commented Oct 1, 2018

"constant interpreter: enough for basic integer operations" uploaded at #19655

@marcrasi
Copy link
Author

marcrasi commented Oct 2, 2018

"constant interpreter: addresses and memory" uploaded at #19657

@marcrasi
Copy link
Author

marcrasi commented Oct 2, 2018

"constant interpreter: support evaluating generic functions" uploaded at #19662

I'm going to stop for a while and wait for comments on the PRs that I have uploaded before uploading more. The remaining pieces that I have not yet uploaded are:

  • floats
  • enums
  • arrays
  • strings

@ravikandhadai
Copy link
Contributor

@marcrasi Hi Marc, I just wanted to share a couple of thoughts regarding the new PRs that you are preparing. I think it is best to add @semantics annotations for well-known functions (in cases where they are not already available), and match on those annotations instead of the mangled names.

@ravikandhadai
Copy link
Contributor

@marcrasi Also, I think it might just be easier to start off with string support, as, I guess, it adds little code but introduces well-known functions concept.

@marcrasi
Copy link
Author

marcrasi commented Jan 8, 2019

Whoops, I just finished doing arrays. It does seem like strings would have been a bit less code. I'll do strings next.

I can do strings off of master instead of off of the array branch, so that strings can get merged in first in case arrays take a while. The two features are orthogonal enough that the conflicts between them shouldn't be annoying to resolve.

@ravikandhadai
Copy link
Contributor

@marcrasi

I just finished doing arrays.

Awesome.

I can do strings off of master instead of off of the array branch

Sounds good to me. That would be great. That would be easier to merge first.

I will look into the array PR tomorrow.

@ravikandhadai
Copy link
Contributor

This feature has been merged with master through several other smaller PRs. This PR is no longer required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants