Skip to content

[Const evaluator] Add support to "skip" instructions in step-wise evaluation #24113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2019

Conversation

ravikandhadai
Copy link
Contributor

No description provided.

@ravikandhadai
Copy link
Contributor Author

ravikandhadai commented Apr 18, 2019

@marcrasi @devincoughlin This commit adds a new functionality to the stepwise constant evaluator which is "skipping" instructions without evaluating them while conservatively accounting for the effects of the skipped instructions on the interpreter state.

This enables a client to step through a sequence of instructions (possibly containing function calls) and decide to evaluate some and skip others, with a guarantee that the constant values found by the interpreter are sound. (That is, whenever constant values are inferred for variables during interpretation at a program point, those values match the runtime values of the variables if/whenever control reaches that program point.)

This functionality is only available at outer most level of interpretation, i.e, when a function call is evaluated it will be evaluated normally using flow-sensitive evaluation and does require all instructions in the function body to be interpretable and have constant values.

@marcrasi While this is a functionality aimed at optimization of the new os log APIs, I think this will also enable replacing/simplifying the existing top-level, backward evaluation mode of the interpreter used for extracting the arguments to #asserts as constants (e.g. functions like getSingleWriterAddressValue etc.), and have a more unified approach that uses only flow-sensitive evaluation mode. Furthermore, I think this will also enable eliminating the mutual recursion in the interpreter, which I believe mainly exists for accomplishing this backward evaluation.

@ravikandhadai
Copy link
Contributor Author

@swift-ci Please smoke test

@ravikandhadai
Copy link
Contributor Author

ravikandhadai commented Apr 18, 2019

Btw, see only the latest commit in this PR, which contains the relevant changes.

@ravikandhadai
Copy link
Contributor Author

@swift-ci Please smoke test

@ravikandhadai
Copy link
Contributor Author

@swift-ci Please smoke test Linux Platform

@marcrasi
Copy link

The implementation makes sense to me, but I don't completely understand the bigger picture and the purpose of this change.

Specifically, something seems missing: Any client using the "skip" functionality will have to do a first pass through the instructions to determine which ones it wants to skip. For example, rewriting top-level evaluation in terms of this new functionality would look something like this:

let instructionsToEvaluate = transitiveInstructionsInfluencing(valueWeWant)
for inst in f {
  if inst in instructionsToEvaluate {
    evaluator.evaluate(inst)
  } else {
    evaluator.skip(inst)
  }
}
return evaluator.lookupConstValue(valueWeWant)

transitiveInstructionsInfluencing is going to be a bit complicated, and seems like something that the const-evaluation infrastructure should implement.

Is that something you're planning to implement later in the const-evaluation infrastructure? Or are you planning to implement something like it in your client? Or are you planning a client that uses skip in a completely different way that I haven't thought of?

@marcrasi
Copy link

Oh, and if the const-evaluation infrastructure does provide a correct transitiveInstructionsInfluencing function, then skip won't be necessary because evaluating all the instructions returned by transitiveInstructionsInfluencing is sufficient to get a sound value.

@ravikandhadai
Copy link
Contributor Author

For example, rewriting top-level evaluation in terms of this new functionality would look something like this:

That's exactly what I had in mind, when I mentioned it as a potential application. It might just be enough if transitiveInstructionsInfluencing pulls in the data and control dependences of arguments to #assert, as a purely syntactic analysis that uses use-def chains. In some sense, doing more evaluation will only affect running time of the interpreter and not the correctness because if the evaluation fails anywhere, we can skip the instruction (this is what the helper function tryEvaluateOrElseSkip does) and if we end up finding more constants than what is needed in the #assert, we can ignore it. My intuition is that if we look at only control/data-dependences of #assert argument, it should be quite sufficient for performance, or at least as performant as the backward propagation we have now, which also uses the use-def chain and attempt to evaluate the dependences (though it gives up in some cases like multiple writers).

I see two advantages of this compared to the existing backward traversal: (1) we can simplify the interpreter and not have different code paths for flow-sensitive and top-level evaluation, (2) we can start making the interpreter iterative (instead of recursive) and make the top-level evaluation as powerful as the flow-sensitive mode e.g. by getting rid of single-writer restriction etc.

Is that something you're planning to implement later in the const-evaluation infrastructure?

Not really. This is just a thought that crossed my mind and wanted to share it. The main goal of creating this is to use it in a controlled way in the os_log optimization client (see below).

Or are you planning a client that uses skip in a completely different way that I haven't thought of?

In the client, which I am now developing (see here) for a rough implementation), we explicitly annotate functions that need to be interpreted using @_semantics annotation. We know exactly what those functions are in this case. In fact, we know where to start interpreting and where to stop, and will only evaluate calls to functions marked with that attribute and skip the rest.

@ravikandhadai
Copy link
Contributor Author

Oh, and if the const-evaluation infrastructure does provide a correct transitiveInstructionsInfluencing function, then skip won't be necessary because evaluating all the instructions returned by transitiveInstructionsInfluencing is sufficient to get a sound value.

Yeah, but that requires precisely knowing all transitiveInstructionsInfluencing. This could be difficult if we have branches. If we have skip, we can allow over-approximating them using just a use-def chain. I am just mostly hand waving here. May be "skip" is not entirely necessary for this application. But, if we can eliminate the backward analysis mode and substitute it with flow-sensitive mode that seems useful (for simplifying the interpreter) without losing functionality.

@ravikandhadai
Copy link
Contributor Author

ravikandhadai commented Apr 19, 2019

An interesting example crossed my mind to illustrate my earlier thought that it is easier to have skip and a somewhat naive transitiveInstructionsInfluencing. (Interestingly, this is also the crux of the code we are trying to interpret in our client application).

    struct S {
        var const = 0
        var nonConst = 1
    } 

    func bar(x: Int) {
        var  s = S()
        var s2 = S()
        s2.nonConst += x
        s = s2
        #assert(s.const == 0)
   }

Currently, we would give up on this example as s has two writers. Furthermore, if we try to compute the transitiveInstructionsInfluencing (possibly using use-def chains) it will include s2.nonConst += x. If we only have evaluate and no skip, we will fail at s2.nonConst += x and give up. If we have skip, we can skip that instruction and continue (while being sound) and find that s.const is 0.

Edit: I changed the example as the earlier one had some errors.

Copy link

@marcrasi marcrasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining your use case! I'm understanding better now.

I noticed that this approach gives up immediately when there is nonconstant control flow, preventing it from finding later constant values. e.g.

func foo(_ x: Bool) {
  if x {
    print("hi")
  }
  let bla = 1
  #assert(bla == 1)
}

I think it would require some nontrivial extra analysis to extend this approach to handle situations like that.

Will that be a problem for your use case?

"branch depends on non-constant value obtained by skipping instructions",())
NOTE(constexpr_returned_by_skip,none, "return value of a skipped instruction "
"is not a constant", ())
NOTE(constexpr_mutated_by_skip,none, "value mutatable by a skipped instruction "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/mutable/mutated/

"is not a constant", ())
NOTE(constexpr_mutated_by_skip,none, "value mutatable by a skipped instruction "
"is not a constant", ())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"skipped instruction" doesn't seem like a concept that should be exposed to the user.

In context of the os_log client, users might understand something like "result of an unrecognized operation" / "mutated by an unrecognized operation". What do you think about changing the diagnostic messages to those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That right. It is not a user concept. It added this note for use in tests. It is not a part of the os_log client. Ideally, I would like to move all diagnostics to clients (like #assert client, tester client etc.) and specialize them as needed. In some sense, they are a part of the client model on how they handle errors in evaluation, as the interpreter itself is not a user-level concept, as of now.

@ravikandhadai
Copy link
Contributor Author

ravikandhadai commented Apr 19, 2019

Thanks for explaining your use case! I'm understanding better now.

I noticed that this approach gives up immediately when there is nonconstant control flow, preventing it from finding later constant values. e.g.

func foo(_ x: Bool) {
  if x {
    print("hi")
  }
  let bla = 1
  #assert(bla == 1)
}

I think it would require some nontrivial extra analysis to extend this approach to handle situations like that.

First thanks for providing these examples and the review. I think the analysis for computing relevant instruction to interpret could be a simple control/data dependent instructions (or in other words, a intraprocedural, backwards slice). But we can make the "driver" that invokes the interpreter on these relevant instructions smarter and in fact simpler :-), e.g. I am thinking of something like this:

let instructionsToEvaluate = transitiveInstructionsInfluencing(valueWeWant)
for inst in f {
  if inst in instructionsToEvaluate {
    evaluator.tryEvaluateOrSkip(inst)
     // A bit of special handling for branches.
     if `inst` is a branch, if it doesn't evaluate to a constant value, error and break
     otherwise, remove instructions in the other arm of the branch from `instructionsToEvlauate`.
  } 
}
return evaluator.lookupConstValue(valueWeWant)

We literally ignore (not even "skip") all instructions not in the backward slice. This handles your example, as we won't even look at the if condition etc. For this, we only need a guarantee that transitiveInstructionsInfluencing will include all relevant instructions. It could over-approximate them but could not leave out anything. It seems like this will be strictly more powerful than what we have currently. WDYT?

Btw, after writing this example, I think "skip" would need a better name inline with what it is doing and it is not quite "ignore". Perhaps, something like "approximateEffects" would be better?

Btw, as you know, even now we are implicitly computing some parts of this transitiveInstructionsInfluencing in singleWriterAddressValue by looking at sources of stores and recursively computing their constant value. Separating that logic out into a function like transitiveInstructionsInfluencing would be good I guess.

Will that be a problem for your use case?

That's a good question. No, the code that needs to be interpreted (which is not user-provided) does not have non-constant branches in my use case. However, these are like implicit constraints. We need to have good diagnostics there to help someone who changes the code and possibly gets stuck in such aspects as a part of that client.

@marcrasi
Copy link

We literally ignore (not even "skip") all instructions not in the backward slice. This handles your example, as we won't even look at the if condition etc. For this, we only need a guarantee that transitiveInstructionsInfluencing will include all relevant instructions. It could over-approximate them but could not leave out anything. It seems like this will be strictly more powerful than what we have currently. WDYT?

Makes sense, and sounds like a pretty good approach. I think the hypothetical future constexpr client won't actually want all the extra power because we want to keep the programming model simple, and allowing things like mutation in top level code (as long as it's "constant evaluable mutation") will make it harder to understand when something is const evaluable. But the elimination of the (confusing) mutual recursion sounds really good and simplifying.

Btw, after writing this example, I think "skip" would need a better name inline with what it is doing and it is not quite "ignore". Perhaps, something like "approximateEffects" would be better?

Yeah, "approximateEffects" sounds better than "skip" to me. And "tryEvaluateOrElseApproximateEffects" is very clear.

continue;
}
auto constVal = constValOpt.getValue();
if (constVal.getKind() != SymbolicValue::Address) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again after having just thought about #22772, I realize that this will have to be careful about aggregates containing addresses (and also addresses pointing at aggregates containing addresses).

There is no such thing as an aggregate containing an address now. Even after arrays are merged, this won't have to handle the array aggregates containing an address, because they have value semantics. (I haven't thought this through super carefully yet, but I think it's true.)

So I'm reasonably sure that no action is required now.

But if/when the constant evaluator gets reference types, this will be a problem. (Perhaps it will never get reference types during top level evaluation, exactly because of problems like this.) Is there a way to make sure we don't forget?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I remembered that reference types show up as reference types in SIL, not as addresses. So there should never be any aggregates containing addresses (except for things like arrays that are value types but that contain addresses as an implementation detail). So this will not have to worry about addresses in aggregates.

The point that it will have to worry about reference types is still valid. But we don't have reference types now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are interesting observations. You are right that if the interpreter starts supporting reference types (or structs with reference types), this way of estimating effects is not right. We need to do a traversal of the reachable memory locations to correctly approximate all possible effects.

As you say Arrays and Strings are fine as they have value semantics. It does seem to me that we don't need any additional work for Arrays.

It doesn't look like "Structs with addresses" is a thing in SIL, unless we allow unsafe pointers, which can store addresses. However, allowing anything like that in the interpreted fragment does seem messy.

Is there a way to make sure we don't forget?

I think we can document code, add a test that checks that diagnostics are emitting for skip with reference types, and also add an assert to skip that will fail on any new kind of symbolic object that is not currently supported. This will force anyone who is adding a new kind of symbolic object to look at skip.

@ravikandhadai
Copy link
Contributor Author

I think the hypothetical future constexpr client won't actually want all the extra power because we want to keep the programming model simple, and allowing things like mutation in top level code (as long as it's "constant evaluable mutation") will make it harder to understand when something is const evaluable. But the elimination of the (confusing) mutual recursion sounds really good and simplifying.

I completely agree with you.

@ravikandhadai
Copy link
Contributor Author

Rebased and fixed the things we discussed above such as giving "skip" a better name, adding assertions and documentation to make sure that new symbolic values added in the future would require thinking about "skip" function.

@ravikandhadai
Copy link
Contributor Author

@swift-ci Please smoke test

@ravikandhadai
Copy link
Contributor Author

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 14ba86aa54f00452526be1a47a51828f8fd61dec

@ravikandhadai
Copy link
Contributor Author

@swift-ci Please test Linux Platform

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 10ab00728290e04a83c0ecbae4b19227fb87cb44

@ravikandhadai
Copy link
Contributor Author

@swift-ci Please test macOS Platform

@ravikandhadai
Copy link
Contributor Author

@swift-ci Please test macOS Platform

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - b2a56bff55de838076b66f1faf8c5d4721d4b657

instructions without evaluating them while conservatively accounting
for the effects of the skipped instructions on the interpreter state.
@ravikandhadai
Copy link
Contributor Author

ravikandhadai commented Apr 30, 2019

Adding a minor enhancement: while skipping calls which are passed addresses, the contents of the addresses that are passed @in_guaranteed or @in_constant need not be reset to unknown as they cannot be mutated by the call.

@ravikandhadai
Copy link
Contributor Author

@swift-ci Please test

@ravikandhadai ravikandhadai merged commit d912e33 into swiftlang:master May 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants