Skip to content

Pass PluginHandle to stream contexts. #167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 24, 2021

Conversation

mathetake
Copy link
Contributor

@mathetake mathetake commented Jun 3, 2021

VM and root contexts (i.e. plugins) must outlive stream contexts, so this commit passes PluginHandle to stream contexts.
Resolves istio/istio#33091

Signed-off-by: Takeshi Yoneda [email protected]

@mathetake
Copy link
Contributor Author

mathetake commented Jun 3, 2021

note: previously ContextBase only holds raw pointers to WasmBase, and this is the root cause (since WasmBase is deleted in response to plugin deletion).

@PiotrSikora
Copy link
Member

@mathetake I'm pretty sure that stream context cannot outlive Envoy filter and therefore the plugin itself.

Could you check if you can reproduce the same issue with Envoy v1.17.3 to make sure it's not a regression in different part of Envoy?

@mathetake
Copy link
Contributor Author

I guess this is something to do with ECDS related lifecycle issue.. but year let me double check later. In anyway, with this and Envyo patches, the original issue has gone.

@PiotrSikora
Copy link
Member

Could you check if you can reproduce the same issue with Envoy v1.17.3 to make sure it's not a regression in different part of Envoy?

Actually, nevermind, Istio v1.9.5 is based on Envoy v1.17.x, so the original report uses that version.

@mathetake
Copy link
Contributor Author

You can actually verify this by inserting assert(wasm_ != nullptr) https://github.com/proxy-wasm/proxy-wasm-cpp-host/blob/master/src/context.cc#L316 and elsewhere on the stream context callbacks.

@mathetake
Copy link
Contributor Author

and wasm_ is sometimes going nullptr after ecds updates.

@PiotrSikora
Copy link
Member

I guess this is something to do with ECDS related lifecycle issue.. but year let me double check later. In anyway, with this and Envyo patches, the original issue has gone.

Yeah, that's kind of my point. If the root cause is that extension lifecycle is different between ECDS and LDS, then we're not helping anybody by adding a Wasm-specific workaround for the issue, without fixing the actual problem.

Could you check if you can reproduce this when using LDS instead of ECDS?

At the very least, we should verify with @kyessenov if streams outliving filters is the exepected behavior when using ECDS.

@mathetake
Copy link
Contributor Author

Yeah, that's kind of my point. If the root cause is that extension lifecycle is different between ECDS and LDS, then we're not helping anybody by adding a Wasm-specific workaround for the issue, without fixing the actual problem.

OK agreed. but this fix itself is not bad to add since "plugins must outlive stream contexts" seems like an Envoy-specific assumption of this library to me.. wdyt?

@PiotrSikora
Copy link
Member

OK agreed. but this fix itself is not bad to add since "plugins must outlive stream contexts" seems like an Envoy-specific assumption of this library to me.. wdyt?

That's a valid point. I definitely agree that the lifecycle logic should be fully cointained in this library, but I don't think that's the case right now, and if we want to stop relying on those Envoy assumptions, then we might need a much bigger rewrite.

@mathetake
Copy link
Contributor Author

At the very least, we should verify with @kyessenov if streams outliving filters is the exepected behavior when using ECDS.

kindly ping @kyessenov

@kyessenov
Copy link
Collaborator

Yeah, streams should hold on to filter instances even if filter factories are removed by ECDS. That is how other filters operate and Wasm should be no different. I think we actually tested this exact scenario so not sure how we missed it.

@mathetake
Copy link
Contributor Author

IIUC it is Wasm http/network filters' factory that holds PluginHandle

so I see from what @kyessenov you mentioned that this is an expected behavior. And we should fix Wasm specificially..

@mathetake
Copy link
Contributor Author

that is, if factories are deleted on ECDS updates, then Wasm Handles are deleted regardless of existence of outstanding streams(Wasm contexts).

@mathetake
Copy link
Contributor Author

OK, added test cases for this bug, and passed in envoy envoyproxy/envoy#16795.

@mathetake mathetake marked this pull request as ready for review June 8, 2021 08:42
@mathetake mathetake requested a review from PiotrSikora June 8, 2021 08:42
@mathetake
Copy link
Contributor Author

kindly ping @PiotrSikora here and envoyproxy/envoy#16795.

Copy link
Member

@PiotrSikora PiotrSikora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, and sorry for the delay!

@PiotrSikora PiotrSikora merged commit b524f41 into proxy-wasm:master Jun 24, 2021
lizan pushed a commit to envoyproxy/envoy that referenced this pull request Jun 28, 2021
Previously, stream contexts (i.e. filter instances) are not holding shared ptrs to Wasm plugins which take ownership of Wasm VMs, and only holding raw pointers to Wasm VMs. If Wasm plugins (i.e. factory) die earlier than stream contexts during ECDS updates, then the left stream contexts will try using invalidated pointers to Wasm VM and this results in crashes.
So this commit passes the PluginHandle to all the stream contexts which the plugin generates, so the VM must outlive all stream contexts by which it is used.

Counter part in Proxy-Wasm C++ Host: proxy-wasm/proxy-wasm-cpp-host#167
Resolves istio/istio#33091

Signed-off-by: Takeshi Yoneda <[email protected]>
@mathetake mathetake deleted the pass-plugin-handle branch July 5, 2021 06:32
chrisxrepo pushed a commit to chrisxrepo/envoy that referenced this pull request Jul 8, 2021
Previously, stream contexts (i.e. filter instances) are not holding shared ptrs to Wasm plugins which take ownership of Wasm VMs, and only holding raw pointers to Wasm VMs. If Wasm plugins (i.e. factory) die earlier than stream contexts during ECDS updates, then the left stream contexts will try using invalidated pointers to Wasm VM and this results in crashes.
So this commit passes the PluginHandle to all the stream contexts which the plugin generates, so the VM must outlive all stream contexts by which it is used.

Counter part in Proxy-Wasm C++ Host: proxy-wasm/proxy-wasm-cpp-host#167
Resolves istio/istio#33091

Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: chris.xin <[email protected]>
leyao-daily pushed a commit to leyao-daily/envoy that referenced this pull request Sep 30, 2021
Previously, stream contexts (i.e. filter instances) are not holding shared ptrs to Wasm plugins which take ownership of Wasm VMs, and only holding raw pointers to Wasm VMs. If Wasm plugins (i.e. factory) die earlier than stream contexts during ECDS updates, then the left stream contexts will try using invalidated pointers to Wasm VM and this results in crashes.
So this commit passes the PluginHandle to all the stream contexts which the plugin generates, so the VM must outlive all stream contexts by which it is used.

Counter part in Proxy-Wasm C++ Host: proxy-wasm/proxy-wasm-cpp-host#167
Resolves istio/istio#33091

Signed-off-by: Takeshi Yoneda <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Envoy crashes when upgrading WASM modules under load
3 participants