Skip to content

[3.1.x] Fix dotnet.exe process recovery after abnormal exit in ANCM #17103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Nov 22, 2019

Conversation

jkotalik
Copy link
Contributor

Fixes #17063.

Description

We introduced a regression between 2.2 and 3.0/3.1 where ANCM out of process wouldn't restart dotnet.exe on crash after restarting once. This was due to a background thread hanging when trying to close a handle for stdout. Also adds an opt-out switch to remove all redirection s.t. if there are still issues with this code-path, we can disable it.

Risk

Medium, but with reduced risk. Any change to ANCM out-of-process is always risky, but giving an opt-out switch should reduce the risk.

User impact

On application crash, ANCM normally would start a new dotnet.exe to restart dotnet. However, because a call to CloseHandle would hang, the static callback we registered wouldn't be overwritten. Therefore, on the next process exit, the callback wouldn't be fired, meaning all subsequent crashes wouldn't be detected.

Workarounds

If people expect their applications to crash or exit abnormally, they can work around this by either using ANCMv1 or enabling stdout logging. Both of these are non-ideal work arounds.

cc @Pilchie

@jkotalik jkotalik changed the title Fix dotnet.exe process recovery when Fix dotnet.exe process recovery after abnormal exit in ANCM Nov 14, 2019
@analogrelay analogrelay added this to the 3.1.x milestone Nov 16, 2019
@analogrelay
Copy link
Contributor

Please apply servicing-consider when this is ready for merging.

@analogrelay analogrelay changed the title Fix dotnet.exe process recovery after abnormal exit in ANCM [3.1.x servicing] Fix dotnet.exe process recovery after abnormal exit in ANCM Nov 16, 2019
@Tratcher
Copy link
Member

We'll also patch this in the 3.0 branch/sdk?

@jkotalik
Copy link
Contributor Author

@Tratcher from what I heard, no.

for (var i = 0; i < 10; i++)
{
// ANCM should eventually recover from being shutdown multiple times.
response = await deploymentResult.HttpClient.GetAsync("/HelloWorld");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not a fan 😆

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the problem here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is a bit... ugly.

Copy link
Member

@halter73 halter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what's the other bool config that defaults to true?

@jkotalik
Copy link
Contributor Author

Out of curiosity, what's the other bool config that defaults to true?

SetCurrentDirectory and CallStartupHook both default to true.

https://github.com/aspnet/AspNetCore/blob/3be11f6544485948ba17646ffa7b4242c2c5339a/src/Servers/IIS/AspNetCoreModuleV2/InProcessRequestHandler/InProcessOptions.cpp#L60-L61

@analogrelay analogrelay added the Servicing-consider Shiproom approval is required for the issue label Nov 20, 2019
@analogrelay analogrelay changed the title [3.1.x servicing] Fix dotnet.exe process recovery after abnormal exit in ANCM [3.1.x] Fix dotnet.exe process recovery after abnormal exit in ANCM Nov 21, 2019
@leecow leecow added Servicing-approved Shiproom has approved the issue and removed Servicing-consider Shiproom approval is required for the issue labels Nov 21, 2019
@leecow leecow modified the milestones: 3.1.x, 3.1.1 Nov 21, 2019
@analogrelay
Copy link
Contributor

Do we have to wait for branding or can we merge these now? I believe our deadline for 3.1.1 is today. @aspnet/build

@wtgodbe
Copy link
Member

wtgodbe commented Nov 22, 2019

These can be merged

@analogrelay analogrelay merged commit 9fded4c into release/3.1 Nov 22, 2019
@analogrelay analogrelay deleted the jkotalik/fixRestartIssue branch November 22, 2019 23:50
@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions Servicing-approved Shiproom has approved the issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants