-
Notifications
You must be signed in to change notification settings - Fork 605
Fix flaky test #771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky test #771
Conversation
dbba130
to
957de05
Compare
Sometimes waiting 2 seconds is not long enough: https://ci.rabbitmq.com/teams/main/pipelines/dotnet/jobs/test/builds/240 refactor Fix WorkPool to execute all jobs in the Work queue in a loop iteration
957de05
to
0b673f0
Compare
@danielmarbach you may find this fix interesting. The failure here... https://ci.appveyor.com/project/rabbitmq/rabbitmq-dotnet-client/builds/31667700 Is due to the |
My original code always had a while c393768#diff-4e86f90f6b775db8f50d55f854995db9R82 To take out all enqueued stuff once spinning because that was the behavior that the sync version had. Could you have missed thah when reviewing stebets PR? |
Pretty sure it is broken without the while because it would always miss one. |
@danielmarbach does the version in this PR look correct to your? |
Give me some time and I'll send a PR-------- Ursprüngliche Nachricht --------Von: Michael Klishin <[email protected]>Datum: Di., 24. März 2020, 21:42An: rabbitmq/rabbitmq-dotnet-client <[email protected]>Cc: Daniel Marbach <[email protected]>, Mention <[email protected]>Betreff: Re: [rabbitmq/rabbitmq-dotnet-client] Fix flaky test (#771)
@danielmarbach does the version in this PR look correct to your?
—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or unsubscribe.
|
@michaelklishin @lukebakken I have one question before I send in the PR that bothers me for a long time. The previous loop implementation did actually block on stop until the loop was properly stopped. Today's service implementation immediately stops and returns eventhough there might still be work going on or in the queue. Is this the behavior you guys wanted really by accepting stebet's PR? |
@danielmarbach it would be nice to stop the service cleanly after processing all pending operations. I assume we stop the service e.g. when the connection or channel are closed, so, not on the hot path? |
The current implementation no longer does cleanly shutdown and that was also backported to 5.2. So another bug?
|
@danielmarbach sounds like it. Such things are easy to miss since shutdown is not seen as an "interesting" or "difficult" problem by most developers :) |
How about #772 as a first step. Also aligns sync consumer service a bit |
@michaelklishin @lukebakken It seems this statement wasn't sufficiently reviewed |
In order to fix it we would need to reintroduce a GetAwaiter().GetResult() call that is called within What do you guys want? |
@danielmarbach, in that case, we might choose to go with the current behavior. The reasoning is that app developers can achieve a clean shutdown in practice e.g. by introducing a delay before models are closed. Unfortunately, there's usually no such path to recover from a deadlock. What would be the conditions for the deadlock? How likely do you think they are? |
It is really only a problem in environment with a SynchronizationContext. So any usage of the model in WinForms, WPF for example got on shutdown hang if the underlying code doesn't opt-out from context capturing. This risk could be largely mitigated by having 4.x and 5.1.x already has the GetAwaiter in place btw. https://github.com/rabbitmq/rabbitmq-dotnet-client/blob/v5.1.2/projects/client/RabbitMQ.Client/src/client/impl/AsyncConsumerDispatcher.cs#L29 So it would be bringing back what we had before in terms of shutdown behavior that was there before stebet's changes |
@danielmarbach OK, those changes are probably worth the risk as most users we see do not use WinForms or WPF. |
@danielmarbach as for missing items in previous reviews, I'm basically trusting the statements made by you, @bording and @stebet as you all have more recent experience with this library. Debugging tis issue has increased my understanding, however 😄 |
Fair enough. FYI I mostly spelunking in this code base only in my free time and don't spend enough time on it that I would trust myself. Sometimes when I get pinged on this repo about a specific question I only look at the PR from the angle of the question and not further to be mindful about my time and that can cause me to not see things. I'm happy to help where I can from time to time but I'm definitely not trustworthy 😁😊 |
Sent in #781 |
For whatever reason
TestBasicCancelNoWait
has been flaky. Trying to figure out why.