Skip to content

Make consumer-level prefetch-like limit configurable for cases with QoS prefetch > 200 #11822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Ayanda-D
Copy link
Contributor

@Ayanda-D Ayanda-D commented Jul 24, 2024

Proposed Changes

The rabbit_queue_consumers implementation has a hard-coded UNSENT_MESSAGE_LIMIT = 200 which can block the consumer channel (see is_ch_blocked/1 predicate) from consumer credit based flow control.

When a consumer channel is operating at e.g. prefetch count >= 200, outbound deliveries leading to unsent_message_count >= 200 before Acks are received or consumer credit is processed, can cause the channel to block further deliveries more frequently. This can also break the semantics of expected AMQP QOS, for such high end or unlimited prefetch settings.

This PR introduces a rabbit.classic_queue_consumer_unsent_message_limit config which grants us flexibility on this hard limit, allowing us to increase/prolong blocking on classic queue consumers. We're keeping the default 200 limit (happy to bump it up if necessary, e.g. 1K).

Extremely high setting can be risky for slow consumers - however very fast consumers can do with higher limit than default 200.

UNSENT_MESSAGE_LIMIT limitation dates back to some really early releases of RabbitMQ, v1.4.0 and v2.8.0 having control over this flow control limit on consumers, to increase it, is yielding some interesting results on improved throughput.

Couple of PerfTests (on some very high UNSENT_MESSAGE_LIMIT settings):

FIXED PREFETCH (up to ~250% throughput improvement)

1. UNSENT_MESSAGE_LIMIT = 200, PREFETCH = 1000

bin/runjava com.rabbitmq.perf.PerfTest -h amqp://localhost:5672 -x 1 -X 1 -y 1 -Y 1 -s 1000 -z 60 -mp delivery-mode=2 --qos 1000
id: test-175008-398, sending rate avg: 147010 msg/s
id: test-175008-398, receiving rate avg: 2900 msg/s

2. UNSENT_MESSAGE_LIMIT = 10000, PREFETCH = 1000

bin/runjava com.rabbitmq.perf.PerfTest -h amqp://localhost:5672 -x 1 -X 1 -y 1 -Y 1 -s 1000 -z 60 -mp delivery-mode=2 --qos 1000
id: test-175810-192, sending rate avg: 149568 msg/s
id: test-175810-192, receiving rate avg: 4972 msg/s

3. UNSENT_MESSAGE_LIMIT = 20000, PREFETCH = 1000

bin/runjava com.rabbitmq.perf.PerfTest -h amqp://localhost:5672 -x 1 -X 1 -y 1 -Y 1 -s 1000 -z 60 -mp delivery-mode=2 --qos 1000
id: test-175315-616, sending rate avg: 125433 msg/s
id: test-175315-616, receiving rate avg: 6963 msg/s

4. UNSENT_MESSAGE_LIMIT = 50000, PREFETCH = 1000

bin/runjava com.rabbitmq.perf.PerfTest -h amqp://localhost:5672 -x 1 -X 1 -y 1 -Y 1 -s 1000 -z 60 -mp delivery-mode=2 --qos 1000
id: test-175601-591, sending rate avg: 135194 msg/s
id: test-175601-591, receiving rate avg: 7394 msg/s

UNBOUNDED PREFETCH (up to ~530% throughput improvement)

1. UNSENT_MESSAGE_LIMIT = 200, PREFETCH = UNLIMITED

bin/runjava com.rabbitmq.perf.PerfTest -h amqp://localhost:5670 -x 1 -X 1 -y 1 -Y 1 -s 1000 -z 60 -mp delivery-mode=2
id: test-110749-528, sending rate avg: 146539 msg/s
id: test-110749-528, receiving rate avg: 2819 msg/s

2. UNSENT_MESSAGE_LIMIT = 10000, PREFETCH = UNLIMITED

bin/runjava com.rabbitmq.perf.PerfTest -h amqp://localhost:5672 -x 1 -X 1 -y 1 -Y 1 -s 1000 -z 60 -mp delivery-mode=2
id: test-114438-293, sending rate avg: 141731 msg/s
id: test-114438-293, receiving rate avg: 9579 msg/s

3. UNSENT_MESSAGE_LIMIT = 20000, PREFETCH = UNLIMITED

bin/runjava com.rabbitmq.perf.PerfTest -h amqp://localhost:5672 -x 1 -X 1 -y 1 -Y 1 -s 1000 -z 60 -mp delivery-mode=2
id: test-114920-661, sending rate avg: 127584 msg/s
id: test-114920-661, receiving rate avg: 15327 msg/s

4. UNSENT_MESSAGE_LIMIT = 50000, PREFETCH = UNLIMITED

bin/runjava com.rabbitmq.perf.PerfTest -h amqp://localhost:5672 -x 1 -X 1 -y 1 -Y 1 -s 1000 -z 60 -mp delivery-mode=2
id: test-115210-507, sending rate avg: 127504 msg/s
id: test-115210-507, receiving rate avg: 15222 msg/s

From 20K to 50K settings, we notice consumer receive rate/throughput reaching saturation (could be of use for more high performance use-cases).

We're keen to have this feature in 3.12.x as well. Please take a look 👍

Types of Changes

What types of changes does your code introduce to this project?
Put an x in the boxes that apply

  • Bug fix (non-breaking change which fixes issue #NNNN)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause an observable behavior change in existing systems)
  • Documentation improvements (corrections, new content, etc)
  • Cosmetic change (whitespace, formatting, etc)
  • Build system and/or CI

Checklist

Put an x in the boxes that apply.
You can also fill these out after creating the PR.
If you're unsure about any of them, don't hesitate to ask on the mailing list.
We're here to help!
This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING.md document
  • I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
  • I have added tests that prove my fix is effective or that my feature works
  • All tests pass locally with my changes
  • If relevant, I have added necessary documentation to https://github.com/rabbitmq/rabbitmq-website
  • If relevant, I have added this change to the first version(s) in release-notes that I expect to introduce it

Further Comments

@mergify mergify bot added the make label Jul 24, 2024
@michaelklishin
Copy link
Collaborator

This constant is used on the hot delivery path (as part of rabbit_queue_consumers:deliver_to_consumer/3, rabbit_queue_consumers:is_ch_blocked/2).

What are the effects of this PR with all defaults?

@michaelklishin
Copy link
Collaborator

3.12.x is out of community support so I'd not expect any more OSS RabbitMQ 3.12.x releases.

%%
%% {classic_queue_consumer_unsent_message_limit, 200},

{mapping, "classic_queue_consumer_unsent_message_limit", "rabbit.classic_queue_consumer_unsent_message_limit", [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -22,7 +22,8 @@

-define(QUEUE, lqueue).

-define(UNSENT_MESSAGE_LIMIT, 200).
-define(UNSENT_MESSAGE_LIMIT,
application:get_env(rabbit, classic_queue_consumer_unsent_message_limit, 200)).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be initialized in the cr record state, or at least a persistent term? Calling application:get_env/3 for every batch of deliveries sounds substantially more expensive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sure can. I actually tried initialising in cr state but fetching limit by pattern match per message delivery yielded slower throughput numbers vs lookups in ets via application:get_env/3. I'll do more samples. Didn't try persistent term. i'll test all 3 and we opt for fastest approach 👍

@michaelklishin michaelklishin changed the title Configurable rabbit.classic_queue_consumer_unsent_message_limit to boost throughput Make consumer-level prefetch-like limit configurable for cases with QoS prefetch > 200 Jul 24, 2024
@Ayanda-D
Copy link
Contributor Author

Ayanda-D commented Jul 29, 2024

Ran some tests locally for unbounded prefetch (at least 3 samples for each approach), using the following (including Process Dictionary).

Using:

bin/runjava com.rabbitmq.perf.PerfTest -h amqp://localhost:5672 -x 1 -X 1 -y 1 -Y 1 -s 1000 -z 60 -mp delivery-mode=2

From results below, we can either store/retrieve UNSENT-MESSAGE-LIMIT using application config or persistent term. Application config gives the highest throughput samples for larger UNSENT-MESSAGE-LIMIT values (recorded the highest numbers). Persistent term on the other hand, at lower/default UNSENT-MESSAGE-LIMIT of 200 gives the higher performance (which is likely to be the setting to be used the most). At larger configs, persistent-term is not as fast as using Application config, but still very fast enough to give very high throughput. Channel Record #cr{ } gives very high throughput as well, but it seems to suffer for consistency/stable performance on multiple samples. I think this is down to the fact that the UNSENT-MESSAGE-LIMIT is being retrieved from the queue's (erlang) process memory. Process dictionary performs very well, but gave low throughput samples at default UNSENT-MESSAGE-LIMIT and there are cases where samples aren't stable. I think similar to #cr{ }, constant high rate retrieval of the limit from the queue's (erlang) process memory/heap can be impacted by queue being busy with all other AMQP and internal housekeeping operations like GC. Isolating the UNSENT-MESSAGE-LIMIT storage and retrieval to externally to the queue memory space, i.e. application config/ets or persistent term is the way to go.

All approaches perform reasonably well. In this case, we can go for persistent term mainly due to generally acceptable high throughput (despite not giving highest samples), and higher throughput at lower/default UNSENT-MESSAGE-LIMIT config. The only downside of Application Config is lower throughput at default, UNSENT-MESSAGE-LIMIT 200, however performs very well as limit increases.

UNSENT-MESSAGE-LIMIT = 200

  • Application Config, moderate throughput, stable performance.
  • Persistent Term gives high throughput and moderate stable performance.
  • Channel Record, very high throughput (highest sample), but not stable/consistent
  • Process Dictionary, moderate throughput, moderate stable performance

1. APPLICATION CONFIG

sending rate avg: 146539 msg/s
receiving rate avg: 2819 msg/s
sending rate avg: 136161 msg/s
receiving rate avg: 2677 msg/s
sending rate avg: 126546 msg/s
receiving rate avg: 2494 msg/s

2. PERSISTENT TERM **

sending rate avg: 113543 msg/s
receiving rate avg: 6429 msg/s
sending rate avg: 99878 msg/s
receiving rate avg: 7121 msg/s
sending rate avg: 68435 msg/s
receiving rate avg: 8936 msg/s

3. CHANNEL RECORD / #cr{}

sending rate avg: 104098 msg/s
receiving rate avg: 6509 msg/s
sending rate avg: 126786 msg/s
receiving rate avg: 2538 msg/s
sending rate avg: 111604 msg/s
receiving rate avg: 2579 msg/s
sending rate avg: 38348 msg/s
receiving rate avg: 10947 msg/s
sending rate avg: 96787 msg/s
receiving rate avg: 8014 msg/s
sending rate avg: 68210 msg/s
receiving rate avg: 9364 msg/s

4. PROCESS DICTIONARY

sending rate avg: 144480 msg/s
receiving rate avg: 3760 msg/s
sending rate avg: 128208 msg/s
receiving rate avg: 2349 msg/s
sending rate avg: 109908 msg/s
receiving rate avg: 2167 msg/s

UNSENT-MESSAGE-LIMIT = 10000

  • Application Config, high throughput, not the most stable/consistent
  • Persistent Term high throughput (recorded highest sample), moderate stable performance.
  • Channel Record, high throughput, stable performance
  • Process Dictionary, high throughput, moderate stable performance

1. APPLICATION CONFIG

sending rate avg: 141731 msg/s
receiving rate avg: 9579 msg/s

sending rate avg: 142855 msg/s
receiving rate avg: 12831 msg/s
sending rate avg: 81555 msg/s
receiving rate avg: 17983 msg/s

2. PERSISTENT TERM

sending rate avg: 107206 msg/s
receiving rate avg: 16450 msg/s

sending rate avg: 61521 msg/s
receiving rate avg: 22623 msg/s
sending rate avg: 102850 msg/s
receiving rate avg: 13559 msg/s

3. CHANNEL RECORD / #cr{}

sending rate avg: 120883 msg/s
receiving rate avg: 13749 msg/s
sending rate avg: 80352 msg/s
receiving rate avg: 14966 msg/s

sending rate avg: 95623 msg/s
receiving rate avg: 12914 msg/s
sending rate avg: 123071 msg/s
receiving rate avg: 12872 msg/s

4. PROCESS DICTIONARY

sending rate avg: 125005 msg/s
receiving rate avg: 12538 msg/s

sending rate avg: 140233 msg/s
receiving rate avg: 10842 msg/s

sending rate avg: 110047 msg/s
receiving rate avg: 12266 msg/s

UNSENT-MESSAGE-LIMIT = 20000

  • Application Config, very high throughput (recorded highest sample), moderate stability
  • Persistent Term high throughput, however not the most stable performance.
  • Channel Record, high throughput with low samples as well. Performance not very stable
  • Process Dictionary, very high throughput, good stable performance

1. APPLICATION CONFIG

sending rate avg: 127584 msg/s
receiving rate avg: 15327 msg/s
sending rate avg: 135996 msg/s
receiving rate avg: 14375 msg/s
sending rate avg: 91994 msg/s
receiving rate avg: 18480 msg/s

2. PERSISTENT TERM

sending rate avg: 135969 msg/s
receiving rate avg: 11717 msg/s

sending rate avg: 137299 msg/s
receiving rate avg: 14022 msg/s

sending rate avg: 101505 msg/s
receiving rate avg: 12107 msg/s

3. CHANNEL RECORD / #cr{}

sending rate avg: 133265 msg/s
receiving rate avg: 11263 msg/s
sending rate avg: 107598 msg/s
receiving rate avg: 12555 msg/s

sending rate avg: 106408 msg/s
receiving rate avg: 5639 msg/s
sending rate avg: 109276 msg/s
receiving rate avg: 15116 msg/s
sending rate avg: 127528 msg/s
receiving rate avg: 12795 msg/s

4. PROCESS DICTIONARY

sending rate avg: 119738 msg/s
receiving rate avg: 14550 msg/s
sending rate avg: 81773 msg/s
receiving rate avg: 17613 msg/s
sending rate avg: 105823 msg/s
receiving rate avg: 15725 msg/s

UNSENT-MESSAGE-LIMIT = 50000

  • Application Config, very high throughput (recorded highest sample), moderate stability
  • Persistent Term high throughput, moderate stability
  • Channel Record, very high throughput with low samples as well. Performance not very stable
  • Process Dictionary, very high throughput, good stable performance

1. APPLICATION CONFIG

sending rate avg: 127504 msg/s
receiving rate avg: 15222 msg/s
sending rate avg: 136726 msg/s
receiving rate avg: 17801 msg/s
sending rate avg: 136803 msg/s
receiving rate avg: 14149 msg/s

2. PERSISTENT TERM

sending rate avg: 111343 msg/s
receiving rate avg: 16649 msg/s

sending rate avg: 79634 msg/s
receiving rate avg: 14038 msg/s

sending rate avg: 119138 msg/s
receiving rate avg: 16874 msg/s

3. CHANNEL RECORD / #cr{}

sending rate avg: 142837 msg/s
receiving rate avg: 14381 msg/s
sending rate avg: 139968 msg/s
receiving rate avg: 15294 msg/s
sending rate avg: 116568 msg/s
receiving rate avg: 6523 msg/s

sending rate avg: 108930 msg/s
receiving rate avg: 4630 msg/s
sending rate avg: 93271 msg/s
receiving rate avg: 3978 msg/s
sending rate avg: 57414 msg/s
receiving rate avg: 17723 msg/s

4. PROCESS DICTIONARY

sending rate avg: 140930 msg/s
receiving rate avg: 16686 msg/s

sending rate avg: 123495 msg/s
receiving rate avg: 17229 msg/s
sending rate avg: 146543 msg/s
receiving rate avg: 14492 msg/s

UNSENT-MESSAGE-LIMIT = 100000

  • Application Config, very high throughput (recorded highest sample across all tests), moderate stability
  • Persistent Term very high throughput, very stable performance
  • Channel Record, very high throughput. Stable performance
  • Process Dictionary, very high throughput, performance not very stable

1. APPLICATION CONFIG

sending rate avg: 143485 msg/s
receiving rate avg: 16181 msg/s
sending rate avg: 80453 msg/s
receiving rate avg: 29124 msg/s
sending rate avg: 94802 msg/s
receiving rate avg: 17727 msg/s

2. PERSISTENT TERM

sending rate avg: 98352 msg/s
receiving rate avg: 22125 msg/s

sending rate avg: 100332 msg/s
receiving rate avg: 19078 msg/s
sending rate avg: 125238 msg/s
receiving rate avg: 18219 msg/s

3. CHANNEL RECORD / #cr{}

sending rate avg: 111196 msg/s
receiving rate avg: 16868 msg/s

sending rate avg: 97504 msg/s
receiving rate avg: 22506 msg/s

sending rate avg: 103266 msg/s
receiving rate avg: 18330 msg/s

sending rate avg: 90818 msg/s
receiving rate avg: 20880 msg/s

sending rate avg: 100710 msg/s
receiving rate avg: 19311 msg/s

4. PROCESS DICTIONARY

sending rate avg: 89712 msg/s
receiving rate avg: 21237 msg/s

sending rate avg: 89020 msg/s
receiving rate avg: 22835 msg/s
sending rate avg: 113971 msg/s
receiving rate avg: 15453 msg/s

@michaelklishin
Copy link
Collaborator

So persistent term and the cr record ended up being optimal. That's good news.

Copy link
Collaborator

@michaelklishin michaelklishin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this implementation.

It does not regress on the default path, which 99.99% of users will start with.

With the limit of 500, I observe no average publishing rate degradation and very comparable variability, and the consumer rate goes up ≈ 7.6%.

With the limit of 1000, the publisher metrics are identical and the consumer throughput is 16.2% higher.

With higher limits (including a higher consumer/PerfTest QoS limit, of course) I do not observe any meaningful improvements but the variability of results raises significantly.

But I guess it is a useful configurable value to have.

Copy link
Member

@ansd ansd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this value configurable is a good idea. See my inline comments.

Comment on lines 171 to 172
persistent_term:put(unsent_message_limit,
application:get_env(rabbit, classic_queue_consumer_unsent_message_limit, 200)),
Copy link
Member

@ansd ansd Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very important, but would it be better to place this into rabbit_queue_consumers:new/0 instead because that's easier to reason about and called less often?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we define the default of 200 at only one place?
Either here or in the Makefile.
I know that other RabbitMQ settings also define a default value 2 times (1 in the Makefile and 1 in the code as done here), but I never understood why that's done.
FWIW, I prefer this setting to not be exposed in rabbitmq.conf and only to bet set in advanced.config. This way we can get rid of the Makefile changes entirely including the Makefile default setting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that commonly set configuration goes into rabbitmq.conf which requires writing Cuttlefish translations.
Rarely or advanced config goes - as the name suggests - into advanced.config file.
This setting, given it has never been configurable, is for me a rare / advanced configuration.
Other internal credit flow settings such as msg_store_credit_disc_bound or credit_flow_default_credit are not exposed either in rabbitmq.conf.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ansd using rabbit_queue_consumers:new/0 wont work here. We want this to be dynamic and configurable in real time, when new consumers attach to a queue. Which is why we're adding to rabbit_queue_consumers:add/9. This suggestion breaks the purpose of the change - we want capability to "accelerate" consumers on existing queues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ansd credit-flow configs use advanced.config solely because they're configured as native Erlang terms/tuples. This config is just a simple integer, hence must be set in rabbitmq.conf. Which is much easier for operators to update/manage this config. Not sure I get the value/advantage of making configuring this unnecessarily more complex for users?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want this to be dynamic and configurable in real time, when new consumers attach to a queue.

When exactly would you increase or decrease this value dynamically at runtime globally for all consumers running on this node?

ansd added a commit that referenced this pull request Jul 29, 2024
Similar to other RabbitMQ internal credit flow configurations such as
`credit_flow_default_credit` and `msg_store_credit_disc_bound`, this
commit makes the `classic_queue_consumer_unsent_message_limit`
configurable via `advanced.config`.

See #11822 for the
original motivation to make this setting configurable.
@ansd
Copy link
Member

ansd commented Jul 29, 2024

#11857 implements an alternative to this PR making this value configurable via advanced.config.

@Ayanda-D
Copy link
Contributor Author

@ansd this #11857 adds unnecessary complexity. Lets keep this config very simple please. Managing advanced.config file and its complexity definitely is not the greatest experience for most operators. Check the docs on why it exists: https://www.rabbitmq.com/docs/configure#advanced-config-file

@mergify mergify bot added the bazel label Jul 29, 2024
@michaelklishin
Copy link
Collaborator

@Ayanda-D 99.99% of operators will not use this setting in the first few months when it ships, and a high 90-something % of operators will never use this setting because their applications do not set prefetch to values into thousands, leave alone to something like 30K or 50K.

This is not a common user setting.

We know well why advanced.config exists. We also know what is not exposed to rabbitmq.conf: most settings that hardly anyone ever uses.

@michaelklishin
Copy link
Collaborator

I am comparing this PR without the rabbitmq.conf setting to #11857.

@michaelklishin
Copy link
Collaborator

#11857 was accepted and will be backported to v3.13.x.

mergify bot pushed a commit that referenced this pull request Jul 30, 2024
Similar to other RabbitMQ internal credit flow configurations such as
`credit_flow_default_credit` and `msg_store_credit_disc_bound`, this
commit makes the `classic_queue_consumer_unsent_message_limit`
configurable via `advanced.config`.

See #11822 for the
original motivation to make this setting configurable.

(cherry picked from commit c771b24)
mergify bot pushed a commit that referenced this pull request Jul 30, 2024
Similar to other RabbitMQ internal credit flow configurations such as
`credit_flow_default_credit` and `msg_store_credit_disc_bound`, this
commit makes the `classic_queue_consumer_unsent_message_limit`
configurable via `advanced.config`.

See #11822 for the
original motivation to make this setting configurable.

(cherry picked from commit c771b24)
(cherry picked from commit eee851b)

# Conflicts:
#	deps/rabbit/src/rabbit_queue_consumers.erl
ansd added a commit that referenced this pull request Jul 30, 2024
Similar to other RabbitMQ internal credit flow configurations such as
`credit_flow_default_credit` and `msg_store_credit_disc_bound`, this
commit makes the `classic_queue_consumer_unsent_message_limit`
configurable via `advanced.config`.

See #11822 for the
original motivation to make this setting configurable.

(cherry picked from commit c771b24)
(cherry picked from commit eee851b)

# Conflicts:
#	deps/rabbit/src/rabbit_queue_consumers.erl
@Ayanda-D
Copy link
Contributor Author

A prefetch of 0 is a very common use-case. Please read your documentation - https://www.rabbitmq.com/docs/consumer-prefetch#single-consumer - this setting is at consumer-level, there is no need to restart the queue to change performance of consumers. So this configuration should maintain the same/similar semantics - only affect consumers.

And just because a config is configurable in rabbitmq.conf it doesnt mean its exposed to users. Exposure is in adding to docs - you don't "hide" config by adding it to advanced.config 😄 just leave out the documentation if you want to take this approach of "hiding" performance config from the community-users (we dont even agree with this approach of "hiding" config from users - please expose as much config as possible, document it well and let users use and decide for themselves as fitting their use-case. You might be hiding configs that can solve many problems - such as this unsent msg limit blocker).

What you're now doing in #11857 is now taking Consumer optimization config and making it tightly coupled to the queue. So queues have to be terminated and restarted for a consumer config/optimization to take effect? Which completely doesn't make sense. Queues can be/usually are, long running entities - having to restart queues to apply this configuration just doesn't make sense and is breaking the way consumers/queues are meant to be optimized.

Like QoS (anytime at runtime), you only need to restart a consumer channel to apply new QoS value. Similarly this config. Restarting a queue for a consumer-level config to take effect, is probably one of the most incorrect things to be doing/i've seen being done in Rabbit. Restarting a queue to improve consumer QoS ? Very wrong thing to do. Thats what #11857 introduces.

& FYI. Follow up to this work is a CLI command to set_classic_queue_consumer_unsent_message_limit_command <limit> at runtime to speed up new consumers on queues without need to kill queues and restarting them.

Luckily, we maintain our own releases - we don't have to conform to some of these wrong code changes that break normal operation flow. We'll do the right thing on our side 👍

@ansd
Copy link
Member

ansd commented Jul 30, 2024

@Ayanda-D you didn't answer #11822 (comment). You weren't able to explain when exactly you would increase or decrease this value at runtime globally for all consumers running on the given node. We will never add features to RabbitMQ without a clear and well understood use case.

@lhoguin
Copy link
Contributor

lhoguin commented Jul 30, 2024

You can easily update the value via rabbitmqctl eval running:

persistent_term:put(classic_queue_consumer_unsent_message_limit, NewVal).

It is not tied to the queue at all.

There's no point in adding a CLI command, but feel free to do so in your fork.

Note that some time after 4.1 I will go through this module to make it work better for CQv2, potentially making this configuration option obsolete.

@Ayanda-D
Copy link
Contributor Author

@ansd I've just answered that in my last comment, thought you'd see the answer without me needing to repeat myself - please go into CLI repo and check set_* commands for updating all sorts of config and limits during runtime:

deps/rabbitmq_cli/lib/rabbitmq/cli/ctl/commands/set_*_command.ex

Even by simply running:

rabbitmqctl eval "application:set_env(rabbit, classic_queue_consumer_unsent_message_limit,<LIMIT>)." on a running node.

This is a very normal thing to do.

@michaelklishin
Copy link
Collaborator

michaelklishin commented Jul 30, 2024

A CLI command and a rabbitmq.conf setting alike can be added from a plugin.

@Ayanda-D
Copy link
Contributor Author

@lhoguin yes but thats too many moving parts - if we update persistent_term directly, then what we have in active application config differs from whats in persistent term. I did this deliberately https://github.com/rabbitmq/rabbitmq-server/pull/11822/files#diff-31847e286e97c82d99ea03de141a3e10abde94e6d3f13e92d7f9f3c8808499b9R171-R172 on every basic.consume.

If a queue restarts, or new queue added, then it overrides whatever you had updated directly with persistent_term:put/2

@michaelklishin
Copy link
Collaborator

michaelklishin commented Jul 30, 2024

@Ayanda-D you completely ignore the fact that 99% of RabbitMQ users likely won't do things like "set QoS to 30K so that we must override an internal constant related to classic queues, and then update it at runtime."

RabbitMQ remains open source software in a world where other projects have gone Business Software License or ask to pay for their cloud option. That does not mean that we will accept every single change but it surely is worth something.

@rabbitmq rabbitmq locked as resolved and limited conversation to collaborators Jul 30, 2024
@lhoguin
Copy link
Contributor

lhoguin commented Jul 30, 2024

if we update persistent_term directly, then what we have in active application config differs from whats in persistent term

Yes and that's fine, you shouldn't update this value in production. It's a band-aid that does some good now but is likely to be removed in the future after the next set of refactorings of classic queues.

@michaelklishin
Copy link
Collaborator

michaelklishin commented Jul 30, 2024

And anyone who needs a substantial throughput boost should switch from classic queues to streams and/or superstreams. That'd be an order of magnitude change or more with an easier path to multiplying that throughput by growing cluster size.

Which, by the way, were developed and made available for free to everyone by the RabbitMQ Core team.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants