feat(router): add max_batch_size #1542

OlivierDehaene · 2024-02-08T16:01:47Z

Some hardware require a maximum batch size.

HuggingFaceDocBuilderDev · 2024-02-08T16:15:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

dacorvo

LGTM, thanks !

dacorvo · 2024-02-08T16:35:30Z

launcher/src/main.rs

@@ -279,6 +279,11 @@ struct Args {
    #[clap(default_value = "20", long, env)]
    max_waiting_tokens: usize,

+    /// Enforce a maximum number of requests per batch
+    /// Specific flag for hardware targets that do not support unpadded inference


'unpadded inference' is a bit unclear to me, but I suppose it corresponds to other configurations.

Unpadded = Flash attention "varlen" (which means requests are stacked linearly without requiring padding tokens).

Padded:

input_ids = [
[3, 53, 63],
[0, 0, 234],
]

Unpadded:

input_ids = [3, 53, 63, 234] + [0, 3, 4] (seqlengths)

Narsil

Look much better thana my PR

Some hardware require a maximum batch size.

OlivierDehaene added 2 commits February 8, 2024 17:01

feat(router): add max_batch_size

faaa9df

update doc

9e042bd

OlivierDehaene added 2 commits February 8, 2024 17:26

use max_size in the batch task

2af011a

my b

55e29c9

dacorvo approved these changes Feb 8, 2024

View reviewed changes

Narsil approved these changes Feb 8, 2024

View reviewed changes

Narsil merged commit 5321463 into main Feb 9, 2024

Narsil deleted the feat/max_batch_size branch February 9, 2024 11:38

dacorvo mentioned this pull request Feb 9, 2024

Allow queueing in Neuron X TGI server beyond batch_size huggingface/optimum-neuron#473

Closed

kdamaszk pushed a commit to kdamaszk/tgi-gaudi that referenced this pull request Apr 29, 2024

feat(router): add max_batch_size (huggingface#1542)

518d30d

Some hardware require a maximum batch size.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(router): add max_batch_size #1542

feat(router): add max_batch_size #1542

Uh oh!

OlivierDehaene commented Feb 8, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Feb 8, 2024

Uh oh!

dacorvo left a comment

Uh oh!

dacorvo Feb 8, 2024

Uh oh!

Narsil Feb 9, 2024

Uh oh!

Narsil left a comment

Uh oh!

Uh oh!

feat(router): add max_batch_size #1542

feat(router): add max_batch_size #1542

Uh oh!

Conversation

OlivierDehaene commented Feb 8, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Feb 8, 2024

Uh oh!

dacorvo left a comment

Choose a reason for hiding this comment

Uh oh!

dacorvo Feb 8, 2024

Choose a reason for hiding this comment

Uh oh!

Narsil Feb 9, 2024

Choose a reason for hiding this comment

Uh oh!

Narsil left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!