-
Notifications
You must be signed in to change notification settings - Fork 53
Add support for batch_input #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for batch_input #98
Conversation
Thanks for making this contribution, Markus! This looks like a great change. Before we can start review, can you please submit your Contributor License Agreement, directions here? Also, we'll need to add tests to validate this change. If you're able to update the L0_batch_input to ensure this passes, that would help with getting this in. The models used in those tests are generated here and here, I believe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave minor comment, and @dyastremsky 's points on signed CLA and testing. Otherwise looks good to me
src/libtorch.cc
Outdated
batch_input_count_ = config_batch_inputs.ArraySize(); | ||
expected_input_cnt += batch_input_count_; | ||
} else { | ||
batch_input_count_ = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can initalize to 0 in the constructor member init list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an initialisation at the declaration now, let me know if that's okay.
@HennerM Are you able to fill out and send in the CLA? |
Sorry for the delay, we had the CLA sent through for Speechmatics to [email protected] on 4th of May |
Thanks Markus, received! Will start working on the tests to get this merged. |
4a7753d
to
5506203
Compare
Appreciated, I had a look at the test repo but couldn't quite figure out how it all fits together. |
Hi Henner. Keeping you updated. You can see the updated test here: triton-inference-server/server#5855 You can generate the models by going into server/qa/common and running The batch_input support is failing, so will try to debug it to see what needs changing. When the batch input is made on the CPU, PyTorch complains due to it being on a different device than the other inputs. When modified to be made on the GPU, it segfaults. |
Thank you for this PR, @HennerM! This works and passes testing if you modify the PR to create the batch inputs on the current device rather than the CPU. As an example, please see these changes. That PR also has some extra differences due to running the auto-formatter on libtorch.cc. The main differences are here, here, and here. If you're able to make these changes, we can get this pull request merged. Let me know if you have any questions or comments. |
bc38192
to
625dee4
Compare
625dee4
to
8c2bcc8
Compare
I changed the batch input to create the tensor on the Triton assigned device and ran clang-format over the .cc files, I got some formatting changes that are unrelated to my changes though, I hope that's okay? |
Yep, that's perfect! Thank you for making the changes. Let me re-run CI quickly, then should be able to approve and merge once it passes. |
Great work! This pull request is now merged. |
The backend was missing support for
batch_input
inputs, specifically, no batch_input is passed to the model.I added the ability to pass these inputs through to the model, the same way normal input's are passed through.
I followed the implementation in the onnxruntime_backend, and kept to passing the batch_input data as Tensor on the CPU.