Skip to content

add thread control for pytorch backend #125

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 18, 2024

Conversation

yongbinfeng
Copy link
Contributor

As noted in the issue here: triton-inference-server/server#6896 we have found out the number of threads can affect the pytorch inference performance a lot. In some cases we have seen PyTorch inference runs (super) slow on multi-core CPU machines, and just setting number of instance is not enough to handle the problem. We have tested using at::set_num_threads(1) and confirmed this fixes the slow inference issue.

This PR allows to configure intra_op_thread_count and inter_op_thread_count for pytorch models, similar to other backends such as TF and ONNX, with syntax such as

parameters { key: "INTRA_OP_THREAD_COUNT" value: { string_value: "1" } }
parameters { key: "INTER_OP_THREAD_COUNT" value: { string_value: "1" } }

@Pascualex
Copy link

Thank you for this, we are experiencing the same problem and are being forced to convert our models to ONNX, which in turn is causing other issues.

I'm not a maintainer so I can only validate that this is a real issue for us too.

@tanmayv25 tanmayv25 self-assigned this Apr 15, 2024
@tanmayv25
Copy link
Contributor

@yongbinfeng Can you submit Triton CLA?

@yongbinfeng
Copy link
Contributor Author

Triton CLA

I think I've done already that, through my affiliation (fermilab) and my affiliation email. (The other PR is already merged: #120 so hopefully it should be fine I guess?)

@tanmayv25 tanmayv25 self-requested a review April 18, 2024 21:29
@tanmayv25 tanmayv25 merged commit c50d65b into triton-inference-server:main Apr 18, 2024
@tanmayv25
Copy link
Contributor

Thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants