-
Notifications
You must be signed in to change notification settings - Fork 363
WIP: (//core): Added device meta data serialization/deserialization implic… #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@andi4191 Any updates on the comments. Also there was some changes in the execution phase to support the non deterministic binding order. You will probably need to rebase on master |
819f6d6
to
9c70a39
Compare
All existing tests pass on x86 and Jetson for me |
afb0de7
to
1d3529a
Compare
Are the Python API changes for device supposed to be in this PR? |
This PR has not seen activity for 30 days, Remove stale label or comment or this will be closed in 5 days |
1ea0b45
to
3df301e
Compare
core/runtime/TRTEngine.cpp
Outdated
{ | ||
cuda_device = deserialize_device(serialized_device_info); | ||
// Set CUDA device as configured in serialized meta data | ||
set_cuda_device(cuda_device); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we store the deserialized device somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't save the deserialized device information anywhere. Do you want to perform file write in runtime?
755a55e
to
5069368
Compare
@@ -174,7 +183,7 @@ torch::jit::script::Module CompileGraph(const torch::jit::script::Module& mod, C | |||
} | |||
|
|||
void set_device(const int gpu_id) { | |||
TRTORCH_ASSERT(cudaSetDevice(gpu_id) == cudaSuccess, "Unable to set CUDA device: " << gpu_id); | |||
TRTORCH_CHECK((cudaSetDevice(gpu_id) == cudaSuccess), "Unable to set CUDA device: " << gpu_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we call runtime::set_cuda_device here just to centralize responsibility for device management in the runtime section?
core/runtime/TRTEngine.cpp
Outdated
CudaDevice cuda_device; | ||
// Deserialize device meta data if device_info is non-empty | ||
if (!serialized_device_info.empty()) { | ||
cuda_device = deserialize_device(serialized_device_info); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the device should be maintained as a field in TRTEngine
instead of querying set device before serialization
core/runtime/runtime.h
Outdated
@@ -19,14 +40,16 @@ struct TRTEngine : torch::CustomClassHolder { | |||
std::pair<uint64_t, uint64_t> num_io; | |||
EngineID id; | |||
std::string name; | |||
CudaDevice device_info; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we use this but we should
core/runtime/TRTEngine.cpp
Outdated
: logger( | ||
std::string("[") + mod_name + std::string("_engine] - "), | ||
util::logging::get_logger().get_reportable_severity(), | ||
util::logging::get_logger().get_is_colored_output_on()) { | ||
CudaDevice cuda_device; | ||
// Deserialize device meta data if device_info is non-empty | ||
if (!serialized_device_info.empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems potentially unsafe. What happens if we provide a DLA engine but no serialized_device_info?
core/runtime/TRTEngine.cpp
Outdated
TRTEngine::TRTEngine( | ||
std::string mod_name, | ||
std::string serialized_engine, | ||
std::string serialized_device_info = std::string()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we take a CudaDevice struct as the arg here instead?
[](const c10::intrusive_ptr<TRTEngine>& self) -> std::string { | ||
auto serialized_engine = self->cuda_engine->serialize(); | ||
return std::string((const char*)serialized_engine->data(), serialized_engine->size()); | ||
[](const c10::intrusive_ptr<TRTEngine>& self) -> std::vector<std::string> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How hard would it be to use a map instead of a vector? I think a map will be a more scalable solution long term for metadata but if its too hard its not worth
core/runtime/TRTEngine.cpp
Outdated
util::logging::get_logger().get_reportable_severity(), | ||
util::logging::get_logger().get_is_colored_output_on()) { | ||
std::string _name = "deserialized_trt"; | ||
std::string device_info = serialized_info[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need size or key checking here.
Do you think we need to add a section to the main execution function that sets device on the fly? or does it need to be done at de-serialization time only? What happens if there is one engine on GPU 1 and one engine on GPU 2? |
Do we need to do something in the runtime that moves input tensors around to the right GPUs for execution? |
This PR has not seen activity for 30 days, Remove stale label or comment or this will be closed in 5 days |
5069368
to
fa942e7
Compare
1c2cf90
to
00c1dd3
Compare
Signed-off-by: TRTorch Github Bot <[email protected]>
Signed-off-by: inocsin <[email protected]>
Signed-off-by: inocsin <[email protected]>
Signed-off-by: inocsin <[email protected]>
Signed-off-by: inocsin <[email protected]>
Signed-off-by: TRTorch Github Bot <[email protected]>
00c1dd3
to
574d77e
Compare
574d77e
to
968cb82
Compare
Signed-off-by: Anurag Dixit <[email protected]>
…itly
Signed-off-by: Anurag Dixit [email protected]
Description
This change enables the end-user to bypass setting CUDA device ID at runtime. It automatically sets the CUDA device based on serialization.
Fixes # (issue)
Type of change
Please delete options that are not relevant and/or add your own.
Added abstraction of Device Info configuration at runtime
Checklist: