-
Notifications
You must be signed in to change notification settings - Fork 364
Make sure torch inputs contiguous before passing data pointer to TRT #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
792e1f3
to
9fa2e94
Compare
Signed-off-by: Junjie Bai <[email protected]>
9fa2e94
to
b48fc58
Compare
core/execution/register_trt_op.cpp
Outdated
//LOG_DEBUG("In shape:" << in_gpu.sizes() ); | ||
ctx->setBindingDimensions(i, shape); | ||
gpu_handles.push_back(in_gpu.data_ptr()); | ||
auto dims = core::util::toDimsPad(inputs[i].sizes(), 4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should here be 1 as in the test RunGraphEngine? (I assume it's because TRT doesn't have zero-dim Scalar as in PyTorch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should probably be 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main goal should be user facing behavior that is no different than pytorch, so if it works in pytorch it should work here. I think my main concern when i first wrote this what things like TensorRT FullyConnected which expects a 4D input but the test works fine with pad 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarification. Yeah if there are particular layers expecting different input dimensions between TRT and PyTorch, it's better to insert a padding op in the corresponding converter (e.g. pytorch linear -> trt pad 4 + trt FC). Unconditionally padding the input will break ops that have the same input dimensions between PyTorch and TRT.
gpu_handles.push_back(in_gpu.data_ptr()); | ||
auto dims = core::util::toDimsPad(inputs[i].sizes(), 4); | ||
auto shape = core::util::toVec(dims); | ||
contig_inputs.push_back(inputs[i].to(at::kCUDA).view(shape).contiguous()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Signed-off-by: Junjie Bai <[email protected]>
1187b20
to
1333319
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT!
Summary: Pull Request resolved: https://github.com/pytorch/fx2trt/pull/21 Reviewed By: jasonjk-park, yinghai Differential Revision: D34916991 fbshipit-source-id: c088b4d6fe40444e13433a6eac76bcbd0fa078e6
No description provided.