You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Handle scalar tensor and mutable buffer inputs in Vulkan delegate runtime (#5930)
Summary:
Pull Request resolved: #5930
## Context
* Handle scalar tensor inputs by adding them to the graph as symbolic ints
* Add support for symint inputs in the Vulkan delegate
* Add type checking for Vulkan delegate inputs and outputs
This is needed for Transformer models, which receive a an `input_pos` integer scalar tensor as an input. `input_pos` is used in KV cache updates and determines the sizes of the cache slices.
Additionally, mutable buffer inputs/outputs, which appear as `TensorRef` to the Vulkan graph, are handled as well by ignoring them when copying outputs. More details in the comments.
### Why are scalar tensors added as symint?
Adding scalar tensors as symints makes more sense than adding them as real tensors, since symints are commonly used to inform tensor shapes. Adding scalar tensors as symints allow them to be easily accessible by the CPU at graph encoding and resizing time, as well as easily accesible by the GPU within compute shaders.
ghstack-source-id: 246752221
Reviewed By: jorgep31415
Differential Revision: D63979312
fbshipit-source-id: ce76993d65c9b5af8de98e4f131c5a6f475900ab
0 commit comments