Skip to content

Commit c52f0c7

Browse files
committed
Update on "[ET-VK] Introduce virtual_clone API to support view of view use cases + fix synchronization hazard with view tensors"
## Context This diff fixes some hazards (not necessarily) bugs with view tensors. ### `virtual_clone` API Consider the following sequence of calls which may be common in the view of view use case. ``` t1 = graph.add_tensor(...); // t2 will have the same metadata as t1 t2 = graph.add_tensor_view(t1); // t3 will also have the same metadata as t2 at this point. t3 = graph.add_tensor_view(t2); // t2 metadata will be updated correctly. t2 = add_transpose_view_node(t1, 0, 1, t2); // Unfortunately, this node will have an assumption that t3 has the same metadata as t2 to start. However, this is not true. // As a result, t3 will have incorrect metadata after this node. t3 = add_transpose_view_node(t2, 1, 2, t3); ``` To address this, the `virtual_clone` API is introduced which will allow view nodes to set the metadata of the output equal to the input before modifying the output. ### WAW synchronization hazards `vTensorStorage` maintains a `last_access` state which facilitates inserting the correct memory barriers for the underlying `vkImage` or `vkBuffer`. However, when we create a tensor view, `last_access` is not shared between `vTensor` instances that use the same resource. As as result, writing into a `vTensor` will not update the `last_access` of its views, and vice versa. Therefore, sebsequent accesses of the other tensor that references the same resource will result in a synchronization hazard. This diff fixes this hazard in a bit of a crude way; if the `vTensor` is a copy, or has copies, then cowardly assume that it has been written to before the current access so that appropriate memory barriers are inserted. This was the selected solution because I thought that adding a map to track last access of tensors that share resources is a bit overkill when the assumption that the underlying resource has been written to before the current access should hold most of the time. Differential Revision: [D63642092](https://our.internmc.facebook.com/intern/diff/D63642092/) [ghstack-poisoned]
1 parent bd11b11 commit c52f0c7

File tree

1 file changed

+43
-0
lines changed

1 file changed

+43
-0
lines changed

backends/vulkan/test/vulkan_compute_api_test.cpp

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1274,6 +1274,49 @@ TEST(VulkanComputeGraphTest, test_simple_graph_with_view) {
12741274
}
12751275
}
12761276

1277+
TEST(VulkanComputeGraphTest, test_graph_view_of_view) {
1278+
GraphConfig config;
1279+
config.set_storage_type_override(utils::kTexture3D);
1280+
ComputeGraph graph(config);
1281+
1282+
constexpr int N = 3;
1283+
constexpr int C = 5;
1284+
constexpr int H = 17;
1285+
constexpr int W = 19;
1286+
1287+
std::vector<int64_t> orig_sizes = {N, C, H, W};
1288+
1289+
// Test a common view of view usage pattern. In delegate execution, the values
1290+
// of the graph are created first; then operators are added. As a result,
1291+
// creating views of views is a bit tricky because metadata updates to a view
1292+
// does not update the metadata of the view's views. Nonetheless, view
1293+
// operators have an implicit assumption that the metadata of the output is
1294+
// equivalent to the metadata of the input. Therefore, view operators must
1295+
// account for unseen updates to the input view by first calling
1296+
// `virtual_clone()` to make the output equivalent to the input before.
1297+
// modifying metadata.
1298+
1299+
ValueRef t1 = graph.add_tensor(orig_sizes, vkapi::kFloat);
1300+
ValueRef t2 = graph.add_tensor_view(t1);
1301+
ValueRef t3 = graph.add_tensor_view(t2);
1302+
1303+
ValueRef channels = graph.add_scalar<int64_t>(1);
1304+
ValueRef height = graph.add_scalar<int64_t>(2);
1305+
ValueRef width = graph.add_scalar<int64_t>(3);
1306+
1307+
auto opFn = VK_GET_OP_FN("aten.transpose.int");
1308+
1309+
opFn(graph, {t1, channels, height, t2});
1310+
std::vector<int64_t> t2_sizes = graph.sizes_of(t2);
1311+
std::vector<int64_t> expected_t2_sizes = {N, H, C, W};
1312+
EXPECT_TRUE(t2_sizes == expected_t2_sizes);
1313+
1314+
opFn(graph, {t2, height, width, t3});
1315+
std::vector<int64_t> t3_sizes = graph.sizes_of(t3);
1316+
std::vector<int64_t> expected_t3_sizes = {N, H, W, C};
1317+
EXPECT_TRUE(t2_sizes == expected_t2_sizes);
1318+
}
1319+
12771320
TEST(VulkanComputeGraphTest, test_simple_graph) {
12781321
GraphConfig config;
12791322
ComputeGraph graph(config);

0 commit comments

Comments
 (0)