Update on "[ET-VK][Ops] aten.convolution (Bias=False)"

The final touches to get ET-VK convolution on-par with ATen-VK's convolution. ## Idea In our shaders, we add the bias to our sum. ``` ${VEC4_T[DTYPE]} sum = texelFetch(bias_in, ivec2(pos.z, 0), 0); ``` To keep our shaders as is, we implement having no bias by allocating a buffer of zeros. Then, our shader adds zero to our sum. ## Issue If `Bias=False`, dummy buffer of zeros is not serialized with the graph. The bias ValueRef is deserialized in the runtime as `TypeTag::NONE`, not `TypeTag::TENSORREF`. ## Solution If `TypeTag::NONE` is given, (1) create the `vTensor` using the `out_channels` value from the weights and (2) allocate a StagingBuffer of that size. The StagingBuffer will be transferred to GPU memory and initialized to zeros. Differential Revision: [D55814589](https://our.internmc.facebook.com/intern/diff/D55814589/) [ghstack-poisoned]
pytorch · junpi3 · Apr 5, 2024 · Apr 8, 2024 · Apr 8, 2024 · Apr 9, 2024
commit bf72506cb5b3021c8ae03873f7c7020ec9759964
diff --git a/backends/vulkan/runtime/graph/ops/PrepackNode.cpp b/backends/vulkan/runtime/graph/ops/PrepackNode.cpp
@@ -35,11 +35,13 @@ PrepackNode::PrepackNode(
 api::StorageBuffer PrepackNode::create_staging_buffer(ComputeGraph* graph) {
   vTensor& packed = graph->get_val(packed_).toTensor();
 
-  // If no TensorRef is provided, create a zeroed staging buffer according to
+  // If no TensorRef is provided, create a staging buffer of zeros according to
   // the vTensor metadata.
   if (graph->get_val(tref_).isNone()) {
     size_t numel = api::utils::multiply_integers(packed.sizes());
     api::StorageBuffer staging(graph->context(), packed.dtype(), numel);
+    size_t nbytes = numel * api::element_size(packed.dtype());
+    copy_zeros_to_staging(staging, nbytes);
     return staging;
   }
 

diff --git a/backends/vulkan/runtime/graph/ops/utils/StagingUtils.cpp b/backends/vulkan/runtime/graph/ops/utils/StagingUtils.cpp
@@ -89,6 +89,13 @@ void copy_staging_to_ptr(
   memcpy_from_mapping(mapping, dst, nbytes, staging.dtype());
 }
 
+void copy_zeros_to_staging(api::StorageBuffer& staging, const size_t nbytes) {
+  void* data = malloc(nbytes);
+  memset(data, 0, nbytes);
+  copy_ptr_to_staging(data, staging, nbytes);
+  free(data);
+}
+
 api::ShaderInfo get_nchw_to_image_shader(const vTensor& v_dst) {
   if (v_dst.is_quantized()) {
     VK_THROW("Quantized Tensors are currently not supported!");

diff --git a/backends/vulkan/runtime/graph/ops/utils/StagingUtils.h b/backends/vulkan/runtime/graph/ops/utils/StagingUtils.h
@@ -25,6 +25,8 @@ void copy_staging_to_ptr(
     void* dst,
     const size_t nbytes);
 
+void copy_zeros_to_staging(api::StorageBuffer& staging, const size_t nbytes);
+
 //
 // Functions to get shaders
 //