Skip to content

[mlir][bytecode] Add bytecode writer config API to skip serialization of resources #71991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mfrancio
Copy link
Contributor

@mfrancio mfrancio commented Nov 10, 2023

When serializing to bytecode, users can select the option to elide resources from the bytecode file. This will instruct the bytecode writer to serialize only the key and resource kind, while skipping serialization of the data buffer. At parsing, the IR is built in memory with valid (but empty) resource handlers.

@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir labels Nov 10, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 10, 2023

@llvm/pr-subscribers-mlir-core

Author: Matteo Franciolini (mfrancio)

Changes

When serializing to bytecode, users can select the option to elide resources from the bytecode file. This will instruct the bytecode writer to serialize only the key and resource kind, while skipping serialization of the data buffer. At parsing, the IR built in memory with valid (but empty) resource handlers.


Full diff: https://github.com/llvm/llvm-project/pull/71991.diff

5 Files Affected:

  • (modified) mlir/include/mlir/Bytecode/BytecodeWriter.h (+3)
  • (modified) mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h (+6)
  • (modified) mlir/lib/Bytecode/Writer/BytecodeWriter.cpp (+20-6)
  • (modified) mlir/lib/Tools/mlir-opt/MlirOptMain.cpp (+7)
  • (added) mlir/test/Bytecode/resources_elision.mlir (+18)
diff --git a/mlir/include/mlir/Bytecode/BytecodeWriter.h b/mlir/include/mlir/Bytecode/BytecodeWriter.h
index b82d8ddad38ed1c..ea4b36832e0bac3 100644
--- a/mlir/include/mlir/Bytecode/BytecodeWriter.h
+++ b/mlir/include/mlir/Bytecode/BytecodeWriter.h
@@ -152,6 +152,9 @@ class BytecodeWriterConfig {
   // Resources
   //===--------------------------------------------------------------------===//
 
+  /// Set a boolean flag to skip emission of resources into the bytecode file.
+  void setElideResourceDataFlag(bool shouldElideResourceData = true);
+
   /// Attach the given resource printer to the writer configuration.
   void attachResourcePrinter(std::unique_ptr<AsmResourcePrinter> printer);
 
diff --git a/mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h b/mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
index a1530936f55caee..e255d9fa70b6594 100644
--- a/mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
+++ b/mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
@@ -82,6 +82,9 @@ class MlirOptMainConfig {
     return *this;
   }
   bool shouldEmitBytecode() const { return emitBytecodeFlag; }
+  bool shouldElideResourceDataFromBytecode() const {
+    return elideResourceDataFromBytecodeFlag;
+  }
 
   /// Set the IRDL file to load before processing the input.
   MlirOptMainConfig &setIrdlFile(StringRef file) {
@@ -185,6 +188,9 @@ class MlirOptMainConfig {
   /// Emit bytecode instead of textual assembly when generating output.
   bool emitBytecodeFlag = false;
 
+  /// Elide resources when generating bytecode.
+  bool elideResourceDataFromBytecodeFlag = false;
+
   /// Enable the Debugger action hook: Debugger can intercept MLIR Actions.
   bool enableDebuggerActionHookFlag = false;
 
diff --git a/mlir/lib/Bytecode/Writer/BytecodeWriter.cpp b/mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
index 01dcea1ca3848eb..6097f0eda469cd2 100644
--- a/mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
+++ b/mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
@@ -39,6 +39,10 @@ struct BytecodeWriterConfig::Impl {
   /// Note: This only differs from kVersion if a specific version is set.
   int64_t bytecodeVersion = bytecode::kVersion;
 
+  /// A flag specifying whether to elide emission of resources into the bytecode
+  /// file.
+  bool shouldElideResourceData = false;
+
   /// A map containing dialect version information for each dialect to emit.
   llvm::StringMap<std::unique_ptr<DialectVersion>> dialectVersionMap;
 
@@ -89,6 +93,11 @@ void BytecodeWriterConfig::attachResourcePrinter(
   impl->externalResourcePrinters.emplace_back(std::move(printer));
 }
 
+void BytecodeWriterConfig::setElideResourceDataFlag(
+    bool shouldElideResourceData) {
+  impl->shouldElideResourceData = shouldElideResourceData;
+}
+
 void BytecodeWriterConfig::setDesiredBytecodeVersion(int64_t bytecodeVersion) {
   impl->bytecodeVersion = bytecodeVersion;
 }
@@ -1170,22 +1179,25 @@ class ResourceBuilder : public AsmResourceBuilder {
   using PostProcessFn = function_ref<void(StringRef, AsmResourceEntryKind)>;
 
   ResourceBuilder(EncodingEmitter &emitter, StringSectionBuilder &stringSection,
-                  PostProcessFn postProcessFn)
+                  PostProcessFn postProcessFn, bool shouldElideData)
       : emitter(emitter), stringSection(stringSection),
-        postProcessFn(postProcessFn) {}
+        postProcessFn(postProcessFn), shouldElideData(shouldElideData) {}
   ~ResourceBuilder() override = default;
 
   void buildBlob(StringRef key, ArrayRef<char> data,
                  uint32_t dataAlignment) final {
-    emitter.emitOwnedBlobAndAlignment(data, dataAlignment);
+    if (!shouldElideData)
+      emitter.emitOwnedBlobAndAlignment(data, dataAlignment);
     postProcessFn(key, AsmResourceEntryKind::Blob);
   }
   void buildBool(StringRef key, bool data) final {
-    emitter.emitByte(data);
+    if (!shouldElideData)
+      emitter.emitByte(data);
     postProcessFn(key, AsmResourceEntryKind::Bool);
   }
   void buildString(StringRef key, StringRef data) final {
-    emitter.emitVarInt(stringSection.insert(data));
+    if (!shouldElideData)
+      emitter.emitVarInt(stringSection.insert(data));
     postProcessFn(key, AsmResourceEntryKind::String);
   }
 
@@ -1193,6 +1205,7 @@ class ResourceBuilder : public AsmResourceBuilder {
   EncodingEmitter &emitter;
   StringSectionBuilder &stringSection;
   PostProcessFn postProcessFn;
+  bool shouldElideData = false;
 };
 } // namespace
 
@@ -1225,7 +1238,8 @@ void BytecodeWriter::writeResourceSection(Operation *op,
 
   // Builder used to emit resources.
   ResourceBuilder entryBuilder(resourceEmitter, stringSection,
-                               appendResourceOffset);
+                               appendResourceOffset,
+                               config.shouldElideResourceData);
 
   // Emit the external resource entries.
   resourceOffsetEmitter.emitVarInt(config.externalResourcePrinters.size());
diff --git a/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp b/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
index c36afae716b12c5..d7d47619ef4ac98 100644
--- a/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
+++ b/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
@@ -90,6 +90,11 @@ struct MlirOptMainConfigCLOptions : public MlirOptMainConfig {
         "emit-bytecode", cl::desc("Emit bytecode when generating output"),
         cl::location(emitBytecodeFlag), cl::init(false));
 
+    static cl::opt<bool, /*ExternalStorage=*/true> elideResourcesFromBytecode(
+        "elide-resource-data-from-bytecode",
+        cl::desc("Elide resources when generating bytecode"),
+        cl::location(elideResourceDataFromBytecodeFlag), cl::init(false));
+
     static cl::opt<std::optional<int64_t>, /*ExternalStorage=*/true,
                    BytecodeVersionParser>
         bytecodeVersion(
@@ -385,6 +390,8 @@ performActions(raw_ostream &os,
     BytecodeWriterConfig writerConfig(fallbackResourceMap);
     if (auto v = config.bytecodeVersionToEmit())
       writerConfig.setDesiredBytecodeVersion(*v);
+    if (config.shouldElideResourceDataFromBytecode())
+      writerConfig.setElideResourceDataFlag();
     return writeBytecodeToFile(op.get(), os, writerConfig);
   }
 
diff --git a/mlir/test/Bytecode/resources_elision.mlir b/mlir/test/Bytecode/resources_elision.mlir
new file mode 100644
index 000000000000000..5238ae8f1ebc583
--- /dev/null
+++ b/mlir/test/Bytecode/resources_elision.mlir
@@ -0,0 +1,18 @@
+// RUN: mlir-opt -emit-bytecode -elide-resource-data-from-bytecode %s | mlir-opt | FileCheck %s
+
+// CHECK-LABEL: @TestDialectResources
+module @TestDialectResources attributes {
+  bytecode.test = dense_resource<decl_resource> : tensor<2xui32>,
+  bytecode.test2 = dense_resource<resource> : tensor<4xf64>,
+  bytecode.test3 = dense_resource<resource_2> : tensor<4xf64>
+} {}
+
+// CHECK-NOT: dialect_resources
+{-#
+  dialect_resources: {
+    builtin: {
+      resource: "0x08000000010000000000000002000000000000000300000000000000",
+      resource_2: "0x08000000010000000000000002000000000000000300000000000000"
+    }
+  }
+#-}

@llvmbot
Copy link
Member

llvmbot commented Nov 10, 2023

@llvm/pr-subscribers-mlir

Author: Matteo Franciolini (mfrancio)

Changes

When serializing to bytecode, users can select the option to elide resources from the bytecode file. This will instruct the bytecode writer to serialize only the key and resource kind, while skipping serialization of the data buffer. At parsing, the IR built in memory with valid (but empty) resource handlers.


Full diff: https://github.com/llvm/llvm-project/pull/71991.diff

5 Files Affected:

  • (modified) mlir/include/mlir/Bytecode/BytecodeWriter.h (+3)
  • (modified) mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h (+6)
  • (modified) mlir/lib/Bytecode/Writer/BytecodeWriter.cpp (+20-6)
  • (modified) mlir/lib/Tools/mlir-opt/MlirOptMain.cpp (+7)
  • (added) mlir/test/Bytecode/resources_elision.mlir (+18)
diff --git a/mlir/include/mlir/Bytecode/BytecodeWriter.h b/mlir/include/mlir/Bytecode/BytecodeWriter.h
index b82d8ddad38ed1c..ea4b36832e0bac3 100644
--- a/mlir/include/mlir/Bytecode/BytecodeWriter.h
+++ b/mlir/include/mlir/Bytecode/BytecodeWriter.h
@@ -152,6 +152,9 @@ class BytecodeWriterConfig {
   // Resources
   //===--------------------------------------------------------------------===//
 
+  /// Set a boolean flag to skip emission of resources into the bytecode file.
+  void setElideResourceDataFlag(bool shouldElideResourceData = true);
+
   /// Attach the given resource printer to the writer configuration.
   void attachResourcePrinter(std::unique_ptr<AsmResourcePrinter> printer);
 
diff --git a/mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h b/mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
index a1530936f55caee..e255d9fa70b6594 100644
--- a/mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
+++ b/mlir/include/mlir/Tools/mlir-opt/MlirOptMain.h
@@ -82,6 +82,9 @@ class MlirOptMainConfig {
     return *this;
   }
   bool shouldEmitBytecode() const { return emitBytecodeFlag; }
+  bool shouldElideResourceDataFromBytecode() const {
+    return elideResourceDataFromBytecodeFlag;
+  }
 
   /// Set the IRDL file to load before processing the input.
   MlirOptMainConfig &setIrdlFile(StringRef file) {
@@ -185,6 +188,9 @@ class MlirOptMainConfig {
   /// Emit bytecode instead of textual assembly when generating output.
   bool emitBytecodeFlag = false;
 
+  /// Elide resources when generating bytecode.
+  bool elideResourceDataFromBytecodeFlag = false;
+
   /// Enable the Debugger action hook: Debugger can intercept MLIR Actions.
   bool enableDebuggerActionHookFlag = false;
 
diff --git a/mlir/lib/Bytecode/Writer/BytecodeWriter.cpp b/mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
index 01dcea1ca3848eb..6097f0eda469cd2 100644
--- a/mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
+++ b/mlir/lib/Bytecode/Writer/BytecodeWriter.cpp
@@ -39,6 +39,10 @@ struct BytecodeWriterConfig::Impl {
   /// Note: This only differs from kVersion if a specific version is set.
   int64_t bytecodeVersion = bytecode::kVersion;
 
+  /// A flag specifying whether to elide emission of resources into the bytecode
+  /// file.
+  bool shouldElideResourceData = false;
+
   /// A map containing dialect version information for each dialect to emit.
   llvm::StringMap<std::unique_ptr<DialectVersion>> dialectVersionMap;
 
@@ -89,6 +93,11 @@ void BytecodeWriterConfig::attachResourcePrinter(
   impl->externalResourcePrinters.emplace_back(std::move(printer));
 }
 
+void BytecodeWriterConfig::setElideResourceDataFlag(
+    bool shouldElideResourceData) {
+  impl->shouldElideResourceData = shouldElideResourceData;
+}
+
 void BytecodeWriterConfig::setDesiredBytecodeVersion(int64_t bytecodeVersion) {
   impl->bytecodeVersion = bytecodeVersion;
 }
@@ -1170,22 +1179,25 @@ class ResourceBuilder : public AsmResourceBuilder {
   using PostProcessFn = function_ref<void(StringRef, AsmResourceEntryKind)>;
 
   ResourceBuilder(EncodingEmitter &emitter, StringSectionBuilder &stringSection,
-                  PostProcessFn postProcessFn)
+                  PostProcessFn postProcessFn, bool shouldElideData)
       : emitter(emitter), stringSection(stringSection),
-        postProcessFn(postProcessFn) {}
+        postProcessFn(postProcessFn), shouldElideData(shouldElideData) {}
   ~ResourceBuilder() override = default;
 
   void buildBlob(StringRef key, ArrayRef<char> data,
                  uint32_t dataAlignment) final {
-    emitter.emitOwnedBlobAndAlignment(data, dataAlignment);
+    if (!shouldElideData)
+      emitter.emitOwnedBlobAndAlignment(data, dataAlignment);
     postProcessFn(key, AsmResourceEntryKind::Blob);
   }
   void buildBool(StringRef key, bool data) final {
-    emitter.emitByte(data);
+    if (!shouldElideData)
+      emitter.emitByte(data);
     postProcessFn(key, AsmResourceEntryKind::Bool);
   }
   void buildString(StringRef key, StringRef data) final {
-    emitter.emitVarInt(stringSection.insert(data));
+    if (!shouldElideData)
+      emitter.emitVarInt(stringSection.insert(data));
     postProcessFn(key, AsmResourceEntryKind::String);
   }
 
@@ -1193,6 +1205,7 @@ class ResourceBuilder : public AsmResourceBuilder {
   EncodingEmitter &emitter;
   StringSectionBuilder &stringSection;
   PostProcessFn postProcessFn;
+  bool shouldElideData = false;
 };
 } // namespace
 
@@ -1225,7 +1238,8 @@ void BytecodeWriter::writeResourceSection(Operation *op,
 
   // Builder used to emit resources.
   ResourceBuilder entryBuilder(resourceEmitter, stringSection,
-                               appendResourceOffset);
+                               appendResourceOffset,
+                               config.shouldElideResourceData);
 
   // Emit the external resource entries.
   resourceOffsetEmitter.emitVarInt(config.externalResourcePrinters.size());
diff --git a/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp b/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
index c36afae716b12c5..d7d47619ef4ac98 100644
--- a/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
+++ b/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp
@@ -90,6 +90,11 @@ struct MlirOptMainConfigCLOptions : public MlirOptMainConfig {
         "emit-bytecode", cl::desc("Emit bytecode when generating output"),
         cl::location(emitBytecodeFlag), cl::init(false));
 
+    static cl::opt<bool, /*ExternalStorage=*/true> elideResourcesFromBytecode(
+        "elide-resource-data-from-bytecode",
+        cl::desc("Elide resources when generating bytecode"),
+        cl::location(elideResourceDataFromBytecodeFlag), cl::init(false));
+
     static cl::opt<std::optional<int64_t>, /*ExternalStorage=*/true,
                    BytecodeVersionParser>
         bytecodeVersion(
@@ -385,6 +390,8 @@ performActions(raw_ostream &os,
     BytecodeWriterConfig writerConfig(fallbackResourceMap);
     if (auto v = config.bytecodeVersionToEmit())
       writerConfig.setDesiredBytecodeVersion(*v);
+    if (config.shouldElideResourceDataFromBytecode())
+      writerConfig.setElideResourceDataFlag();
     return writeBytecodeToFile(op.get(), os, writerConfig);
   }
 
diff --git a/mlir/test/Bytecode/resources_elision.mlir b/mlir/test/Bytecode/resources_elision.mlir
new file mode 100644
index 000000000000000..5238ae8f1ebc583
--- /dev/null
+++ b/mlir/test/Bytecode/resources_elision.mlir
@@ -0,0 +1,18 @@
+// RUN: mlir-opt -emit-bytecode -elide-resource-data-from-bytecode %s | mlir-opt | FileCheck %s
+
+// CHECK-LABEL: @TestDialectResources
+module @TestDialectResources attributes {
+  bytecode.test = dense_resource<decl_resource> : tensor<2xui32>,
+  bytecode.test2 = dense_resource<resource> : tensor<4xf64>,
+  bytecode.test3 = dense_resource<resource_2> : tensor<4xf64>
+} {}
+
+// CHECK-NOT: dialect_resources
+{-#
+  dialect_resources: {
+    builtin: {
+      resource: "0x08000000010000000000000002000000000000000300000000000000",
+      resource_2: "0x08000000010000000000000002000000000000000300000000000000"
+    }
+  }
+#-}

… of resources

When serializing to bytecode, users can select the option to elide resources from the bytecode file. This will instruct the bytecode writer to serialize only the key and resource kind, while skipping serialization of the data buffer. At parsing, the IR is built in memory with valid (but empty) resource handlers.
@mfrancio mfrancio force-pushed the dev/mfrancio/skipSerializationOfResourcesIntoBytecode branch from 3b02bec to be01399 Compare November 13, 2023 16:41
@mfrancio mfrancio merged commit 4488f49 into llvm:main Nov 13, 2023
@mfrancio mfrancio deleted the dev/mfrancio/skipSerializationOfResourcesIntoBytecode branch November 13, 2023 18:59
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
… of resources (llvm#71991)

When serializing to bytecode, users can select the option to elide
resources from the bytecode file. This will instruct the bytecode writer
to serialize only the key and resource kind, while skipping
serialization of the data buffer. At parsing, the IR is built in memory
with valid (but empty) resource handlers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir:core MLIR Core Infrastructure mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants