Skip to content

Commit ac62d04

Browse files
mcr229facebook-github-bot
authored andcommitted
Intorduce XNNPACKHeaderto manage flatbuffer data and constant data
Summary: Introducing the XNNPACKHeader to manage the flatbuffer data and constant data. Previously, we have serialized constant data along with flatbuffer. However, with large weights and large tensors in general, this takes a large amount of time and memory converting our dataclass --> json --> flatbuffer. This has become a blocker on some larger models To fix, we circumvent serializing constant tensors via flatbuffer, by appending the constant data after the flatbuffer payload. In order to do this, we need an XNNPACKHeader which will give us the flatbuffer offset, flatbuffer size, constant data offset, and constant data sizes. It will look something like this: ``` ┌───────────────────────────────────┐ │XNNPACK Header │ ├───────────────────────────────────┤ │Padding for 16 byte alignment │ ├───────────────────────────────────┤ │Flatbuffer-serialized payload data │ │ │ │ │ ├───────────────────────────────────┤ │Padding for 16 byte alignment │ ├───────────────────────────────────┤ │Constant Data │ │ │ │ │ └───────────────────────────────────┘ ``` Within the XNNPACK Header, we hold the following: - 4 bytes to offset the header magic - 4 bytes for the header magic - 4 bytes for the header length - 8 bytes for the flatbuffer offset - 8 bytes for the flatbuffer size - 8 bytes for constant data offset - 8 bytes for constant data size Differential Revision: D52497977
1 parent 504366f commit ac62d04

File tree

1 file changed

+77
-1
lines changed

1 file changed

+77
-1
lines changed

backends/xnnpack/serialization/xnnpack_graph_serialize.py

Lines changed: 77 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,20 @@
88
import os
99
import tempfile
1010

11-
from dataclasses import fields, is_dataclass
11+
from dataclasses import dataclass, fields, is_dataclass
12+
from typing import ClassVar, Literal
1213

1314
import pkg_resources
1415
from executorch.backends.xnnpack.serialization.xnnpack_graph_schema import XNNGraph
1516
from executorch.exir._serialize._dataclass import _DataclassEncoder
1617

1718
from executorch.exir._serialize._flatbuffer import _flatc_compile
1819

20+
# Byte order of numbers written to program headers. Always little-endian
21+
# regardless of the host system, since all commonly-used modern CPUs are little
22+
# endian.
23+
_HEADER_BYTEORDER: Literal["little"] = "little"
24+
1925

2026
def sanity_check_xnngraph_dataclass(table, name: str = ""):
2127
"""
@@ -68,6 +74,76 @@ def check_for_sym(obj, name):
6874
check_for_sym(o, _name_field)
6975

7076

77+
@dataclass
78+
class XNNPACKHeader:
79+
# Class Constants
80+
81+
# magic bytes that should be at the beginning of the header
82+
EXPECTED_MAGIC: ClassVar[bytes] = b"XH00"
83+
# The length of the header in bytes.
84+
EXPECTED_LENGTH: ClassVar[int] = (
85+
# Zeros magic
86+
# We offset the magic by 4 bytes so that it is in the same location
87+
# as the flatbuffer payload's magic. This way we can dynamically
88+
# choose between the XNNPACK Header and Flatbuffer Header
89+
4
90+
# Header magic
91+
+ 4
92+
# Header Length
93+
+ 4
94+
# Flatbuffer offset
95+
+ 8
96+
# Flatbuffer size
97+
+ 8
98+
# Constant Data offset
99+
+ 8
100+
# Constant Data size
101+
+ 8
102+
)
103+
104+
# Instance attributes. @dataclass will turn these into ctor args.
105+
106+
# offset to the flatbuffer data
107+
flatbuffer_offset: int
108+
109+
# flatbuffer size
110+
flatbuffer_size: int
111+
112+
# offset to the constant data
113+
constant_data_offset: int
114+
115+
# constant data size
116+
constant_data_size: int
117+
118+
def to_bytes(self) -> bytes:
119+
"""
120+
Returns the binary representation of the XNNPACK Header.
121+
"""
122+
123+
data: bytes = (
124+
# Padding for magic bytes. This is so that header magic is in the same position
125+
# as the flatbuffer magic, and allows consumer to detect whether the header is
126+
# being used or not
127+
b"\x00\x00\x00\x00"
128+
# XNNPACK Header's magic. This allows consumer to detect whether or not the header
129+
# is being used or the flatbuffer header is being used
130+
+ self.EXPECTED_MAGIC
131+
# uint32_t: Size of this header. This makes it easier to add new fields to the header
132+
# in the future.
133+
+ self.EXPECTED_LENGTH.to_bytes(4, byteorder=_HEADER_BYTEORDER)
134+
# uint64_t: Offset to the start of the flatbuffer data
135+
+ self.flatbuffer_offset.to_bytes(8, byteorder=_HEADER_BYTEORDER)
136+
# uint64_t: Size of the flatbuffer data payload
137+
+ self.flatbuffer_size.to_bytes(8, byteorder=_HEADER_BYTEORDER)
138+
# uint64_t: Offset to the start of the constant data
139+
+ self.constant_data_offset.to_bytes(8, byteorder=_HEADER_BYTEORDER)
140+
# uint64_t: Size of the constant data
141+
+ self.constant_data_size.to_bytes(8, byteorder=_HEADER_BYTEORDER)
142+
)
143+
144+
return data
145+
146+
71147
def convert_to_flatbuffer(xnnpack_graph: XNNGraph) -> bytes:
72148
sanity_check_xnngraph_dataclass(xnnpack_graph)
73149
xnnpack_graph_json = json.dumps(xnnpack_graph, cls=_DataclassEncoder)

0 commit comments

Comments
 (0)