Skip to content

Commit da17f66

Browse files
pytorchbotlucylq
andauthored
[flat_tensor] Persist FreeableBuffers of external constants in method (#8599)
Pull Request resolved: #8437 ## Problem Currently, the FlatTensorDataMap persists tensors, and returns a FreeableBuffer with an empty free function. The NamedDataMap should not persist data, as most cases (eg. delegate) will want it to be freed. Ownership should be on the caller; `get_data` returns a FreeableBuffer that 'owns' the data. The FreeableBuffer in turn is owned by the caller. NOTE: this doesn't support the case where we want to share plain tensors between methods/pte files at runtime. A custom NDM could support that use-case. ## This diff: 1. Introduces a 'NamedData' struct to method.h. This holds a key and a FreeeableBuffer. 2. Iterate over all the flatbuffer tensors to count the constants tagged with EXTERNAL. NOTE: this will increase load time for all users. Potentially allocate chunks of 16 and use a linked list to store external constants, or store this number in PTE file (see D69618283). 3. Allocate space for num_external_constants using the method allocator. 4. Iterate over all flatbuffer tensors and use the named_data_map to resolve EXTERNAL tensors into the array of NamedData. 5. Pass the resolved external constants to tensor_parser, along with NDM (used for mutable external tensors). 6. Resolved external tensors are stored inside method. They are freed when the method is destructed. Some notes: https://docs.google.com/document/d/1_PBi4JgODuClUPD4PCUWrKNjyUH54zOUHGUJ3QHDNes/edit?tab=t.0#heading=h.blsvwraxss7g ghstack-source-id: 267364187 TODO: add test case when two fqns point to the same data buffer. Differential Revision: [D69477027](https://our.internmc.facebook.com/intern/diff/D69477027/) Co-authored-by: lucylq <[email protected]>
1 parent 463119e commit da17f66

File tree

6 files changed

+279
-80
lines changed

6 files changed

+279
-80
lines changed

runtime/executor/method.cpp

Lines changed: 138 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
namespace executorch {
3434
namespace runtime {
3535

36+
using deserialization::NamedData;
3637
using internal::PlatformMemoryAllocator;
3738

3839
/**
@@ -289,6 +290,113 @@ Result<bool> parse_cond_value(const EValue& cond_value) {
289290

290291
} // namespace
291292

293+
Result<size_t> Method::get_num_external_constants() {
294+
auto flatbuffer_values = serialization_plan_->values();
295+
size_t n_value = flatbuffer_values->size();
296+
297+
size_t n_external_constants = 0;
298+
for (size_t i = 0; i < n_value; ++i) {
299+
auto serialization_value = flatbuffer_values->Get(i);
300+
// Ensure values are non-null.
301+
// Note that as a side-effect of this check, we're guaranteed that all
302+
// values are non-null, so later loops can skip that check.
303+
ET_CHECK_OR_RETURN_ERROR(
304+
serialization_value != nullptr &&
305+
(serialization_value->val_type() ==
306+
executorch_flatbuffer::KernelTypes::Null ||
307+
serialization_value->val() != nullptr),
308+
InvalidProgram,
309+
"Null value at index %" ET_PRIsize_t,
310+
i);
311+
// Ignore non-tensor types.
312+
if (serialization_value->val_type() !=
313+
executorch_flatbuffer::KernelTypes::Tensor) {
314+
continue;
315+
}
316+
const auto s_tensor = static_cast<const executorch_flatbuffer::Tensor*>(
317+
serialization_value->val());
318+
319+
// An external constant is tagged with EXTERNAL and has no
320+
// allocation_info.
321+
if (s_tensor->extra_tensor_info() != nullptr &&
322+
s_tensor->extra_tensor_info()->location() ==
323+
executorch_flatbuffer::TensorDataLocation::EXTERNAL &&
324+
s_tensor->allocation_info() == nullptr) {
325+
n_external_constants++;
326+
}
327+
}
328+
return n_external_constants;
329+
}
330+
331+
Error Method::parse_external_constants(const NamedDataMap* named_data_map) {
332+
auto flatbuffer_values = serialization_plan_->values();
333+
size_t n_value = flatbuffer_values->size();
334+
335+
// n_external_constants_ counts the number of successfully-initialized
336+
// external constants for ~Method() to clean up, and is incremented at the
337+
// bottom of the loop. This makes it safe for errors to return without
338+
// updating any state.
339+
n_external_constants_ = 0;
340+
for (size_t i = 0; i < n_value; ++i) {
341+
auto serialization_value = flatbuffer_values->Get(i);
342+
// Ignore non-tensor types.
343+
if (serialization_value->val_type() !=
344+
executorch_flatbuffer::KernelTypes::Tensor) {
345+
continue;
346+
}
347+
const auto s_tensor = static_cast<const executorch_flatbuffer::Tensor*>(
348+
serialization_value->val());
349+
// Constant tensors are resolved here; tensors with allocation_info are
350+
// mutable and are resolved in parse_values.
351+
if (s_tensor->extra_tensor_info() == nullptr ||
352+
s_tensor->extra_tensor_info()->location() !=
353+
executorch_flatbuffer::TensorDataLocation::EXTERNAL ||
354+
s_tensor->allocation_info() != nullptr) {
355+
continue;
356+
}
357+
ET_CHECK_OR_RETURN_ERROR(
358+
s_tensor->extra_tensor_info()->fully_qualified_name() != nullptr,
359+
InvalidExternalData,
360+
"Fully qualified name of external tensor is null at index %zu",
361+
i);
362+
363+
const char* key =
364+
s_tensor->extra_tensor_info()->fully_qualified_name()->c_str();
365+
366+
// Check if this tensor has already been resolved.
367+
if (get_data_by_key(
368+
key, Span<NamedData>(external_constants_, n_external_constants_)) !=
369+
nullptr) {
370+
continue;
371+
}
372+
Result<const TensorLayout> tensor_layout =
373+
named_data_map->get_metadata(key);
374+
if (!tensor_layout.ok()) {
375+
return tensor_layout.error();
376+
}
377+
// Check external tensor compatibility.
378+
Error err =
379+
deserialization::validateTensorLayout(s_tensor, tensor_layout.get());
380+
if (err != Error::Ok) {
381+
return err;
382+
}
383+
// Save the key.
384+
external_constants_[n_external_constants_].key = key;
385+
386+
// Save the buffer.
387+
Result<FreeableBuffer> buffer = named_data_map->get_data(key);
388+
ET_CHECK_OR_RETURN_ERROR(
389+
buffer.ok(),
390+
InvalidExternalData,
391+
"Buffer retrieved from get_data is not valid");
392+
new (&external_constants_[n_external_constants_].buffer)
393+
FreeableBuffer(std::move(buffer.get()));
394+
395+
n_external_constants_ += 1;
396+
}
397+
return Error::Ok;
398+
}
399+
292400
Error Method::parse_values(const NamedDataMap* named_data_map) {
293401
auto flatbuffer_values = serialization_plan_->values();
294402
ET_CHECK_OR_RETURN_ERROR(
@@ -299,23 +407,37 @@ Error Method::parse_values(const NamedDataMap* named_data_map) {
299407
return Error::MemoryAllocationFailed;
300408
}
301409

410+
// Count the number of tensors marked as EXTERNAL for this method. The actual
411+
// number of external constants may be smaller, eg. if multiple tensors point
412+
// to the same underlying data buffer.
413+
// This function also ensures that all flatbuffer_values entries
414+
// are non-null, so `val_as_X()` calls below are guaranteed to return
415+
// non-null pointers.
416+
Result<size_t> max_external_constants = get_num_external_constants();
417+
if (!max_external_constants.ok()) {
418+
return max_external_constants.error();
419+
}
420+
if (max_external_constants.get() > 0) {
421+
// Allocate space for external tensors.
422+
external_constants_ =
423+
memory_manager_->method_allocator()->allocateList<NamedData>(
424+
max_external_constants.get());
425+
if (external_constants_ == nullptr) {
426+
return Error::MemoryAllocationFailed;
427+
}
428+
Error err = parse_external_constants(named_data_map);
429+
if (err != Error::Ok) {
430+
return err;
431+
}
432+
}
433+
302434
// n_value_ counts the number of successfully-initialized values for ~Method()
303435
// to clean up, and is incremented at the bottom of the loop. This makes it
304436
// safe for errors to return without updating any state.
305437
n_value_ = 0;
306438

307439
for (size_t i = 0; i < n_value; ++i) {
308440
auto serialization_value = flatbuffer_values->Get(i);
309-
// Ensure that the `val_as_X()` calls will return non-null pointers.
310-
ET_CHECK_OR_RETURN_ERROR(
311-
serialization_value != nullptr &&
312-
(serialization_value->val_type() ==
313-
executorch_flatbuffer::KernelTypes::Null ||
314-
serialization_value->val() != nullptr),
315-
InvalidProgram,
316-
"Null value at index %" ET_PRIsize_t,
317-
i);
318-
319441
const auto val = serialization_value->val();
320442

321443
switch (serialization_value->val_type()) {
@@ -416,7 +538,8 @@ Error Method::parse_values(const NamedDataMap* named_data_map) {
416538
program_,
417539
memory_manager_,
418540
static_cast<const executorch_flatbuffer::Tensor*>(val),
419-
named_data_map);
541+
named_data_map,
542+
Span<NamedData>(external_constants_, n_external_constants_));
420543
if (!t.ok()) {
421544
ET_LOG(
422545
Error,
@@ -1496,6 +1619,10 @@ Method::~Method() {
14961619
delegates_[i].~BackendDelegate();
14971620
}
14981621
}
1622+
// Free resources associated with external constants.
1623+
for (int i = 0; i < n_external_constants_; i++) {
1624+
external_constants_[i].buffer.~FreeableBuffer();
1625+
}
14991626
// All other fields are trivially destructible.
15001627
}
15011628
} // namespace runtime

runtime/executor/method.h

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,12 @@ struct EValue;
3131
namespace executorch {
3232
namespace runtime {
3333

34+
// Forward declare NamedData. This is a public header and must not include
35+
// internal data types.
36+
namespace deserialization {
37+
struct NamedData;
38+
} // namespace deserialization
39+
3440
// Forward declare Program to avoid a circular reference.
3541
class Program;
3642

@@ -42,6 +48,7 @@ using OpFunction = void (*)(KernelRuntimeContext&, EValue**);
4248
/// A list of pointers into the master values table that together compose the
4349
/// argument list for a single instruction
4450
using InstructionArgs = Span<EValue*>;
51+
using deserialization::NamedData;
4552

4653
/**
4754
* An executable method of an executorch program. Maps to a python method like
@@ -66,13 +73,17 @@ class Method final {
6673
delegates_(rhs.delegates_),
6774
n_chains_(rhs.n_chains_),
6875
chains_(rhs.chains_),
76+
external_constants_(rhs.external_constants_),
77+
n_external_constants_(rhs.n_external_constants_),
6978
init_state_(rhs.init_state_) {
7079
// Required: clear out fields that the dtor looks at, so that we don't free
7180
// anything twice.
7281
rhs.n_value_ = 0;
7382
rhs.values_ = nullptr;
7483
rhs.n_delegate_ = 0;
7584
rhs.delegates_ = nullptr;
85+
rhs.n_external_constants_ = 0;
86+
rhs.external_constants_ = nullptr;
7687

7788
// Helpful: Try to ensure that any other interactions with the old object
7889
// result in failures.
@@ -288,6 +299,8 @@ class Method final {
288299
delegates_(nullptr),
289300
n_chains_(0),
290301
chains_(nullptr),
302+
external_constants_(nullptr),
303+
n_external_constants_(0),
291304
init_state_(InitializationState::Uninitialized) {}
292305

293306
/// Static factory used by Program.
@@ -336,8 +349,31 @@ class Method final {
336349
size_t n_chains_;
337350
Chain* chains_;
338351

352+
NamedData* external_constants_;
353+
size_t n_external_constants_ = 0;
354+
339355
InitializationState init_state_;
340356

357+
/**
358+
* Counts the number of tensors marked as EXTERNAL in the flatbuffer
359+
* for this method.
360+
*/
361+
ET_NODISCARD Result<size_t> get_num_external_constants();
362+
363+
/**
364+
* Parses the flatbuffer for constant tensors tagged as EXTERNAL.
365+
* Retrieves the external constants using the named_data_map and places them
366+
* into `external_constants_`. Updates `n_external_constants_` to count the
367+
* number of successfully-initialized external constants.
368+
* FreeableBuffers returned by the named_data_map are owned by the
369+
* method and are freed on method destruction.
370+
*
371+
* @param[in] named_data_map, to retrieve external constants from.
372+
* @returns Error::Ok on success, non-Ok on failure.
373+
*/
374+
ET_NODISCARD Error
375+
parse_external_constants(const NamedDataMap* named_data_map);
376+
341377
/**
342378
* Parses the elements of the values_ array. On error, n_value_ will be set to
343379
* the number of successfully-initialized entries so that ~Method doesn't try

runtime/executor/tensor_parser.h

Lines changed: 24 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,18 +21,34 @@ namespace executorch {
2121
namespace runtime {
2222
namespace deserialization {
2323

24+
/// Data structure to hold key and data buffer for external data used
25+
/// in a method.
26+
struct NamedData {
27+
const char* key;
28+
FreeableBuffer buffer;
29+
};
30+
31+
NamedData* get_data_by_key(const char* key, Span<NamedData> entries);
32+
2433
ET_NODISCARD Result<executorch::aten::Tensor> parseTensor(
2534
const Program* program,
2635
MemoryManager* memory_manager,
2736
const executorch_flatbuffer::Tensor* s_tensor,
28-
const NamedDataMap* named_data_map = nullptr);
37+
const NamedDataMap* named_data_map = nullptr,
38+
Span<NamedData> external_constants = {});
2939

3040
ET_NODISCARD Result<BoxedEvalueList<executorch::aten::Tensor>> parseTensorList(
3141
const flatbuffers::Vector<int32_t>* tensor_indices,
3242
EValue* values,
3343
size_t values_len,
3444
MemoryManager* memory_manager);
3545

46+
// Checks that the sizes, dim_order and scalar_type match between tensors
47+
// stored in the PTE and externally.
48+
ET_NODISCARD Error validateTensorLayout(
49+
const executorch_flatbuffer::Tensor* s_tensor,
50+
const TensorLayout& expected_layout);
51+
3652
// Deserializes a List of optional type. The code here is the same between all
3753
// list of optionals: list of optional Tensor, list of optional float etc, so we
3854
// just use a template to avoid boilerplate.
@@ -105,7 +121,11 @@ parseListOptionalType(
105121
* @param[in] nbytes The amount of memory to get from the allocator.
106122
* @param[in] allocator The source of memory for non-constant tensors.
107123
* @param[in] named_data_map An optional map of {name, blob} used to resolve
108-
* data that is external to the PTE, if any.
124+
* data that is mutable and external to the PTE, if any.
125+
* @param[in] external_constants An optional span containing tensor fqn to
126+
* corresponding tensor data. Used to resolve data that is constant and
127+
* external to the PTE, if any. Referencing data from external_constants is
128+
* safe, as it has the same lifetime as the method.
109129
*
110130
* @returns On success, the data pointer to use for the tensor. On failure, a
111131
* non-Ok Error.
@@ -115,7 +135,8 @@ ET_NODISCARD Result<void*> getTensorDataPtr(
115135
const Program* program,
116136
size_t nbytes,
117137
HierarchicalAllocator* allocator,
118-
const NamedDataMap* named_data_map = nullptr);
138+
const NamedDataMap* named_data_map = nullptr,
139+
Span<NamedData> external_constants = {});
119140

120141
} // namespace deserialization
121142
} // namespace runtime

runtime/executor/tensor_parser_aten.cpp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,8 @@ Result<at::Tensor> parseTensor(
3333
const Program* program,
3434
MemoryManager* memory_manager,
3535
const executorch_flatbuffer::Tensor* s_tensor,
36-
const NamedDataMap* named_data_map) {
36+
const NamedDataMap* named_data_map,
37+
Span<NamedData> external_constants) {
3738
EXECUTORCH_SCOPE_PROF("TensorParser::parseTensor");
3839

3940
ET_CHECK_OR_RETURN_ERROR(
@@ -108,7 +109,8 @@ Result<at::Tensor> parseTensor(
108109
program,
109110
tensor.nbytes(),
110111
memory_manager->planned_memory(),
111-
named_data_map);
112+
named_data_map,
113+
external_constants);
112114
if (!data_ptr.ok()) {
113115
ET_LOG(
114116
Error,

0 commit comments

Comments
 (0)