Skip to content

Commit a848175

Browse files
authored
Merge pull request #78 from bratpiorka/rrudnick_doc_ext
add introduction to docs
2 parents c1c55b6 + 69eccd1 commit a848175

File tree

6 files changed

+205
-2
lines changed

6 files changed

+205
-2
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,9 @@ A UMF memory pool is a combination of a pool allocator and a memory provider. A
1515

1616
Pool allocator can leverage existing allocators (e.g. jemalloc or tbbmalloc) or be written from scratch.
1717

18-
UMF comes with predefined pool allocators (see include/pool) and providers (see include/provider). UMF can also work with user-defined pools and providers that implement a specific interface (see include/umf/memory_pool_ops.h and include/umf/memory_provider_ops.h)
18+
UMF comes with predefined pool allocators (see include/pool) and providers (see include/provider). UMF can also work with user-defined pools and providers that implement a specific interface (see include/umf/memory_pool_ops.h and include/umf/memory_provider_ops.h).
19+
20+
More detailed documentation is available here: https://oneapi-src.github.io/unified-memory-framework/
1921

2022
## Memory providers
2123

39.5 KB
Loading

scripts/docs_config/api.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
==========================================
2-
Unified Memory Framework API Documentation
2+
API Documentation
33
==========================================
44

55
Globals

scripts/docs_config/glossary.rst

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
Glossary
2+
==========================================================
3+
4+
Homogeneous Memory System
5+
A system that operates on a single type of memory implemented using a single
6+
technology.
7+
8+
Heterogeneous Memory System
9+
A system that operates on multiple types of memories, possibly implemented
10+
using different technologies, often managed by different entities.
11+
12+
Memory Tiering
13+
An organization of different types of memory storage within a system, each
14+
having distinct characteristics, performance, and cost attributes. These
15+
memory tiers are typically organized in a hierarchy, with faster, more
16+
expensive memory located closer to the processor and slower, less expensive
17+
memory located further away.
18+
19+
Memory Access Initiator
20+
A component in a computer system that initiates or requests access to the
21+
computer's memory subsystem. This could be a CPU, GPU, or other I/O and cache
22+
devices.
23+
24+
Memory Target
25+
Any part of the memory subsystem that can handle memory access requests. This
26+
could be the OS-accessible main memory (RAM), video memory that resides on
27+
the graphics cards, memory caches, storage, external memory devices connected
28+
using CXL.mem protocol, etc.
29+
30+
Memory Page
31+
A fixed-length contiguous block of virtual memory, described by a single
32+
entry in the page table. It is the smallest unit of data for memory
33+
management in a virtual memory operating system.
34+
35+
Enlightened Application
36+
An application that explicitly manages data allocation distribution among
37+
different types of memory and handles data migration between them.
38+
39+
Unenlightened Application
40+
An application that relies on the underlying infrastructure (OS, frameworks,
41+
libraries) that offers various memory tiering and migration solutions without
42+
any code modifications.
43+
44+
Memory Pool
45+
A memory management technique used in computer programming and software
46+
development, where relatively large blocks of memory are preallocated using
47+
memory provider and then passed to a pool allocator for fine-grain
48+
management. The pool allocator could divide these blocks into smaller chunks
49+
and use them for application allocations depending on its needs. Typically
50+
pool allocators focus on the low fragmentation and constant allocation time,
51+
so they are used to optimize memory allocation and deallocation in scenarios
52+
where efficiency and performance are critical.
53+
54+
Pool Allocator
55+
A memory allocator type used to efficiently manage memory pools. Among the
56+
existing ones are jemalloc or oneTBB's Scalable Memory Allocator.
57+
58+
Memory Provider
59+
A software component responsible for supplying memory or managing memory
60+
targets. A single memory provider can efficiently manage the memory
61+
operations for one or multiple devices within the system or other memory
62+
sources like file-backed or user-provided memory. Memory providers are
63+
responsible for coarse-grain allocations and management of memory pages.
64+
65+
High Bandwidth Memory (HBM)
66+
A high-speed computer memory. It is used in conjunction with high-performance
67+
graphics accelerators, network devices, and high-performance data centers, as
68+
on-package cache in CPUs, FPGAs, supercomputers, etc.
69+
70+
Compute Express Link (`CXL`_)
71+
An open standard for high-speed, high-capacity central processing unit
72+
(CPU)-to-device and CPU-to-memory connections, designed for high-performance
73+
data center computers. CXL is built on the serial PCI Express (PCIe) physical
74+
and electrical interface and includes PCIe-based block input/output protocol
75+
(CXL.io), cache-coherent protocols for accessing system memory (CXL.cache),
76+
and device memory (CXL.mem).
77+
78+
oneAPI Threading Building Blocks (`oneTBB`_)
79+
A C++ template library developed by Intel for parallel programming on
80+
multi-core processors. TBB broke down the computation into tasks that can run
81+
in parallel. The library manages and schedules threads to execute these tasks.
82+
83+
jemalloc
84+
A general-purpose malloc implementation that emphasizes fragmentation
85+
avoidance and scalable concurrency support. It provides introspection, memory
86+
management, and tuning features functionalities. `Jemalloc`_ uses separate
87+
pools (“arenas”) for each CPU which avoids lock contention problems in
88+
multithreading applications and makes them scale linearly with the number of
89+
threads.
90+
91+
Unified Shared Memory (USM)
92+
A programming model which provides a single memory address space that is
93+
shared between CPUs, GPUs, and possibly other accelerators. It simplifies
94+
memory management by transparently handling data migration between the CPU
95+
and the accelerator device as needed.
96+
97+
.. _CXL: https://www.computeexpresslink.org/
98+
.. _oneTBB: https://oneapi-src.github.io/oneTBB/
99+
.. _Jemalloc: https://jemalloc.net/

scripts/docs_config/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,6 @@ Intel Unified Memory Framework documentation
77
.. toctree::
88
:maxdepth: 3
99

10+
introduction.rst
1011
api.rst
12+
glossary.rst

scripts/docs_config/introduction.rst

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
==============
2+
Introduction
3+
==============
4+
5+
The amount of data that needs to be processed by modern workloads is continuously
6+
growing. To address the increasing demand, memory subsystem of modern server
7+
platforms is becoming heterogeneous. For example, High-Bandwidth Memory (HBM)
8+
addresses throughput needs; the CXL protocol closes the capacity gap and tends
9+
to improve memory utilization by memory pooling capabilities. Beyond CPU use
10+
cases, there are GPU accelerators with their own memory on board.
11+
12+
Modern heterogeneous memory platforms present a range of opportunities. At the
13+
same time, they introduce new challenges that could require software updates to
14+
fully utilize the HW features. There are two main problems that modern
15+
applications need to deal with. The first one is appropriate data placement and
16+
data migration between different types of memory. The second one is how SW
17+
should leverage different memory topologies.
18+
19+
All applications can be divided into two big groups: enlightened and
20+
unenlightened. Enlightened applications explicitly manage data allocation
21+
distribution among memory tiers and further data migration. Unenlightened
22+
applications do not require any code modifications and rely on underlying
23+
infrastructure. An underlying infrastructure refers not only to the OS with
24+
various memory tiering solutions to migrate memory pages between tiers, but
25+
also middleware: frameworks and libraries.
26+
27+
==============
28+
Architecture
29+
==============
30+
31+
The Unified Memory Framework (`UMF`_) is a library for constructing allocators
32+
and memory pools. It also contains broadly useful abstractions and utilities
33+
for memory management. UMF allows users to create and manage multiple memory
34+
pools characterized by different attributes, allowing certain allocation types
35+
to be isolated from others and allocated using different hardware resources as
36+
required.
37+
38+
A memory pool is a combination of a pool allocator instance and a memory
39+
provider instance along with their properties and allocation policies.
40+
Specifically, a memory provider is responsible for coarse-grained memory
41+
allocations, while the pool allocator controls the pool and handles
42+
fine-grained memory allocations. UMF defines distinct interfaces for both pool
43+
allocators and memory providers. Users can use pool allocators and memory
44+
providers provided by UMF or create their own.
45+
46+
.. figure:: ../assets/images/intro_architecture.png
47+
48+
The UMF library contains various pool allocators and memory providers but also
49+
allows for the integration of external ones, giving users the flexibility to
50+
either use existing solutions or provide their implementations.
51+
52+
Memory Providers
53+
================
54+
55+
A memory provider is an abstraction for coarse (memory page) allocations and
56+
deallocations of target memory types, such as host CPU, GPU, or CXL memory.
57+
A single distinct memory provider can efficiently operate the memory of devices
58+
on the platform or other memory sources such as file-backed or user-provider
59+
memory.
60+
61+
UMF comes with several bundled memory providers. Please refer to `README.md`_
62+
to see a full list of them. There is also a possibility to use externally
63+
defined memory providers if they implement the UMF interface.
64+
65+
To instantiate a memory provider, user must pass an additional context which
66+
contains the details about the specific memory target that should be used. This
67+
would be a NUMA node mask for the OS memory provider, file path for the
68+
file-backed memory provider, etc. After creation, the memory provider context
69+
can't be changed.
70+
71+
Pool Allocators
72+
===============
73+
74+
A pool allocator is an abstraction over object-level memory management based
75+
on coarse chunks acquired from the memory provider. It manages the memory pool
76+
and services fine-grained malloc/free requests.
77+
78+
Pool allocators can be implemented to be general purpose or to fulfill
79+
specific use cases. Implementations of the pool allocator interface can
80+
leverage existing allocators (e.g., jemalloc or oneTBB) or be fully
81+
customizable. The pool allocator abstraction could contain basic memory
82+
management interfaces, as well as more complex ones that can be used, for
83+
example, by the implementation for page monitoring or control (e.g., `madvise`).
84+
85+
UMF comes with several bundled pool allocators. Please refer to `README.md`_
86+
to see a full list of them. There is also a possibility to use externally
87+
defined pool allocators if they implement the UMF interface.
88+
89+
Memory Pools
90+
============
91+
92+
A memory pool consists of a pool allocator and a memory provider instancies
93+
along with their properties and allocation policies. Memory pools are used by
94+
the `allocation API`_ as a first argument. There is also a possibility to
95+
retrieve a memory pool from an existing memory pointer that points to a memory
96+
previously allocated by UMF.
97+
98+
.. _UMF: https://github.com/oneapi-src/unified-memory-framework
99+
.. _README.md: https://github.com/oneapi-src/unified-memory-framework/blob/main/README.md
100+
.. _allocation API: https://oneapi-src.github.io/unified-memory-framework/api.html#memory-pool

0 commit comments

Comments
 (0)