Skip to content

Commit e6d50d2

Browse files
committed
add introduction and glossary to html docs
1 parent 6c485a2 commit e6d50d2

File tree

5 files changed

+202
-1
lines changed

5 files changed

+202
-1
lines changed
22.4 KB
Loading

scripts/docs_config/api.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
==========================================
2-
Unified Memory Framework API Documentation
2+
API Documentation
33
==========================================
44

55
Globals

scripts/docs_config/glossary.rst

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
Glossary
2+
==========================================================
3+
4+
Homogeneous Memory
5+
A set of memory used in a system that is composed of a single memory type of
6+
memory technology, managed by a singular driver using a uniform approach.
7+
8+
Heterogeneous Memory
9+
A set of memory used in a system that is composed of multiple types of memory
10+
technologies, each requiring distinct handling approaches often managed by
11+
separate drivers.
12+
13+
Memory Tiering
14+
An organization of different types of memory storage within a system, each
15+
having distinct characteristics, performance, and cost attributes. These
16+
memory tiers are typically orrganized in a hierarchy, with faster, more
17+
expensive memory located closer to the processor and slower, less expensive
18+
memory located further away.
19+
20+
Memory Access Initiator
21+
A component in a computer system that initiates or requests access to the
22+
computer's memory subsystem. This could be a CPU, GPU, or other I/O and cache
23+
devices.
24+
25+
Memory Target
26+
Any part of the memory subsystem that can handle memory access requests. This
27+
could be the OS-accessible main memory (RAM), video memory that resides on
28+
the graphics cards, memory caches, storage, and external memory devices
29+
connected using CXL.mem protocol, etc.
30+
31+
Memory Page
32+
A fixed-length contiguous block of virtual memory, described by a single
33+
entry in the page table. It is the smallest unit of data for memory
34+
management in a virtual memory operating system.
35+
36+
Enlightened Application
37+
An application that explicitly manages data allocation distribution among
38+
different types of memory and handles data migration between them.
39+
40+
Unenlightened Application
41+
An application that coexists with the underlying infrastructure (OS,
42+
frameworks, libraries) that offers various memory tiering and migration
43+
solutions without any code modifications.
44+
45+
Memory Pool
46+
A memory management technique used in computer programming and software
47+
development, where relatively large, fixed-size blocks of memory are
48+
preallocated using one or more memory providers and then passed to a pool
49+
allocator for fine-grain management. The pool allocator could divide these
50+
blocks into smaller chunks and use them for application allocations depending
51+
on its needs. Typically pool allocators focus on the low fragmentation and
52+
constant allocation time, so they are used to optimize memory allocation and
53+
deallocation in scenarios where efficiency and performance are critical.
54+
55+
Pool Allocator
56+
A memory allocator type used to efficiently manage memory pools. Among the
57+
existing ones are jemalloc or oneTBB's Scalable Memory Allocator.
58+
59+
Memory Provider
60+
A software component responsible for supplying memory or managing memory
61+
targets. A single memory provider kind can efficiently manage the memory
62+
operations for one or multiple devices within the system or other memory
63+
sources like file-backed or user-provided memory. Memory providers are
64+
responsible for coarse-grain allocations and management of memory pages.
65+
66+
High Bandwidth Memory (HBM)
67+
A high-speed computer memory. It is used in conjunction with high-performance
68+
graphics accelerators, network devices, and high-performance data centers, as
69+
on-package cache on-package RAM in CPUs, FPGAs, supercomputers, etc.
70+
71+
Compute Express Link (`CXL`_)
72+
An open standard for high-speed, high-capacity central processing unit
73+
(CPU)-to-device and CPU-to-memory connections, designed for high-performance
74+
data center computers. CXL is built on the serial PCI Express (PCIe) physical
75+
and electrical interface and includes PCIe-based block input/output protocol
76+
(CXL.io), cache-coherent protocols for accessing system memory (CXL.cache),
77+
and device memory (CXL.mem).
78+
79+
oneAPI Threading Building Blocks (`oneTBB`_)
80+
A C++ template library developed by Intel for parallel programming on
81+
multi-core processors. TBB broke down the computation into tasks that can run
82+
in parallel. The library manages and schedules threads to execute these tasks.
83+
84+
jemalloc
85+
A general-purpose malloc implementation that emphasizes fragmentation
86+
avoidance and scalable concurrency support. It provides introspection, memory
87+
management, and tuning features functionalities. `Jemalloc`_ uses separate
88+
pools (“arenas”) for each CPU which avoids lock contention problems in
89+
multithreading applications and makes them scale linearly with the number of
90+
threads.
91+
92+
Unified Shared Memory (USM)
93+
A programming model which provides a single memory address space that is
94+
shared between CPUs, GPUs, and possibly other accelerators. It simplifies
95+
memory management by transparently handling data migration between the CPU
96+
and the accelerator device as needed.
97+
98+
.. _CXL: https://www.computeexpresslink.org/
99+
.. _oneTBB: https://oneapi-src.github.io/oneTBB/
100+
.. _Jemalloc: https://jemalloc.net/

scripts/docs_config/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,6 @@ Intel Unified Memory Framework documentation
77
.. toctree::
88
:maxdepth: 3
99

10+
introduction.rst
1011
api.rst
12+
glossary.rst

scripts/docs_config/introduction.rst

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
==============
2+
Introduction
3+
==============
4+
5+
The amount of data that need to be processed by modern workloads is continuously
6+
growing. To address the increasing demand, memory subsystem of modern server
7+
platforms is becoming heterogeneous. For example, High-Bandwidth Memory (HBM)
8+
addresses throughput needs; the CXL protocol closes the capacity gap and tends
9+
to improve memory utilization by memory pooling capabilities. Beyond CPU use
10+
cases, there are GPU accelerators with their own memory on board.
11+
12+
The opportunities provided by modern heterogeneous memory platforms come
13+
together with additional challenges. This means that additional software
14+
changes might be required to fully leverage new HW capabilities. There are two
15+
main problems that modern applications need to deal with. The first one is
16+
appropriate data placement and data migration between different types of memory.
17+
The second one is how SW should leverage different memory topologies.
18+
19+
All applications can be divided into two big groups: enlightened and
20+
unenlightened. Enlightened applications explicitly manage data allocation
21+
distribution among memory tiers and further data migration. Unenlightened
22+
applications do not require any code modifications and rely on underlying
23+
infrastructure which is in turn enlightened. An underlying infrastructure
24+
refers to not only the OS with various memory tiering solutions to migrate
25+
memory pages between tiers, but also middleware: frameworks and libraries.
26+
27+
==============
28+
Architecture
29+
==============
30+
31+
The Unified Memory Framework (UMF) is a library for constructing allocators
32+
and memory pools. It also contains broadly useful abstractions and utilities
33+
for memory management. UMF allows users to manage multiple memory pools
34+
characterized by different attributes, allowing certain allocation types to be
35+
isolated from others and allocated using different hardware resources as
36+
required.
37+
38+
A memory pool is a combination of a pool allocator and one or more memory
39+
targets accessed by memory providers along with their properties and allocation
40+
policies. Specifically, a memory provider is responsible for coarse-grained
41+
memory allocations, while the pool allocator controls the pool and handles
42+
fine-grained memory allocations. UMF provides distinct interfaces for both pool
43+
allocators and memory providers, allowing integration into various
44+
applications.
45+
46+
.. figure:: ../assets/images/intro_architecture.png
47+
48+
The UMF library contains various pool allocators and memory providers but also
49+
allows for the integration of external ones, giving users the flexibility to
50+
either use existing solutions or provide their implementations.
51+
52+
Memory Providers
53+
================
54+
55+
A memory provider is an abstraction for coarse (memory page) allocations and
56+
deallocations of target memory types, such as host CPU, GPU, or CXL memory.
57+
A single memory provider kind can efficiently manage the memory operations for
58+
one or multiple devices within the system or other memory sources like
59+
file-backed or user-provided memory.
60+
61+
UMF comes with several bundled memory providers. Please refer to README.md
62+
to see a full list of them. There is also a possibility to use externally
63+
defined memory providers if they implement the UMF interface.
64+
65+
To instantiate a memory provider, user must pass an additional context which
66+
contains the details about the specific memory target that should be used. This
67+
would be a NUMA node mask for the OS memory provider, file path for the
68+
file-backed memory provider, etc. After creation, the memory provider context
69+
can't be changed.
70+
71+
Pool Allocators
72+
===============
73+
74+
A pool allocator is an abstraction over object-level memory management based
75+
on coarse chunks acquired from the memory provider. It manages the memory pool
76+
and services fine-grained malloc/free requests.
77+
78+
Pool allocators can be implemented to be general purpose or to fulfill
79+
specific use cases. Implementations of the pool allocator interface can
80+
leverage existing allocators (e.g., jemalloc or oneTBB) or be fully
81+
customizable. The pool allocator abstraction could contain basic memory
82+
management interfaces, as well as more complex ones that can be used, for
83+
example, by the implementation for page monitoring or control (e.g., `madvise`).
84+
85+
UMF comes with several bundled memory providers. Please refer to the README.md
86+
to see a full list of them. There is also a possibility to use externally
87+
defined pool allocators if they implement the UMF interface.
88+
89+
Memory Pools
90+
============
91+
92+
A Memory pool is a combination of a pool allocator and one or more memory
93+
targets accessed by memory providers. In UMF the user could either use some
94+
predefined memory pools or construct user-defined ones using the Pool Creation
95+
API.
96+
97+
After construction, memory pools are used by the Allocation API as a first
98+
argument. There is also a possibility to retrieve a memory pool from an
99+
existing memory pointer that points to a memory previously allocated by UMF.

0 commit comments

Comments
 (0)