Skip to content

Commit 103f3a8

Browse files
committed
add introduction and glossary to html docs
1 parent 89e431b commit 103f3a8

File tree

5 files changed

+151
-1
lines changed

5 files changed

+151
-1
lines changed
22.4 KB
Loading

scripts/docs_config/api.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
==========================================
2-
Unified Memory Framework API Documentation
2+
API Documentation
33
==========================================
44

55
Globals

scripts/docs_config/glossary.rst

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
Glossary
2+
==========================================================
3+
4+
Homogeneous Memory
5+
A collection of memory composed of a single memory type, managed by a singular
6+
driver using a uniform approach.
7+
8+
Heterogeneous Memory
9+
A set of memory composed of multiple types of memory technologies, each
10+
requiring distinct handling approaches often managed by separate drivers.
11+
12+
Memory Tiering
13+
An organization and hierarchy of different types of memory storage within a
14+
system, with each type of memory having distinct characteristics, performance,
15+
and cost attributes. These memory tiers are typically organized in a
16+
hierarchy, with faster, more expensive memory located closer to the processor
17+
and slower, less expensive memory located further away.
18+
19+
Memory Access Initiator
20+
A component in a computer system that initiates or requests access to the
21+
computer's memory subsystem. This could be a CPU, GPU, or other I/O and cache
22+
devices.
23+
24+
Memory Target
25+
Any part of the memory subsystem that can handle memory access requests. This
26+
could be the OS memory (RAM), video memory that resides on the graphics
27+
cards, memory caches, storage, external memory devices connected using
28+
CXL.mem protocol, etc.
29+
30+
Memory Page
31+
A fixed-length contiguous block of virtual memory, described by a single
32+
entry in the page table. It is the smallest unit of data for memory
33+
management in a virtual memory operating system.
34+
35+
Enlightened Application
36+
An application that explicitly manages data allocation distribution among
37+
memory tiers and further data migration.
38+
39+
Unenlightened Application
40+
An application that coexists with the underlying infrastructure (OS,
41+
frameworks, libraries) that offers various memory tiering and migration
42+
solutions without any code modifications.
43+
44+
Memory Pool
45+
A memory management technique used in computer programming and software
46+
development, where fixed-size blocks of memory are preallocated using one or
47+
more memory providers and then divided into smaller, fixed-size blocks or
48+
chunks. These smaller blocks are then allocated and deallocated by a pool
49+
allocator depending on the needs of the program or application. Thanks to
50+
low fragmentation and constant allocation time, memory pools are used to
51+
optimize memory allocation and deallocation in scenarios where efficiency
52+
and performance are critical.
53+
54+
Pool Allocator
55+
A memory allocator type used to efficiently manage memory pools.
56+
57+
Memory Provider
58+
A software component responsible for supplying memory or managing memory
59+
targets. A single memory provider kind can efficiently manage the memory
60+
operations for one or multiple devices within the system or other memory
61+
sources like file-backed or user-provided memory.
62+
63+
High Bandwidth Memory (HBM)
64+
A high-speed computer memory. It is used in conjunction with high-performance
65+
graphics accelerators, network devices, and high-performance data centers, as
66+
on-package cache on-package RAM in CPUs, FPGAs, supercomputers, etc.
67+
68+
Compute Express Link (CXL_)
69+
An open standard for high-speed, high-capacity central processing unit
70+
(CPU)-to-device and CPU-to-memory connections, designed for high-performance
71+
data center computers. CXL is built on the serial PCI Express (PCIe) physical
72+
and electrical interface and includes PCIe-based block input/output protocol
73+
(CXL.io), cache-coherent protocols for accessing system memory (CXL.cache),
74+
and device memory (CXL.mem).
75+
76+
oneAPI Threading Building Blocks (oneTBB_)
77+
A C++ template library developed by Intel for parallel programming on
78+
multi-core processors. TBB broke down the computation into tasks that can run
79+
in parallel. The library manages and schedules threads to execute these tasks.
80+
81+
jemalloc
82+
A general-purpose malloc implementation that emphasizes fragmentation
83+
avoidance and scalable concurrency support. It provides introspection, memory
84+
management, and tuning features functionalities. Jemalloc_ uses separate pools
85+
(“arenas”) for each CPU which avoids lock contention problems in
86+
multithreading applications and makes them scale linearly with the number of
87+
threads.
88+
89+
Unified Shared Memory (USM)
90+
A programming model which provides a single memory address space that is
91+
shared between CPUs, GPUs, and possibly other accelerators. It simplifies
92+
memory management by transparently handling data migration between the CPU
93+
and the accelerator device as needed.
94+
95+
.. _CXL https://www.computeexpresslink.org/
96+
.. _oneTBB https://oneapi-src.github.io/oneTBB/
97+
.. _Jemalloc https://jemalloc.net/

scripts/docs_config/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,6 @@ Intel Unified Memory Framework documentation
77
.. toctree::
88
:maxdepth: 3
99

10+
introduction.rst
1011
api.rst
12+
glossary.rst

scripts/docs_config/introduction.rst

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
==============
2+
Introduction
3+
==============
4+
5+
Motivation
6+
=========
7+
8+
The amount of data associated with modern workloads that need to be processed
9+
by modern workloads is continuously growing. To address the increasing demand
10+
memory subsystem of modern server platforms is becoming heterogeneous. For
11+
example, High-Bandwidth Memory (HBM) introduced in Sapphire Rapids addresses
12+
throughput needs; the emerging CXL protocol closes the capacity gap and tends
13+
to better memory utilization by memory pooling capabilities. Beyond CPU use
14+
cases, there are GPU accelerators with their own memory on board.
15+
The opportunities provided by modern heterogeneous memory platforms come
16+
together with additional challenges. This means that additional software
17+
changes might be required to fully leverage new HW capabilities. The are two
18+
main problems that modern applications need to deal with. The first one is
19+
appropriate data placement and data migration between different types of
20+
memory. The second one is how SW should deal with different memory topologies.
21+
All applications can be divided into two big groups: enlightened and
22+
unenlightened. Enlightened applications explicitly manage data allocation
23+
distribution among memory tiers and further data migration. Unenlightened
24+
applications do not require any code modifications and rely on underlying
25+
infrastructure which is in turn enlightened. And underlying infrastructure is
26+
not only OS with various memory tiering solutions to migrate memory pages
27+
between tiers, but also middleware: frameworks and libraries.
28+
29+
Architecture
30+
=========
31+
32+
The Unified Memory Framework (UMF) is a library for constructing allocators
33+
and memory pools. It also contains broadly useful abstractions and utilities
34+
for memory management. UMF allows users to manage multiple memory pools
35+
characterized by different attributes, allowing certain allocation types to be
36+
isolated from others and allocated using different hardware resources as
37+
required.
38+
39+
A memory pool is a combination of a pool allocator and one or more memory
40+
targets accessed by memory providers along with their properties and allocation
41+
policies. Specifically, a memory provider is responsible for coarse-grained
42+
memory allocations, while the pool allocator controls the pool and handles
43+
fine-grained memory allocations. UMF provides distinct interfaces for both pool
44+
allocators and memory providers, allowing integration into various
45+
applications.
46+
47+
.. figure:: ../assets/images/intro_architecture.png
48+
49+
The UMF library contains various pool allocators and memory providers but also
50+
allows for the integration of external ones, giving users the flexibility to
51+
either use existing solutions or provide their implementations.

0 commit comments

Comments
 (0)