-
Notifications
You must be signed in to change notification settings - Fork 35
add introduction to docs #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
Glossary | ||
========================================================== | ||
|
||
Homogeneous Memory System | ||
A system that operates on a single type of memory implemented using a single | ||
technology. | ||
|
||
Heterogeneous Memory System | ||
A system that operates on multiple types of memories, possibly implemented | ||
using different technologies, often managed by different entities. | ||
|
||
Memory Tiering | ||
An organization of different types of memory storage within a system, each | ||
having distinct characteristics, performance, and cost attributes. These | ||
memory tiers are typically organized in a hierarchy, with faster, more | ||
expensive memory located closer to the processor and slower, less expensive | ||
memory located further away. | ||
|
||
Memory Access Initiator | ||
A component in a computer system that initiates or requests access to the | ||
computer's memory subsystem. This could be a CPU, GPU, or other I/O and cache | ||
devices. | ||
|
||
Memory Target | ||
Any part of the memory subsystem that can handle memory access requests. This | ||
could be the OS-accessible main memory (RAM), video memory that resides on | ||
the graphics cards, memory caches, storage, external memory devices connected | ||
using CXL.mem protocol, etc. | ||
|
||
Memory Page | ||
A fixed-length contiguous block of virtual memory, described by a single | ||
entry in the page table. It is the smallest unit of data for memory | ||
management in a virtual memory operating system. | ||
|
||
Enlightened Application | ||
An application that explicitly manages data allocation distribution among | ||
different types of memory and handles data migration between them. | ||
|
||
Unenlightened Application | ||
An application that relies on the underlying infrastructure (OS, frameworks, | ||
libraries) that offers various memory tiering and migration solutions without | ||
any code modifications. | ||
|
||
Memory Pool | ||
A memory management technique used in computer programming and software | ||
development, where relatively large blocks of memory are preallocated using | ||
memory provider and then passed to a pool allocator for fine-grain | ||
management. The pool allocator could divide these blocks into smaller chunks | ||
and use them for application allocations depending on its needs. Typically | ||
pool allocators focus on the low fragmentation and constant allocation time, | ||
so they are used to optimize memory allocation and deallocation in scenarios | ||
where efficiency and performance are critical. | ||
|
||
Pool Allocator | ||
igchor marked this conversation as resolved.
Show resolved
Hide resolved
|
||
A memory allocator type used to efficiently manage memory pools. Among the | ||
existing ones are jemalloc or oneTBB's Scalable Memory Allocator. | ||
|
||
Memory Provider | ||
A software component responsible for supplying memory or managing memory | ||
targets. A single memory provider can efficiently manage the memory | ||
operations for one or multiple devices within the system or other memory | ||
sources like file-backed or user-provided memory. Memory providers are | ||
responsible for coarse-grain allocations and management of memory pages. | ||
|
||
High Bandwidth Memory (HBM) | ||
A high-speed computer memory. It is used in conjunction with high-performance | ||
graphics accelerators, network devices, and high-performance data centers, as | ||
on-package cache in CPUs, FPGAs, supercomputers, etc. | ||
|
||
Compute Express Link (`CXL`_) | ||
An open standard for high-speed, high-capacity central processing unit | ||
(CPU)-to-device and CPU-to-memory connections, designed for high-performance | ||
data center computers. CXL is built on the serial PCI Express (PCIe) physical | ||
and electrical interface and includes PCIe-based block input/output protocol | ||
(CXL.io), cache-coherent protocols for accessing system memory (CXL.cache), | ||
and device memory (CXL.mem). | ||
|
||
oneAPI Threading Building Blocks (`oneTBB`_) | ||
A C++ template library developed by Intel for parallel programming on | ||
multi-core processors. TBB broke down the computation into tasks that can run | ||
in parallel. The library manages and schedules threads to execute these tasks. | ||
|
||
jemalloc | ||
A general-purpose malloc implementation that emphasizes fragmentation | ||
avoidance and scalable concurrency support. It provides introspection, memory | ||
management, and tuning features functionalities. `Jemalloc`_ uses separate | ||
pools (“arenas”) for each CPU which avoids lock contention problems in | ||
multithreading applications and makes them scale linearly with the number of | ||
threads. | ||
|
||
Unified Shared Memory (USM) | ||
A programming model which provides a single memory address space that is | ||
shared between CPUs, GPUs, and possibly other accelerators. It simplifies | ||
memory management by transparently handling data migration between the CPU | ||
and the accelerator device as needed. | ||
|
||
.. _CXL: https://www.computeexpresslink.org/ | ||
.. _oneTBB: https://oneapi-src.github.io/oneTBB/ | ||
.. _Jemalloc: https://jemalloc.net/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,4 +7,6 @@ Intel Unified Memory Framework documentation | |
.. toctree:: | ||
:maxdepth: 3 | ||
|
||
introduction.rst | ||
api.rst | ||
glossary.rst |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
============== | ||
igchor marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Introduction | ||
============== | ||
|
||
The amount of data that needs to be processed by modern workloads is continuously | ||
growing. To address the increasing demand, memory subsystem of modern server | ||
platforms is becoming heterogeneous. For example, High-Bandwidth Memory (HBM) | ||
addresses throughput needs; the CXL protocol closes the capacity gap and tends | ||
to improve memory utilization by memory pooling capabilities. Beyond CPU use | ||
cases, there are GPU accelerators with their own memory on board. | ||
|
||
Modern heterogeneous memory platforms present a range of opportunities. At the | ||
same time, they introduce new challenges that could require software updates to | ||
fully utilize the HW features. There are two main problems that modern | ||
applications need to deal with. The first one is appropriate data placement and | ||
data migration between different types of memory. The second one is how SW | ||
should leverage different memory topologies. | ||
|
||
All applications can be divided into two big groups: enlightened and | ||
PatKamin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
unenlightened. Enlightened applications explicitly manage data allocation | ||
distribution among memory tiers and further data migration. Unenlightened | ||
applications do not require any code modifications and rely on underlying | ||
infrastructure. An underlying infrastructure refers not only to the OS with | ||
various memory tiering solutions to migrate memory pages between tiers, but | ||
also middleware: frameworks and libraries. | ||
|
||
============== | ||
Architecture | ||
============== | ||
|
||
The Unified Memory Framework (`UMF`_) is a library for constructing allocators | ||
and memory pools. It also contains broadly useful abstractions and utilities | ||
for memory management. UMF allows users to create and manage multiple memory | ||
pools characterized by different attributes, allowing certain allocation types | ||
to be isolated from others and allocated using different hardware resources as | ||
required. | ||
|
||
A memory pool is a combination of a pool allocator instance and a memory | ||
provider instance along with their properties and allocation policies. | ||
Specifically, a memory provider is responsible for coarse-grained memory | ||
allocations, while the pool allocator controls the pool and handles | ||
fine-grained memory allocations. UMF defines distinct interfaces for both pool | ||
allocators and memory providers. Users can use pool allocators and memory | ||
providers provided by UMF or create their own. | ||
|
||
.. figure:: ../assets/images/intro_architecture.png | ||
|
||
The UMF library contains various pool allocators and memory providers but also | ||
allows for the integration of external ones, giving users the flexibility to | ||
either use existing solutions or provide their implementations. | ||
kswiecicki marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Memory Providers | ||
================ | ||
|
||
A memory provider is an abstraction for coarse (memory page) allocations and | ||
deallocations of target memory types, such as host CPU, GPU, or CXL memory. | ||
A single distinct memory provider can efficiently operate the memory of devices | ||
on the platform or other memory sources such as file-backed or user-provider | ||
memory. | ||
|
||
UMF comes with several bundled memory providers. Please refer to `README.md`_ | ||
to see a full list of them. There is also a possibility to use externally | ||
PatKamin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
defined memory providers if they implement the UMF interface. | ||
|
||
To instantiate a memory provider, user must pass an additional context which | ||
contains the details about the specific memory target that should be used. This | ||
would be a NUMA node mask for the OS memory provider, file path for the | ||
PatKamin marked this conversation as resolved.
Show resolved
Hide resolved
|
||
file-backed memory provider, etc. After creation, the memory provider context | ||
can't be changed. | ||
|
||
Pool Allocators | ||
=============== | ||
|
||
A pool allocator is an abstraction over object-level memory management based | ||
on coarse chunks acquired from the memory provider. It manages the memory pool | ||
and services fine-grained malloc/free requests. | ||
|
||
Pool allocators can be implemented to be general purpose or to fulfill | ||
specific use cases. Implementations of the pool allocator interface can | ||
leverage existing allocators (e.g., jemalloc or oneTBB) or be fully | ||
customizable. The pool allocator abstraction could contain basic memory | ||
management interfaces, as well as more complex ones that can be used, for | ||
example, by the implementation for page monitoring or control (e.g., `madvise`). | ||
|
||
UMF comes with several bundled pool allocators. Please refer to `README.md`_ | ||
to see a full list of them. There is also a possibility to use externally | ||
defined pool allocators if they implement the UMF interface. | ||
|
||
Memory Pools | ||
============ | ||
|
||
A memory pool consists of a pool allocator and a memory provider instancies | ||
along with their properties and allocation policies. Memory pools are used by | ||
the `allocation API`_ as a first argument. There is also a possibility to | ||
retrieve a memory pool from an existing memory pointer that points to a memory | ||
previously allocated by UMF. | ||
|
||
.. _UMF: https://github.com/oneapi-src/unified-memory-framework | ||
.. _README.md: https://github.com/oneapi-src/unified-memory-framework/blob/main/README.md | ||
.. _allocation API: https://oneapi-src.github.io/unified-memory-framework/api.html#memory-pool |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can add links in repo to these file: