oneapi-src · bratpiorka · Jan 17, 2024 · Dec 19, 2023 · lukaszstolarczuk · Jan 16, 2024
@@ -15,7 +15,9 @@ A UMF memory pool is a combination of a pool allocator and a memory provider. A
 
 Pool allocator can leverage existing allocators (e.g. jemalloc or tbbmalloc) or be written from scratch. 
 
-UMF comes with predefined pool allocators (see include/pool) and providers (see include/provider). UMF can also work with user-defined pools and providers that implement a specific interface (see include/umf/memory_pool_ops.h and include/umf/memory_provider_ops.h)
+UMF comes with predefined pool allocators (see include/pool) and providers (see include/provider). UMF can also work with user-defined pools and providers that implement a specific interface (see include/umf/memory_pool_ops.h and include/umf/memory_provider_ops.h).
+
+More detailed documentation is available here: https://oneapi-src.github.io/unified-memory-framework/
 
 ## Memory providers
 

@@ -1,5 +1,5 @@
 ==========================================
-Unified Memory Framework API Documentation
+API Documentation
 ==========================================
 
 Globals

@@ -0,0 +1,99 @@
+Glossary
+==========================================================
+
+Homogeneous Memory System  
+  A system that operates on a single type of memory implemented using a single 
+  technology.
+
+Heterogeneous Memory System 
+  A system that operates on multiple types of memories, possibly implemented 
+  using different technologies, often managed by different entities.
+
+Memory Tiering
+  An organization of different types of memory storage within a system, each 
+  having distinct characteristics, performance, and cost attributes. These 
+  memory tiers are typically organized in a hierarchy, with faster, more 
+  expensive memory located closer to the processor and slower, less expensive 
+  memory located further away.
+
+Memory Access Initiator 
+  A component in a computer system that initiates or requests access to the 
+  computer's memory subsystem. This could be a CPU, GPU, or other I/O and cache 
+  devices.
+
+Memory Target 
+  Any part of the memory subsystem that can handle memory access requests. This 
+  could be the OS-accessible main memory (RAM), video memory that resides on 
+  the graphics cards, memory caches, storage, external memory devices connected 
+  using CXL.mem protocol, etc.
+
+Memory Page 
+  A fixed-length contiguous block of virtual memory, described by a single 
+  entry in the page table. It is the smallest unit of data for memory 
+  management in a virtual memory operating system.
+
+Enlightened Application 
+  An application that explicitly manages data allocation distribution among 
+  different types of memory and handles data migration between them. 
+
+Unenlightened Application 
+  An application that relies on the underlying infrastructure (OS, frameworks, 
+  libraries) that offers various memory tiering and migration solutions without 
+  any code modifications.
+
+Memory Pool 
+  A memory management technique used in computer programming and software 
+  development, where relatively large blocks of memory are preallocated using 
+  memory provider and then passed to a pool allocator for fine-grain 
+  management. The pool allocator could divide these blocks into smaller chunks 
+  and use them for application allocations depending on its needs. Typically 
+  pool allocators focus on the low fragmentation and constant allocation time, 
+  so they are used to optimize memory allocation and deallocation in scenarios 
+  where efficiency and performance are critical.
+
+Pool Allocator 
+  A memory allocator type used to efficiently manage memory pools. Among the 
+  existing ones are jemalloc or oneTBB's Scalable Memory Allocator.
+
+Memory Provider 
+  A software component responsible for supplying memory or managing memory 
+  targets. A single memory provider can efficiently manage the memory 
+  operations for one or multiple devices within the system or other memory 
+  sources like file-backed or user-provided memory. Memory providers are 
+  responsible for coarse-grain allocations and management of memory pages.
+
+High Bandwidth Memory (HBM)
+  A high-speed computer memory. It is used in conjunction with high-performance 
+  graphics accelerators, network devices, and high-performance data centers, as 
+  on-package cache in CPUs, FPGAs, supercomputers, etc.
+
+Compute Express Link (`CXL`_)
+  An open standard for high-speed, high-capacity central processing unit 
+  (CPU)-to-device and CPU-to-memory connections, designed for high-performance 
+  data center computers. CXL is built on the serial PCI Express (PCIe) physical 
+  and electrical interface and includes PCIe-based block input/output protocol 
+  (CXL.io), cache-coherent protocols for accessing system memory (CXL.cache), 
+  and device memory (CXL.mem).
+
+oneAPI Threading Building Blocks (`oneTBB`_)
+  A C++ template library developed by Intel for parallel programming on 
+  multi-core processors. TBB broke down the computation into tasks that can run 
+  in parallel. The library manages and schedules threads to execute these tasks.
+
+jemalloc 
+  A general-purpose malloc implementation that emphasizes fragmentation 
+  avoidance and scalable concurrency support. It provides introspection, memory 
+  management, and tuning features functionalities. `Jemalloc`_ uses separate 
+  pools (“arenas”) for each CPU which avoids lock contention problems in 
+  multithreading applications and makes them scale linearly with the number of 
+  threads.
+
+Unified Shared Memory (USM) 
+  A programming model which provides a single memory address space that is 
+  shared between CPUs, GPUs, and possibly other accelerators. It simplifies 
+  memory management by transparently handling data migration between the CPU 
+  and the accelerator device as needed.
+
+.. _CXL: https://www.computeexpresslink.org/
+.. _oneTBB: https://oneapi-src.github.io/oneTBB/
+.. _Jemalloc: https://jemalloc.net/
@@ -7,4 +7,6 @@ Intel Unified Memory Framework documentation
 .. toctree::
    :maxdepth: 3
 
+   introduction.rst
    api.rst
+   glossary.rst
@@ -0,0 +1,100 @@
+==============
+ Introduction
+==============
+
+The amount of data that needs to be processed by modern workloads is continuously 
+growing. To address the increasing demand, memory subsystem of modern server 
+platforms is becoming heterogeneous. For example, High-Bandwidth Memory (HBM) 
+addresses throughput needs; the CXL protocol closes the capacity gap and tends 
+to improve memory utilization by memory pooling capabilities. Beyond CPU use 
+cases, there are GPU accelerators with their own memory on board. 
+
+Modern heterogeneous memory platforms present a range of opportunities. At the 
+same time, they introduce new challenges that could require software updates to 
+fully utilize the HW features. There are two main problems that modern 
+applications need to deal with. The first one is appropriate data placement and 
+data migration between different types of memory. The second one is how SW 
+should leverage different memory topologies. 
+
+All applications can be divided into two big groups: enlightened and 
+unenlightened. Enlightened applications explicitly manage data allocation 
+distribution among memory tiers and further data migration. Unenlightened 
+applications do not require any code modifications and rely on underlying 
+infrastructure. An underlying infrastructure refers not only to the OS with 
+various memory tiering solutions to migrate memory pages between tiers, but 
+also middleware: frameworks and libraries. 
+
+==============
+ Architecture
+==============
+
+The Unified Memory Framework (`UMF`_) is a library for constructing allocators 
+and memory pools. It also contains broadly useful abstractions and utilities 
+for memory management. UMF allows users to create and manage multiple memory 
+pools characterized by different attributes, allowing certain allocation types 
+to be isolated from others and allocated using different hardware resources as 
+required. 
+
+A memory pool is a combination of a pool allocator instance and a memory 
+provider instance along with their properties and allocation policies. 
+Specifically, a memory provider is responsible for coarse-grained memory 
+allocations, while the pool allocator controls the pool and handles 
+fine-grained memory allocations. UMF defines distinct interfaces for both pool 
+allocators and memory providers. Users can use pool allocators and memory 
+providers provided by UMF or create their own.
+
+.. figure:: ../assets/images/intro_architecture.png
+
+The UMF library contains various pool allocators and memory providers but also 
+allows for the integration of external ones, giving users the flexibility to 
+either use existing solutions or provide their implementations. 
+
+Memory Providers
+================
+
+A memory provider is an abstraction for coarse (memory page) allocations and 
+deallocations of target memory types, such as host CPU, GPU, or CXL memory. 
+A single distinct memory provider can efficiently operate the memory of devices 
+on the platform or other memory sources such as file-backed or user-provider 
+memory.
+
+UMF comes with several bundled memory providers. Please refer to `README.md`_ 
+to see a full list of them. There is also a possibility to use externally 
+defined memory providers if they implement the UMF interface.
+
+To instantiate a memory provider, user must pass an additional context which 
+contains the details about the specific memory target that should be used. This 
+would be a NUMA node mask for the OS memory provider, file path for the 
+file-backed memory provider, etc. After creation, the memory provider context
+can't be changed.
+
+Pool Allocators
+===============
+
+A pool allocator is an abstraction over object-level memory management based 
+on coarse chunks acquired from the memory provider. It manages the memory pool 
+and services fine-grained malloc/free requests. 
+
+Pool allocators can be implemented to be general purpose or to fulfill 
+specific use cases. Implementations of the pool allocator interface can 
+leverage existing allocators (e.g., jemalloc or oneTBB) or be fully 
+customizable. The pool allocator abstraction could contain basic memory 
+management interfaces, as well as more complex ones that can be used, for 
+example, by the implementation for page monitoring or control (e.g., `madvise`).
+
+UMF comes with several bundled pool allocators. Please refer to `README.md`_ 
+to see a full list of them. There is also a possibility to use externally 
+defined pool allocators if they implement the UMF interface.
+
+Memory Pools
+============
+
+A memory pool consists of a pool allocator and a memory provider instancies 
+along with their properties and allocation policies. Memory pools are used by 
+the `allocation API`_ as a first argument. There is also a possibility to 
+retrieve a memory pool from an existing memory pointer that points to a memory 
+previously allocated by UMF.
+
+.. _UMF: https://github.com/oneapi-src/unified-memory-framework
+.. _README.md: https://github.com/oneapi-src/unified-memory-framework/blob/main/README.md
+.. _allocation API: https://oneapi-src.github.io/unified-memory-framework/api.html#memory-pool