[SandboxIR][Doc] Add a Doc file for Sandbox IR (#98691)

vporpo · yuxuanchen1997 · commit 9c429833f208 · 2024-07-25T12:51:21.000-07:00
This is under User Guides &gt; Additional Topics &gt; Sandbox IR.
diff --git a/llvm/docs/SandboxIR.md b/llvm/docs/SandboxIR.md
@@ -0,0 +1,53 @@
+# Sandbox IR: A transactional layer over LLVM IR
+
+Sandbox IR is an IR layer on top of LLVM IR that allows you to save/restore its state.
+
+# API
+The Sandbox IR API is designed to feel like LLVM, replicating many common API classes and functions to mirror the LLVM API.
+The class hierarchy is similar (but in the `llvm::sandboxir` namespace).
+For example here is a small part of it:
+```
+namespace sandboxir {
+              Value
+              /  \
+            User BasicBlock ...
+           /   \
+  Instruction Constant
+        /
+     ...
+}
+```
+
+# Design
+
+## Sandbox IR Value <-> LLVM IR Value Mapping
+Each LLVM IR Value maps to a single Sandbox IR Value.
+The reverse is also true in most cases, except for Sandbox IR Instructions that map to more than one LLVM IR Instruction.
+Such instructions can be defined in extensions of the base Sandbox IR.
+
+- Forward mapping: Sandbox IR Value -> LLVM IR Value
+Each Sandbox IR Value contains an `llvm::Value *Val` member variable that points to the corresponding LLVM IR Value.
+
+- Reverse mapping: LLVM IR Value -> Sandbox IR Value
+This mapping is stored in `sandboxir::Context::LLVMValueToValue`.
+
+For example `sandboxir::User::getOperand(OpIdx)` for a `sandboxir::User *U` works as follows:
+- First we find the LLVM User: `llvm::User *LLVMU = U->Val`.
+- Next we get the LLVM Value operand: `llvm::Value *LLVMOp = LLVMU->getOperand(OpIdx)`
+- Finally we get the Sandbox IR operand that corresponds to `LLVMOp` by querying the map in the Sandbox IR context: `retrun Ctx.getValue(LLVMOp)`.
+
+## Sandbox IR is Write-Through
+Sandbox IR is designed to rely on LLVM IR for its state.
+So any change made to Sandbox IR objects directly updates the corresponding LLVM IR.
+
+This has the following benefits:
+- It minimizes the replication of state, and
+- It makes sure that Sandbox IR and LLVM IR are always in sync, which helps avoid bugs and removes the need for a lowering step.
+- No need for serialization/de-serialization infrastructure as we can rely on LLVM IR for it.
+- One can pass actual `llvm::Instruction`s to cost modeling APIs.
+
+Sandbox IR API functions that modify the IR state call the corresponding LLVM IR function that modifies the LLVM IR's state.
+For example, for `sandboxir::User::setOperand(OpIdx, sandboxir::Value *Op)`:
+- We get the corresponding LLVM User: `llvm::User *LLVMU = cast<llvm::User>(Val)`
+- Next we get the corresponding LLVM Operand: `llvm::Value *LLVMOp = Op->Val`
+- Finally we modify `LLVMU`'s operand: `LLVMU->setOperand(OpIdx, LLVMOp)
diff --git a/llvm/docs/UserGuides.rst b/llvm/docs/UserGuides.rst
@@ -67,6 +67,7 @@ intermediate LLVM representation.
    RISCV/RISCVVectorExtension
    SourceLevelDebugging
    SPIRVUsage
+   SandboxIR
    StackSafetyAnalysis
    SupportLibrary
    TableGen/index
@@ -192,6 +193,7 @@ Optimizations
    This document specifies guidelines for contributions for InstCombine and
    related passes.
 
+
 Code Generation
 ---------------
 
@@ -288,3 +290,6 @@ Additional Topics
 
 :doc:`RISCV/RISCVVectorExtension`
    This document describes how the RISC-V Vector extension can be expressed in LLVM IR and how code is generated for it in the backend.
+
+:doc:`Sandbox IR <SandboxIR>`
+   This document describes the design and usage of Sandbox IR, a transactional layer over LLVM IR.