Skip to content

Commit 2e7d83d

Browse files
authored
[mlir][sparse] replace "sparse compiler" with "sparsifier" in doc (#67082)
Rationale: The term "sparse compiler", although dear to my heart, is often mistaken as a completely separate compiler, and not a pass within a full compiler pipeline. Therefore, we start migrating to the term "sparsifier".
1 parent 2f98ff7 commit 2e7d83d

File tree

6 files changed

+35
-24
lines changed

6 files changed

+35
-24
lines changed

mlir/include/mlir/Dialect/SparseTensor/IR/Enums.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -190,8 +190,7 @@ enum class DimLevelType : uint8_t {
190190
TwoOutOfFour = 64, // 0b10000_00
191191
};
192192

193-
/// This enum defines all the storage formats supported by the sparse compiler,
194-
/// without the level properties.
193+
/// This enum defines all supported storage format without the level properties.
195194
enum class LevelFormat : uint8_t {
196195
Dense = 4, // 0b00001_00
197196
Compressed = 8, // 0b00010_00

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorAttrDefs.td

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -106,35 +106,36 @@ def SparseTensorEncodingAttr : SparseTensor_Attr<"SparseTensorEncoding",
106106
sparsity-agnostic representation of the computation, i.e., an implicit sparse
107107
representation is converted to an explicit sparse representation where co-iterating
108108
loops operate on sparse storage formats rather than tensors with a sparsity
109-
encoding. Compiler passes that run before this sparse compiler pass need to
110-
be aware of the semantics of tensor types with such a sparsity encoding.
109+
encoding. Compiler passes that run before this sparsifier pass need to be aware
110+
of the semantics of tensor types with such a sparsity encoding.
111111

112-
In this encoding, we use `dimension` to refer to the axes of the semantic tensor,
113-
and `level` to refer to the axes of the actual storage format, i.e., the
112+
In this encoding, we use **dimension** to refer to the axes of the semantic tensor,
113+
and **level** to refer to the axes of the actual storage format, i.e., the
114114
operational representation of the sparse tensor in memory. The number of
115115
dimensions is usually the same as the number of levels (such as CSR storage format).
116116
However, the encoding can also map dimensions to higher-order levels (for example,
117117
to encode a block-sparse BSR storage format) or to lower-order levels
118118
(for example, to linearize dimensions as a single level in the storage).
119119

120-
The encoding contains a `map` that provides the following:
120+
The encoding contains a map that provides the following:
121121

122122
- An ordered sequence of dimension specifications, each of which defines:
123123
- the dimension-size (implicit from the tensor’s dimension-shape)
124124
- a **dimension-expression**
125125
- An ordered sequence of level specifications, each of which includes a required
126126
**level-type**, which defines how the level should be stored. Each level-type
127127
consists of:
128+
- a **level-expression**, which defines what is stored
128129
- a **level-format**
129130
- a collection of **level-properties** that apply to the level-format
130-
- a **level-expression**, which defines what is stored
131131

132132
Each level-expression is an affine expression over dimension-variables. Thus, the
133133
level-expressions collectively define an affine map from dimension-coordinates to
134134
level-coordinates. The dimension-expressions collectively define the inverse map,
135135
which only needs to be provided for elaborate cases where it cannot be inferred
136136
automatically. Within the sparse storage format, we refer to indices that are
137-
stored explicitly as `coordinates` and indices into the storage format as `positions`.
137+
stored explicitly as **coordinates** and offsets into the storage format as
138+
**positions**.
138139

139140
The supported level-formats are the following:
140141

@@ -145,26 +146,26 @@ def SparseTensorEncodingAttr : SparseTensor_Attr<"SparseTensorEncoding",
145146
Different level-formats may have different collections of level-properties.
146147
By default, each level-type has the property of being unique (no duplicate
147148
coordinates at that level), ordered (coordinates appear sorted at that
148-
level), and, for compression, storing the positions in a compact way where
149-
an interval is defined by a lower bound "pos(i)" and an upper bound "pos(i+1)-1".
149+
level), and, for compression, storing each position interval in a compact
150+
way with a lowerbound `pos(i)`" and an upperbound `pos(i+1) - 1`.
150151
The following properties can be added to a level-format to change this
151152
default behavior:
152153

153154
- **nonunique** : duplicate coordinates may appear at the level
154155
- **nonordered** : coordinates may appear in arbribratry order
155-
- **high** : the upper bound is stored explicitly in a separate array
156+
- **high** : position interval upperbounds are stored explicitly
156157
- **block2_4** : the compression uses a 2:4 encoding per 1x4 block
157158

158-
In addition to the `map`, the following two fields are optional:
159+
In addition to the map, the following two fields are optional:
159160

160-
- The required bitwidth for `position` storage (integral offsets
161+
- The required bitwidth for position storage (integral offsets
161162
into the sparse storage scheme). A narrow width reduces the memory
162163
footprint of overhead storage, as long as the width suffices to
163164
define the total required range (viz. the maximum number of stored
164165
entries over all indirection levels). The choices are `8`, `16`,
165166
`32`, `64`, or, the default, `0` to indicate the native bitwidth.
166167

167-
- The required bitwidth for `coordinate` storage (the coordinates
168+
- The required bitwidth for coordinate storage (the coordinates
168169
of stored entries). A narrow width reduces the memory footprint
169170
of overhead storage, as long as the width suffices to define
170171
the total required range (viz. the maximum value of each tensor
@@ -231,20 +232,26 @@ def SparseTensorEncodingAttr : SparseTensor_Attr<"SparseTensorEncoding",
231232
```
232233
}];
233234

235+
//
234236
// Data in sparse tensor encoding.
237+
//
235238
let parameters = (
236239
ins
237240
// A level-type for each level of the sparse storage.
238241
ArrayRefParameter<
239242
"::mlir::sparse_tensor::DimLevelType",
240243
"level-types"
241244
>: $lvlTypes,
245+
242246
// A mapping from dimension-coordinates to level-coordinates.
243247
"AffineMap":$dimToLvl,
248+
244249
// The required bitwidth for position storage.
245250
"unsigned":$posWidth,
251+
246252
// The required bitwidth for coordinate storage.
247253
"unsigned":$crdWidth,
254+
248255
// A slice attribute for each dimension of the tensor type.
249256
ArrayRefParameter<
250257
"::mlir::sparse_tensor::SparseTensorDimSliceAttr",

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorBase.td

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,16 @@ def SparseTensor_Dialect : Dialect {
2525
means of a small sparse runtime support library.
2626

2727
The concept of **treating sparsity as a property, not a tedious
28-
implementation detail**, by letting a **sparse compiler** generate
28+
implementation detail**, by letting a **sparsifier** generate
2929
sparse code automatically was pioneered for linear algebra by [Bik96]
3030
in MT1 (see https://www.aartbik.com/sparse.php) and formalized
3131
to tensor algebra by [Kjolstad17,Kjolstad20] in the Sparse Tensor
3232
Algebra Compiler (TACO) project (see http://tensor-compiler.org).
33+
Please note that we started to prefer the term "sparsifier" over
34+
the also commonly used "sparse compiler" terminology to refer to
35+
such a pass to make it clear that the sparsifier pass is not a
36+
separate compiler, but should be an integral part of any compiler
37+
pipeline that is built with the MLIR compiler infrastructure
3338

3439
The MLIR implementation [Biketal22] closely follows the "sparse
3540
iteration theory" that forms the foundation of TACO. A rewriting

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorOps.td

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ def SparseTensor_PackOp : SparseTensor_Op<"pack", [Pure]>,
7474
sources; e.g., when passing two numpy arrays from Python.
7575

7676
Disclaimer: This is the user's responsibility to provide input that can be
77-
correctly interpreted by the sparse compiler, which does not perform
77+
correctly interpreted by the sparsifier, which does not perform
7878
any sanity test during runtime to verify data integrity.
7979

8080
TODO: The returned tensor is allowed (in principle) to have non-identity
@@ -120,7 +120,7 @@ def SparseTensor_UnpackOp : SparseTensor_Op<"unpack", [Pure, SameVariadicResultS
120120
unpacked MLIR sparse tensor to frontend; e.g., returning two numpy arrays to Python.
121121

122122
Disclaimer: This is the user's responsibility to allocate large enough buffers
123-
to hold the sparse tensor. The sparse compiler simply copies each fields
123+
to hold the sparse tensor. The sparsifier simply copies each fields
124124
of the sparse tensor into the user-supplied buffer without bound checking.
125125

126126
TODO: the current implementation does not yet support non-identity mappings.
@@ -362,7 +362,7 @@ def SparseTensor_ToSliceOffsetOp : SparseTensor_Op<"slice.offset", [Pure]>,
362362
Extracts the offset of the sparse tensor slice at the given dimension.
363363

364364
Currently, sparse tensor slices are still a work in progress, and only
365-
works when runtime library is disabled (i.e., running sparse compiler
365+
works when runtime library is disabled (i.e., running the sparsifier
366366
with `enable-runtime-library=false`).
367367

368368
Example:
@@ -389,7 +389,7 @@ def SparseTensor_ToSliceStrideOp : SparseTensor_Op<"slice.stride", [Pure]>,
389389
Extracts the stride of the sparse tensor slice at the given dimension.
390390

391391
Currently, sparse tensor slices are still a work in progress, and only
392-
works when runtime library is disabled (i.e., running sparse compiler
392+
works when runtime library is disabled (i.e., running the sparsifier
393393
with `enable-runtime-library=false`).
394394

395395
Example:

mlir/include/mlir/Dialect/SparseTensor/IR/SparseTensorType.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -127,8 +127,8 @@ class SparseTensorType {
127127
/// Allow implicit conversion to `RankedTensorType`, `ShapedType`,
128128
/// and `Type`. These are implicit to help alleviate the impedance
129129
/// mismatch for code that has not been converted to use `SparseTensorType`
130-
/// directly. Once more of the sparse compiler has been converted to
131-
/// using `SparseTensorType`, we may want to make these explicit instead.
130+
/// directly. Once more uses have been converted to `SparseTensorType`,
131+
/// we may want to make these explicit instead.
132132
///
133133
/// WARNING: This user-defined-conversion method causes overload
134134
/// ambiguity whenever passing a `SparseTensorType` directly to a

mlir/include/mlir/Dialect/SparseTensor/Transforms/Passes.td

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ def PreSparsificationRewrite : Pass<"pre-sparsification-rewrite", "ModuleOp"> {
3131
def SparsificationPass : Pass<"sparsification", "ModuleOp"> {
3232
let summary = "Automatically generate sparse tensor code from sparse tensor types";
3333
let description = [{
34-
A pass that implements the core functionality of a **sparse compiler**.
34+
A pass that implements the core functionality of a **sparsifier**.
3535
Each Linalg operation (MLIR's tensor index notation) that operates on
3636
sparse tensor types is converted into code in which the sparsity is
3737
explicit both in terms of co-iterating looping logic as well as
@@ -332,7 +332,7 @@ def SparseVectorization : Pass<"sparse-vectorization", "ModuleOp"> {
332332
def SparseGPUCodegen : Pass<"sparse-gpu-codegen", "ModuleOp"> {
333333
let summary = "Generates GPU code during sparsification";
334334
let description = [{
335-
Enables sparse compiler to use GPU acceleration.
335+
Enables the sparsifier to use GPU acceleration.
336336
}];
337337
let constructor = "mlir::createSparseGPUCodegenPass()";
338338
let dependentDialects = [

0 commit comments

Comments
 (0)