Skip to content

Commit e379a68

Browse files
[mlir] Remove obsolete "Quantization" section from the rationale.
* It reads as more of a TODO for the future and has been long obsoleted by later work. * One of the authors of the referenced paper called this out as "weird stuff from two years ago" when reviewing the more recent TOSA RFC. Differential Revision: https://reviews.llvm.org/D89329
1 parent 20e78eb commit e379a68

File tree

1 file changed

+0
-26
lines changed

1 file changed

+0
-26
lines changed

mlir/docs/Rationale/Rationale.md

Lines changed: 0 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -427,32 +427,6 @@ arguments to explicitly break the use-def chains in the current proposal. This
427427
can be combined with an attribute-imposed semantic requirement disallowing the
428428
body of the region to refer to any value from outside it.
429429

430-
### Quantized integer operations
431-
432-
We haven't designed integer quantized operations in MLIR, but experience from
433-
TensorFlow suggests that it is better to put information about the quantization
434-
range/scale into the type itself, rather than have a single type like "qint8"
435-
and put these on attributes of the operation.
436-
437-
There are a few ways to do this with MLIR, including at least:
438-
439-
* We could do the same thing TensorFlow does - and we will _have_ to support
440-
that model to some extent for compatibility.
441-
* We can encode the fp range of quantized integers directly into the types
442-
when they are constants. The best practice on this seems to be to encode the
443-
zero point as well as a scale factor. This ensures that 0.0 is always
444-
exactly representable, e.g. `qi8<-1.42, 31.23x>`.
445-
* We could theoretically encode dynamically determined ranges into the types
446-
using something like `qi8<?,?>` with the bounds being determined through the
447-
SSA dataflow graph dynamically - similar to how dynamic shapes are handled.
448-
449-
We will definitely need to do #1 for compatibility, we probably want to do #2,
450-
and we should investigate #3 over time. That said, our short term plan is to get
451-
more implementation experience with the rest of the system first, then come back
452-
to re-examine the representation for quantized arithmetic when we have that
453-
experience. When we do, we should chat with benoitjacob@ and
454-
[read the paper](https://arxiv.org/abs/1712.05877).
455-
456430
### Dialect type extensions
457431

458432
This section describes the design decisions that shaped the dialect extensible

0 commit comments

Comments
 (0)