Skip to content

Commit f7c1459

Browse files
Michael Gschwindlarryliu0820
authored andcommitted
4b embedding quantizer (#3081)
Summary: 4b embedding quantizer Reviewed By: larryliu0820 Differential Revision: D56229021
1 parent 73d5e7e commit f7c1459

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

examples/models/llama2/quantize.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -436,10 +436,18 @@ def __init__(
436436
@torch.no_grad()
437437
def forward(self, indices: torch.Tensor) -> torch.Tensor:
438438
if not self.packed: # 8bit
439+
<<<<<<< HEAD
439440
return torch.ops.quantized_decomposed.embedding_byte.dtype(
440441
self.weight, self.scales, None, 0, 0, indices, dtype=self.dtype
441442
)
442443
else: # 4bit packed
443444
return torch.ops.quantized_decomposed.embedding_4bit.dtype(
445+
=======
446+
return torch.ops.llama_quantized.DEPRECATED_DO_NOT_USE_embedding_byte.dtype(
447+
self.weight, self.scales, None, 0, 0, indices, dtype=self.dtype
448+
)
449+
else: # 4bit packed
450+
return torch.ops.llama_quantized.embedding_4bit.dtype(
451+
>>>>>>> 6b3b7228c (4b embedding quantizer (#3081))
444452
self.weight, self.scales, None, 0, 0, indices, dtype=self.dtype
445453
)

0 commit comments

Comments
 (0)