Skip to content

Commit 16d7d83

Browse files
committed
create dir on download
1 parent 25a105f commit 16d7d83

File tree

2 files changed

+3
-2
lines changed

2 files changed

+3
-2
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -101,9 +101,9 @@ Designed for interactive graphical conversations using the familiar web browser
101101

102102
## Quantizing your model (suggested for mobile)
103103

104-
Quantization is the process of converting a model into a more memory-efficient representation. Quantization is particularly important for accelerators -- to take advantage of the available memory bandwidth, and fit in the often limited high-speed memory in accelerators – and mobile devices – to fit in the typically very limited memory of mobile devices.
104+
Quantization is the process of converting a model into a more memory-efficient representation. Quantization is particularly important for accelerators -- to take advantage of the available memory bandwidth, and fit in the often limited high-speed memory in accelerators – and mobile devices – to fit in the typically very limited memory of mobile devices.
105105

106-
With quantization, 32-bit floating numbers can be represented with as few as 8 or even 4 bits, and a scale shared by a group of these weights. This transformation is lossy and modifies the behavior of models. While research is being conducted on how to efficiently quantize large language models for use in mobile devices, this transformation invariable results in both quality loss and a reduced amount of control over the output of the models, leading to an increased risk of undesirable responses, hallucinations and stuttering.
106+
With quantization, 32-bit floating numbers can be represented with as few as 8 or even 4 bits, and a scale shared by a group of these weights. This transformation is lossy and modifies the behavior of models. While research is being conducted on how to efficiently quantize large language models for use in mobile devices, this transformation invariable results in both quality loss and a reduced amount of control over the output of the models, leading to an increased risk of undesirable responses, hallucinations and stuttering.
107107

108108
In effect an a developer quantizing a model, has much control and even more responsibility to quantize a model to quantify and reduce these effects.
109109

download.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ def download_and_convert(
9393
# overwriting if necessary.
9494
if os.path.isdir(model_dir):
9595
shutil.rmtree(model_dir)
96+
os.makedirs(model_dir, exist_ok=True)
9697
shutil.move(temp_dir, model_dir)
9798

9899
finally:

0 commit comments

Comments
 (0)