create dir on download

lucylq · lucylq · commit 517531b15f65 · 2024-04-24T18:41:29.000-07:00
diff --git a/README.md b/README.md
@@ -110,7 +110,6 @@ Designed for interactive graphical conversations using the familiar web browser
 
 Quantization is the process of converting a model into a more memory-efficient representation.  Quantization is particularly important for accelerators -- to take advantage of the available memory bandwidth, and fit in the often limited high-speed memory in accelerators – and mobile devices – to fit in the typically very limited memory of mobile devices.
 
-
 Depending on the model and the target device, different quantization recipes may be applied.  Torchchat contains two example configurations to optimize performance for GPU-based systems `config/data/cuda.json` , and mobile systems `config/data/mobile.json`.  The GPU configuration is targeted towards optimizing for memory bandwidth which is a scarce resource in powerful GPUs (and to a less degree, memory footprint to fit large models into a device's memory).  The mobile configuration is targeted towards optimizing for memory fotoprint because in many devices, a single application is limited to as little as GB or less of memory.
 
 You can use the quantization recipes in conjunction with any of the `chat`, `generate` and `browser` commands to test their impact and accelerate model execution. You will apply these recipes to the export comamnds below, to optimize the exported models.  To adapt these recipes or wrote your own, please refer to the [quantization overview](docs/quantization.md).
diff --git a/download.py b/download.py
@@ -105,6 +105,7 @@ def download_and_convert(
         # overwriting if necessary.
         if os.path.isdir(model_dir):
             shutil.rmtree(model_dir)
+        os.makedirs(model_dir, exist_ok=True)
         shutil.move(temp_dir, model_dir)
 
     finally: