Skip to content

Commit 217d783

Browse files
committed
Added paramterised search and d/l for Hugging Face. Updated README.md
1 parent 483b6ba commit 217d783

File tree

3 files changed

+47
-27
lines changed

3 files changed

+47
-27
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,3 +164,6 @@ cython_debug/
164164
# and can be added to the global gitignore or merged into this file. For a more nuclear
165165
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
166166
.idea/
167+
168+
# model .bin files
169+
docker/auto_docker/*.bin

docker/README.md

Lines changed: 21 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,11 @@
1+
# Install Docker Server
2+
3+
**Note #1:** This was tested with Docker running on Linux. If you can get it working on Windows or MacOS, please update this `README.md` with a PR!
4+
5+
[Install Docker Engine](https://docs.docker.com/engine/install)
6+
7+
**Note #2:** NVidia GPU CuBLAS support requires a NVidia GPU with sufficient VRAM (approximately as much as the size above) and Docker NVidia support (see [container-toolkit/install-guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html))
8+
19
# Simple Dockerfiles for building the llama-cpp-python server with external model bin files
210
- `./openblas_simple/Dockerfile` - a simple Dockerfile for non-GPU OpenBLAS, where the model is located outside the Docker image
311
- `cd ./openblas_simple`
@@ -15,14 +23,14 @@
1523
- `hug_model.py` - a Python utility for interactively choosing and downloading the latest `5_1` quantized models from [huggingface.co/TheBloke]( https://huggingface.co/TheBloke)
1624
- `Dockerfile` - a single OpenBLAS and CuBLAS combined Dockerfile that automatically installs a previously downloaded model `model.bin`
1725

18-
## Get model from Hugging Face
19-
`python3 ./hug_model.py`
20-
21-
You should now have a model in the current directory and `model.bin` symlinked to it for the subsequent Docker build and copy step. e.g.
26+
## Download a Llama Model from Hugging Face
27+
- To download a MIT licensed Llama model run: `python3 ./hug_model.py -a vihangd -s open_llama_7b_700bt_ggml`
28+
- To select and install a restricted license Llama model run: `python3 ./hug_model.py -a TheBloke -t llama`
29+
- You should now have a model in the current directory and `model.bin` symlinked to it for the subsequent Docker build and copy step. e.g.
2230
```
2331
docker $ ls -lh *.bin
24-
-rw-rw-r-- 1 user user 4.8G May 23 18:30 <downloaded-model-file>.q5_1.bin
25-
lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>.q5_1.bin
32+
-rw-rw-r-- 1 user user 4.8G May 23 18:30 <downloaded-model-file>q5_1.bin
33+
lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>q5_1.bin
2634
```
2735
**Note #1:** Make sure you have enough disk space to download the model. As the model is then copied into the image you will need at least
2836
**TWICE** as much disk space as the size of the model:
@@ -36,22 +44,15 @@ lrwxrwxrwx 1 user user 24 May 23 18:30 model.bin -> <downloaded-model-file>.q5
3644

3745
**Note #2:** If you want to pass or tune additional parameters, customise `./start_server.sh` before running `docker build ...`
3846

39-
# Install Docker Server
40-
41-
**Note #3:** This was tested with Docker running on Linux. If you can get it working on Windows or MacOS, please update this `README.md` with a PR!
42-
43-
[Install Docker Engine](https://docs.docker.com/engine/install)
44-
45-
# Use OpenBLAS
47+
## Use OpenBLAS
4648
Use if you don't have a NVidia GPU. Defaults to `python:3-slim-bullseye` Docker base image and OpenBLAS:
47-
## Build:
48-
`docker build --build-arg -t openblas .`
49-
## Run:
49+
### Build:
50+
`docker build -t openblas .`
51+
### Run:
5052
`docker run --cap-add SYS_RESOURCE -t openblas`
5153

52-
# Use CuBLAS
53-
Requires a NVidia GPU with sufficient VRAM (approximately as much as the size above) and Docker NVidia support (see [container-toolkit/install-guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html))
54-
## Build:
54+
## Use CuBLAS
55+
### Build:
5556
`docker build --build-arg IMAGE=nvidia/cuda:12.1.1-devel-ubuntu22.04 -t cublas .`
56-
## Run:
57+
### Run:
5758
`docker run --cap-add SYS_RESOURCE -t cublas`

docker/auto_docker/hug_model.py

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
import json
33
import os
44
import struct
5+
import argparse
56

67
def make_request(url, params=None):
78
print(f"Making request to {url}...")
@@ -69,21 +70,28 @@ def get_user_choice(model_list):
6970

7071
return None
7172

72-
import argparse
73-
7473
def main():
7574
# Create an argument parser
76-
parser = argparse.ArgumentParser(description='Process the model version.')
75+
parser = argparse.ArgumentParser(description='Process some parameters.')
76+
77+
# Arguments
7778
parser.add_argument('-v', '--version', type=int, default=0x0003,
7879
help='an integer for the version to be used')
80+
parser.add_argument('-a', '--author', type=str, default='TheBloke',
81+
help='an author to be filtered')
82+
parser.add_argument('-t', '--tags', type=str, default='llama',
83+
help='tags for the content')
84+
parser.add_argument('-s', '--search', type=str, default='',
85+
help='search term')
7986

8087
# Parse the arguments
8188
args = parser.parse_args()
8289

8390
# Define the parameters
8491
params = {
85-
"author": "TheBloke", # Filter by author
86-
"tags": "llama"
92+
"author": args.author,
93+
"tags": args.tags,
94+
"search": args.search
8795
}
8896

8997
models = make_request('https://huggingface.co/api/models', params=params)
@@ -103,14 +111,22 @@ def main():
103111
if rfilename and 'q5_1' in rfilename:
104112
model_list.append((model_id, rfilename))
105113

106-
model_choice = get_user_choice(model_list)
114+
# Choose the model
115+
if len(model_list) == 1:
116+
model_choice = model_list[0]
117+
else:
118+
model_choice = get_user_choice(model_list)
119+
107120
if model_choice is not None:
108121
model_id, rfilename = model_choice
109122
url = f"https://huggingface.co/{model_id}/resolve/main/{rfilename}"
110123
download_file(url, rfilename)
111124
_, version = check_magic_and_version(rfilename)
112125
if version != args.version:
113-
print(f"Warning: Expected version {args.version}, but found different version in the file.")
126+
print(f"Warning: Expected version {args.version}, but found different version in the file.")
127+
else:
128+
print("Error - model choice was None")
129+
exit(1)
114130

115131
if __name__ == '__main__':
116132
main()

0 commit comments

Comments
 (0)