-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Improved quantize script #222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
f8db3d6
2ab3311
01237dd
c028226
e2bfaeb
5d864c1
b802b78
c389c69
e9c3343
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
#!/usr/bin/python3 | ||
|
||
"""Script to execute quantization on a given model.""" | ||
|
||
import subprocess | ||
import argparse | ||
import sys | ||
import os | ||
|
||
|
||
def main(): | ||
"""Parse the command line arguments and execute the script.""" | ||
|
||
parser = argparse.ArgumentParser( | ||
prog='Quantization Script', | ||
description='This script quantizes a model.' | ||
) | ||
|
||
parser.add_argument("models", nargs='+', dest='models') | ||
SuajCarrot marked this conversation as resolved.
Show resolved
Hide resolved
|
||
parser.add_argument( | ||
'-r', '--remove-16', action='store_true', dest='remove_f16', | ||
help='Remove the f16 model after quantizing it.' | ||
) | ||
|
||
args = parser.parse_args() | ||
|
||
for model in args.models: | ||
|
||
model_path = os.path.join("models", model, "ggml-model-f16.bin") | ||
SuajCarrot marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
for i in os.listdir(model_path): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. there's another PR about parallelizing the quantizations. here it would be easy just to wrap this in a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand exactly what can be parallelized here, is it quantizing many models at the same time? The code from the latest commit removed the loop that acts upon the incorrect listing of the file ( |
||
subprocess.run( | ||
["./quantize", i, i.replace("f16", "q4_0"), "2"], | ||
SuajCarrot marked this conversation as resolved.
Show resolved
Hide resolved
|
||
shell=True, | ||
check=True | ||
) | ||
|
||
if args.remove_f16: | ||
os.remove(i) | ||
|
||
|
||
if __name__ == "__main__": | ||
try: | ||
main() | ||
|
||
except subprocess.CalledProcessError: | ||
print("An error ocurred while trying to quantize the models.") | ||
sys.exit(1) | ||
|
||
except FileNotFoundError as err: | ||
print( | ||
f'A FileNotFoundError exception was raised while executing the \ | ||
script:\n{err}\nMake sure you are located in the root of the \ | ||
repository and that the models are in the "models" directory.' | ||
) | ||
sys.exit(1) | ||
|
||
except KeyboardInterrupt: | ||
sys.exit(0) |
Uh oh!
There was an error while loading. Please reload this page.