-
Notifications
You must be signed in to change notification settings - Fork 250
add aoti c/c++ runner to hqq tests; check output for gibberish using spell #824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/824
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 7e67742 with merge base 7c2d949 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…ed sequence extraction
#! /bin/bash | ||
|
||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do it twice?
cat ${TMPFILE} | aspell -a -c | grep '^[\&#]' >/tmp/out.$$ | ||
# Exit with a non-zero status code if there were any spelling errors because: | ||
# * Finding one or more lines with & or # means we found a spelling error, might be gibberish | ||
if [ $? -ne 0 ]; then | ||
echo "No spelling errors found; likely correct operation. Success." | ||
exit 0 | ||
fi | ||
cat /tmp/out.$$ | ||
echo "Spelling errors found; might indicate garbage output. Failing." | ||
exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we introduce some sort of a tolerance criteria here (i.e. total number of unknown words)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea -- I've had one instance where generate produced an invented non-dictionary work with and 'he calls it "whateversomething".' That being said it only occurred once, so I decided to not overindex on that.
Allowing for some rate of output is a good solution to this. Should we proactively do that, or wait till we see if in actual test runs?
# print( | ||
# f"warning: {name} is padded to satisfy in_features % 1024 == 0" | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, how this change is related to PR in question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you output that, it enters the spell checked sequence, and fails! We are too wordy with debug messages, this is primarily my fault in the quantization logic.
@@ -748,7 +748,7 @@ def callback(x): | |||
aggregate_metrics["tokens_per_sec"].append(tokens_sec) | |||
|
|||
if jit_compile: | |||
print(f"JIT compilation time (incl runtime): {compilation_time:.2} seconds") | |||
print(f"just-in-time compilation time (incl run time): {compilation_time:.2} seconds") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but shouldn't be part of the PR, should it?
@@ -441,6 +441,7 @@ def _initialize_model( | |||
|
|||
model.to(dtype=builder_args.precision) | |||
|
|||
print("-----------------------------------------------------------") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking for a delimiter for our gibberish (load, quantization and other time....). As far as model output it's as much gibberish as a hallucinating model (in terms of utlliity and connectedness to the model).
That being said, there's a point where I strongly believe in dumping performance data -- we had missed regressions when we did not.
@@ -244,3 +244,4 @@ jobs: | |||
echo "tests complete" | |||
echo "*******************************************" | |||
echo "::endgroup::" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
|
||
./cmake-out/aoti_run ${MODEL_DIR}/${MODEL_NAME}.so -z ${TOKENIZER_PATH} -i "${PROMPT}" > ./output_runner_aoti | ||
cat ./output_runner_aoti | ||
# .ci/scripts/check_gibberish ./output_runner_aoti --no-extract |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why check is skipped there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because aoti_runner does not work properly and I can't run proper tests.
Would love to add test here, and add aoti runner everywhere where we call generate --dso-path
today as a second test. Pending resolution of beoing able to load CPU model when the aoti_runner was built on a host with cuda.
|
||
if __name__ == "__main__": | ||
if len(sys.argv) < 2: | ||
print("Usage: python scriptname.py filename") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should have been something like `f"Usage:\n {sys.executable} {sys.argv[0]} filename")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
…spell (#824) * add runner to hqq tests * replace cat with a gibberish check * typo * create script to check for gibberish * update gibberish check * update gibberish check * use variable for tokenizer path * aspell dictionaries for english * exclude device name from gibberish check * handle JIT time line * handle Warning: * grep update * fix line exclusion * remove warning which causes gibberish check fail * add sequence extraction for principled handling of perf info and user messages * typo * change output to pass spell check * updates * handle runner which does not have sequence delimiters b/c does not need sequence extraction * add updated workflow yml * typo * native runner weirdness * remove secrets * don't log in for GGUF open_orca model
…spell (#824) * add runner to hqq tests * replace cat with a gibberish check * typo * create script to check for gibberish * update gibberish check * update gibberish check * use variable for tokenizer path * aspell dictionaries for english * exclude device name from gibberish check * handle JIT time line * handle Warning: * grep update * fix line exclusion * remove warning which causes gibberish check fail * add sequence extraction for principled handling of perf info and user messages * typo * change output to pass spell check * updates * handle runner which does not have sequence delimiters b/c does not need sequence extraction * add updated workflow yml * typo * native runner weirdness * remove secrets * don't log in for GGUF open_orca model
…spell (#824) * add runner to hqq tests * replace cat with a gibberish check * typo * create script to check for gibberish * update gibberish check * update gibberish check * use variable for tokenizer path * aspell dictionaries for english * exclude device name from gibberish check * handle JIT time line * handle Warning: * grep update * fix line exclusion * remove warning which causes gibberish check fail * add sequence extraction for principled handling of perf info and user messages * typo * change output to pass spell check * updates * handle runner which does not have sequence delimiters b/c does not need sequence extraction * add updated workflow yml * typo * native runner weirdness * remove secrets * don't log in for GGUF open_orca model
…spell (#824) * add runner to hqq tests * replace cat with a gibberish check * typo * create script to check for gibberish * update gibberish check * update gibberish check * use variable for tokenizer path * aspell dictionaries for english * exclude device name from gibberish check * handle JIT time line * handle Warning: * grep update * fix line exclusion * remove warning which causes gibberish check fail * add sequence extraction for principled handling of perf info and user messages * typo * change output to pass spell check * updates * handle runner which does not have sequence delimiters b/c does not need sequence extraction * add updated workflow yml * typo * native runner weirdness * remove secrets * don't log in for GGUF open_orca model
…spell (#824) * add runner to hqq tests * replace cat with a gibberish check * typo * create script to check for gibberish * update gibberish check * update gibberish check * use variable for tokenizer path * aspell dictionaries for english * exclude device name from gibberish check * handle JIT time line * handle Warning: * grep update * fix line exclusion * remove warning which causes gibberish check fail * add sequence extraction for principled handling of perf info and user messages * typo * change output to pass spell check * updates * handle runner which does not have sequence delimiters b/c does not need sequence extraction * add updated workflow yml * typo * native runner weirdness * remove secrets * don't log in for GGUF open_orca model
…spell (#824) * add runner to hqq tests * replace cat with a gibberish check * typo * create script to check for gibberish * update gibberish check * update gibberish check * use variable for tokenizer path * aspell dictionaries for english * exclude device name from gibberish check * handle JIT time line * handle Warning: * grep update * fix line exclusion * remove warning which causes gibberish check fail * add sequence extraction for principled handling of perf info and user messages * typo * change output to pass spell check * updates * handle runner which does not have sequence delimiters b/c does not need sequence extraction * add updated workflow yml * typo * native runner weirdness * remove secrets * don't log in for GGUF open_orca model
add aoti c/c++ runner to hqq tests
use spell check to detect garbage output