Skip to content

Clean up CLI output #473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 25, 2024
Merged

Clean up CLI output #473

merged 1 commit into from
Apr 25, 2024

Conversation

GregoryComer
Copy link
Member

@GregoryComer GregoryComer commented Apr 24, 2024

Move a few messages to debug. Update chat messages to a slightly more minimal form.

With changes to chat:

Entering Chat Mode. Will continue chatting back and forth with the language model until the models max context length of 2048 tokens is hit or until the user says /bye
System Prompt [Optional]:
User: Hello llama. Please write a script to print the numbers 1 through 10 in python.
Model: Hello there! As a llama, I'd be delighted to help you with that. Here's a simple script to print the numbers 1 through 10 in Python:

for i in range(1, 11):
    print(i)

Let me explain what this script does:

...
User: /bye
Exiting Chat.

==========
Average tokens/sec: 2.32
Memory used: 0.00 GB

Also, I tested generate stories15m. Should be unchanged:

Using device=cpu Apple M1 Pro
Loading model...
Time to load model: 0.03 seconds
Quantizing the model with: { }
Time to quantize model: 0.00 seconds
Hello, my name is Sue. He is a brave bear. He is always ready to help.
Sue loves to bounce around in the forest. One day, she was bouncing around when she spotted some old, tough honey.
She hopped closer and said, "Oh no! What can I do?"
Suddenly, a bear appeared. He said, "Don't worry, Sue. I can get you some honey!"
Sue was very happy. She said, "Thank you!" The bear smiled, and then said, "Now I can help you eat."
Sue thanked the bear again and went off to bounce around in the forest. She had lots of fun and was so happy that the bear helped her. Once upon a time, there was a little girl named Lily. She loved to play outside in the snow. One day, she went outside to build a snowman.
Max Sequence Length Reached. Ending Conversation.
==========
Average tokens/sec: 238.14
Memory used: 0.00 GB

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 24, 2024
@@ -667,10 +667,10 @@ def callback(x):
tokens_generated = y.size(0) - prompt_length
tokens_sec = tokens_generated / t
aggregate_metrics["tokens_per_sec"].append(tokens_sec)
logging.info(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what's wrong with this being an info? I think for generate it's pretty important to see the perf, isn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are being printed after every model response, which breaks up the chat session. It is still printed at info level at the end of conversation (verified for both chat and generate).

@GregoryComer GregoryComer merged commit 1d4841f into pytorch:main Apr 25, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
malfet pushed a commit that referenced this pull request Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants