Skip to content

feat: improve temperature logic in chat #1749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Apr 25, 2024
Merged

feat: improve temperature logic in chat #1749

merged 12 commits into from
Apr 25, 2024

Conversation

drbh
Copy link
Collaborator

@drbh drbh commented Apr 15, 2024

This PR adds support for do_sample to chat to enable greedy sampling

@drbh drbh changed the title feat: support do_sample param in ChatRequest feat: improve temperature logic in chat Apr 15, 2024
@drbh drbh self-assigned this Apr 16, 2024
@drbh drbh force-pushed the greedy-chat-tokens branch from 7c8b473 to a1df6c5 Compare April 16, 2024 16:36
@drbh drbh requested a review from Narsil April 17, 2024 13:38
@@ -775,6 +776,10 @@ async fn chat_completions(
let logprobs = logprobs.unwrap_or(false);
let tool_prompt = tool_prompt.unwrap_or_default();
let stop = stop.unwrap_or_default();
// rescale temperature starting from 0.0 to 1.0
let adjusted_temperature = temperature.map_or(1.0, |t| t + 1.0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand this. Why are you adding 1.0 ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 is added since openais api is essentially zero indexed. They treat 0 as deterministic and higher values as more random.

In our case 1 is deterministic and greater is more random, so adding 1 allows users to get deterministic output with temp 0, and more random with larger values.

temperature:
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's not.

logits.pow(temperature).

temperature == 0 -> Pretty much forces deteministic sampling (with float issues).
temperature == 1 -> Regular distribution.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense! PR updated to:

  1. only set do_sample to false when temperature is 0
  2. rescale temperature from 0 to 1 only when temp is 0

I believe this fits the requirements, and enables greedy ONLY when temperature is 0. Please lmk if I should change! 🙏

@drbh drbh requested a review from Narsil April 17, 2024 14:12
@drbh
Copy link
Collaborator Author

drbh commented Apr 17, 2024

tiny reproduction script

import requests

headers = {
    "Content-Type": "application/json",
}

data = {
    "model": "tgi",
    "messages": [
        {
            "role": "user",
            "content": "Summarize the main ideas of Jeff Walker's Product Launch Formula into bullet points as it pertains to a growth marketing agency implementing these strategies and tactics for their clients...",
        }
    ],
    "stream": False,
    "max_tokens": 100,
    "temperature": 0,
}

url = "http://localhost:3000/v1/chat/completions"

response = requests.post(url, headers=headers, json=data)

print(response.text)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@drbh drbh force-pushed the greedy-chat-tokens branch from a461eaa to a2c935d Compare April 17, 2024 14:42
Comment on lines 1014 to 1015
let adjusted_temperature = temperature.map_or(1.0, |t| if t == 0.0 { 1.0 } else { t });
let temperature = Some(adjusted_temperature);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn´t that.

let temperature = if temperature == 0.0{
   None
   } else{ Some(temperature) } 

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better yet, make the condition == 0.0 common to both do_sample and temperature (makes it more readable imho).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that exactly equivalent since we need the following:

  1. do_sample to be true in all cases except if the incoming temp is 0
  2. we need temperature to be a Option<f32> and we need to replace 0 with 1.

most readable/simple I've come up with so far is...

let do_sample: bool = temperature.map_or(true, |t| t != 0.0);
let temperature = temperature.map(|t| if t == 0.0 { 1.0 } else { t });

whatcha think? happy to make any changes!

@Narsil Narsil merged commit 0acac5c into main Apr 25, 2024
@Narsil Narsil deleted the greedy-chat-tokens branch April 25, 2024 13:31
Nilabhra pushed a commit to TII-AI-Research-Center/text-generation-inference that referenced this pull request May 14, 2024
This PR adds support for `do_sample` to chat to enable greedy sampling

---------

Co-authored-by: Nicolas Patry <[email protected]>
kdamaszk pushed a commit to kdamaszk/tgi-gaudi that referenced this pull request Jun 10, 2024
This PR adds support for `do_sample` to chat to enable greedy sampling

---------

Co-authored-by: Nicolas Patry <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants