-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: improve temperature logic in chat #1749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7c8b473
to
a1df6c5
Compare
router/src/server.rs
Outdated
@@ -775,6 +776,10 @@ async fn chat_completions( | |||
let logprobs = logprobs.unwrap_or(false); | |||
let tool_prompt = tool_prompt.unwrap_or_default(); | |||
let stop = stop.unwrap_or_default(); | |||
// rescale temperature starting from 0.0 to 1.0 | |||
let adjusted_temperature = temperature.map_or(1.0, |t| t + 1.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand this. Why are you adding 1.0 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 is added since openais api is essentially zero indexed. They treat 0
as deterministic and higher values as more random.
In our case 1 is deterministic and greater is more random, so adding 1 allows users to get deterministic output with temp 0, and more random with larger values.
temperature:
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's not.
logits.pow(temperature).
temperature == 0 -> Pretty much forces deteministic sampling (with float issues).
temperature == 1 -> Regular distribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense! PR updated to:
- only set
do_sample
tofalse
whentemperature
is 0 - rescale
temperature
from 0 to 1 only when temp is 0
I believe this fits the requirements, and enables greedy ONLY when temperature is 0. Please lmk if I should change! 🙏
tiny reproduction script import requests
headers = {
"Content-Type": "application/json",
}
data = {
"model": "tgi",
"messages": [
{
"role": "user",
"content": "Summarize the main ideas of Jeff Walker's Product Launch Formula into bullet points as it pertains to a growth marketing agency implementing these strategies and tactics for their clients...",
}
],
"stream": False,
"max_tokens": 100,
"temperature": 0,
}
url = "http://localhost:3000/v1/chat/completions"
response = requests.post(url, headers=headers, json=data)
print(response.text) |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
a461eaa
to
a2c935d
Compare
router/src/server.rs
Outdated
let adjusted_temperature = temperature.map_or(1.0, |t| if t == 0.0 { 1.0 } else { t }); | ||
let temperature = Some(adjusted_temperature); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn´t that.
let temperature = if temperature == 0.0{
None
} else{ Some(temperature) }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better yet, make the condition == 0.0 common to both do_sample and temperature (makes it more readable imho).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that exactly equivalent since we need the following:
- do_sample to be true in all cases except if the incoming temp is 0
- we need temperature to be a
Option<f32>
and we need to replace 0 with 1.
most readable/simple I've come up with so far is...
let do_sample: bool = temperature.map_or(true, |t| t != 0.0);
let temperature = temperature.map(|t| if t == 0.0 { 1.0 } else { t });
whatcha think? happy to make any changes!
This PR adds support for `do_sample` to chat to enable greedy sampling --------- Co-authored-by: Nicolas Patry <[email protected]>
This PR adds support for `do_sample` to chat to enable greedy sampling --------- Co-authored-by: Nicolas Patry <[email protected]>
This PR adds support for
do_sample
to chat to enable greedy sampling