Skip to content

Commit 58f8ae9

Browse files
committed
readme change
1 parent fa0f22f commit 58f8ae9

File tree

1 file changed

+40
-2
lines changed

1 file changed

+40
-2
lines changed

examples/server/README.md

Lines changed: 40 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,8 @@ node index.js
114114

115115
*Options:*
116116

117+
`prompt`: Provide the prompt for this completion as a string or as an array of strings or numbers representing tokens. Internally, the prompt is compared to the previous completion and only the "unseen" suffix is evaluated. If the prompt is a string or an array with the first element given as a string, a `bos` token is inserted in the front like `main` does.
118+
117119
`temperature`: Adjust the randomness of the generated text (default: 0.8).
118120

119121
`top_k`: Limit the next token selection to the K most probable tokens (default: 40).
@@ -122,8 +124,8 @@ node index.js
122124

123125
`n_predict`: Set the maximum number of tokens to predict when generating text. **Note:** May exceed the set limit slightly if the last token is a partial multibyte character. When 0, no tokens will be generated but the prompt is evaluated into the cache. (default: -1, -1 = infinity).
124126

125-
`n_keep`: Specify the number of tokens from the initial prompt to retain when the model resets its internal context.
126-
By default, this value is set to 0 (meaning no tokens are kept). Use `-1` to retain all tokens from the initial prompt.
127+
`n_keep`: Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded.
128+
By default, this value is set to 0 (meaning no tokens are kept). Use `-1` to retain all tokens from the prompt.
127129

128130
`stream`: It allows receiving each predicted token in real-time instead of waiting for the completion to finish. To enable this, set to `true`.
129131

@@ -160,6 +162,42 @@ node index.js
160162

161163
`n_probs`: If greater than 0, the response also contains the probabilities of top N tokens for each generated token (default: 0)
162164

165+
*Result JSON:*
166+
167+
Note: When using streaming mode (`stream`) only `content` and `stop` will be returned until end of completion.
168+
169+
`content`: Completion result as a string (excluding `stopping_word` if any). In case of streaming mode, will contain the next token as a string.
170+
171+
`stop`: Boolean for use with `stream` to check whether the generation has stopped (Note: This is not related to stopping words array `stop` from input options)
172+
173+
`generation_settings`: The provided options above excluding `prompt` but including `n_ctx`, `model`
174+
175+
`model`: The path to the model loaded with `-m`
176+
177+
`prompt`: The provided `prompt`
178+
179+
`stopped_eos`: Indicating whether the completion has stopped because it encountered the EOS token
180+
181+
`stopped_limit`: Indicating whether the completion stopped because `n_predict` tokens were generated before stop words or EOS was encountered
182+
183+
`stopped_word`: Indicating whether the completion stopped due to encountering a stopping word from `stop` JSON array provided
184+
185+
`stopping_word`: The stopping word encountered which stopped the generation (or "" if not stopped due to a stopping word)
186+
187+
`timings`: Hash of timing information about the completion such as the number of tokens `predicted_per_second`
188+
189+
`tokens_cached`: Number of tokens from the prompt which could be re-used from previous completion (`n_past`)
190+
191+
`tokens_evaluated`: Number of tokens evaluated in total from the prompt
192+
193+
`truncated`: Boolean indicating if the context size was exceeded during generation, i.e. the number of tokens provided in the prompt (`tokens_evaluated`) plus tokens generated (`tokens predicted`) exceeded the context size (`n_ctx`)
194+
195+
`slot_id`: Assign the completion task to an specific slot. If is -1 the task will be assigned to a Idle slot (default: -1)
196+
197+
`cache_prompt`: Save the prompt and generation for avoid reprocess entire prompt if a part of this isn't change (default: false)
198+
199+
`system_prompt`: Change the system prompt (initial prompt of all slots), this is useful for chat applications. [See more](#change-system-prompt-on-runtime)
200+
163201
- **POST** `/tokenize`: Tokenize a given text.
164202

165203
*Options:*

0 commit comments

Comments
 (0)