3
3
4
4
by Humans for All.
5
5
6
+ ## quickstart
7
+
8
+ To run from the build dir
9
+
10
+ bin/llama-server -m path/model.gguf --path ../examples/server/public_simplechat
11
+
12
+ Continue reading for the details.
6
13
7
14
## overview
8
15
@@ -14,7 +21,7 @@ own system prompts.
14
21
This allows seeing the generated text / ai-model response in oneshot at the end, after it is fully generated,
15
22
or potentially as it is being generated, in a streamed manner from the server/ai-model.
16
23
17
- ![ Chat and Settings screens] ( ./simplechat_screens.png )
24
+ ![ Chat and Settings screens] ( ./simplechat_screens.png " Chat and Settings screens " )
18
25
19
26
Auto saves the chat session locally as and when the chat is progressing and inturn at a later time when you
20
27
open SimpleChat, option is provided to restore the old chat session, if a matching one exists.
@@ -183,10 +190,11 @@ It is attached to the document object. Some of these can also be updated using t
183
190
user at runtime by directly modifying gMe.chatRequestOptions, setting ui entries will be auto
184
191
created.
185
192
186
- cache_prompt option supported by example/server is allowed to be controlled by user. So that
187
- any caching supported with system-prompt and chat history if usable can get used. If one has
188
- enabled chat history sliding window, then the chat history caching may or maynot kick in at
189
- the backend, based on aspects related to positional encoding, attention mechanism etal.
193
+ cache_prompt option supported by example/server is allowed to be controlled by user, so that
194
+ any caching supported wrt system-prompt and chat history, if usable can get used. When chat
195
+ history sliding window is enabled, cache_prompt logic may or may not kick in at the backend
196
+ wrt same, based on aspects related to model, positional encoding, attention mechanism etal.
197
+ However system prompt should ideally get the benefit of caching.
190
198
191
199
headers - maintains the list of http headers sent when request is made to the server. By default
192
200
Content-Type is set to application/json. Additionally Authorization entry is provided, which can
0 commit comments