Skip to content

SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, Settings UI, Streaming mode, OpenAi Compat (Model, Authorization Bearer), Save/Restore session, Auto Settings UI #7548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 59 commits into from
Jun 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
c83c19a
SimpleChat:DU:BringIn local helper js modules using importmap
hanishkvc May 25, 2024
54802dc
SimpleChat:DU: Add trim garbage at end in loop helper
hanishkvc May 25, 2024
6390f34
SimpleChat:DU:TrimGarbage if unable try skip char and retry
hanishkvc May 25, 2024
f33aa28
SimpleChat:DU: Try trim using histogram based info
hanishkvc May 25, 2024
d1e73d8
SimpleChat:DU: Switch trim garbage hist based to maxUniq simple
hanishkvc May 25, 2024
ae9f610
SimpleChat:DU: Bring in maxType to the mix along with maxUniq
hanishkvc May 25, 2024
15152af
SimpleChat:DU: Cleanup debug log messages
hanishkvc May 25, 2024
a41f701
SimpleChat:UI: Move html ui base helpers into its own module
hanishkvc May 26, 2024
ed345ab
SimpleChat:DU:Avoid setting frequence/Presence penalty
hanishkvc May 26, 2024
ae7e66d
SimpleChat:UI: Add and use a para-create-append helper
hanishkvc May 26, 2024
e42249d
SimpleChat:UI: Helper to create bool button and use it wrt settings
hanishkvc May 26, 2024
1e47a48
SimpleChat:UI: Add Select helper and use it wrt ChatHistoryInCtxt
hanishkvc May 26, 2024
94bc0b0
SimpleChat:UI:Select: dict-name-value, value wrt default, change
hanishkvc May 26, 2024
e17f5e0
SimpleChat:UI: Add Div wrapped label+element helpers
hanishkvc May 26, 2024
0dae12b
SimpleChat:UI:Add settings button and bring in settings ui
hanishkvc May 26, 2024
452813f
SimpleChat:UI:Settings make boolean button text show meaning
hanishkvc May 26, 2024
1db965d
SimpleChat: Update a bit wrt readme and notes in du
hanishkvc May 26, 2024
42b4fe5
SimpleChat: GarbageTrim enable/disable, show trimmed part ifany
hanishkvc May 26, 2024
f9fc543
SimpleChat: highlight trim, garbage trimming bitmore aggressive
hanishkvc May 27, 2024
b2c10b9
SimpleChat: Cleanup a bit wrt Api end point related flow
hanishkvc May 27, 2024
269cf3f
SimpleChat:Move extracting assistant response to SimpleChat class
hanishkvc May 27, 2024
f5f9a2b
SimpleChat:DU: Bring in both trim garbage logics to try trim
hanishkvc May 27, 2024
060925c
SimpleChat: Cleanup readme a bit, add one more chathistory length
hanishkvc May 27, 2024
9d0e65d
SimpleChat:Stream:Initial handshake skeleton
hanishkvc May 28, 2024
8f97c23
SimpleChat: Move handling oneshot mode server response
hanishkvc May 28, 2024
aecf0e2
SimpleChat: Move multi part server response handling in
hanishkvc May 28, 2024
08b117b
SimpleChat: Add MultiPart Response handling, common trimming
hanishkvc May 28, 2024
4d35455
SimpleChat: show streamed generative text as it becomes available
hanishkvc May 28, 2024
b7a5424
SimpleChat:DU: Add NewLines helper class
hanishkvc May 28, 2024
7251714
SimpleChat:DU: Make NewLines shift more robust and flexible
hanishkvc May 28, 2024
0792374
SimpleChat:HandleResponseMultiPart using NewLines helper
hanishkvc May 28, 2024
fcd385c
SimpleChat: Disable console debug by default by making it dummy
hanishkvc May 28, 2024
ace3704
SimpleChat:MultiPart/Stream flow cleanup
hanishkvc May 29, 2024
104848b
SimpleChat: Move baseUrl to Me and inturn gMe
hanishkvc May 29, 2024
ebf978d
SimpleChat:UI: Add input element helper
hanishkvc May 29, 2024
f54e000
SimpleChat: Add support for changing the base url
hanishkvc May 29, 2024
dce4e6a
SimpleChat: Move request headers into Me and gMe
hanishkvc May 29, 2024
c9559d2
SimpleChat: Rather need to use append to insert headers
hanishkvc May 29, 2024
af342b3
SimpleChat: Allow Authorization header to be set by end user
hanishkvc May 29, 2024
7a0399e
SimpleChat:UI+: Return div and element wrt creatediv helpers
hanishkvc May 29, 2024
85fd2d0
SimpleChat: readme wrt authorization, maybe minimal openai testing
hanishkvc May 29, 2024
0e7880a
SimpleChat: model request field for openai/equivalent compat
hanishkvc May 29, 2024
48f02e0
SimpleChat: readme stream-utf-8 trim-english deps, exception2error
hanishkvc May 29, 2024
009563d
Readme: Add a entry for simplechat in the http server section
hanishkvc May 29, 2024
b75b3db
SimpleChat:WIP:Collate internally, Stream mode Trap exceptions
hanishkvc May 29, 2024
cdb4f6d
SimpleChat:theResp-origMsg: Undo a prev change to fix non trim
hanishkvc May 29, 2024
872ee2c
SimpleChat: Save message internally in handle_response itself
hanishkvc May 29, 2024
ec79b8d
SimpleChat:Cleanup: Add spacing wrt shown req-options
hanishkvc May 30, 2024
803ee72
SimpleChat:UI: CreateDiv Divs map to GridX2 class
hanishkvc May 30, 2024
3d925cb
SimpleChat: Show Non SettingsUI config field by default
hanishkvc May 30, 2024
1d7739b
SimpleChat: Allow for multiline system prompt
hanishkvc May 30, 2024
e2efcb4
SimpleChat: Add basic skeleton for saving and loading chat
hanishkvc May 30, 2024
a15d4dc
SimpleChat:ODS: Add a prefix to chatid wrt ondiskstorage key
hanishkvc May 30, 2024
5d40866
SimpleChat:ODS:WIP:TMP: Add UI to load previously saved chat
hanishkvc May 30, 2024
4abcfde
SimpleChat:ODS:Move restore/load saved chat btn setup to Me
hanishkvc May 30, 2024
6ef57cc
SimpleChat:Readme updated wrt save and restore chat session info
hanishkvc May 30, 2024
bc68803
SimpleChat:Show chat session restore button, only if saved session
hanishkvc May 31, 2024
bb0f0c8
SimpleChat: AutoCreate ChatRequestOptions settings to an extent
hanishkvc May 31, 2024
c4141a5
SimpleChat: Update main README wrt usage with server
hanishkvc May 31, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,8 @@ Typically finetunes of the base models below are supported as well.

[llama.cpp web server](./examples/server) is a lightweight [OpenAI API](https://github.com/openai/openai-openapi) compatible HTTP server that can be used to serve local models and easily connect them to existing clients.

[simplechat](./examples/server/public_simplechat) is a simple chat client, which can be used to chat with the model exposed using above web server (use --path to point to simplechat), from a local web browser.

**Bindings:**

- Python: [abetlen/llama-cpp-python](https://github.com/abetlen/llama-cpp-python)
Expand Down
266 changes: 266 additions & 0 deletions examples/server/public_simplechat/datautils.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,266 @@
//@ts-check
// Helpers to work with different data types
// by Humans for All
//

/**
* Given the limited context size of local LLMs and , many a times when context gets filled
* between the prompt and the response, it can lead to repeating text garbage generation.
* And many a times setting penalty wrt repeatation leads to over-intelligent garbage
* repeatation with slight variations. These garbage inturn can lead to overloading of the
* available model context, leading to less valuable response for subsequent prompts/queries,
* if chat history is sent to ai model.
*
* So two simple minded garbage trimming logics are experimented below.
* * one based on progressively-larger-substring-based-repeat-matching-with-partial-skip and
* * another based on char-histogram-driven garbage trimming.
* * in future characteristic of histogram over varying lengths could be used to allow for
* a more aggressive and adaptive trimming logic.
*/


/**
* Simple minded logic to help remove repeating garbage at end of the string.
* The repeatation needs to be perfectly matching.
*
* The logic progressively goes on probing for longer and longer substring based
* repeatation, till there is no longer repeatation. Inturn picks the one with
* the longest chain.
*
* @param {string} sIn
* @param {number} maxSubL
* @param {number} maxMatchLenThreshold
*/
export function trim_repeat_garbage_at_end(sIn, maxSubL=10, maxMatchLenThreshold=40) {
let rCnt = [0];
let maxMatchLen = maxSubL;
let iMML = -1;
for(let subL=1; subL < maxSubL; subL++) {
rCnt.push(0);
let i;
let refS = sIn.substring(sIn.length-subL, sIn.length);
for(i=sIn.length; i > 0; i -= subL) {
let curS = sIn.substring(i-subL, i);
if (refS != curS) {
let curMatchLen = rCnt[subL]*subL;
if (maxMatchLen < curMatchLen) {
maxMatchLen = curMatchLen;
iMML = subL;
}
break;
}
rCnt[subL] += 1;
}
}
console.debug("DBUG:DU:TrimRepeatGarbage:", rCnt);
if ((iMML == -1) || (maxMatchLen < maxMatchLenThreshold)) {
return {trimmed: false, data: sIn};
}
console.debug("DBUG:TrimRepeatGarbage:TrimmedCharLen:", maxMatchLen);
let iEnd = sIn.length - maxMatchLen;
return { trimmed: true, data: sIn.substring(0, iEnd) };
}


/**
* Simple minded logic to help remove repeating garbage at end of the string, till it cant.
* If its not able to trim, then it will try to skip a char at end and then trim, a few times.
* This ensures that even if there are multiple runs of garbage with different patterns, the
* logic still tries to munch through them.
*
* @param {string} sIn
* @param {number} maxSubL
* @param {number | undefined} [maxMatchLenThreshold]
*/
export function trim_repeat_garbage_at_end_loop(sIn, maxSubL, maxMatchLenThreshold, skipMax=16) {
let sCur = sIn;
let sSaved = "";
let iTry = 0;
while(true) {
let got = trim_repeat_garbage_at_end(sCur, maxSubL, maxMatchLenThreshold);
if (got.trimmed != true) {
if (iTry == 0) {
sSaved = got.data;
}
iTry += 1;
if (iTry >= skipMax) {
return sSaved;
}
got.data = got.data.substring(0,got.data.length-1);
} else {
iTry = 0;
}
sCur = got.data;
}
}


/**
* A simple minded try trim garbage at end using histogram driven characteristics.
* There can be variation in the repeatations, as long as no new char props up.
*
* This tracks the chars and their frequency in a specified length of substring at the end
* and inturn checks if moving further into the generated text from the end remains within
* the same char subset or goes beyond it and based on that either trims the string at the
* end or not. This allows to filter garbage at the end, including even if there are certain
* kind of small variations in the repeated text wrt position of seen chars.
*
* Allow the garbage to contain upto maxUniq chars, but at the same time ensure that
* a given type of char ie numerals or alphabets or other types dont cross the specified
* maxType limit. This allows intermixed text garbage to be identified and trimmed.
*
* ALERT: This is not perfect and only provides a rough garbage identification logic.
* Also it currently only differentiates between character classes wrt english.
*
* @param {string} sIn
* @param {number} maxType
* @param {number} maxUniq
* @param {number} maxMatchLenThreshold
*/
export function trim_hist_garbage_at_end(sIn, maxType, maxUniq, maxMatchLenThreshold) {
if (sIn.length < maxMatchLenThreshold) {
return { trimmed: false, data: sIn };
}
let iAlp = 0;
let iNum = 0;
let iOth = 0;
// Learn
let hist = {};
let iUniq = 0;
for(let i=0; i<maxMatchLenThreshold; i++) {
let c = sIn[sIn.length-1-i];
if (c in hist) {
hist[c] += 1;
} else {
if(c.match(/[0-9]/) != null) {
iNum += 1;
} else if(c.match(/[A-Za-z]/) != null) {
iAlp += 1;
} else {
iOth += 1;
}
iUniq += 1;
if (iUniq >= maxUniq) {
break;
}
hist[c] = 1;
}
}
console.debug("DBUG:TrimHistGarbage:", hist);
if ((iAlp > maxType) || (iNum > maxType) || (iOth > maxType)) {
return { trimmed: false, data: sIn };
}
// Catch and Trim
for(let i=0; i < sIn.length; i++) {
let c = sIn[sIn.length-1-i];
if (!(c in hist)) {
if (i < maxMatchLenThreshold) {
return { trimmed: false, data: sIn };
}
console.debug("DBUG:TrimHistGarbage:TrimmedCharLen:", i);
return { trimmed: true, data: sIn.substring(0, sIn.length-i+1) };
}
}
console.debug("DBUG:TrimHistGarbage:Trimmed fully");
return { trimmed: true, data: "" };
}

/**
* Keep trimming repeatedly using hist_garbage logic, till you no longer can.
* This ensures that even if there are multiple runs of garbage with different patterns,
* the logic still tries to munch through them.
*
* @param {any} sIn
* @param {number} maxType
* @param {number} maxUniq
* @param {number} maxMatchLenThreshold
*/
export function trim_hist_garbage_at_end_loop(sIn, maxType, maxUniq, maxMatchLenThreshold) {
let sCur = sIn;
while (true) {
let got = trim_hist_garbage_at_end(sCur, maxType, maxUniq, maxMatchLenThreshold);
if (!got.trimmed) {
return got.data;
}
sCur = got.data;
}
}

/**
* Try trim garbage at the end by using both the hist-driven-garbage-trimming as well as
* skip-a-bit-if-reqd-then-repeat-pattern-based-garbage-trimming, with blind retrying.
* @param {string} sIn
*/
export function trim_garbage_at_end(sIn) {
let sCur = sIn;
for(let i=0; i<2; i++) {
sCur = trim_hist_garbage_at_end_loop(sCur, 8, 24, 72);
sCur = trim_repeat_garbage_at_end_loop(sCur, 32, 72, 12);
}
return sCur;
}


/**
* NewLines array helper.
* Allow for maintaining a list of lines.
* Allow for a line to be builtup/appended part by part.
*/
export class NewLines {

constructor() {
/** @type {string[]} */
this.lines = [];
}

/**
* Extracts lines from the passed string and inturn either
* append to a previous partial line or add a new line.
* @param {string} sLines
*/
add_append(sLines) {
let aLines = sLines.split("\n");
let lCnt = 0;
for(let line of aLines) {
lCnt += 1;
// Add back newline removed if any during split
if (lCnt < aLines.length) {
line += "\n";
} else {
if (sLines.endsWith("\n")) {
line += "\n";
}
}
// Append if required
if (lCnt == 1) {
let lastLine = this.lines[this.lines.length-1];
if (lastLine != undefined) {
if (!lastLine.endsWith("\n")) {
this.lines[this.lines.length-1] += line;
continue;
}
}
}
// Add new line
this.lines.push(line);
}
}

/**
* Shift the oldest/earliest/0th line in the array. [Old-New|Earliest-Latest]
* Optionally control whether only full lines (ie those with newline at end) will be returned
* or will a partial line without a newline at end (can only be the last line) be returned.
* @param {boolean} bFullWithNewLineOnly
*/
shift(bFullWithNewLineOnly=true) {
let line = this.lines[0];
if (line == undefined) {
return undefined;
}
if ((line[line.length-1] != "\n") && bFullWithNewLineOnly){
return undefined;
}
return this.lines.shift();
}

}
24 changes: 13 additions & 11 deletions examples/server/public_simplechat/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -8,29 +8,31 @@
<meta name="description" content="SimpleChat: trigger LLM web service endpoints /chat/completions and /completions, single/multi chat sessions" />
<meta name="author" content="by Humans for All" />
<meta http-equiv="Cache-Control" content="no-cache, no-store, must-revalidate" />
<script src="simplechat.js" defer></script>
<script type="importmap">
{
"imports": {
"datautils": "./datautils.mjs",
"ui": "./ui.mjs"
}
}
</script>
<script src="simplechat.js" type="module" defer></script>
<link rel="stylesheet" href="simplechat.css" />
</head>
<body>
<div class="samecolumn" id="fullbody">

<div class="sameline">
<div class="sameline" id="heading">
<p class="heading flex-grow" > <b> SimpleChat </b> </p>
<div class="sameline">
<label for="api-ep">Mode:</label>
<select name="api-ep" id="api-ep">
<option value="chat" selected>Chat</option>
<option value="completion">Completion</option>
</select>
</div>
<button id="settings">Settings</button>
</div>

<div id="sessions-div" class="sameline"></div>

<hr>
<div class="sameline">
<label for="system-in">System</label>
<input type="text" name="system" id="system-in" placeholder="e.g. you are a helpful ai assistant, who provides concise answers" class="flex-grow"/>
<textarea name="system" id="system-in" rows="2" placeholder="e.g. you are a helpful ai assistant, who provides concise answers" class="flex-grow"></textarea>
</div>

<hr>
Expand All @@ -40,7 +42,7 @@

<hr>
<div class="sameline">
<textarea id="user-in" class="flex-grow" rows="3" placeholder="enter your query to the ai model here" ></textarea>
<textarea id="user-in" class="flex-grow" rows="2" placeholder="enter your query to the ai model here" ></textarea>
<button id="user-btn">submit</button>
</div>

Expand Down
Loading
Loading