Created a Server example #1025

zrthxn · 2023-04-17T11:51:35Z

I've created an example which provides an HTTP interface to LLaMA using Crow. This comes as a single header file which I've committed. Also, this library depends on Boost, so to build this example, one needs to install Boost (for MacOS its brew install boost).

prusnak · 2023-04-17T20:44:06Z

Please fix:

build failures (CI above all over the place)
conflicts with the main branch
whitespace (see editorconfig failure in the CI)

Also you shouldn't do such massive changes to main.cpp, otherwise this will never get merged and you'll keep fixing the conflicts forever.

It's better to just add a server.cpp example that is independent from main.cpp (at the cost of code duplication).

zrthxn · 2023-04-18T06:20:08Z

@prusnak Sorry I should've marked this as a draft, I meant to fix all that.

zrthxn · 2023-04-27T07:07:27Z

@prusnak Could you help me with getting the Windows build (windows-latest-cmake) to work? I'm not sure how to get Boost to install there.

ggerganov

Add LLAMA_BOOST CMake option here which is OFF by default:

https://github.com/ggerganov/llama.cpp/blob/5fba3c016bfd1d73a070e7c93dac14162ce118d0/CMakeLists.txt#L65-L71

Make this example build only if LLAMA_BOOST is ON
Make a separate CI job just for LLAMA_BOOST. Only one OS is enough. We don't want to install / link boost in all CI jobs

prusnak · 2023-04-28T17:30:39Z

@zrthxn Do we even need Boost? We are on C++11 and lots of Boost features are being moved to std::. Doesn't Crow work with plain C++11?

prusnak · 2023-04-28T17:42:20Z

Do we even need Boost? We are on C++11 and lots of Boost features are being moved to std::. Doesn't Crow work with plain C++11?

Answering to myself - yes, it seems Crow still needs boost :-/

prusnak · 2023-04-28T17:44:12Z

Wondering whether we shouldn't use a different C++ header-only HTTP(S) server library which does not require boost - such as https://github.com/yhirose/cpp-httplib

ggerganov · 2023-04-28T18:02:58Z

@prusnak This lib does look much better. Boost comes with many drawbacks and hence the request to put this behind a CMake option

We can either rework the PR to use the proposed lib, or merge it like this and later when someone implements an example using cpp-httplib we can replace it

zrthxn · 2023-04-28T19:56:26Z

@ggerganov I think it won't be too hard to use another library. I'm only using very minimal functionality from Crow, and I only have 1 or 2 endpoints. So I'll rework this to use cpp-httplib. I liked Crow mainly because it advertised that its a single header.

prusnak · 2023-04-28T19:59:05Z

I liked Crow mainly because it advertised that its a single header.

So is cpp-httplib

zrthxn · 2023-04-28T20:02:59Z

@ggerganov @prusnak By the way, there is a slight issue that I've come across with serving the model. If an incoming request is cancelled, i.e. the client disconnects, the eval loop keeps running consuming CPU resources. My guess is that, at least with Crow, its because the endpoint handler that you write as a lambda expression is executed in a separate thread and that doesn't get stopped/killed when the client disconnects.

prusnak · 2023-04-28T20:06:21Z

Let's try how cpp-httplib deals with that.

ggerganov · 2023-04-28T20:10:19Z

In C++ world there is no way to terminate a thread once it has started, except to join it (i.e. wait for it to finish).
To solve the described issue, we need an abort mechanism added to the llama API

zrthxn · 2023-04-28T20:16:51Z

@ggerganov One way of implementing aborting could be that in the eval loop, before writing a token to the output stream (stdout or file stream), check if some special character or sequence like ABORT--ABORT was written. If so, break.

FSSRepo · 2023-04-30T18:43:39Z

Hello everyone, I have a llama.cpp with cpp-httplib here.

It doesn't require external dependencies.

Limitations:

Just tested in Windows and Linux
Only CMake build.
Only one context at a time.
Just vicuna support for interaction.

Usage

Get Code

git clone https://github.com/FSSRepo/llama.cpp.git
cd llama.cpp

Build

mkdir build
cd build
cmake ..
cmake --build . --config Release

Run

Model tested: Vicuna

server -m ggml-vicuna-7b-q4_0.bin --keep -1 --ctx_size 2048

Node JS Test the endpoints

You need to have Node.js installed.

mkdir llama-client
cd llama-client
npm init
npm install axios

Create a index.js file and put inside this:

const axios = require('axios');

async function Test() {
    let result = await axios.post("http://127.0.0.1:8080/setting-context", {
        context: [
            { role: "system", content: "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions." },
            { role: "user", content: "Hello, Assistant." },
            { role: "assistant", content: "Hello. How may I help you today?" },
            { role: "user", content: "Please tell me the largest city in Europe." },
            { role: "assistant", content: "Sure. The largest city in Europe is Moscow, the capital of Russia." }
        ],
        batch_size: 64,
        temperature: 0.2,
        top_k: 40,
        top_p: 0.9,
        n_predict: 2048,
        threads: 5
    });
    result = await axios.post("http://127.0.0.1:8080/set-message", {
        message: ' What is linux?'
    });
    if(result.data.can_inference) {
        result = await axios.get("http://127.0.0.1:8080/completion?stream=true", { responseType: 'stream' });
        result.data.on('data', (data) => {
            // token by token completion
            let dat = JSON.parse(data.toString());
            process.stdout.write(dat.content);
        });
    }
}

Test();

And run it:

node .

Sorry my bad english and practices in C++ :(

x4080 · 2023-05-09T23:21:07Z

@FSSRepo Hello, I tried running

cmake --build . --config Release

and it gives errors in my mac mini M2 pro, the

cmake ..

works

Can you help ? Thanks

FSSRepo · 2023-05-09T23:29:47Z

@x4080 you can detail the error in Issues tab on my fork, please

x4080 · 2023-05-10T02:20:44Z

@x4080 you can detail the error in Issues tab on my fork, please

I wanted to before this, there's no issue hehe, I'll do it now, see you there and thank you very much man

x4080 · 2023-05-10T20:57:42Z

@ggerganov I tried this API and I think I love it, I cant wait to get it integrated to llama.cpp

@FSSRepo good work man

zrthxn · 2023-05-11T11:27:39Z

Closing in favor of https://github.com/FSSRepo/llama.cpp

x4080 · 2023-05-12T06:37:46Z

@zrthxn so it won't be merged into llama cpp?

zrthxn · 2023-05-12T06:58:10Z

@x4080 I think the version made by @FSSRepo is better than mine. A lot of stuff is taken care of there. I think that should probably be merged. If you'd like me to convert my version to use cpp-httplib, I can do that too.

x4080 · 2023-05-12T07:06:10Z

Oh i thought that you have the same merge request as @FSSRepo

My mistake then

prusnak marked this pull request as draft April 18, 2023 09:30

zrthxn force-pushed the server branch from b688db3 to 323cf54 Compare April 26, 2023 07:23

Alisamar Husain added 3 commits April 27, 2023 16:52

Created a Server example

98de1c3

Added Boost to tests

db921e4

Run LLaMA function

2b50d21

zrthxn force-pushed the server branch from 130c222 to 2b50d21 Compare April 27, 2023 11:22

Fix editorconfig

d2af46e

zrthxn marked this pull request as ready for review April 27, 2023 11:27

Fix editorconfig

846ee2c

ggerganov requested changes Apr 28, 2023

View reviewed changes

zrthxn closed this May 11, 2023

prusnak mentioned this pull request May 14, 2023

Server example with API Rest #1443

Merged

4 tasks

Created a Server example #1025

Created a Server example #1025

Uh oh!

Conversation

zrthxn commented Apr 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

prusnak commented Apr 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zrthxn commented Apr 18, 2023

Uh oh!

zrthxn commented Apr 27, 2023

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

prusnak commented Apr 28, 2023

Uh oh!

prusnak commented Apr 28, 2023

Uh oh!

prusnak commented Apr 28, 2023

Uh oh!

ggerganov commented Apr 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zrthxn commented Apr 28, 2023

Uh oh!

prusnak commented Apr 28, 2023

Uh oh!

zrthxn commented Apr 28, 2023

Uh oh!

prusnak commented Apr 28, 2023

Uh oh!

ggerganov commented Apr 28, 2023

Uh oh!

zrthxn commented Apr 28, 2023

Uh oh!

FSSRepo commented Apr 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Limitations:

Usage

Get Code

Build

Run

Node JS Test the endpoints

Uh oh!

x4080 commented May 9, 2023

Uh oh!

FSSRepo commented May 9, 2023

Uh oh!

x4080 commented May 10, 2023

Uh oh!

x4080 commented May 10, 2023

Uh oh!

zrthxn commented May 11, 2023

Uh oh!

x4080 commented May 12, 2023

Uh oh!

zrthxn commented May 12, 2023

Uh oh!

x4080 commented May 12, 2023

Uh oh!

Uh oh!

zrthxn commented Apr 17, 2023 •

edited

Loading

prusnak commented Apr 17, 2023 •

edited

Loading

ggerganov commented Apr 28, 2023 •

edited

Loading

FSSRepo commented Apr 30, 2023 •

edited

Loading