Skip to content

Commit 90be46f

Browse files
committed
README - Usage examples updated
1 parent a710b2e commit 90be46f

File tree

2 files changed

+124
-50
lines changed

2 files changed

+124
-50
lines changed

README.md

Lines changed: 123 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -21,18 +21,22 @@ This is a no-nonsense async Scala client for OpenAI API supporting all the avail
2121
Note that in order to be consistent with the OpenAI API naming, the service function names match exactly the API endpoint titles/descriptions with camelcase.
2222
Also, we aimed the lib to be self-contained with the fewest dependencies possible therefore we ended up using only two libs `play-ahc-ws-standalone` and `play-ws-standalone-json` (at the top level). Additionally, if dependency injection is required we use `scala-guice` lib as well.
2323

24+
---
25+
2426
(🔥 **New**) In addition to the OpenAI API, this library also supports "API-compatible" providers such as:
2527
- [Azure OpenAI](https://azure.microsoft.com/en-us/products/ai-services/openai-service) - cloud-based, utilizes OpenAI models but with lower latency
2628
- [Azure AI](https://azure.microsoft.com/en-us/products/ai-studio) - cloud-based, offers a vast selection of open-source models
2729
- [Anthropic](https://www.anthropic.com/api) - cloud-based, a major competitor to OpenAI, features proprietary/closed-source models such as Claude3 - Haiku, Sonnet, and Opus
2830
- [Groq](https://wow.groq.com/) - cloud-based, known for its super-fast inference with LPUs
29-
- [Fireworks](https://fireworks.ai/) - cloud-based
31+
- [Fireworks AI](https://fireworks.ai/) - cloud-based
3032
- [OctoAI](https://octo.ai/) - cloud-based
3133
- [Ollama](https://ollama.com/) - runs locally, serves as an umbrella for open-source LLMs including LLaMA3, dbrx, and Command-R
3234
- [FastChat](https://github.com/lm-sys/FastChat) - runs locally, serves as an umbrella for open-source LLMs such as Vicuna, Alpaca, LLaMA2, and FastChat-T5
3335

3436
See [examples](https://github.com/cequence-io/openai-scala-client/tree/master/openai-examples/src/main/scala/io/cequence/openaiscala/examples/nonopenai) for more details.
3537

38+
---
39+
3640
👉 For background information read an article about the lib/client on [Medium](https://medium.com/@0xbnd/openai-scala-client-is-out-d7577de934ad).
3741

3842
Try out also our [Scala client for Pinecone vector database](https://github.com/cequence-io/pinecone-scala), or use both clients together! [This demo project](https://github.com/cequence-io/pinecone-openai-scala-demo) shows how to generate and store OpenAI embeddings (with `text-embedding-ada-002` model) into Pinecone and query them afterward. The OpenAI + Pinecone combo is commonly used for autonomous AI agents, such as [babyAGI](https://github.com/yoheinakajima/babyagi) and [AutoGPT](https://github.com/Significant-Gravitas/Auto-GPT).
@@ -99,13 +103,7 @@ Then you can obtain a service in one of the following ways.
99103
)
100104
```
101105

102-
- Minimal `OpenAICoreService` supporting `listModels`, `createCompletion`, `createChatCompletion`, and `createEmbeddings` calls - e.g. [FastChat](https://github.com/lm-sys/FastChat) service running on the port 8000
103-
104-
```scala
105-
val service = OpenAICoreServiceFactory("http://localhost:8000/v1/")
106-
```
107-
108-
- For Azure with API Key
106+
- For **Azure** with API Key
109107

110108
```scala
111109
val service = OpenAIServiceFactory.forAzureWithApiKey(
@@ -116,22 +114,87 @@ Then you can obtain a service in one of the following ways.
116114
)
117115
```
118116

119-
- For Azure with Access Token
117+
- Minimal `OpenAICoreService` supporting `listModels`, `createCompletion`, `createChatCompletion`, and `createEmbeddings` calls - provided e.g. by [FastChat](https://github.com/lm-sys/FastChat) service running on the port 8000
120118

121119
```scala
122-
val service = OpenAIServiceFactory.forAzureWithAccessToken(
123-
resourceName = "your-resource-name",
124-
deploymentId = "your-deployment-id", // usually model name such as "gpt-35-turbo"
125-
apiVersion = "2023-05-15", // newest version
126-
accessToken = "your_access_token"
120+
val service = OpenAICoreServiceFactory("http://localhost:8000/v1/")
121+
```
122+
123+
- `OpenAIChatCompletionService` providing solely `createChatCompletion`
124+
125+
1. [Groq](https://wow.groq.com/)
126+
```scala
127+
val service = OpenAIChatCompletionServiceFactory(
128+
coreUrl = "https://api.groq.com/openai/v1/",
129+
authHeaders = Seq(("Authorization", s"Bearer ${sys.env("GROQ_API_KEY")}"))
130+
)
131+
```
132+
133+
2. [Azure AI](https://azure.microsoft.com/en-us/products/ai-studio) - e.g. Cohere R+ model
134+
```scala
135+
val service = OpenAIChatCompletionServiceFactory.forAzureAI(
136+
endpoint = sys.env("AZURE_AI_COHERE_R_PLUS_ENDPOINT"),
137+
region = sys.env("AZURE_AI_COHERE_R_PLUS_REGION"),
138+
accessToken = sys.env("AZURE_AI_COHERE_R_PLUS_ACCESS_KEY")
139+
)
140+
```
141+
142+
3. [Anthropic](https://www.anthropic.com/api) (requires our `openai-anthropic-client` lib)
143+
```scala
144+
val service = AnthropicServiceFactory.asOpenAI()
145+
```
146+
147+
4. [Fireworks AI](https://fireworks.ai/)
148+
```scala
149+
val service = OpenAIChatCompletionServiceFactory(
150+
coreUrl = "https://api.fireworks.ai/inference/v1/",
151+
authHeaders = Seq(("Authorization", s"Bearer ${sys.env("FIREWORKS_API_KEY")}"))
152+
)
153+
```
154+
155+
5. [Octo AI](https://octo.ai/)
156+
```scala
157+
val service = OpenAIChatCompletionServiceFactory(
158+
coreUrl = "https://text.octoai.run/v1/",
159+
authHeaders = Seq(("Authorization", s"Bearer ${sys.env("OCTOAI_TOKEN")}"))
160+
)
161+
```
162+
163+
6. [Ollama](https://ollama.com/)
164+
```scala
165+
val service = OpenAIChatCompletionServiceFactory(
166+
coreUrl = "http://localhost:11434/v1/"
167+
)
168+
```
169+
170+
- Services with additional streaming support - `createCompletionStreamed` and `createChatCompletionStreamed` provided by [OpenAIStreamedServiceExtra](./openai-client-stream/src/main/scala/io/cequence/openaiscala/service/OpenAIStreamedServiceExtra.scala) (requires `openai-scala-client-stream` lib)
171+
172+
```scala
173+
import io.cequence.openaiscala.service.StreamedServiceTypes.OpenAIStreamedService
174+
import io.cequence.openaiscala.service.OpenAIStreamedServiceImplicits._
175+
176+
val service: OpenAIStreamedService = OpenAIServiceFactory.withStreaming()
177+
```
178+
179+
for similarly for a chat-completion service
180+
181+
```scala
182+
import io.cequence.openaiscala.service.OpenAIStreamedServiceImplicits._
183+
184+
val service = OpenAIChatCompletionServiceFactory.withStreaming(
185+
coreUrl = "https://api.fireworks.ai/inference/v1/",
186+
authHeaders = Seq(("Authorization", s"Bearer ${sys.env("FIREWORKS_API_KEY")}"))
127187
)
128188
```
129189

130-
**✔️ Important**: If you want streaming support use `OpenAIServiceStreamedFactory` or `OpenAICoreServiceStreamedFactory` from `openai-scala-client-stream` lib instead of `OpenAIServiceFactory` (in the three examples above). Three additional functions - `createCompletionStreamed`, `createChatCompletionStreamed`, and `listFineTuneEventsStreamed` (deprecated) provided by [OpenAIServiceStreamedExtra](./openai-client-stream/src/main/scala/io/cequence/openaiscala/service/OpenAIServiceStreamedExtra.scala) will be then available.
131-
🔥 **New**: Note that it is now possible to use a streamed service also with a non-OpenAI provider e.g. as:
190+
or if only streaming is required
132191

133192
```scala
134-
val service = OpenAICoreServiceStreamedFactory.customInstance("http://localhost:8000/v1/")
193+
val service: OpenAIChatCompletionStreamedServiceExtra =
194+
OpenAIChatCompletionStreamedServiceFactory(
195+
coreUrl = "https://api.fireworks.ai/inference/v1/",
196+
authHeaders = Seq(("Authorization", s"Bearer ${sys.env("FIREWORKS_API_KEY")}"))
197+
)
135198
```
136199

137200
- Via dependency injection (requires `openai-scala-guice` lib)
@@ -140,6 +203,8 @@ Then you can obtain a service in one of the following ways.
140203
class MyClass @Inject() (openAIService: OpenAIService) {...}
141204
```
142205

206+
---
207+
143208
**II. Calling functions**
144209

145210
Full documentation of each call with its respective inputs and settings is provided in [OpenAIService](./openai-core/src/main/scala/io/cequence/openaiscala/service/OpenAIService.scala). Since all the calls are async they return responses wrapped in `Future`.
@@ -191,7 +256,7 @@ Examples:
191256
service.createCompletion(
192257
text,
193258
settings = CreateCompletionSettings(
194-
model = ModelId.text_davinci_001,
259+
model = ModelId.gpt_3_5_turbo_16k,
195260
max_tokens = Some(1500),
196261
temperature = Some(0.9),
197262
presence_penalty = Some(0.2),
@@ -202,23 +267,6 @@ Examples:
202267
)
203268
```
204269

205-
- 🔥 **New**: Count used tokens before calling `createChatCompletions` or `createChatFunCompletions`, this help you select proper model ex. `gpt-3.5-turbo` or `gpt-3.5-turbo-16k` and reduce costs. This is an experimental feature and it may not work for all models.
206-
207-
```scala
208-
import io.cequence.openaiscala.service.OpenAICountTokensHelper
209-
import io.cequence.openaiscala.domain.{ChatRole, FunMessageSpec, FunctionSpec}
210-
211-
class MyCompletionService extends OpenAICountTokensHelper {
212-
def exec = {
213-
val messages: Seq[FunMessageSpec] = ??? // messages to be sent to OpenAI
214-
val function: FunctionSpec = ??? // function to be called
215-
216-
val tokens = countFunMessageTokens(messages, List(function), Some(function.name))
217-
}
218-
}
219-
220-
```
221-
222270
- Create completion with streaming and a custom setting
223271

224272
```scala
@@ -265,11 +313,12 @@ For this to work you need to use `OpenAIServiceStreamedFactory` from `openai-sca
265313

266314
```scala
267315
val messages = Seq(
268-
FunMessageSpec(role = ChatRole.User, content = Some("What's the weather like in Boston?")),
316+
SystemMessage("You are a helpful assistant."),
317+
UserMessage("What's the weather like in San Francisco, Tokyo, and Paris?")
269318
)
270319

271320
// as a param type we can use "number", "string", "boolean", "object", "array", and "null"
272-
val functions = Seq(
321+
val tools = Seq(
273322
FunctionSpec(
274323
name = "get_current_weather",
275324
description = Some("Get the current weather in a given location"),
@@ -278,38 +327,62 @@ For this to work you need to use `OpenAIServiceStreamedFactory` from `openai-sca
278327
"properties" -> Map(
279328
"location" -> Map(
280329
"type" -> "string",
281-
"description" -> "The city and state, e.g. San Francisco, CA",
330+
"description" -> "The city and state, e.g. San Francisco, CA"
282331
),
283332
"unit" -> Map(
284333
"type" -> "string",
285334
"enum" -> Seq("celsius", "fahrenheit")
286335
)
287336
),
288-
"required" -> Seq("location"),
337+
"required" -> Seq("location")
289338
)
290339
)
291340
)
292341

293342
// if we want to force the model to use the above function as a response
294-
// we can do so by passing: responseFunctionName = Some("get_current_weather")`
295-
service.createChatFunCompletion(
343+
// we can do so by passing: responseToolChoice = Some("get_current_weather")`
344+
service.createChatToolCompletion(
296345
messages = messages,
297-
functions = functions,
298-
responseFunctionName = None
346+
tools = tools,
347+
responseToolChoice = None, // means "auto"
348+
settings = CreateChatCompletionSettings(ModelId.gpt_3_5_turbo_1106)
299349
).map { response =>
300350
val chatFunCompletionMessage = response.choices.head.message
301-
val functionCall = chatFunCompletionMessage.function_call
351+
val toolCalls = chatFunCompletionMessage.tool_calls.collect {
352+
case (id, x: FunctionCallSpec) => (id, x)
353+
}
354+
355+
println(
356+
"tool call ids : " + toolCalls.map(_._1).mkString(", ")
357+
)
358+
println(
359+
"function/tool call names : " + toolCalls.map(_._2.name).mkString(", ")
360+
)
361+
println(
362+
"function/tool call arguments : " + toolCalls.map(_._2.arguments).mkString(", ")
363+
)
364+
}
365+
```
366+
367+
- 🔥 **New**: Count expected used tokens before calling `createChatCompletions` or `createChatFunCompletions`, this help you select proper model ex. `gpt-3.5-turbo` or `gpt-3.5-turbo-16k` and reduce costs. This is an experimental feature and it may not work for all models. Requires `openai-scala-count-tokens` lib.
302368

303-
println("function call name : " + functionCall.map(_.name).getOrElse("N/A"))
304-
println("function call arguments : " + functionCall.map(_.arguments).getOrElse("N/A"))
369+
```scala
370+
import io.cequence.openaiscala.service.OpenAICountTokensHelper
371+
import io.cequence.openaiscala.domain.{ChatRole, FunMessageSpec, FunctionSpec}
372+
373+
class MyCompletionService extends OpenAICountTokensHelper {
374+
def exec = {
375+
val messages: Seq[FunMessageSpec] = ??? // messages to be sent to OpenAI
376+
val function: FunctionSpec = ??? // function to be called
377+
378+
val tokens = countFunMessageTokens(model, messages, Seq(function), Some(function.name))
305379
}
380+
}
306381
```
307-
Note that instead of `MessageSpec`, the `function_call` version of the chat completion uses the `FunMessageSpec` class to define messages - both as part of the request and the response.
308-
This extension of the standard chat completion is currently supported by the following `0613` models, all conveniently available in `ModelId` object:
309-
- `gpt-3.5-turbo-0613` (default), `gpt-3.5-turbo-16k-0613`, `gpt-4-0613`, and `gpt-4-32k-0613`.
310382

383+
**✔️ Important**: After you are done using the service, you should close it by calling `service.close`. Otherwise, the underlying resources/threads won't be released.
311384

312-
**✔️ Important Note**: After you are done using the service, you should close it by calling `service.close`. Otherwise, the underlying resources/threads won't be released.
385+
---
313386

314387
**III. Using multiple services**
315388

openai-client/src/main/scala/io/cequence/openaiscala/service/OpenAIServiceFactoryHelper.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,7 @@ trait OpenAIServiceFactoryHelper[F] extends OpenAIServiceConsts {
110110
* The API version to use for this operation. This follows the YYYY-MM-DD format. Supported
111111
* versions: 2023-03-15-preview, 2022-12-01, 2023-05-15, and 2023-06-01-preview
112112
*/
113+
@Deprecated
113114
def forAzureWithAccessToken(
114115
resourceName: String,
115116
deploymentId: String,

0 commit comments

Comments
 (0)