-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Add Mistral AI Chat Completion support to Inference Plugin #128538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jonathan-buttner
merged 24 commits into
elastic:main
from
Jan-Kazlouski-elastic:feature/mistral-chat-completion-integration
Jun 4, 2025
Merged
Changes from all commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
f7dc246
Add Mistral AI Chat Completion support to Inference Plugin
Jan-Kazlouski-elastic 0aa8da8
Add changelog file
Jan-Kazlouski-elastic c3a8716
Fix tests and typos
Jan-Kazlouski-elastic 69f16b3
Merge remote-tracking branch 'refs/remotes/origin/main' into feature/…
Jan-Kazlouski-elastic 91f8ccf
Refactor Mistral chat completion integration and add tests
Jan-Kazlouski-elastic ff81e36
Refactor Mistral error response handling and extract StreamingErrorRe…
Jan-Kazlouski-elastic 5a9ce48
Add Mistral chat completion request and response tests
Jan-Kazlouski-elastic 17dead3
Enhance error response documentation and clarify StreamingErrorRespon…
Jan-Kazlouski-elastic 74b3df6
Refactor Mistral chat completion request handling and introduce skip …
Jan-Kazlouski-elastic d50bc76
Refactor MistralChatCompletionServiceSettings to include rateLimitSet…
Jan-Kazlouski-elastic 4824f12
Enhance MistralErrorResponse documentation with detailed error examples
Jan-Kazlouski-elastic 158622e
Add comment for Mistral-specific 422 validation error in OpenAiRespon…
Jan-Kazlouski-elastic 60df2f7
Merge remote-tracking branch 'origin/main' into feature/mistral-chat-…
Jan-Kazlouski-elastic 34ca847
[CI] Auto commit changes from spotless
elasticsearchmachine cc13241
Merge remote-tracking branch 'origin/main' into feature/mistral-chat-…
Jan-Kazlouski-elastic 24c52e8
Refactor OpenAiUnifiedChatCompletionRequestEntity to remove unused fi…
Jan-Kazlouski-elastic f184fc7
Refactor UnifiedChatCompletionRequestEntity and UnifiedCompletionRequ…
Jan-Kazlouski-elastic 5cc7402
Refactor MistralChatCompletionRequestEntityTests to improve JSON asse…
Jan-Kazlouski-elastic 977bfc4
Add unit tests for MistralUnifiedChatCompletionResponseHandler to val…
Jan-Kazlouski-elastic f49fac2
Add unit tests for MistralService
Jan-Kazlouski-elastic 7505915
Merge remote-tracking branch 'origin/main' into feature/mistral-chat-…
Jan-Kazlouski-elastic fb2be46
Merge remote-tracking branch 'origin/main' into feature/mistral-chat-…
Jan-Kazlouski-elastic 68a5432
Update expected service count in testGetServicesWithCompletionTaskType
Jan-Kazlouski-elastic 102da20
Merge remote-tracking branch 'origin/main' into feature/mistral-chat-…
Jan-Kazlouski-elastic File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
pr: 128538 | ||
summary: "Added Mistral Chat Completion support to the Inference Plugin" | ||
area: Machine Learning | ||
type: enhancement | ||
issues: [] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
128 changes: 128 additions & 0 deletions
128
...org/elasticsearch/xpack/inference/external/response/streaming/StreamingErrorResponse.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License | ||
* 2.0; you may not use this file except in compliance with the Elastic License | ||
* 2.0. | ||
*/ | ||
|
||
package org.elasticsearch.xpack.inference.external.response.streaming; | ||
|
||
import org.elasticsearch.core.Nullable; | ||
import org.elasticsearch.xcontent.ConstructingObjectParser; | ||
import org.elasticsearch.xcontent.ParseField; | ||
import org.elasticsearch.xcontent.XContentFactory; | ||
import org.elasticsearch.xcontent.XContentParser; | ||
import org.elasticsearch.xcontent.XContentParserConfiguration; | ||
import org.elasticsearch.xcontent.XContentType; | ||
import org.elasticsearch.xpack.inference.external.http.HttpResult; | ||
import org.elasticsearch.xpack.inference.external.http.retry.ErrorResponse; | ||
import org.elasticsearch.xpack.inference.external.response.ErrorMessageResponseEntity; | ||
|
||
import java.util.Objects; | ||
import java.util.Optional; | ||
|
||
/** | ||
* Represents an error response from a streaming inference service. | ||
* This class extends {@link ErrorResponse} and provides additional fields | ||
* specific to streaming errors, such as code, param, and type. | ||
* An example error response for a streaming service might look like: | ||
* <pre><code> | ||
* { | ||
* "error": { | ||
* "message": "Invalid input", | ||
* "code": "400", | ||
* "param": "input", | ||
* "type": "invalid_request_error" | ||
* } | ||
* } | ||
* </code></pre> | ||
* TODO: {@link ErrorMessageResponseEntity} is nearly identical to this, but doesn't parse as many fields. We must remove the duplication. | ||
*/ | ||
public class StreamingErrorResponse extends ErrorResponse { | ||
private static final ConstructingObjectParser<Optional<ErrorResponse>, Void> ERROR_PARSER = new ConstructingObjectParser<>( | ||
"streaming_error", | ||
true, | ||
args -> Optional.ofNullable((StreamingErrorResponse) args[0]) | ||
); | ||
private static final ConstructingObjectParser<StreamingErrorResponse, Void> ERROR_BODY_PARSER = new ConstructingObjectParser<>( | ||
"streaming_error", | ||
true, | ||
args -> new StreamingErrorResponse((String) args[0], (String) args[1], (String) args[2], (String) args[3]) | ||
); | ||
|
||
static { | ||
ERROR_BODY_PARSER.declareString(ConstructingObjectParser.constructorArg(), new ParseField("message")); | ||
ERROR_BODY_PARSER.declareStringOrNull(ConstructingObjectParser.optionalConstructorArg(), new ParseField("code")); | ||
ERROR_BODY_PARSER.declareStringOrNull(ConstructingObjectParser.optionalConstructorArg(), new ParseField("param")); | ||
ERROR_BODY_PARSER.declareString(ConstructingObjectParser.constructorArg(), new ParseField("type")); | ||
|
||
ERROR_PARSER.declareObjectOrNull( | ||
ConstructingObjectParser.optionalConstructorArg(), | ||
ERROR_BODY_PARSER, | ||
null, | ||
new ParseField("error") | ||
); | ||
} | ||
|
||
/** | ||
* Standard error response parser. This can be overridden for those subclasses that | ||
* have a different error response structure. | ||
* @param response The error response as an HttpResult | ||
*/ | ||
public static ErrorResponse fromResponse(HttpResult response) { | ||
try ( | ||
XContentParser parser = XContentFactory.xContent(XContentType.JSON) | ||
.createParser(XContentParserConfiguration.EMPTY, response.body()) | ||
) { | ||
return ERROR_PARSER.apply(parser, null).orElse(ErrorResponse.UNDEFINED_ERROR); | ||
} catch (Exception e) { | ||
// swallow the error | ||
} | ||
|
||
return ErrorResponse.UNDEFINED_ERROR; | ||
} | ||
|
||
/** | ||
* Standard error response parser. This can be overridden for those subclasses that | ||
* have a different error response structure. | ||
* @param response The error response as a string | ||
*/ | ||
public static ErrorResponse fromString(String response) { | ||
try ( | ||
XContentParser parser = XContentFactory.xContent(XContentType.JSON).createParser(XContentParserConfiguration.EMPTY, response) | ||
) { | ||
return ERROR_PARSER.apply(parser, null).orElse(ErrorResponse.UNDEFINED_ERROR); | ||
} catch (Exception e) { | ||
// swallow the error | ||
} | ||
|
||
return ErrorResponse.UNDEFINED_ERROR; | ||
} | ||
|
||
@Nullable | ||
private final String code; | ||
@Nullable | ||
private final String param; | ||
private final String type; | ||
|
||
StreamingErrorResponse(String errorMessage, @Nullable String code, @Nullable String param, String type) { | ||
super(errorMessage); | ||
this.code = code; | ||
this.param = param; | ||
this.type = Objects.requireNonNull(type); | ||
} | ||
|
||
@Nullable | ||
public String code() { | ||
return code; | ||
} | ||
|
||
@Nullable | ||
public String param() { | ||
return param; | ||
} | ||
|
||
public String type() { | ||
return type; | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
29 changes: 29 additions & 0 deletions
29
.../org/elasticsearch/xpack/inference/services/mistral/MistralCompletionResponseHandler.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the Elastic License | ||
* 2.0; you may not use this file except in compliance with the Elastic License | ||
* 2.0. | ||
*/ | ||
|
||
package org.elasticsearch.xpack.inference.services.mistral; | ||
|
||
import org.elasticsearch.xpack.inference.external.http.retry.ResponseParser; | ||
import org.elasticsearch.xpack.inference.services.mistral.response.MistralErrorResponse; | ||
import org.elasticsearch.xpack.inference.services.openai.OpenAiChatCompletionResponseHandler; | ||
|
||
/** | ||
* Handles non-streaming completion responses for Mistral models, extending the OpenAI completion response handler. | ||
* This class is specifically designed to handle Mistral's error response format. | ||
*/ | ||
public class MistralCompletionResponseHandler extends OpenAiChatCompletionResponseHandler { | ||
|
||
/** | ||
* Constructs a MistralCompletionResponseHandler with the specified request type and response parser. | ||
* | ||
* @param requestType The type of request being handled (e.g., "mistral completions"). | ||
* @param parseFunction The function to parse the response. | ||
*/ | ||
public MistralCompletionResponseHandler(String requestType, ResponseParser parseFunction) { | ||
super(requestType, parseFunction, MistralErrorResponse::fromResponse); | ||
} | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment with an example error message that this would parse? Let's also add a TODO to note that
ErrorMessageResponseEntity
https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/external/response/ErrorMessageResponseEntity.java is nearly identical (doesn't parse as many fields) and we should remove the duplicationUh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.