Skip to content

Commit f645183

Browse files
lcawlkarenzone
authored andcommitted
[DOCS] Add warning to create inference API (#3311)
Co-authored-by: Karen Metts <[email protected]> (cherry picked from commit 679ae2b)
1 parent f7a774d commit f645183

File tree

5 files changed

+24
-5
lines changed

5 files changed

+24
-5
lines changed

compiler/src/model/utils.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -667,7 +667,7 @@ export function hoistRequestAnnotations (
667667
} else if (tag === 'cluster_privileges') {
668668
const privileges = [
669669
'all', 'cancel_task', 'create_snapshot', 'grant_api_key', 'manage', 'manage_api_key', 'manage_ccr',
670-
'manage_enrich', 'manage_ilm', 'manage_index_templates', 'manage_ingest_pipelines', 'manage_logstash_pipelines',
670+
'manage_enrich', 'manage_ilm', 'manage_index_templates', 'manage_inference', 'manage_ingest_pipelines', 'manage_logstash_pipelines',
671671
'manage_ml', 'manage_oidc', 'manage_own_api_key', 'manage_pipeline', 'manage_rollup', 'manage_saml',
672672
'manage_security', 'manage_service_account', 'manage_slm', 'manage_token', 'manage_transform', 'manage_user_profile',
673673
'manage_watcher', 'monitor', 'monitor_ml', 'monitor_rollup', 'monitor_snapshot', 'monitor_text_structure',

output/openapi/elasticsearch-openapi.json

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

output/openapi/elasticsearch-serverless-openapi.json

Lines changed: 2 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

output/schema/schema.json

Lines changed: 8 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

specification/inference/put/PutRequest.ts

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,20 @@ import { RequestBase } from '@_types/Base'
2323
import { Id } from '@_types/common'
2424

2525
/**
26-
* Create an inference endpoint
26+
* Create an inference endpoint.
27+
* When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running.
28+
* After creating the endpoint, wait for the model deployment to complete before using it.
29+
* To verify the deployment status, use the get trained model statistics API.
30+
* Look for `"state": "fully_allocated"` in the response and ensure that the `"allocation_count"` matches the `"target_allocation_count"`.
31+
* Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.
32+
*
33+
* IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face.
34+
* For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models.
35+
* However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
2736
* @rest_spec_name inference.put
2837
* @availability stack since=8.11.0 stability=stable visibility=public
2938
* @availability serverless stability=stable visibility=public
39+
* @cluster_privileges manage_inference
3040
*/
3141
export interface Request extends RequestBase {
3242
path_parts: {

0 commit comments

Comments
 (0)