elastic · Samiul-TheSoccerFan · Apr 16, 2025 · Apr 3, 2025 · Apr 7, 2025 · Apr 7, 2025
diff --git a/notebooks/search/12-semantic-reranking-elastic-rerank.ipynb b/notebooks/search/12-semantic-reranking-elastic-rerank.ipynb
@@ -12,10 +12,7 @@
     "\n",
     "In this notebook you'll learn how to implement semantic reranking in Elasticsearch using the built-in [Elastic Rerank model](https://www.elastic.co/guide/en/machine-learning/master/ml-nlp-rerank.html). You'll also learn about the `retriever` abstraction, a simpler syntax for crafting queries and combining different search operations.\n",
     "\n",
-    "You will:\n",
-    "\n",
-    "- Create an inference endpoint to manage your `rerank` task. This will download and deploy the Elastic Rerank model.\n",
-    "- Query your data using the `text_similarity_rerank` retriever, leveraging the Elastic Rerank model."
+    "You will query your data using the `text_similarity_rerank` retriever, and the Elastic Rerank model to boost the relevance of your search results."
    ]
   },
   {
@@ -234,87 +231,6 @@
     "time.sleep(3)"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "id": "DRIABkGAgV_Q"
-   },
-   "source": [
-    "## Create inference endpoint\n",
-    "\n",
-    "Next we'll create an inference endpoint for the `rerank` task to deploy and manage our model and, if necessary, spin up the necessary ML resources behind the scenes."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "colab": {
-     "base_uri": "https://localhost:8080/"
-    },
-    "id": "DiKsd3YygV_Q",
-    "outputId": "c3c46c6b-b502-4167-c98c-d2e2e0a4613c"
-   },
-   "outputs": [],
-   "source": [
-    "try:\n",
-    "    client.inference.delete(inference_id=\"my-elastic-reranker\")\n",
-    "except exceptions.NotFoundError:\n",
-    "    # Inference endpoint does not exist\n",
-    "    pass\n",
-    "\n",
-    "try:\n",
-    "    client.options(\n",
-    "        request_timeout=60, max_retries=3, retry_on_timeout=True\n",
-    "    ).inference.put(\n",
-    "        task_type=\"rerank\",\n",
-    "        inference_id=\"my-elastic-reranker\",\n",
-    "        inference_config={\n",
-    "            \"service\": \"elasticsearch\",\n",
-    "            \"service_settings\": {\n",
-    "                \"model_id\": \".rerank-v1\",\n",
-    "                \"num_threads\": 1,\n",
-    "                \"adaptive_allocations\": {\n",
-    "                    \"enabled\": True,\n",
-    "                    \"min_number_of_allocations\": 1,\n",
-    "                    \"max_number_of_allocations\": 4,\n",
-    "                },\n",
-    "            },\n",
-    "        },\n",
-    "    )\n",
-    "    print(\"Inference endpoint created successfully\")\n",
-    "except exceptions.BadRequestError as e:\n",
-    "    if e.error == \"resource_already_exists_exception\":\n",
-    "        print(\"Inference endpoint created successfully\")\n",
-    "    else:\n",
-    "        raise e"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Run the following command to confirm your inference endpoint is deployed."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "client.inference.get().body"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "\n",
-    "⚠️ When you deploy your model, you might need to sync your ML saved objects in the Kibana (or Serverless) UI.\n",
-    "Go to **Trained Models** and select **Synchronize saved objects**."
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {
@@ -465,7 +381,7 @@
    "source": [
     "## Semantic reranker\n",
     "\n",
-    "In the following `retriever` syntax, we wrap our standard `match` query retriever in a `text_similarity_reranker`. This allows us to leverage the NLP model we deployed to Elasticsearch to rerank the results based on the phrase \"flesh-eating bad guy\"."
+    "In the following `retriever` syntax, we wrap our standard `match` query retriever in a `text_similarity_reranker`. This allows us to leverage the [Elastic rerank model](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-rerank.html) to rerank the results based on the phrase \"flesh-eating bad guy\"."
    ]
   },
   {
@@ -523,7 +439,6 @@
     "                }\n",
     "            },\n",
     "            \"field\": \"plot\",\n",
-    "            \"inference_id\": \"my-elastic-reranker\",\n",
     "            \"inference_text\": \"flesh-eating bad guy\",\n",
     "        }\n",
     "    },\n",
@@ -543,7 +458,9 @@
    "source": [
     "Success! \"The Silence of the Lambs\" is our top result. Semantic reranking helped us find the most relevant result by parsing a natural language query, overcoming the limitations of lexical search that relies on keyword matching.\n",
     "\n",
-    "Semantic reranking enables semantic search in a few steps, without the need for generating and storing embeddings. This a great tool for testing and building hybrid search systems in Elasticsearch."
+    "Semantic reranking enables semantic search in a few steps, without the need for generating and storing embeddings. This a great tool for testing and building hybrid search systems in Elasticsearch.\n",
+    "\n",
+    "*Note* Starting with Elasticsearch version `8.18`, The `inference_id` field is optional. If not specified, it defaults to `.rerank-v1-elasticsearch`. If you are using an earlier version or prefer to manage your own endpoint, you can set up a custom `rerank` inference endpoint using the [create inference API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-put)."
    ]
   },
   {