Skip to content

chatbot-rag-app: adds Kubernetes manifest and instructions #396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,11 @@ Note: If you haven't checked out this repository, all you need is one file:
wget https://raw.githubusercontent.com/elastic/elasticsearch-labs/refs/heads/main/docker/docker-compose-elastic.yml
```

Use docker compose to run Elastic stack in the background:
Before you begin, ensure you have free CPU and memory on your Docker host. If
you plan to use ELSER, assume a minimum of 8 cpus and 6GB memory for the
containers in this compose file.

First, start this Elastic Stack in the background:
```bash
docker compose -f docker-compose-elastic.yml up --force-recreate --wait -d
```
Expand All @@ -20,7 +23,6 @@ Then, you can view Kibana at http://localhost:5601/app/home#/
If asked for a username and password, use username: elastic and password: elastic.

Clean up when finished, like this:

```bash
docker compose -f docker-compose-elastic.yml down
```
15 changes: 9 additions & 6 deletions docker/docker-compose-elastic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ services:
test: # readiness probe taken from kbn-health-gateway-server script
[
"CMD-SHELL",
"curl -s http://localhost:9200 | grep -q 'missing authentication credentials'",
"curl --max-time 1 -s http://localhost:9200 | grep -q 'missing authentication credentials'",
]
start_period: 10s
interval: 1s
Expand All @@ -41,12 +41,15 @@ services:
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.2
container_name: elasticsearch_settings
restart: 'no'
# gen-ai assistants in kibana save state in a way that requires system
# access, so set kibana_system's password to a known value.
command: >
bash -c '
# gen-ai assistants in kibana save state in a way that requires security to be enabled, so we need to create
# a kibana system user before starting it.
bash -c '
echo "Setup the kibana_system password";
until curl -s -u "elastic:elastic" -X POST http://elasticsearch:9200/_security/user/kibana_system/_password -d "{\"password\":\"elastic\"}" -H "Content-Type: application/json" | grep -q "^{}"; do sleep 5; done;
until curl --max-time 1 -s -u "elastic:elastic" \
-X POST http://elasticsearch:9200/_security/user/kibana_system/_password \
-d "{\"password\":\"elastic\"}" \
-H "Content-Type: application/json" | grep -q "^{}"; do sleep 5; done;
'

kibana:
Expand All @@ -69,7 +72,7 @@ services:
- XPACK_ENCRYPTEDSAVEDOBJECTS_ENCRYPTIONKEY=fhjskloppd678ehkdfdlliverpoolfcr
- SERVER_PUBLICBASEURL=http://127.0.0.1:5601
healthcheck:
test: ["CMD-SHELL", "curl -s http://localhost:5601/api/status | grep -q 'available'"]
test: ["CMD-SHELL", "curl --max-time 1 -s http://localhost:5601/api/status | grep -q 'available'"]
retries: 300
interval: 1s

Expand Down
11 changes: 3 additions & 8 deletions example-apps/chatbot-rag-app/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,7 @@ COPY frontend ./frontend
RUN cd frontend && yarn install
RUN cd frontend && REACT_APP_API_HOST=/api yarn build

# langchain and vertexai depend on a large number of system packages including
# linux-headers, g++, geos, geos-dev, rust and cargo. These are already present
# on -slim and adding them to -alpine results in a larger image than -slim.
# Use glibc-based image to get pre-compiled wheels for grpcio and tiktoken
FROM python:3.12-slim

WORKDIR /app
Expand All @@ -27,15 +25,12 @@ EXPOSE 4000
# docker invocations to reenable.
ENV OTEL_SDK_DISABLED=true

# https://github.com/elastic/genai-instrumentation/issues/255
# Currently Python SDK has a bug that spams logs when opentelemetry-instrument is used
# with SDK being disabled. Until it is fixed, we handle it in our own entrypoint by
# avoiding opentelemetry-instrument when SDK is disabled.
# TODO remove custom entrypoint when EDOT Python >0.7.0 is released.
RUN echo 'if [ "${OTEL_SDK_DISABLED:-true}" == "false" ]; \
then \
opentelemetry-instrument $@; \
else \
exec $@; \
fi' > entrypoint.sh
ENTRYPOINT [ "bash", "-eu", "./entrypoint.sh" ]
CMD [ "python", "api/app.py"]
CMD [ "python", "api/app.py" ]
68 changes: 66 additions & 2 deletions example-apps/chatbot-rag-app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ Copy [env.example](env.example) to `.env` and fill in values noted inside.
## Installing and connecting to Elasticsearch

There are a number of ways to install Elasticsearch. Cloud is best for most
use-cases. We also have [docker-compose-elastic.yml](../../docker), that starts
Elasticsearch, Kibana, and APM Server on your laptop with one command.
use-cases. We also have [docker-compose-elastic.yml][docker-compose-elastic],
that starts Elasticsearch, Kibana, and APM Server on your laptop in one step.

Once you decided your approach, edit your `.env` file accordingly.

Expand Down Expand Up @@ -71,6 +71,68 @@ Clean up when finished, like this:
docker compose down
```

### Run with Kubernetes

Kubernetes is more complicated than Docker, but closer to the production
experience for many users. [k8s-manifest.yml](k8s-manifest.yml) creates the
same services, but needs additional configuration first.

First step is to setup your environment. [env.example](env.example) must be
copied to a file name `.env` and updated with `ELASTICSEARCH_URL` and
`OTEL_EXPORTER_OTLP_ENDPOINT` values visible to you Kubernetes deployment.

For example, if you started your Elastic Stack with [k8s-manifest-elastic.yml][k8s-manifest-elastic],
you would update these values:
```
ELASTICSEARCH_URL=http://elasticsearch:9200
OTEL_EXPORTER_OTLP_ENDPOINT=http://apm-server:8200
```

Then, import your `.env` file as a configmap like this:
```bash
kubectl create configmap chatbot-rag-app-env --from-env-file=.env
```

<details>
<summary>To use Vertex AI, set `LLM_TYPE=vertex` in your `.env` and follow these steps</summary>
The `api-frontend container` needs access to your Google Cloud credentials.
Share your `application_default_credentials.json` as a Kubernetes secret:
```bash
# Logs you into Google Cloud and creates application_default_credentials.json
gcloud auth application-default login
# Adds your credentials to a Kubernetes secret named gcloud-credentials
kubectl create secret generic gcloud-credentials \
--from-file=application_default_credentials.json=$HOME/.config/gcloud/application_default_credentials.json
```
</details>

Now that your configuration is applied, create the `chatbot-rag-app` deployment
and service by applying this manifest:
```bash
kubectl apply -f k8s-manifest.yml
```

Next, block until `chatbot-rag-app` is available.
```bash
kubectl wait --for=condition=available --timeout=20m deployment/chatbot-rag-app
```

*Note*: The first run may take several minutes to become available. Here's how
to follow logs on this stage:
```bash
kubectl logs deployment.apps/chatbot-rag-app -c create-index -f
```

Next, forward the web UI port:
```bash
kubectl port-forward deployment.apps/chatbot-rag-app 4000:4000 &
```

Clean up when finished, like this:
```bash
kubectl delete -f k8s-manifest.yml
```

### Run with Python

If you want to run this example with Python, you need to do a few things listed
Expand Down Expand Up @@ -196,3 +258,5 @@ docker compose up --build --force-recreate
---
[loader-docs]: https://python.langchain.com/docs/how_to/#document-loaders
[install-es]: https://www.elastic.co/search-labs/tutorials/install-elasticsearch
[docker-compose-elastic]: ../../docker/docker-compose-elastic.yml
[k8s-manifest-elastic]: ../../k8s/k8s-manifest-elastic.yml
10 changes: 9 additions & 1 deletion example-apps/chatbot-rag-app/env.example
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ FLASK_APP=api/app.py
PYTHONUNBUFFERED=1

# How you connect to Elasticsearch: change details to your instance
# This defaults to a Elastic Stack accessible via localhost.
#
# When running inside Kubernetes, set to http://elasticsearch.default.svc:9200
# or similar.
ELASTICSEARCH_URL=http://localhost:9200
ELASTICSEARCH_USER=elastic
ELASTICSEARCH_PASSWORD=elastic
Expand Down Expand Up @@ -68,7 +72,11 @@ OTEL_SDK_DISABLED=true
# Assign the service name that shows up in Kibana
OTEL_SERVICE_NAME=chatbot-rag-app

# Default to send traces to the Elastic APM server
# Default to send logs, traces and metrics to an Elastic APM server accessible
# via localhost.
#
# When running inside Kubernetes, set to http://elasticsearch.default.svc:9200
# or similar.
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:8200
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf

Expand Down
58 changes: 58 additions & 0 deletions example-apps/chatbot-rag-app/k8s-manifest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
# chatbot-rag-app deploys "create-index" to install ELSER and load values.
# Then, it starts "api-frontend" to serve the application.
apiVersion: apps/v1
kind: Deployment
metadata:
name: chatbot-rag-app
spec:
replicas: 1
selector:
matchLabels:
app: chatbot-rag-app
template:
metadata:
labels:
app: chatbot-rag-app
spec:
# For `LLM_TYPE=vertex`: create a volume for application_default_credentials.json
volumes:
- name: gcloud-credentials
secret:
secretName: gcloud-credentials
optional: true # only read when `LLM_TYPE=vertex`
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part allows vertex config to work, but others to not block on it. the optional applies indirectly to a mount that uses it, so no worries.

initContainers:
- name: create-index
image: &image ghcr.io/elastic/elasticsearch-labs/chatbot-rag-app:latest
command: &command [ "bash", "-eu", "./entrypoint.sh" ] # match image
args: [ "flask", "create-index" ]
# This recreates your configmap based on your .env file:
# kubectl create configmap chatbot-rag-app-env --from-env-file=.env
envFrom: &envFrom
- configMapRef:
name: chatbot-rag-app-env
containers:
- name: api-frontend
image: *image
command: *command
args: [ "python", "api/app.py" ]
ports:
- containerPort: 4000
envFrom: *envFrom
# For `LLM_TYPE=vertex`: mount credentials to the path read by the google-cloud-sdk
volumeMounts:
- name: gcloud-credentials
mountPath: /root/.config/gcloud
readOnly: true
---
apiVersion: v1
kind: Service
metadata:
name: api
spec:
selector:
app: chatbot-rag-app
ports:
- protocol: TCP
port: 4000
targetPort: 4000
47 changes: 47 additions & 0 deletions k8s/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Running your own Elastic Stack with Kubernetes

If you'd like to start Elastic with Kubernetes, you can use the provided
[manifest-elastic.yml](manifest-elastic.yml) file. This starts
Elasticsearch, Kibana, and APM Server in an existing Kubernetes cluster.

Note: If you haven't checked out this repository, all you need is one file:
```bash
wget https://raw.githubusercontent.com/elastic/elasticsearch-labs/refs/heads/main/k8s/k8s-manifest-elastic.yml
```

Before you begin, ensure you have free CPU and memory in your cluster. If you
plan to use ELSER, assume a minimum of 8 cpus and 6GB memory for the containers
in this manifest.

First, start this Elastic Stack in the background:
```bash
kubectl apply -f k8s-manifest-elastic.yml
```

**Note**: For simplicity, this adds an Elastic Stack to the default namespace.
Commands after here are simpler due to this. If you want to choose a different
one, use `kubectl`'s `--namespace` flag!

Next, block until the whole stack is available. First install or changing the
Elastic Stack version can take a long time due to image pulling.
```bash
kubectl wait --for=condition=available --timeout=10m \
deployment/elasticsearch \
deployment/kibana \
deployment/apm-server
```

Next, forward the kibana port:
```bash
kubectl port-forward service/kibana 5601:5601 &
```

Finally, you can view Kibana at http://localhost:5601/app/home#/

If asked for a username and password, use username: elastic and password: elastic.

Clean up when finished, like this:

```bash
kubectl delete -f k8s-manifest-elastic.yml
```
Loading