|
1 | 1 |
|
2 |
| -# Kafka as Kubernetes StatefulSet |
3 | 2 |
|
4 |
| -Example of three Kafka brokers depending on five Zookeeper instances. |
| 3 | +# Kafka on Kubernetes |
5 | 4 |
|
6 |
| -To get consistent service DNS names `kafka-N.broker.kafka`(`.svc.cluster.local`), run everything in a [namespace](http://kubernetes.io/docs/admin/namespaces/walkthrough/): |
7 |
| -``` |
8 |
| -kubectl create -f 00namespace.yml |
9 |
| -``` |
| 5 | +Transparent Kafka setup that you can grow with. |
| 6 | +Good for both experiments and production. |
10 | 7 |
|
11 |
| -## Set up Zookeeper |
| 8 | +How to use: |
| 9 | + * Run a Kubernetes cluster, [minikube](https://github.com/kubernetes/minikube) or real. |
| 10 | + * To quickly get a small Kafka cluster running, use the `kubectl apply`s below. |
| 11 | + * To start using Kafka for real, fork and have a look at [addon](https://github.com/Yolean/kubernetes-kafka/labels/addon)s. |
| 12 | + * Join the discussion here in issues and PRs. |
12 | 13 |
|
13 |
| -The Kafka book (Definitive Guide, O'Reilly 2016) recommends that Kafka has its own Zookeeper cluster with at least 5 instances. |
14 |
| -We use the zookeeper build that comes with the Kafka distribution, and tweak the startup command to support StatefulSet. |
| 14 | +Why? |
| 15 | +See for yourself. No single readable readme can properly introduce both Kafka and Kubernets. |
| 16 | +Back when we read [Newman](http://samnewman.io/books/building_microservices/) we were beginners with both. |
| 17 | +Now we read [Kleppmann](http://dataintensive.net/), [Confluent's blog](https://www.confluent.io/blog/) and [SRE](https://landing.google.com/sre/book.html) and enjoy this "Streaming Platform" lock-in :smile:. |
15 | 18 |
|
16 |
| -``` |
17 |
| -kubectl create -f ./zookeeper/ |
18 |
| -``` |
| 19 | +## What you get |
19 | 20 |
|
20 |
| -## Start Kafka |
| 21 | +Keep an eye on `kubectl --namespace kafka get pods -w`. |
21 | 22 |
|
22 |
| -Assuming you have your PVCs `Bound`, or enabled automatic provisioning (see above), go ahead and: |
| 23 | +[Bootstrap servers](http://kafka.apache.org/documentation/#producerconfigs): `kafka-0.broker.kafka.svc.cluster.local:9092,kafka-1.broker.kafka.svc.cluster.local:9092,kafka-2.broker.kafka.svc.cluster.local:9092` |
| 24 | +` |
23 | 25 |
|
24 |
| -``` |
25 |
| -kubectl create -f ./ |
26 |
| -``` |
| 26 | +Zookeeper at `zookeeper.kafka.svc.cluster.local:2181`. |
27 | 27 |
|
28 |
| -You might want to verify in logs that Kafka found its own DNS name(s) correctly. Look for records like: |
29 |
| -``` |
30 |
| -kubectl -n kafka logs kafka-0 | grep "Registered broker" |
31 |
| -# INFO Registered broker 0 at path /brokers/ids/0 with addresses: PLAINTEXT -> EndPoint(kafka-0.broker.kafka.svc.cluster.local,9092,PLAINTEXT) |
32 |
| -``` |
| 28 | +## Start Zookeeper |
33 | 29 |
|
34 |
| -## Testing manually |
| 30 | +The [Kafka book](https://www.confluent.io/resources/kafka-definitive-guide-preview-edition/) recommends that Kafka has its own Zookeeper cluster with at least 5 instances. |
35 | 31 |
|
36 |
| -There's a Kafka pod that doesn't start the server, so you can invoke the various shell scripts. |
37 | 32 | ```
|
38 |
| -kubectl create -f test/99testclient.yml |
| 33 | +kubectl create -f ./zookeeper/ |
39 | 34 | ```
|
40 | 35 |
|
41 |
| -See `./test/test.sh` for some sample commands. |
42 |
| - |
43 |
| -## Automated test, while going chaosmonkey on the cluster |
| 36 | +To support automatic migration in the face of availability zone unavailability we mix persistent and ephemeral storage. |
44 | 37 |
|
45 |
| -This is WIP, but topic creation has been automated. Note that as a [Job](http://kubernetes.io/docs/user-guide/jobs/), it will restart if the command fails, including if the topic exists :( |
46 |
| -``` |
47 |
| -kubectl create -f test/11topic-create-test1.yml |
48 |
| -``` |
| 38 | +## Start Kafka |
49 | 39 |
|
50 |
| -Pods that keep consuming messages (but they won't exit on cluster failures) |
51 | 40 | ```
|
52 |
| -kubectl create -f test/21consumer-test1.yml |
| 41 | +kubectl create -f ./ |
53 | 42 | ```
|
54 | 43 |
|
55 |
| -## Teardown & cleanup |
56 |
| - |
57 |
| -Testing and retesting... delete the namespace. PVs are outside namespaces so delete them too. |
| 44 | +You might want to verify in logs that Kafka found its own DNS name(s) correctly. Look for records like: |
58 | 45 | ```
|
59 |
| -kubectl delete namespace kafka |
60 |
| -rm -R ./data/ && kubectl -n kafka delete pv datadir-kafka-0 datadir-kafka-1 datadir-kafka-2 |
| 46 | +kubectl -n kafka logs kafka-0 | grep "Registered broker" |
| 47 | +# INFO Registered broker 0 at path /brokers/ids/0 with addresses: PLAINTEXT -> EndPoint(kafka-0.broker.kafka.svc.cluster.local,9092,PLAINTEXT) |
61 | 48 | ```
|
62 | 49 |
|
63 |
| -## Metrics, Prometheus style |
64 |
| - |
65 |
| -Is the metrics system up and running? |
66 |
| -``` |
67 |
| -kubectl logs -c metrics kafka-0 |
68 |
| -kubectl exec -c broker kafka-0 -- /bin/sh -c 'apk add --no-cache curl && curl http://localhost:5556/metrics' |
69 |
| -kubectl logs -c metrics zoo-0 |
70 |
| -kubectl exec -c zookeeper zoo-0 -- /bin/sh -c 'apk add --no-cache curl && curl http://localhost:5556/metrics' |
71 |
| -``` |
72 |
| -Metrics containers can't be used for the curl because they're too short on memory. |
| 50 | +That's it. Just add business value :wink:. |
| 51 | +For clients we tend to use [librdkafka](https://github.com/edenhill/librdkafka)-based drivers like [node-rdkafka](https://github.com/Blizzard/node-rdkafka). |
| 52 | +To use [Kafka Connect](http://kafka.apache.org/documentation/#connect) and [Kafka Streams](http://kafka.apache.org/documentation/streams/) you may want to take a look at our [sample](https://github.com/solsson/dockerfiles/tree/master/connect-files) [Dockerfile](https://github.com/solsson/dockerfiles/tree/master/streams-logfilter)s. |
| 53 | +Don't forget the [addon](https://github.com/Yolean/kubernetes-kafka/labels/addon)s. |
0 commit comments