Skip to content

addon: Schema registry and REST proxy #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Feb 3, 2018
Merged

addon: Schema registry and REST proxy #45

merged 23 commits into from
Feb 3, 2018

Conversation

solsson
Copy link
Contributor

@solsson solsson commented Jul 25, 2017

The diff make no sense until kafka-011 has been merged to master. This branch should only add the 6* and 7* ymls.

solsson added 8 commits July 29, 2017 16:51
I'm not getting the expected behavior.
Might also be because of low resources on minikube,
but 3.3.0 should only be a few days away.
also the sample is from schema-registry.properties while
kafka-rest had only the stdout appender (and no per-package levels)
both zookeeper and bootstrap addresses.
Was getting connection errors to localhost:9092 broker id -1,
resulting in REST requests never returning.
@solsson
Copy link
Contributor Author

solsson commented Aug 1, 2017

Status now is that I think both Schema Registry and REST works. I've tested with fluent/fluent-bit-kubernetes-logging#6 and scaled deploy/rest to 3, topic receiving 200k messages in 10 minutes.

Remaining work:

  • Readiness probes. Might solve some of the connection issues from fluent-bit.
  • The test case must move most calls to run.sh so we assert stuff (by turning unready on unexpected results)
  • Maybe move all of it to a subfolder that can be kubectl create -f'd
  • Resource limits.
    • With modest load fluent-bit takes ~ 30m 6Mi
    • Rest takes ~22m 158Mi
    • Schema Registry isn't used at all, takes ~ 1m 147Mi

@Analect
Copy link

Analect commented Sep 27, 2017

@solsson ... thanks for your work on this. I had been searching around for a kubernetes implementation of the confluent ecosystem and had been wanting to get kafka-connect and potentially KSQL in a kubernetes context.

I happened upon this from https://github.com/Yolean/confluent-quickstart-kubernetes .. which I have played with in the past. Is the plan to merge this soon?

To get this running in its current state ... is the idea to run kubectl apply -f addon-cp/?

@solsson
Copy link
Contributor Author

solsson commented Sep 28, 2017

I happened upon this from https://github.com/Yolean/confluent-quickstart-kubernetes .. which I have played with in the past. Is the plan to merge this soon?

I actually aborted that, because the quickstart is very geared towards host type docker networking. Not a good option for kubernetes. What more parts of Confluent's platform is it you'd like to run?

To get this running in its current state ... is the idea to run kubectl apply -f addon-cp/?

Yes, this PR requires no changes to core kafka or zookeeper resources -- pretty clean "addon".

@Analect
Copy link

Analect commented Sep 28, 2017

Thanks @solsson
Just wondering that if this PR represents the add-on part ... are the steps to get everything running therefore:
kubectl apply -f ./zookeeper/
kubectl apply -f ./
and then ... kubectl apply -f addon-cp/

What more parts of Confluent's platform is it you'd like to run?

Previously (ie. not within kubernetes), I was experimenting around running kafka-connect in conjunction with some of the debezium connectors and landoop UI components ... all working from a single docker-compose. I'm now keen to try to emulate that environment on kubernetes ... a blunt way of doing that might be to use kompose to do that with the existing compose file ... but I suspect that's not going to be that stable ... hence I was interested in the approach you are taking here.

I'm also keen to see how KSQL might fit in this context and if it needs to be handled in any special way.

@solsson
Copy link
Contributor Author

solsson commented Sep 28, 2017

Yes, those are the steps, ideally :)

We're very excited about KSQL too. Waiting for Kafka 1.0 before I experiment with that though.

There's a branch from this branch for Kafka Connect: https://github.com/Yolean/kubernetes-kafka/compare/addon-rest...addon-connect?expand=1. It's rather untested, hence no PR yet. Would be interesting to hear your results from apply -f addon-connect/ on that one too.

@Analect
Copy link

Analect commented Sep 28, 2017

@solsson
Thanks for the link, I'll have a play around.
It's clearer to me now how you have arranged these various add-on experiments as unmerged PRs ... I presume they are designed to work independent of eachother ... except in the case of your link ... where one add-on is an extension of another?

Can I ask why you create your own confluent-platform images ... like solsson/kafka-cp@sha256:a22047b9e... and solsson/kafka-connect-jdbc@sha256:a6108f094 ... I can see you also bundle-in schema-registry to your own generic image here rather than using the official one ... is that just down to having more control over accessing the latest version of this tooling ahead of confluent releasing an image ... or is there something else I'm missing?

I don't have much experience with maven, but if you were to go about enabling KSQL .. would you use something like this to inform a build for an image that has KSQL on board ... and build it akin to how you have done the connect-jdbc image? Would you then structure it to run in an independed pod on kubernetes ... as you have done for connect with the addon-connect branch?

Thanks again. Your insights are greatly valued.

@solsson
Copy link
Contributor Author

solsson commented Sep 28, 2017

This repository reflects how we use Kafka at Yolean. We started using it only as backend for the collaborative part of our app, but the "streaming platform" principles that Confluent advocates is increasingly influencing our microservices design. Hence we're experimenting with the entire Kafka ecosystem now, but haven't stabilized those parts.

The reason for keeping PRs unmerged is that I wanted the bare bones setup to stay simple. But... I should probably re-brand this repo to something like Confluent Platform "for Kubernetes".

Back when we set up our first Kafka cluster - and our ops is "Kubernetes only" - it was difficult to find community resources on Kafka and StatefulSet, but now there's semi-official helm charts, several git repos with similar scope as this, and blog posts like http://blog.kubernetes.io/2017/09/kubernetes-statefulsets-daemonsets.html.

Confluent's documentation on the other hand doesn't afaik mention Kubernetes, and overall it's quite geared towards enterprise setups rather than startups like us. Which leads to the next question...

Can I ask why you create your own confluent-platform images

This is a difficult choice. I've sort of concluded that any organization that grows with a Kubernets+Kafka setup will switch to their own images. For example to add any 3rd party (non-confluent) connectors you'll have to figure out how to build them and place the resulting jars in your kafka connect image. Other reasons:

  • Smaller images (-500 MB or so) means less down time when pods are rescheduled. It also saves time when using kafka locally, during development.
  • Kubernetes has ConfigMaps, which makes the use of .properties files for config very maintainable. Confluent's images use env vars to override individual config entries, which adheres to Docker conventions but IMO gets less transparent.

@solsson
Copy link
Contributor Author

solsson commented Sep 28, 2017

I don't have much experience with maven, but if you were to go about enabling KSQL .. would you use something like this to inform a build for an image that has KSQL on board ... and build it akin to how you have done the connect-jdbc image? Would you then structure it to run in an independed pod on kubernetes ... as you have done for connect with the addon-connect branch?

Hadn't looked at the sources before... Unlike the quickstart we'd have to go directly for https://github.com/confluentinc/ksql/blob/0.1.x/docs/concepts.md#client-server-mode as "same JVM" would be impractical in k8s. I'd have to try and build it to confirm your guess, but it does have a bin folder and a package assembly like the other Confluent components so it can probably be dockerized much like Connect.

@solsson
Copy link
Contributor Author

solsson commented Oct 2, 2017

@Analect If you'd like to test KSQL feel free to try #68. I won't be able to get further today, but possibly before the end of this week.

@Analect
Copy link

Analect commented Oct 3, 2017

@solsson ... thanks. I'll take a look. I added a WIP PR #69 to attempt to test the connector using this blog post, so as to be able to mix and match both confluent and a third-party connector ... but as per my notes on PR, I'm hitting a problem with a crashing connector.

@nuthankumar
Copy link

nuthankumar commented Oct 24, 2018

@solsson
how different is the image
confluentinc/cp-zookeeper:3.2.0@sha256:cf151144d5cbe094af291eb1a86f5dcfae4b4cdb9f16803bff2e5010a9b16fd1
confluentinc/cp-kafka:3.2.0@sha256:82cb7a49161705ea750f3466700cce7a57e45d772f4bdbcddcdaac336dac77c0

from confluentinc/cp-zookeeper and confluentinc/cp-kafka ??

@solsson
Copy link
Contributor Author

solsson commented Oct 24, 2018

Sounds like the same image, or do you mean the effect of the ska256? The reason I'm always adding the checksum is that with the default imagePullPolicy in k8s you may otherwise end up with different images for the same pod spec depending on the pull history on the particular node they run on. And with imagePullPolicy: Always you make every pod start a bit slower. You'll need to use docker history and compare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants