Docker and service clusters
We are looking into using Docker plus either Mesos/Marathon or Kubernetes for hosting a cluster. However, the one issue that we haven’t really seen any answers for is how to allow clustered services to connect to each other correctly. All of the ones that I have seen need to know about at least one other node before they can join the cluster. Some need to know about every node. However, in Kubernetes and Mesos, there’s no way to know what those IP addresses are ahead of time.
So, are there any best practices for this? If it helps, some technologies we’re looking into deploying as containers are ElasticSearch, ActiveMQ, and MongoDB. There may be others.
However, the one issue that we haven’t really seen any answers for is how to allow clustered services to connect to each other correctly.
I think you’re talking about HA/replicated/sharded apps here.
At the moment, in kubernetes, you can accomplish this by making an api call listing all the “endpoints” of the service; that will tell you where your peers are running.
We’d eventually like to support the use case you describe in a more first-class manner.
I filed https://github.com/GoogleCloudPlatform/kubernetes/issues/3419 to maybe get something more standardized started here.
I also wanted to setup an ElasticSearch cluster using Mesos/Marathon. As the existing “solutions” either were merely undocumented, or not working/outdated, I set up my own container.
If you like, have a look at https://github.com/tobilg/docker-elasticsearch-marathon
If you have a running Marathon installation (I use v0.8.1), then setting up an ElasticSearch cluster should be a matter of a few minutes.
The container now uses Elasticsearch v1.5.2 and is able to run on the latest Marathon v0.8.2.
As for Kubernetes, it currently does require
kube-controllers-manager to start with
--machines argument given a list of minion IPs or hostnames.
I don’t see any easy way how to handle this correctly in Kubernetes now. Yes, you could make a call to the API that returns list of endpoints but you must watch for changes and take an action when endpoints change…
I would prefer to use Mesos/Marathon that is well prepared for this scenario. You should implement custom Framework for Mesos. There is already Framework for ElasticSearch prepared: http://mesos.apache.org/documentation/latest/mesos-frameworks/