Trying to understand what a Kubernetes Worker Node and Pod is compared to a Docker “Service”


Trying to understand what a Kubernetes Worker Node and Pod is compared to a Docker “Service”

I’m trying to learn Kubernetes to push up my microservices solution to some Kubernetes in the Cloud (e.g. Azure Kubernetes Service, etc)
As part of this, I’m trying to understand the main concepts, specifically around Pods + Workers and (in the yml file) Pods + Services. To do this, I’m trying to compare what I have inside my docker-compose file against the new concepts.
I currently have a docker-compose.yml file which contains about 10 images. I’ve split the solution up into two ‘networks’: frontend and backend. The backend network contains 3 microservices and cannot be accessed at all via a browser. The frontend network contains a reverse-proxy (aka. Traefik, which is just like nginx) which is used to route all requests to the appropriate backend microservice and a simple SPA web app. All works 100% awesome.
Each backend Microservice has at least one of these:

Web API host
Background tasks host

So this means, I could scale out the WebApi hosts, if required .. but I should never scale out the background tasks hosts.
Here’s a simple diagram of the solution:

So if the SPA app tries to request some data with the following route: this will hit the reverse-proxy and match a rule to then forward onto /account/1
So it’s from here, I’m trying to learn how to write up an Kubernetes deployment file based on these docker-compose concepts.

Each ‘Pod’ has it’s own IP so I should create a Pod per container. (Yes, a Pod can have multiple containers and to me, that’s like saying ‘install these software products on the same machine’)
A ‘Worker Node’ is what we replicate/scale out, so we should put our Pods into a Node based on the scaling scenario. For example, the Background Task hosts should go into one Node because they shouldn’t be scaled. Also, the hardware requirements for that node are really small. While the Web Api’s should go into another Node so they can be replicated/scaled out

If I’m on the right path with the understanding above, then I’ll have a lot of nodes and pods … which feels … weird?


Solution 1:

The pod is the unit of Workload, and has one or more containers. Exactly one container is normal. You scale that workload by changing the number of Pod Replicas in a ReplicaSet (or Deployment).

A Pod is mostly an accounting construct with no direct parallel to base docker. It’s similar to docker-compose’s Service. A pod is mostly immutable after creation. Like every resource in kubernetes, a pod is a declaration of desired state – containers to be running somewhere. All containers defined in a pod are scheduled together and share resources (IP, memory limits, disk volumes, etc).

All Pods within a ReplicaSet are both fungible and mortal – a request can be served by any pod in the ReplicaSet, and any pod can be replaced at any time. Each pod does get its own IP, but a replacement pod will probably get a different IP. And if you have multiple replicas of a pod they’ll all have different IPs. You don’t want to manage or track pod IPs. Kubernetes Services provide discovery (how do I find those pods’ IPs) and routing (connect to any Ready pod without caring about its identity) and load balancing (round robin over that group of Pods).

A Node is the compute machine (VM or Physical) running a kernel and a kubelet and a dockerd. (This is a bit of a simplification. Other container runtimes than just dockerd exist, and the virtual-kubelet project aims to turn this assumption on its head.)

All pods are Scheduled on Nodes. When a pod (with containers) is scheduled on a node, the kubelet responsible for & running on that node does things. The kubelet talks to dockerd to start containers.

Once scheduled on a node, a pod is not moved to another node. Nodes are fungible & mortal too, though. If a node goes down or is being decommissioned, the pod will be evicted/terminated/deleted. If that pod was created by a ReplicaSet (or Deployment) then the ReplicaSet Controller will create a new replica of that pod to be scheduled somewhere else.

You normally start many (1-100) pods+containers on the same node+kubelet+dockerd. If you have more pods than that (or they need a lot of cpu/ram/io), you need more nodes. So the nodes are also a unit of scale, though very much indirectly wrt the web-app.

You do not normally care which Node a pod is scheduled on. You let kubernetes decide.