Kubernetes

From wiki.mikejung.biz
Jump to: navigation, search

LiquidWeb API Banner.jpg

Kubernetes Topology

Kubernetes is a container management platform that was created by Google. This is an open source platform. Kubernetes is mainly used to manage Docker containers and places containers on one of many physical hosts which create an entire cluster, so to speak. This wiki is meant to provide a basic, high level overview for the Kubernetes cluster topology. Below I will give some quick explanations on what each cluster node does, what it's role is in relation to other nodes, and how to install and configure a basic Kubernetes cluster using Fedora 20.

I've added some videos at the bottom of the page if you would rather learn about Kubernetes straight from the source (Google).

Kubernetes Pods

Pods are groups of containers that interact with each other on a frequent bases. Typically these are placed on the same physical host and rely on every member of the pod to operate as one logical group. Pods share the same host resources (CPU, RAM, Network), and typically they only communicate with each other via localhost connections on the same physical host. There are a few different container platforms, but the most common one is Docker, but recently Rocket was announced by CoreOS as a possible competitor. Regardless of what container platform is best, the concepts remain the same.

Pods also define the type of application / containers that run in the pod. The pod also defines the shared storage to be used for the containers running on the pod. Pods also facilitate horizontal and vertical scaling for containers within the the pod, essentially a pod is just a abstraction layer, which makes it easier to manage applications than it would be to handle individual containers.

Pods are meant to host CMS like Wordpress, logging systems, snap shot managers, etc, etc, however typically a pod should not be running multiple instances of the same application. Basically you don't want to be running HHVM in the same container that runs MySQL

Kubernetes Apache Pod Example

This is a basic example of a single pod that runs Apache. This may very well be out of date tomorrow, so be sure to check for updated docs.

{
  "id": "Apache-Node",
  "kind": "Pod",
  "apiVersion": "v1beta1",
  "desiredState": {
    "manifest": {
      "version": "v1beta1",
      "id": "Apache-Node",
      "containers": [{
        "name": "master",
        "image": "dockerfile/apache",
        "ports": [{
          "containerPort": 80,
          "hostPort": 8080
        }]
      }]
    }
  },
  "labels": {
    "name": "web-01"
  }
}

Kubernetes Labels

https://github.com/GoogleCloudPlatform/kubernetes/blob/master/DESIGN.md

Labels define and organize loosely coupled pods by using a key/value pair to define the function of the pod. This metadata is used to define the role and environment, such as "production" or front-end", or "back-end".

The Label Selector allows the user to to identify a set of pods by querying the metadata set by the label. This makes it easy to find pods which handle various tasks. This can be used for identify replicas or shards, pool members or other peers in a group of containers. There are currently two objects that are supported by label selectors:

  • service: "A configuration unit for the proxies that run on every worker node. It is named and points to one or more pods." Another way to look at would be as a named load balancer that sends traffic to one or more containers via proxy.

The services find the containers it should be load balancing based off the pod labels that are applied when the pod is initially created. Traffic is sent based on the "selector" name in the configuration file, you would enter in what name of the pod which should receive the traffic.

  • replicationController: "A replication controller takes a template and ensures that there is a specified number of "replicas" of that template running at any one time. If there are too many, it'll kill some. If there are too few, it'll start more."

You could label pods as Apache, PHP-FPM, HHMV or whatever makes the most sense to you to run inside of a container. You can customize and create your own definitions using labels.

For more details on labels, visit this link: https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/labels.md

An example of how to use labels could look something like this: note, syntax here is not correct, this is simply a basic example

tier: "frontend" or "backend"
environment: "production" or "staging"
version: "stable"
replication: 10

If you wanted to test out a change on 2 of these 10 nodes, you could add another label: "testing", then apply this to 2 of the 10 pods.

tier: "frontend" or "backend"
environment: "production" or "staging"
version: "testing"
replication: 2

Labels can overlap between different pods. For instance you can have many pods with a "frontend" label, but there could be a few different environments set, or versions, or whatever else you label them as. This makes it easier to query and view the overall layout of the pods and services. Labels are set when a pod is created.

Kubernetes Node

Kubernetes consists of two main types of services.

  • Services that run on a worker node. A Kubernetes node includes and runs services that handle Docker containers, as well as the software needed to communicate and take commands from the cluster wide master control system.
  • Services that comprise the cluster level control plane. This is considered a "Kubelet". This is an agent that runs on the node, based on a .yaml file which describes the pod it's managing. It then makes sure that the containers described in the files are actually started, running and that they continue to run.

For more information on how to define containers, visit the link below

Kubernetes Node Manifest Options

There are four ways that a manifest can be provided to a Kubelet:

  • File This is passed along as a flag when using CLI, the file specified is then checked every 20 seconds, but the frequency of the check can be raised or lowered if needed.
  • HTTP endpoint: Can specify a URL via CLI, which is then checked every 20 seconds, just like the file option, this can be changed.
  • etcd server: Kubelet reaches out and does a "watch" on an etcd server, because of this, changes are noticed and acted on very quickly, more so than the options listed above.
  • HTTP server: Kublet can listen for HTTP requests and respond to a simple API to submit a new manifest.

Kubernetes Node Proxy

  • Kubernetes Proxy: Every nodes runs a network proxy. The proxy reflects "services" which is defined by the Kubernetes API on each node. The proxy can handle simple TCP stream forwarding, or round robin load balancing / forwarding across a set of backends.

The endpoints are currently found through Docker compatible environment variables which specify internal and external ports for each container. These need to be unique port numbers so the proxy does not mix up other services it may be handling. To view compatible options, visit this link:

https://docs.docker.com/userguide/dockerlinks/

Kubernetes Control Plane

The control plane is made up of multiple components, however they all run on a single master node and work together to provide a unified view of the cluster.

etcd

Used to store configuration data for the persistent master state. If you use with "watch" support you can push out new changes to configurations very quickly.


Kubernetes API Server

The Kubernetes API server handles and configures data for 3 types of objects:

  • Pods
  • Services
  • ReplicationControllers

This data is then stored in etcd once the API server handles the request.

The API server also handles scheduling pods to worker nodes, currently this is a very simple scheduler. On top of scheduling pods, the API server also synchronizes pod information such as location, what ports are exposed and how the service is configured.

Kubernetes Controller Manager Server

The Kubernetes Controller Manager is basically a replica configuration service that is hosted on another physical host to make sure that replica configurations are always enforced and updated if needed. This server watches etcd and if any changes are made to replicationController objects, it uses the API to implement the changes across all pods.

Installing and configuring a 2 node Kubernetes cluster on Fedora 20 / Cent 6

sourcehttps://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/getting-started-guides/fedora/fedora_manual_config.md

Configure Kubernetes Master Node

The first server is going to be considered the master node, it will handle the apiserver and most of the control services. The following commands should get the first Fedora 20 box set up.

yum -y install dnf dnf-plugins-core
dnf copr enable walters/atomic-next
yum repolist walters-atomic-next/x86_64
yum -y install kubernetes
yum -y install iptables-services

You will want to modify these two files before continuing. The apiserver file should be configured so that the master node's API address is listening on all interfaces. NOTE: If you are just messing around and doing some testing, this is fine, but if you plan on leaving the cluster up you may want to limit the API server to listen on private network only. The IPs listed below are all using private addresses, change these to whatever your IPs are. Ideally you want to use an isolated, private network for node communication. Be sure to include the correct minion IPs, or worker nodes, otherwise the master will not be aware of any other nodes in the cluster.

The config file should be updated with the IP of the master node so that etcd is aware it's running on the master. Again, replace the private IPs in this example with the IP of the node you want to configure.

vim /etc/kubernetes/apiserver

KUBE_API_ADDRESS="0.0.0.0"
KUBE_API_PORT="8080"
KUBE_MASTER="192.168.100.177:8080"
MINION_ADDRESSES="192.168.100.242,192.168.100.159"
MINION_PORT="10250"


vim /etc/kubernetes/config

KUBE_ETCD_SERVERS="http://192.168.100.177:4001"


Once the configuration files have been modified and saved, you will want to restart and enable the services to run on start up.

systemctl restart etcd
systemctl restart kube-apiserver
systemctl restart kube-controller-manager
systemctl restart kube-scheduler

systemctl enable etcd
systemctl enable kube-apiserver
systemctl enable kube-controller-manager
systemctl enable kube-scheduler

Create IPtables rules on the master, then save and reload the configuration

/sbin/iptables -I INPUT 1 -p tcp --dport 8080 -j ACCEPT -m comment --comment "kube-apiserver"
/sbin/iptables -I INPUT 1 -p tcp --dport 4001 -j ACCEPT -m comment --comment "etcd_client"
service iptables save
systemctl daemon-reload
systemctl restart iptables

Configure Kubernetes Minion Nodes

Install the same packages that you installed on the master node, do this on every minion node. In addition, we are going to install docker on the minion nodes.

yum -y install dnf dnf-plugins-core
dnf copr enable walters/atomic-next
yum repolist walters-atomic-next/x86_64
yum -y install kubernetes
yum -y install iptables-services
yum erase docker -y
yum -y install docker-io

On the Minion Nodes, we are going to modify these two files. For kubelet replace the IP shown below if the IP of that node. This will be different for each node. You can use the IP for the hostname as well. For the config file, modify the etcd server to the private IP of the master node.

vim /etc/kubernetes/kubelet

MINION_ADDRESS="192.168.100.242"
MINION_PORT="10250"
MINION_HOSTNAME="192.168.100.242"


vim /etc/kubernetes/config

KUBE_ETCD_SERVERS="http://192.168.100.177:4001"

Restarted and enable the following services

systemctl restart kube-proxy
systemctl restart kubelet
systemctl restart docker

systemctl enable kube-proxy
systemctl enable kubelet
systemctl enable docker

Apply some IPtables rules

/sbin/iptables -I INPUT 1 -p tcp --dport 10250 -j ACCEPT -m comment --comment "kubelet"
service iptables save
systemctl daemon-reload
systemctl restart iptables

At this point you should be able to see all the minions on the master node. If you do not see the minions, you may want to double check IPtables, and make sure that all the nodes in the cluster can ping each other, as well as telnet to the ports for each service. For a cluster with one master, and two minions, the command should display something like this:

bin/kubecfg list minions
Minion identifier
----------
192.168.100.159
192.168.100.242

Installing and Configuring Docker, GO, etcd, and Kubernetes on CentOS 6.5

To install Docker on CentOS 6.6 you simply need to install the "docker-io" package. You will want to install Docker on all hosts in the cluster besides the control and API nodes. I suggest using CentOS 7 instead of CentOS 6.6 at this point because CentOS 7 has a much newer Kernel and offers much better performance compared to CentOS 6.

yum install docker-io
service docker start
chkconfig docker on

Get etcd package and extract it, then copy over the etcd binary to /user/bin/

wget https://github.com/coreos/etcd/releases/download/v0.4.6/etcd-v0.4.6-linux-amd64.tar.gz
tar zxvf etcd-v0.4.6-linux-amd64.tar.gz
cp etcd-v0.4.6-linux-amd64/etcd /usr/bin/etcd

Install golang / GO and extract the package as shown below. You will want to update your environment path so that you can find the go binary.

wget https://storage.googleapis.com/golang/go1.3.3.linux-amd64.tar.gz
tar -C /usr/local -xzf go1.3.3.linux-amd64.tar.gz
vim /etc/profile
##Add this line
export PATH=$PATH:/usr/local/go/bin


Get the latest version of kubernetes via git. All you need to do is clone this sucker like so:

git clone https://github.com/GoogleCloudPlatform/kubernetes.git

Start up a local kubernetes cluster (in screen)

screen -S kubernetes
cd kubernetes
hack/local-up-cluster.sh

If everything installed correctly, you should see something like this after you run "local-up-cluster.sh"

Building local go components
Starting etcd
etcd: {"action":"get","node":{"key":"/","dir":true}}
apiserver: {"kind":"PodList","creationTimestamp":null,"selfLink":"/api/v1beta1/pods","resourceVersion":2,"apiVersion":"v1beta1","namespace":"","items":[]}
Local Kubernetes cluster is running. Press Ctrl-C to shut it down.
Logs: 
  /tmp/apiserver.log
  /tmp/controller-manager.log
  /tmp/kubelet.log
  /tmp/kube-proxy.log
  /tmp/k8s-scheduler.log

If you are running Kubernetes on a local machine (like a VM, or KVM Guest), modify this kube-env.sh file and change the provider from gce to local. If you don't change the KUBERNETES_PROVIDER value and you don't use Google Compute then things will not work correctly. Or, if you are just testing stuff out and don't want to get charged a fortune by Google, using Local VMs is certainly the way to go.

vim cluster/kube-env.sh

##Change this to local if you are not use teh cloud

KUBERNETES_PROVIDER=${KUBERNETES_PROVIDER:-local}

Kubernetes Configuration Files

Local Cluster Configuration File

This file seems to configure various network settings for interacting with a local cluster.

kubernetes/cluster/local/config-default.sh

This is what the default configuration looks like

## Contains configuration values for interacting with a Local cluster

# NUMBER OF MINIONS IN THE CLUSTER
NUM_MINIONS=1

# IP LOCATIONS FOR INTERACTING WITH THE MASTER
export KUBE_MASTER_IP="127.0.0.1"
export KUBERNETES_MASTER="http://127.0.0.1:8080"

# IP LOCATIONS FOR INTERACTING WITH THE MINIONS
for (( i=0; i <${NUM_MINIONS}; i++)) do
        KUBE_MINION_IP_ADDRESSES[$i]="127.0.0.1"
done


Kubernetes Commands and Log File Locations

Once the Kubernetes cluster is up and running you should be able to run some commands to interact with the Kubernetes cluster.

List Kubernetes Minions

You can view Kubernetes minions, or worker nodes in the cluster by running the following command on the master server in the local Kubernetes cluster.

cluster/kubecfg.sh list minions

List Kubernetes Pods

To display a list of pods found in the Kubernetes cluster.

cluster/kubecfg.sh list /pods

List Kubernetes Services

To display a list of services that are running under your Kubernetes cluster

cluster/kubecfg.sh list /services

Delete Kubernetes Service

To delete a Kubernetes service, run the command below, replace $ID with the ID of the service you want to remove. For example if my service was called "wordpress" I would use this "cluster/kubecfg.sh delete services/wordpress"

cluster/kubecfg.sh delete services/$ID

List Kubernetes replicationControllers

To display a list of replicationControllers

cluster/kubecfg.sh list /replicationControllers

Create Kubernetes replicationController

To create a new replication controller for a pod you can create a simple json file and then create the controller from the file. In this example I am making a replica controller for the wordpress image. Replace the ID and Names with whatever you used for the pod. Ports will need to be changed as well, however the layout should be similar.

vim /root/wordpress-replica-controller.json

{
  "id": "wordpressReplicaController",
  "kind": "ReplicationController",
  "apiVersion": "v1beta1",
  "desiredState": {
    "replicas": 1,
    "replicaSelector": {"name": "wordpressreplica"},
    "podTemplate": {
      "desiredState": {
         "manifest": {
           "version": "v1beta1",
           "id": "wordpressReplicaController",
           "containers": [{
             "name": "slave",
             "image": "jbfink/wordpress",
             "ports": [{"containerPort": 80, "hostPort": 81}]
           }]
         }
       },
       "labels": {"name": "wordpressreplica"}
      }},
  "labels": {"name": "wordpressreplica"}
}

Once you have saved the file, run the command below to create the replica controller

/bin/kubecfg -c /root/wordpress-replica-controller.json create replicationControllers

Resize Kubernetes replicationControllers Quantity

You can change the amount of replicas for a pod on the fly by using the resize option. Replace "$SlaveController" with whatever you named the replicationController for the pod, you can then raise or lower the amount of replicas to whatever you want, here the number is 10

cluster/kubecfg.sh resize $replicationController 10

Remove Kubernetes replicationController

While you can set the amount of replicas to 0, you can also remove the controller entirely. To do this simply run the command below, replacing "$replicationController" with the actual controller name.

cluster/kubecfg.sh rm $replicationController

Start an Nginx Container

To run an Nginx container, use the command below.

cluster/kubecfg.sh -p 8080:80 run dockerfile/nginx 1 myNginx

If everything worked correctly, you can use the cluster/kubecfg.sh list /pods command to view the current status

cluster/kubecfg.sh list /pods

ID                                     Image(s)            Host                Labels                                                Status
----------                             ----------          ----------          ----------                                            ----------
8db4d374-4e3d-11e4-b405-0cc47a089dd2   dockerfile/nginx    127.0.0.1/          replicationController=myNginx,simpleService=myNginx   Waiting

Kubernetes Log Locations

apiserver.log

This log will display any API related logs. This is a good place to look if you run into any issues. An example of what is displayed by the apiserver.log is listed below

cat /tmp/apiserver.log
....
....
W1007 12:23:29.069545 24694 rest.go:232] No network settings: api.ContainerStatus{State:api.ContainerState{Waiting:(*api.ContainerStateWaiting)(0xc208245870), Running:(*api.ContainerStateRunning)(nil), Termination:(*api.ContainerStateTerminated)(nil)}, RestartCount:28, PodIP:"", Image:"kubernetes/pause:latest"}

controller-manager.log

The controller-manager.log will show you any issues relating to the Kubernetes controller.

cat /tmp/controller-manager.log
...
...
E1007 12:27:09.099277 24704 iowatcher.go:88] Unable to decode an event from the watch stream: got invalid watch event type: ERROR

kubelet.log

The kubelet.log displays a lot of useful information on containers, so if a container is not starting, or having some type of issue, look at this log to see what the cause is.

cat /tmp/kubelet.log
...
...
E1007 12:28:30.961850 24705 kubelet.go:458] Failed to introspect network container. (API error (500): Cannot start container 0831f56d62b1aa89f967ccee48bc0cd02f2cce5a3db464eab540e998e7d0c5c6: listen tcp 0.0.0.0:8080: bind: address already in use

E1007 12:28:30.961883 24705 kubelet.go:653] Error syncing pod: API error (500): Cannot start container 0831f56d62b1aa89f967ccee48bc0cd02f2cce5a3db464eab540e998e7d0c5c6: listen tcp 0.0.0.0:8080: bind: address already in use

kube-proxy.log

The kube-proxy.log file shows any connection issues, as well as what configuration is being used. In the example below a lot of stuff is broke, but the point is this log file is pretty useful for tracking down issues.

cat /tmp/kube-proxy.log
...
...
I1007 12:12:09.065658 24706 proxy.go:58] Using api calls to get config http://127.0.0.1:8080
I1007 12:12:09.065768 24706 proxy.go:87] Using configuration file /tmp/proxy_config
E1007 12:12:09.065863 24706 file.go:88] Couldn't read file: /tmp/proxy_config : open /tmp/proxy_config: no such file or directory

k8s-scheduler.log

Once containers are up and running you should be able to view scheduler data found in k8s-scheduler.log

cat /tmp/k8s-scheduler.log

Kubernetes Example Configuration for Startup Companies

I want to run WordPress and I want to do it Google Style and use Kubernetes because TechCrunch had some article about how it was amazing, and I need to use it if I want to be a successful startup.

I'll need a pod which consists of an Apache container, a database container, and a php-fpm container. All of these containers will need to be defined, and know about each other.

To do this I need to make some labels so I have a way to differentiate each container and let Google know about what ports should be open, and always listening. I'll go ahead and label give my web node a Key/Value of "web,01" and I want to have 3 replicas for once I get on reddit and everyone visits my site. I also need to do the same with my database server, which houses email addresses that I plan on selling to "email marketers" because I want to make money. Gonna set replication level to over 9000 because user data is really valuable these days.

I'm going to define the configuration for web to listen externally on port 666, and internally on port 80. I also need to do the same with my DB and PHP servers. Once that is defined the Kubelet checks my configuration files every 20 seconds or so to make sure I didn't make a new, awesome change. If I did make a change then kubelet applies the updated config to the proper containers automatically. It listens to etcd, which listens to the Kubernetes API for incoming commands or requests.

The Kubernetes Proxy then handles all my paid for traffic requests so I can get crazy mad SEO and make billions off of Google Ads, and then give all that money directly back to Google to pay for my 9000 replicated servers.

The End.

Kubernetes Links and Videos

Google I/O 2014 - Containerizing the Cloud with Docker on Google Cloud Platform

Provides an excellent overview of Kubernetes. Talk given at Google I/O 2014 and is about an hour long. Much of what I learned came from this video.

Connecting Containers: Building a PaaS with Docker and Kubernetes

From the 2015 Linux Conf: AU. Speakers from IBM and HP with a few smaller companies, more recent than the previous video and more focused on building your own PaaS.