Sebastien Goasguen

Wednesday, September 16, 2015

Running Kubernetes on a Raspberry PI

Running the Docker engine on Raspberry Pi is a breeze thanks to the Docker pirates from Hypriot, just download the image and flash it on your Pi and you are off to the races. I am not going to cover this installation process, it is well documented on the Hypriot website and I also wrote a recipe in the Docker cookbook. Roughly, download the .img file and dd it to your SD card, then boot your PI.
Having Docker on Raspberry Pi offers tons of possibilities for hobbyist and home devices. It also triggered my interest because Kubernetes, one of the Docker orchestrators, can be run standalone on a single node using Docker containers. I wrote a post several months ago about doing it using docker-compose. So I decided to give it a try last week-end, running Kubernetes on a PI using the Hypriot image that has the Docker engine.

Getting `etcd` to run

The first issue is that Kubernetes currently uses etcd, and that you need to run it on ARM. I decided to get the etcd source directly on the PI and updated the Dockerfile to build it directly there. Etcd uses a Golang ONBUILD image and it was causing me grief. So I copied the content of the ONBUILD image and created a new Dockerfile based on hypriot/rpi-golang to build it directly. You can see the Dockerfile. With that you have a Docker container running etcd on ARM.

Getting the Hyperkube to run on ARM

Now, I needed the hyperkube binary to run on ARM. Hyperkube is a single binary that allows you to start all the Kubernetes components. Thankfully there are some binaries already available for ARM. That was handy because I struggled to compile Kubernetes directly on the PI.
With that hyperkube binary on hand, I built an image based on the resin/rpi-raspbian:wheezy image. Quite straightforward:

FROM resin/rpi-raspbian:wheezy

RUN apt-get update
RUN apt-get -yy -q install iptables ca-certificates

COPY hyperkube /hyperkube

The Kubelet systemd unit

The Kubernetes agent running on all nodes in a cluster is called the Kubelet. The Kubelet is in charge of making sure that all containers supposed to be running on the node actually do run. It can also be used with a manifest to start some specific containers at startup. There is a good post from Kelsey Hightower about it. Since The Hypriot image uses systemd I took the systemd unit that creates a Kubelet service directly from Kelsey's post:

[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/bin/kubelet  \
--api-servers=http://127.0.0.1:8080 \
--allow-privileged=true \
--config=/etc/kubernetes/manifests \
--v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

The kubelet binary is downloaded directly from the same location as hyperkube. The manifest is a Kubernetes Pod definition that starts all the containers to get a Kubernetes controller running. It starts etcd, the API server, the scheduler, the controller and the service proxy, all using the hyperkube image built above.

Now the dirty hack

Kubernetes does something interesting. All containers in a Pod actually use the same IP address. This is done by running a fake container that just does nothing. The other containers in the Pod just share the same network namespace as this fake container. This is actually called the pause container. I did not find a way to specify a different image for the pause container in Kubernetes, it seems hard coded to gcr.io/google_containers/pause:0.8.0 which off course is supposed to run on x86_64.
So the dirty trick consisted in taking the pause Goland code from the Kubernetes source, compiling it on the PI using the hypriot/rpi-golang and sticking the binary in a SCRATCH image and tagging it locally to appear as gcr.io/google_containers/pause:0.8.0 and avoid the download of the real image that runs on x86_64. Yeah...right...I told you dirty but that was the quickest way I could think of.

Putting it all together

Now that you have all the images ready directly on the PI, plus a Kubelet service, you can start it. The containers will be created and you will have a single node Kubernetes cluster on the PI. All is left is to use the kubectl CLI to use it. You can download an ARM version of Kubectl form the official Kubernetes release.

HypriotOS: root@black-pearl in ~
$ docker images
REPOSITORY                       TAG         
hyperkube                        latest
gcr.io/google_containers/pause   0.8.0
etcd                             latest
resin/rpi-raspbian               wheezy   
hypriot/rpi-golang               latest 

HypriotOS: root@black-pearl in ~
$ ./kubectl get pods
NAME                          READY     STATUS    RESTARTS   AGE
kube-controller-black-pearl   5/5       Running   5          5m
HypriotOS: root@black-pearl in ~
$ ./kubectl get nodes
NAME          LABELS                               STATUS
black-pearl   kubernetes.io/hostname=black-pearl   Ready

Get it

Everything is on GitHub at https://github.com/skippbox/k8s4pi including a horrible bash script that does the entire build :)

Monday, June 15, 2015

Introducing Kmachine, a Docker machine fork for Kubernetes.

Docker machine is a great tool to easily start a Docker host on most public Cloud providers out there. Very handy as a replacement to Vagrant if all you want is a Docker host in the Cloud.

It automatically installs the Docker daemon and sets up the TLS authentication so that you can communicate with it using your local Docker client. It also has some early features to start a Swarm cluster (i.e Multiple Docker hosts).

Since I have been playing with Kubernetes lately, and that there is a single node install available, all based on Docker images...

I thought, let's hack Docker machine a bit so that in addition to installing the docker daemon it also pulls a few images and starts a few containers on boot.

The result is kmachine. The usage is exactly the same as Docker machine. It goes something like this on exoscale (did not take time to open 6443 on all providers...PR :)):

$ kmachine create -d exoscale foobar
$ kmachine env foobar
kubectl config set-cluster kmachine --server=https://185.19.29.23:6443 --insecure-skip-tls-verify=true
kubectl config set-credentials kuser --token=abcdefghijkl
kubectl config set-context kmachine --user=kuser --cluster=kmachine
kubectl config use-context kmachine
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://185.19.29.23:2376"
export DOCKER_CERT_PATH="/Users/sebastiengoasguen/.docker/machine/machines/foobar5"
export DOCKER_MACHINE_NAME="foobar5"
# Run this command to configure your shell: 
# eval "$(kmachine_darwin-amd64 env foobar5)"

You see that I used kubectl, the Kubernetes client to automatically setup the endpoint created by machine. The only gotcha right now is that I hard coded the token...Easily fixed by a friendly PR. We could also setup proper certificates and TLS authentication. But I opted for the easy route for now. If you setup your env. You will have access to Kubernetes, and Docker of course, the original docker-machine functionality is not broken.

$ eval "$(kmachine env foobar)"
$ kubectl get pods
POD                    IP        CONTAINER(S)         IMAGE(S)                                     
kubernetes-127.0.0.1             controller-manager   gcr.io/google_containers/hyperkube:v0.17.0 
                                 apiserver            gcr.io/google_containers/hyperkube:v0.17.0                                             
                                 scheduler            gcr.io/google_containers/hyperkube:v0.17.0                                             
$ kubectl get nodes
NAME        LABELS        STATUS
127.0.0.1   Schedulable   <none>    Ready

Since all Kubernetes components are started as containers, you will see all of them running from the start. etcd, the kubelet, the controller, proxy etc.

$ docker ps
CONTAINER ID        IMAGE                                        COMMAND               
7e5d356d31d7        gcr.io/google_containers/hyperkube:v0.17.0   "/hyperkube controll     
9cc05adf2b27        gcr.io/google_containers/hyperkube:v0.17.0   "/hyperkube schedule               
7a0e490a44e1        gcr.io/google_containers/hyperkube:v0.17.0   "/hyperkube apiserve               
6d2d743172c6        gcr.io/google_containers/pause:0.8.0         "/pause"                            
7950a0d14608        gcr.io/google_containers/hyperkube:v0.17.0   "/hyperkube proxy --                                                                                               
55fc22c508a9        gcr.io/google_containers/hyperkube:v0.17.0   "/hyperkube kubelet                                                                                                
c67496a47bf3        kubernetes/etcd:2.0.5.1                      "/usr/local/bin/etcd

Have fun ! I think it is very handy to get started with Kubernetes and still have the Docker machine setup working. You get the benefit of both, easy provisioning of a Docker host in the Cloud and a fully working Kubernetes setup to experience with. If we could couple it with Weave of Flannel, we could setup a full Kubernetes cluster in the Cloud, just like Swarm.

Tuesday, June 09, 2015

Building an S3 object store with Docker, Cassandra and Kubernetes

Docker makes building distributed applications relatively painless. At the very least deploying existing distributed systems/framework is made easier since you only need to launch containers. Docker hub is full of MongoDB, Elasticsearch, Cassandra images etc ... Assuming that you like what is inside those images, you can just grab them and run a container and you are done.

With a cluster manager/container orchestration system like Kubernetes, running clustered version of these systems where you need to operate multiple containers and multiple nodes is also made dead simple. Swear to God, it is !

Just check the list of examples and you will find everything that is needed to run a Redis, a Spark, a Storm, an Hazelcast even a Glusterfs cluster. Discovery of all the nodes can be a challenge but with things like Etcd, Consul, registrator, service discovery has never been easier.

What caught my eye in the list of Kubernetes examples is the ability to run an Apache Cassandra cluster. Yes, a Cassandra cluster based on Docker containers. It caught my eye especially that my buddies at exoscale have written an S3 compatible object store that uses Cassandra for storage. It's called Pithos and for those interested is written in Clojure.

So I wondered, let's run Cassandra in Kubernetes, then let's create a Docker image for Pithos and run it in Kubernetes as well. That should give me a S3 compatible object store, built using Docker containers.

To start we need a Kubernetes cluster. The easiest is to use Google container engine. But keep an eye on Kubestack which is a Terraform plan to create one. It could easily be adapted for different cloud providers. If you are new to Kubernetes check my previous post, or get the Docker cookbook in early release I just pushed a chapter on Kubernetes. Whatever technique you use, before proceeding you should be able to use the kubectl client and list the nodes in your cluster. For example:

$ ./kubectl get nodes
NAME                              LABELS                                                   STATUS
k8s-cookbook-935a6530-node-hsdb   kubernetes.io/hostname=k8s-cookbook-935a6530-node-hsdb   Ready
k8s-cookbook-935a6530-node-mukh   kubernetes.io/hostname=k8s-cookbook-935a6530-node-mukh   Ready
k8s-cookbook-935a6530-node-t9p8   kubernetes.io/hostname=k8s-cookbook-935a6530-node-t9p8   Ready
k8s-cookbook-935a6530-node-ugp4   kubernetes.io/hostname=k8s-cookbook-935a6530-node-ugp4   Ready

Running Cassandra in Kubernetes

You can use the Kubernetes example straight up or clone my own repo, you can explore all the pods, replication controllers and service definition there:

$ git clone https://github.com/how2dock/dockbook.git
$ cd ch05/examples

Then launch the Cassandra replication controller, increase the number of replicas and launch the service:

$ kubectl create -f ./cassandra/cassandra-controller.yaml
$ kubectl scale --replicas=4 rc cassandra
$ kubectl create -f ./cassandra/cassandra-service.yaml

Once the image is downloaded you will have your Kubernetes pods in running state. Note that the image currently used comes from the Google registry. That's because this image contains a Discovery class specified in the Cassandra configuration. You could use the Cassandra image from Docker hub but would have to put that Java class in there to allow all cassandra nodes to discover each other. As I said, almost painless !

$ kubectl get pods --selector="name=cassandra"

Once Cassandra discovers all nodes and rebalances the database storage you will get something like:

$ ./kubectl exec cassandra-5f709 -c cassandra nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns (effective)  Host ID                               Rack
UN  10.16.2.4  84.32 KB   256     46.0%             8a0c8663-074f-4987-b5db-8b5ff10d9774  rack1
UN  10.16.1.3  67.81 KB   256     53.7%             784c8f4d-7722-4d16-9fc4-3fee0569ec29  rack1
UN  10.16.0.3  51.37 KB   256     49.7%             2f551b3e-9314-4f12-affc-673409e0d434  rack1
UN  10.16.3.3  65.67 KB   256     50.6%             a746b8b3-984f-4b1e-91e0-cc0ea917773b  rack1

Note that you can also access the logs of a container in a pod with kubectl logs very handy.

Launching Pithos S3 object store

Pithos is a daemon which "provides an S3 compatible frontend to a cassandra cluster". So if we run Pithos in our Kubernetes cluster and point it to our running Cassandra cluster we can expose an S3 compatible interface.

To that end I created a Docker image for Pithos runseb/pithos on Docker hub. Its an automated build so you can check out the Dockerfile there. The image contains the default configuration file. You will want to change it to edit your access keys and bucket stores definitions. I launch Pithos as a Kubernetes replication controller and expose a service with an external load balancer created on Google compute engine. The Cassandra service that we launched earlier allows Pithos to find Cassandra using DNS resolution. To bootstrap pithos we need to run a non-restarting Pod which installs the Pithos schema in Cassandra. Let's do it:

$ kubectl create -f ./pithos/pithos-bootstrap.yaml

Wait for the bootstrap to happen, i.e for the Pod to get in succeed state. Then launch the replication controller. For now we will launch only one replicas. Using an rc makes it easy to attach a service and expose it via a Public IP address.

$ kubectl create -f ./pithos/pithos-rc.yaml
$ kubectl create -f ./pithos/spithos.yaml
$ ./kubectl get services --selector="name=pithos"
NAME      LABELS        SELECTOR      IP(S)            PORT(S)
pithos    name=pithos   name=pithos   10.19.251.29     8080/TCP
                                      104.197.27.250

Since Pithos will serve on port 8080 by default, make sure that you open the firewall for public IP of the load-balancer.

Use an S3 client

You are now ready to use your S3 object store, offered by Pithos, backed by Cassandra, running on Kubernetes using Docker. Wow...a mouth full !!!

Install s3cmd and create a configuration file like so:

$ cat ~/.s3cfg
[default]
access_key = AKIAIOSFODNN7EXAMPLE
secret_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
check_ssl_certificate = False
enable_multipart = True
encoding = UTF-8
encrypt = False
host_base = s3.example.com
host_bucket = %(bucket)s.s3.example.com
proxy_host = 104.197.27.250 
proxy_port = 8080
server_side_encryption = True
signature_v2 = True
use_https = False
verbosity = WARNING

Note that we use an unencrypted proxy (the load-balancer IP created by the Pithos Kubernetes service, don't forget to change it). The access and secret keys are the default stored in the Dockerfile

With this configuration in place, you are ready to use +s3cmd+:

$ s3cmd mb s3://foobar
Bucket 's3://foobar/' created
$ s3cmd ls
2015-06-09 11:19  s3://foobar

If you wanted to use Boto, this would work as well:

#!/usr/bin/env python

from boto.s3.key import Key
from boto.s3.connection import S3Connection
from boto.s3.connection import OrdinaryCallingFormat

apikey='AKIAIOSFODNN7EXAMPLE'
secretkey='wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY'

cf=OrdinaryCallingFormat()

conn=S3Connection(aws_access_key_id=apikey,
                  aws_secret_access_key=secretkey,
                  is_secure=False,host='104.197.27.250',
                  port=8080,
                  calling_format=cf)

conn.create_bucket('foobar')

And that's it. All of these steps make sound like a lot, but honestly it has never been that easy to run an S3 object store. Docker and Kubernetes truly make running distributed applications a breeze.

Monday, May 04, 2015

Running VMs in Docker Containers via Kubernetes

Couple weeks ago Google finally published a technical paper describing Borg, their cluster management system that they built over the last ten years or more and that runs all Google services.

There are several interesting concepts in the paper, one of them of course being that they run everything in containers. Whether they use Docker or not is unknown. Some parts of their workloads probably still use LMCTFY, - Let Me Contain That For You-. What struck me is that they say to not be using full virtualization. It makes sense in terms of timeline, considering that Borg started before the advent of hardware virtualization. However, their Google Compute Engine offers VM as a Service, so it is fair to wonder how they are running their VMs. This reminded me of John Wilkes talk at MesosCon 2014. He discussed scheduling in Borg (without mentioning it) and at 23 minutes in his talk, mentions that they run VMs in containers.

Running VM in containers does make sense when you think in terms of a cluster management system that deals with multiple type of workloads. You treat your IaaS (e.g GCE) as a workload, and contain it so that you can pack all your servers and maximize utilization. It also allows you to run some workloads on bare-metal for performance.

Therefore let's assume that GCE is just another workload for Google and that it runs through Borg.

Since Borg laid out the principles for Kubernetes, the cluster management system designed for containerized workloads and open sourced by Google in June 2014. You are left asking:

"How can we run VMs in Kubernetes ?"

This is where Rancher comes to our help to help us prototype a little some-some. Two weeks ago, Rancher announced RancherVM, basically a startup script that creates KVM VMs inside Docker containers (not really doing it justice calling it a script...). It is available on GitHub and super easy to try. I will spare you the details and tell you to go to GitHub instead. The result is that you can build a Docker image that contains a KVM qcow image, and that running the container starts the VM with the proper networking.

Privilege gotcha

With a Docker image now handy to run a KVM instance in it, using Kubernetes to start this container is straightforward. Create a Pod that launches this container. The only caveat is that the Docker host(s) that you use and that form your Kubernetes cluster need to have KVM installed and that your containers will need to have some level of privileges to access the KVM devices. While this can be tweaked with Docker run parameters like --device and --cap-add, you can brute force it in a very unsecure manner with --privilege. However Kubernetes does not accept to run privileged containers by default (rightfully so). Therefore you need to start you Kubernetes cluster (i.e API server and Kubelet with the --allow_privilege=true option).

If you are new to Kubernetes, check out my previous post where I show you how to start a one node Kubernetes "cluster" with Docker compose. The only modification that I did from that post, is that I am running this on a Docker host that also has KVM installed, that the compose manifest specifies --allow_pivileged=true in the kubelet startup command, and that I modify the /etc/kubernetes/manifests/master.json by specifiying a volume. This allows me not to tamper with the images from Google.

Let's try it out

Build your RancherVM images:

$ git clone https://github.com/rancherio/vm.git
$ cd vm
$ make all

You will now have several RancherVM images:

$ sudo docker images
REPOSITORY                           TAG                 ...
rancher/vm-android                   4.4                 ...
rancher/vm-android                   latest              ...
rancher/ranchervm                    0.0.1               ...
rancher/ranchervm                    latest              ...
rancher/vm-centos                    7.1                 ...
rancher/vm-centos                    latest              ...
rancher/vm-ubuntu                    14.04               ...
rancher/vm-ubuntu                    latest              ...
rancher/vm-rancheros                 0.3.0               ...
rancher/vm-rancheros                 latest              ...
rancher/vm-base                      0.0.1               ...
rancher/vm-base                      latest              ...

Starting one of those will give you access to a KVM instance running in the container.

I will skip the startup of the Kubernetes components. Check my previous post. Once you have Kubernetes running you can list the pods (i.e group of containers/volumes). You will see that the Kubernetes master itself is running as a Pod.

$ ./kubectl get pods
POD         IP        CONTAINER(S)         IMAGE(S)                                     ...
nginx-127             controller-manager   gcr.io/google_containers/hyperkube:v0.14.1   ...
                      apiserver            gcr.io/google_containers/hyperkube:v0.14.1                                             
                      scheduler            gcr.io/google_containers/hyperkube:v0.14.1

Now let's define a RancherVM as a Kubernetes Pod. We do this in a YAML file

apiVersion: v1beta2
kind: Pod
id: ranchervm
labels:
  name: vm
desiredState:
  manifest:
    version: v1beta2
    containers:
      - name: master
        image: rancher/vm-rancheros
        privileged: true
        volumeMounts:
          - name: ranchervm
            mountPath: /ranchervm
        env:
         - name: RANCHER_VM
           value: "true"
    volumes:
      - name: ranchervm
        source:
          hostDir: 
            path: /tmp/ranchervm

To create the Pod use the kubectl CLI:

$ ./kubectl create -f vm.yaml 
pods/ranchervm
$ ./kubectl get pods
POD         IP            CONTAINER(S)         IMAGE(S)                                     ....
nginx-127                 controller-manager   gcr.io/google_containers/hyperkube:v0.14.1   ....
                          apiserver            gcr.io/google_containers/hyperkube:v0.14.1                                             
                          scheduler            gcr.io/google_containers/hyperkube:v0.14.1                                             
ranchervm   172.17.0.10   master               rancher/vm-rancheros                         ....

The RancherVM image specified contains RancherOS. The container will start automatically but of course the actual VM will take couple more seconds to start. Once it's up, you can ping it and you can ssh to the VM instance.

$ ping -c 1 172.17.0.10
PING 172.17.0.10 (172.17.0.10) 56(84) bytes of data.
64 bytes from 172.17.0.10: icmp_seq=1 ttl=64 time=0.725 ms

$ ssh rancher@172.17.0.10 
...
[rancher@ranchervm ~]$ sudo docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
[rancher@ranchervm ~]$ sudo system-docker ps
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS               NAMES
229a22962a4d        console:latest      "/usr/sbin/entry.sh    2 minutes ago       Up 2 minutes                            console             
cfd06aa73192        userdocker:latest   "/usr/sbin/entry.sh    2 minutes ago       Up 2 minutes                            userdocker          
448e03b18f93        udev:latest         "/usr/sbin/entry.sh    2 minutes ago       Up 2 minutes                            udev                
ff929cddeda9        syslog:latest       "/usr/sbin/entry.sh    2 minutes ago       Up 2 minutes                            syslog

Amazing ! I can feel that you are just wondering what the heck is going on:)

You want to kill the VM ? Just kill the pod:

$ ./kubectl delete pod ranchervm

Remember that a Pod is not a single container but could contain several ones as well as volumes.

Let's go a step further, and scale the number of VMs by using a replication controller.

Using a Replication Controller to scale the VM

Kubernetes is quite nice, it builds on years of experience with fault-tolerance at Google and provides mechanism for keeping your services up, scaling them and rolling new versions. The replication Controller is a primitive to manage the scale of your services.

So say you would like to automatically increase or decrease the number of VMs running in your datacenter. Start them with a replication controller. This is defined in a YAML manifest like so:

id: ranchervm
kind: ReplicationController
apiVersion: v1beta2
desiredState:
  replicas: 1
  replicaSelector:
    name: ranchervm
  podTemplate:
    desiredState:
      manifest:
        version: v1beta2
        id: vm 
        containers:
          - name: vm
            image: rancher/vm-rancheros
            privileged: true
            volumeMounts:
              - name: ranchervm
                mountPath: /ranchervm
            env:
              - name: RANCHER_VM
                value: "true"
        volumes:
          - name: ranchervm
            source:
              hostDir:
                path: /tmp/ranchervm
    labels:
      name: ranchervm

This manifest defines a Pod template (the one that we created earlier), and set a number of replicas. Here we start with one. To launch it, use the kubectl binary again:

$ ./kubectl create -f vmrc.yaml 
replicationControllers/ranchervm
$ ./kubectl get rc
CONTROLLER   CONTAINER(S)   IMAGE(S)               SELECTOR         REPLICAS
ranchervm    vm             rancher/vm-rancheros   name=ranchervm   1

If you list the pods, you will see that your container is running and hence your VM will start shortly.

$ ./kubectl get pods
POD               IP            CONTAINER(S)         IMAGE(S)                                     ...
nginx-127                       controller-manager   gcr.io/google_containers/hyperkube:v0.14.1   ...
                                apiserver            gcr.io/google_containers/hyperkube:v0.14.1                                                    
                                scheduler            gcr.io/google_containers/hyperkube:v0.14.1                                                    
ranchervm-16ncs   172.17.0.11   vm                   rancher/vm-rancheros                         ...

Why is this awesome ? Because you can scale easily:

$ ./kubectl resize --replicas=2 rc ranchervm
resized

And Boom, two VMs:

$ ./kubectl get pods -l name=ranchervm
POD               IP            CONTAINER(S)   IMAGE(S)               ...
ranchervm-16ncs   172.17.0.11   vm             rancher/vm-rancheros   ...
ranchervm-279fu   172.17.0.12   vm             rancher/vm-rancheros   ...

Now of course, this little test is done on one node. But if you had a real Kubernetes cluster, it would schedule these pods on available nodes. From a networking standpoint, RancherVM can provide DHCP service or not. That means that you could let Kubernetes assign the IP to the Pod and the VMs would be networked over the overlay in place.

Now imagine that we had security groups via an OVS switch on all nodes in the cluster...we could have multi-tenancy with network isolation and full VM isolation. While being able to run workloads in "traditional" containers. This has some significant impact on the current IaaS space, and even Mesos itself.

Your Cloud as a containerized distributed workload, anyone ???

For more recipes like these, checkout the Docker cookbook.

Thursday, April 09, 2015

1 command to Kubernetes with Docker compose

After 1 command to Mesos, here is 1 command to Kubernetes.

I had not looked at Kubernetes in over a month. It is a fast paced project so it is hard to keep up. If you have not looked at Kubernetes, it is roughly a cluster manager for containers. It takes a set of Docker hosts under management and schedules groups of containers in them. Kubernetes was open sourced by Google around June last year to bring all the Google knowledge of working with containers to us, a.k.a The people :) There are a lot of container schedulers or orchestrators if you wish out there, Citadel, Docker Swarm, Mesos with the Marathon framework, Cloud Foundry lattice etc. The Docker ecosystem is booming and our heads are spinning.

What I find very interesting with Kubernetes is the concept of replication controllers. Not only can you schedule groups of colocated containers together in a cluster, but you can also define replica sets. Say you have a container you want to scale up or down, you can define a replica controller and use it to resize the number of containers running. It is great for scaling when the load dictates it, but it is also great when you want to replace a container with a new image. Kubernetes also exposes a concept of services basically a way to expose a container application to all the hosts in your cluster as if it were running locally. Think the ambassador pattern of the early Docker days but on steroid.

All that said, you want to try Kubernetes. I know you do. So here is 1 command to try it out. We are going to use docker-compose like we did with Mesos and thanks to this how-to which seems to have landed 3 days ago, we are going to run Kubernetes on a single host with containers. That means that all the Kubernetes components (the "agent", the "master" and various controllers) will run in containers.

Install compose on your Docker host, if you do not have it yet:

curl -L https://github.com/docker/compose/releases/download/1.2.0/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose

Then create this YAML file, call it say k8s.yml:

etcd:
  image: kubernetes/etcd:2.0.5.1
  net: "host"
  command: /usr/local/bin/etcd --addr=127.0.0.1:4001 --bind-addr=0.0.0.0:4001 --data-dir=/var/etcd/data
master:
  image: gcr.io/google_containers/hyperkube:v0.17.0
  net: "host"
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock
  command: /hyperkube kubelet --api_servers=http://localhost:8080 --v=2 --address=0.0.0.0 --enable_server --hostname_override=127.0.0.1 --config=/etc/kubernetes/manifests
proxy:
  image: gcr.io/google_containers/hyperkube:v0.17.0
  net: "host"
  privileged: true
  command: /hyperkube proxy --master=http://127.0.0.1:8080 --v=2

And now, 1 command:

$ docker-compose -f k8s.yml up -d

Quickly there after, you will see a bunch of containers pop-up:

$ docker ps
CONTAINER ID        IMAGE                                       
a17cac87965b        kubernetes/pause:go  
659917e61d3e        gcr.io/google_containers/hyperkube:v0.17.0
caf22057dbad        gcr.io/google_containers/hyperkube:v0.17.0
288fcb4408c7        gcr.io/google_containers/hyperkube:v0.17.0
820cc546b352        kubernetes/pause:go  
0bfac38bdd10        kubernetes/etcd:2.0.5.1                               
81f58059ca8d        gcr.io/google_containers/hyperkube:v0.17.0                     
ca1590c1d5c4        gcr.io/google_containers/hyperkube:v0.17.0

In the YAML file above, you see in the commands that it used a single binary hyperkube that allows you to start all the kubernetes components, the API server, the replication controller etc ... One of the components it started is the kubelet which is normally used to monitor containers on one of the host in your cluster and make sure they stay up. Here by passing the /etc/kubernetes/manifests it helped us start the other components of kubernetes defined in that manifest. Clever ! Note also that the containers where started with a host networking. So these containers have the network stack of the host, you will not see an interface on the docker bridge.

With all those up, grab the kubectl binary, that is your kubernetes client that you will use to interact with the system. The first thing you can do is list the nodes:

$ ./kubectl get nodes
NAME        LABELS    STATUS
127.0.0.1   <none>    Ready

Now start your first container:

./kubectl run-container nginx --image=nginx --port=80

That's a simple example, where you can actually start a single container. You will want to group your containers that need to be colocated and write a POD description in YAML or json than pass that to kubectl. But it looks like they extended kubectl to take single container start up. That's handy for testing.

Now list your pods:

$ ./kubectl get pods
POD           IP           CONTAINER(S)         IMAGE(S)                                    
k8s-master-127.0.0.1       controller-manager   gcr.io/google_containers/hyperkube:v0.14.1
                           apiserver            gcr.io/google_containers/hyperkube:v0.14.1 
                           scheduler            gcr.io/google_containers/hyperkube:v0.14.1                                                         
nginx-p2sq7   172.17.0.4   nginx                nginx

You see that there is actually two pods running. The nginx one that you just started and one pod made of three containers. That's the pod that was started by your kubelet to get Kubernetes up. Kubernetes managed by Kubernetes...

It automatically created a replication controller (rc):

$ ./kubectl get rc
CONTROLLER   CONTAINER(S)   IMAGE(S)   SELECTOR              REPLICAS
nginx        nginx          nginx      run-container=nginx   1

You can have some fun with the resize capability right away and see a new container pop-up.

$ ./kubectl resize --replicas=2 rc nginx
resized

Now that is fine and dandy but there is no port exposed on the host, so you cannot access your application on the outside. That's where you want to define a service. Technically it is used to expose a service to all nodes in a cluster but of course you can bind that service proxy to a publicly routed interface:

$ ./kubectl expose rc nginx --port=80 --public-ip=192.168.33.10

Now take your browser and open it at http://192.168.33.10 (if that's the IP of your host of course) and enjoy a replicated nginx managed by Kubernetes deployed in 1 command.

You will get more of that good stuff in my book, if I manage to finish it. Wish me luck.

Running the CloudStack Simulator in Docker

CloudStack comes with a simulator. It is very handy for testing purposes, we use it to run our smoke tests on TravisCI for each commit to the code base. However if you want to run the simulator, you need to compile from source using some special maven profiles. That requires you to check out the code and setup your working environment with the dependencies for a successfull CloudStack build.

With Docker you can skip all of that and simply download the cloudstack/simulator image from the Docker Hub. Start a container from that image and expose port 8080 where the dashboard is being served. Once the container is running, you can use docker exec to configure a simulated data center. This will allow you to start fake virtual machines, create security groups and so on. You can do all of this through the dashboard or using the CloudStack API.

So you want to give CloudStack a try ? Use Docker :)

$ docker pull cloudstack/simulator

The image is a bit big and we need to work on slimming it down but once the image is pulled, starting the container will be almost instant. If you feel like sending a little PR just the Dockerfile, there might be a few obvious things to slim down the image.

$ docker run -d -p 8080:8080 --name cloudstak cloudstack/simulator

The application needs a few minutes to start however, something that I have not had time to check. Probably we need to give more memory to the container. Once you can access the dashboard at http://localhost:8080/client you can configure the simulated data-center. You can choose between a basic network which gives you L3 network isolation or advanced zone which gives you a VLAN base isolation:

$ docker exec -ti cloudstack python /root/tools/marvin/marvin/deployDataCenter.py -i /root/setup/dev/basic.cfg

Once the configuration completes, head over to the dashboard http://localhost:8080/client and check your simulated infrastructure

Enjoy the CloudStack simulator brought to you by Docker.

Wednesday, March 18, 2015

1 Command to Mesos with Docker Compose

If you have not tried Docker, you should. The sheer power it puts in your hands and the simplicity of the user experience will just wow you. In this post, I will show you how to start a one node Mesos setup with Docker compose.

Docker announced compose on February 26th. Compose allows you to describe a multi-container setup and manage it with one binary docker-compose. The containers and volumes combinations managed by Compose are defined in a YAML file, super easy to read and super easy to write. The UX is very similar to the Docker CLI.

When compose was released, I tried it and was a bit underwhelmed, as it is basically a relooking of Fig. This is not unexpected as Docker Inc, acquired Orchard the makers of Fig. But I was expecting more added functionality and even a tighter integration with the Docker client (something a dev branch actually prototyped), even a common release instead of a separate binary. I am sure this will come.

As I am writing the docker cookbook, I have deployed Wordpress 20 different ways, and it's getting a bit boring ! I was looking for more information on Mesos and its support for Docker, I re-read a terrific blog post that showed how to start a Mesos setup (zookeeper, master, slave, marathon framework) in 7 commands. Can't beat that.

When I re-read this post, I automatically thought this was an exciting use case for docker-compose. One YAML file to start Mesos/Zookeeper/Marathon and experiment with it. Of course I am not talking about a production multi-node setup. I am just looking at it for an easy Mesos experiment.
I will spare you the details of installing compose (just a curl away). The dockers docs are great.

So here is the YAML file describing our Mesos setup:

zookeeper:
  image: garland/zookeeper
  ports:
   - "2181:2181"
   - "2888:2888"
   - "3888:3888"
mesosmaster:
  image: garland/mesosphere-docker-mesos-master
  ports:
   - "5050:5050"
  links:
   - zookeeper:zk
  environment:
   - MESOS_ZK=zk://zk:2181/mesos
   - MESOS_LOG_DIR=/var/log/mesos
   - MESOS_QUORUM=1
   - MESOS_REGISTRY=in_memory
   - MESOS_WORK_DIR=/var/lib/mesos
marathon:
  image: garland/mesosphere-docker-marathon
  links:
   - zookeeper:zk
   - mesosmaster:master
  command: --master zk://zk:2181/mesos --zk zk://zk:2181/marathon
  ports:
   - "8080:8080"
mesosslave:
  image: garland/mesosphere-docker-mesos-master:latest
  ports:
   - "5051:5051"
  links:
   - zookeeper:zk
   - mesosmaster:master
  entrypoint: mesos-slave
  environment:
   - MESOS_HOSTNAME=192.168.33.10
   - MESOS_MASTER=zk://zk:2181/mesos
   - MESOS_LOG_DIR=/var/log/mesos
   - MESOS_LOGGING_LEVEL=INFO

Four containers, images pulled from Docker hub, some ports exposed on the host. Some container linking and some environment variables used to configure the Mesos slave and master. One small hickup in the Slave defintion. You will see that I set the MESOS_HOSTNAME to the IP of the host. This allows me to browse the stdout and stderr of a Marathon task, otherwise we cannot reach it easily (small improvement to be done there.)

Launch this with docker-compose:

$ ./docker-compose up -d
Recreating vagrant_zookeeper_1...
Recreating vagrant_mesosmaster_1...
Recreating vagrant_marathon_1...
Recreating vagrant_mesosslave_1...

And open your browser at http://IP_HOST:5050 then follow the rest of the blog to start a task in marathon.

Bottom line, I went from '7 commands to Mesos' to '1 command to Mesos' thanks to Docker-compose and a fairly simple YAML file. Got to love it. When compose can do this across Docker hosts in a Docker Swarm started by Machine. Then the real fun will begin !

Tuesday, March 03, 2015

Rancher on RancherOS

Someone at Rancher must have some cattle in the middle of Arizona or in the backcountry of California. Or one of their VCs might be in Montana sitting in a big ranch while Docker is eating the IT world. In any case, this post is short and sweet like veggies and not like cattle (TFW) and is about Rancher and the newly announced RancherOS. Check out the rancheros announcement.

Let's keep this short, shall we ? Docker is great, but it is a daemon running on a single host. Since you want to scale :) and operate multiple servers, you need something to manage your Docker containers across multiple hosts. Several solutions are emerging, of course Docker Swarm but also Kubernetes, Lattice from Cloudfoundry and even Apache Mesos. Rancher is one of these cluster management solutions for Docker. It does some nice things like cross-hosts container linking through a custom built network overlay (think Flannel, Weave, Socketplane).

You can use Rancher with any set of Docker hosts. However, a new type of operating systems have started to appear. Container optimized OS. Or Just Enough Operating System for Docker. CoreOS, ProjectAtomic from RedHat, Ubuntu Snappy fit in that space. They aim to provide rolling atomic upgrades to the OS and run everything in it as a container. No more package manager, magic happens and you are always up to date. Package all your apps in containers, and use Rancher to run them in your cluster. End of story. Wait, enters rancherOS.

RancherOS

A couple lines of bash make all the talking:

$ git clone https://github.com/rancherio/os-vagrant.git
$ cd os-vagrant
$ vagrant up
$ vagrant ssh
[rancher@rancher ~]$ docker version
Client version: 1.5.0
…

rancherOS is a super minimalistic OS exclusively for Docker. It goes further and also runs system services as container themselves. And I will let @ibuildthecloud talk about systemd and Docker as PID 1.

[rancher@rancher ~]$ sudo system-docker ps
CONTAINER ID        IMAGE               COMMAND                ...      NAMES
32607470eb78        console:latest      "/usr/sbin/console.s   ...      console             
d0420165c1c0        userdocker:latest   "/docker.sh"           ...      userdocker          
375a8de12183        syslog:latest       "/syslog.sh"           ...      syslog              
d284afd7f628        ntp:latest          "/ntp.sh"              ...      ntp

The next logical question is of course....drum roll... Can I run rancher on rancheros. RinR not R&R ? And the answer is a resounding yes. I expect Rancher to come out in the next weeks maybe months with a solid product based on the two.

Rancher

If you are interested to try out RinR then check out the Ansible playbook I just made. You can use use it to deploy a cluster of rancherOS instances in AWS, and use one of them as a master and the others as workers. The master runs in a container:

$ docker run -d -p 8080:8080 rancher/server

And the workers can register with their agent:

$ sudo docker run --rm -it --privileged -v /var/run/docker.sock:/var/run/docker.sock rancher/agent http://<master_ip>:8080

Once all the workers have registered you can use the UI or the API to start containers.

As you can see I tested this at web scale with two nodes :)

Notes

In this very early super bleeding-edge testing phase (as you can tell in my good spirit today), I did find a few things that were a bit strange. Considering rancherOS was announced just last week, I am sure things will get fixed. Cloud-init support is minimal, not able to add second network interface, support for both keypair and userdata at the same time seems off. The UI was a bit slow to start and building the overlay was also a bit slow. It is also possible that I did something wrong.

Overall though, rancher is quite nice. It builds on years of experience in the team with developing CloudStack and operating clouds at scale and applies it to the Docker world. It does seem that they want to integrate with and provide the native Docker API, this would mean that users will be able to use Docker machine to add hosts to a rancher cluster, or even Docker swarm and that launching a container would also be a docker command away. How that differentiates from Swarm itself is not yet clear, but I would bet we will see additional networking and integration services in Rancher. Blurring the lines with Kubernetes ? Time will tell.