Be a hyper spaz about a hyperconverged GlusterFS setup with dynamically provisioned Kubernetes persistent volumes

I’d recently brought up my GlusterFS for persistent volumes in Kubernetes setup and I was noticing something errant. I had to REALLY baby the persistent volumes. That didn’t sit right with me, so I refactored the setup to use gluster-kubernetes to hook up a hyperconverged setup. This setup improves on the previous setup by both having the Gluster daemon running in Kubernetes pods, which is just feeling so fresh and so clean. Difference being that OutKast is like smooth and cool – and I’m an excited spaz about technology with this. Gluster-Kubernetes also implements heketi which is an API for GlusterFS volume management – that Kube can also use to allow us dynamic provisioning. Our goal today is to spin up Kube (using kube-ansible) with gluster-kubernetes for dynamic provisioning, and then we’ll validate it with master-slave replication in MySQL, to one-up our simple MySQL from the last article.

If you’re not familiar with persistent volumes in Kubernetes, or some of the basics of why GlusterFS is pretty darn cool – give my previous article a read for those basics. But, come back here for the setup.

The bulk of the work I was able to do here was thanks to the gluster-kubernetes setup guide, which helps you use the tool embedded in that project called gk-deploy. This article (and the playbook) leans on gk-deploy quite a bit. I’d also like to thank @jarrpa for some help he gave when I ran into some documentation snags bringing up gluster-kubernetes.

Requirements

In short, I recommend my usual setup which is a single CentOS 7 machine you can run VMs on. That’s what I typically use with kube-ansible. You’re going to need approximately 100 gigs of disk. You’ll run 4 virtual machines (one master and 4 minions). I personally use 2 vCPUs per VM, you’d likely get away with one.

Otherwise, you can also use this on baremetal, just skip the VM portion of kube-ansible. The tricky part is the that kube-ansible currently only supports a single disk for this setup, and it’d need to be the same name on all baremetal hosts. If you do give it a go, just change the name of the disk in the spare_disk_dev in the ./group_vars/all.yml in your kube-ansible clone. And, you’ll need some disks that are free-and-clear of data, and not mounted on your machines. kube-ansible can set this up for you in VMs. I’m also happy to take some pull requests to improve how this works against baremetal!

Also, as per usual, I assume a CentOS 7 distro on all nodes. And while you might be able to do this with other distros that it colors how I approach this and what ancillary tools I select.

Lastly, you need a client machine you can run Ansible on, and must have Ansible installed.

But, why? Isn’t the previous article’s method just fine?

First and foremost – the original article I wrote didn’t have heketi – the API that we’re going to have Kube use to dynamically provision Gluster volumes. That’s not as good.

The other thing was cleanliness. It was kind of two ways of managing applications – one running on the host operating system, and the others in containers. Just not nearly as clean.

Lastly, it required that you baby some of the volumes. For example, you’d have to specify new persistent volumes, and then make claims against them. Now we can have claims against a new Kubernetes storageclass, and that storage class will specify that we talk to Heketi, like in this example.

Also, we use the gk-deploy tool from gluster-kubernetes here, and it can do a number of things that we just don’t have to maintain anymore – such as “peer probe” all the gluster nodes; which gets them all connected to one another and cooperating.

This begs the question – is there an advantage to running it on the host? I don’t think there is. This has all the pieces that has, it just happens to have them running in containers on the host. Since you’re running Kubernetes – I think that’s an advantage.

It should be noted however that the gk-deploy tool also supports using an existing GlusterFS cluster, and it can just run heketi for us. (However, my playbook doesn’t intend to support that mode, for now.)

Kubernetes Installation (the hard part)

I’ll give a quick review of kube-ansible. If you want a more thorough tutorial check out my article on using it. The most difficult part is just modifying the inventory, and that’s not even that tough. Remember the gist here is that we have a single host that can run virtual machines (which we call the “virthost”, and this playbook has the setup for running those), and then we run virtual machines on which we run Kubernetes (generally for laboratory analysis, in my own case).

Clone up the kube-ansible repo (at a particular tag that has the kube-glusterfs):

$ git clone --branch v0.1.0 https://github.com/redhat-nfvpe/kube-ansible.git && cd kube-ansible

Now go and do the hardest part. Modify the inventories. Modify ./inventory/virthost.inventory to your main CentOS machine to run virtual machines on. Add a vars section to the bottom of it:

[kubehost:vars]
bridge_physical_nic=eth0
bridge_network_cidr=192.168.1.0/24

And set the eth0 to whatever your primary NIC is named (e.g. if you have multiple NICs, it’s likely in your lab this would be the NIC that can access the internet). And set the CIDR for it too. Of course, at the top set the IP address of this host.

Now we’ll run the virthost setup:

$ ansible-playbook -i inventory/virthost.inventory virt-host-setup.yml

Two things you need to do from here:

  • Pay attention to the list of IPs for the VMs that come up in a play described as: Here are the IPs of the VMs
  • Next, go ahead and get the contents of /root/.ssh/id_vm_rsa (the SSH private key) on the virt host. Put those somewhere so on your client machine (workstation or what have you)

Modify the ./inventory/vms.inventory. In the first four lines, put the IP addresses you got from the last step. Then, the last line point the ansible_ssh_private_key_file variable at the path to the SSH private key you got from the previous step. And lastly – comment out the ansible_ssh_common_args line, you don’t need that now.

Now you can install Kubernetes.

$ ansible-playbook -i inventory/vms.inventory kube-install.yml

To verify it, on the virt host you can ssh to the kube master, like so, and get the list of nodes:

$ ssh -i .ssh/id_vm_rsa centos@kube-master 'kubectl get nodes'

Cool – now you have Kube. We’re going to attach some spare disks to those VMs which will show up as /dev/vdb on each of them. By default they’re 10 gigs (and you can change that in the spare_disk_size_megs variable in ./group_vars/all.yml or put it in your inventory)

ansible-playbook -i inventory/virthost.inventory vm-attach-disk.yml

Alright, you’re good to go – now onto the good stuff.

GlusterFS on Kube (the easy part)

Here’s the easy part – just one more playbook to run. Then we can go from there.

$ ansible-playbook -i inventory/vms.inventory gluster-install.yml

This is going to do everything you need to have glusterfs running on each of the minion nodes.

The (at least mock) hyperconverged storage situation is coming now. If you’re not familiar with that terminology – the shortest explanation is that your storage resides on the same hosts as where you run your computational workloads. Awesome.

Great – that’s a whole bunch of magic, what the heck did that playbook actually do!? If you want to see it in stark detail, checkout the ./roles/glusterfs-kube-config/tasks/main.yml file which has all of what it does.

Here’s the run-down:

  • Installs some required packages (glusterfs-fuse is required on all nodes)
  • Templates a gk-deploy topology file, from ./roles/glusterfs-kube-config/templates/glusterfs-topology.json.j2
    • You can also check out an example, if you’d like.
  • Clones gluster-kubernetes
  • Installs the heketi CLI application on the kube master.
  • Runs the gk-deploy script
    • Using the topology file we templated
    • Specifying that we’ll run GlusterFS daemon in Kubernetes
  • Creates a storageclass from a template in ./roles/glusterfs-kube-config/templates/glusterfs-storageclass.yaml.j2

It’s actually a LOT less steps than before. Primarily because we don’t have to worry about such things as:

  • Formatting disks and creating volume groups, etc.
  • Configuring GlusterFS more deeply and manually peering the endpoints.
  • …and more.

Let’s use it!

Alright cool, well, you just hung out for a while waiting for that GlusterFS playbook to run (not to mention, an entire Kubernetes install). Which makes me believe that you’re sufficiently coffee-i-fied at this point. Because of that, we’re going to pick something a little bit more ambitious this time for an example usage of these persistent volumes. Last time we used MariaDB, this time, we’re going to use MySQL with replication.

Setting up MySQL replication in Kubernetes

If you’re interested more deeply in how to do this, check out the k8s docs on running replicated mysql using stateful sets. That’s the origin of my example, but, I have some modified resource definitions here that are specific to what we just spun up so you don’t have to read through every line. However, it is actually fairly interesting to check out, so I do encourage it.

Firstly, let’s curl down those resource definitions. I also have them in a GitHub Gist.

Ok, let’s get the files.

$ curl -s -L https://goo.gl/SKqo8J > mysql-configmap.yaml
$ curl -s -L https://goo.gl/msDQTb > mysql-services.yaml
$ curl -s -L https://goo.gl/doH2gN > mysql-statefulset.yaml

Create from all of those.

$ kubectl create -f mysql-configmap.yaml
$ kubectl create -f mysql-services.yaml
$ kubectl create -f mysql-statefulset.yaml

(One time I had to recreate the stateful set, MySQL complained that I couldn’t connect from an arbitrary IP address one time. Unsure what caused that, but if it happens to you just kubectl delete -f mysql*.yml and try again. )

It takes a bit to spin up, since it’s a stateful set, the pods come up ordered for us, which is nice for a replicated setup. So make sure to do a watch -n1 kubectl get pods (or even a kubectl get pods --watch).

Verifying the MySQL setup.

Now, we can do cool stuff with it. Let’s create a table based on… Honey bees (I keep bees but these numbers aren’t representative of anything scientific, just FYI). Feel free to use whatever data you’d like.

[centos@kube-master ~]$ kubectl run mysql-client --image=mysql:5.7 -i -t --rm --restart=Never -- mysql -h mysql-0.mysql
mysql> CREATE DATABASE beekeeping;
mysql> USE beekeeping;
mysql> CREATE TABLE hive (id INT AUTO_INCREMENT, role VARCHAR(255), counted BIGINT, PRIMARY KEY (id));
mysql> INSERT INTO hive VALUES (NULL,'queen',1);
mysql> INSERT INTO hive VALUES (NULL,'worker',20000);
mysql> INSERT INTO hive VALUES (NULL,'drone',800);
mysql> SELECT * FROM hive;

Ok, that’s all well and good, now, let’s check that the replicated members have data.

[centos@kube-master ~]$ kubectl run mysql-client --image=mysql:5.7 -i -t --rm --restart=Never -- mysql -h mysql-read --execute "SELECT * FROM beekeeping.hive"
+----+--------+---------+
| id | role   | counted |
+----+--------+---------+
|  1 | queen  |       1 |
|  2 | worker |   20000 |
|  3 | drone  |     800 |
+----+--------+---------+

Now, let’s have fun and tear it down, and see if we still have data rollin’.

[centos@kube-master ~]$ kubectl delete -f mysql-statefulset.yaml 
[centos@kube-master ~]$ kubectl create -f mysql-statefulset.yaml 

And then exec the select again, and bammo…

[centos@kube-master ~]$ kubectl run mysql-client --image=mysql:5.7 -i -t --rm --restart=Never -- mysql -h mysql-read --execute "SELECT * FROM beekeeping.hive"
+----+--------+---------+
| id | role   | counted |
+----+--------+---------+
|  1 | queen  |       1 |
|  2 | worker |   20000 |
|  3 | drone  |     800 |
+----+--------+---------+

You’re cookin’ with oil!

Chainmail of NFV (+1 Dexterity) -- Service Chaining in Containers using Koko & Koro

In this episode – we’re going to do some “service chaining” in containers, with some work facilitated by Tomofumi Hayashi in his creation of koko and koro.

Koko (the “container connector”) gives us the ability to connect a network between containers (with veth, vxlan or vlan interfaces) in an isolated way (and it creates multiple interfaces for our containers too, which will allow us to chain them), and then we can use the functionality of Koro (the “container routing” tool) to manipulate those network interfaces, and specifically their routing in order to chain them together, and then further manipulate routing and ip addressing to facilitate the changing of this chain.

Our goal today will be to connect four containers in a chain of services going from a http client, to a firewall, through a router, and terminating at a web server. Once we have that chain together, we’ll intentionally cause a failure of a service and then repair it using koro.

(The title joke is… fairly lame. Since when aren’t the other one’s lame? But! It’s supposed to be a reference to magic items in Dungeons & Dragons)

I’d like to point out that this is not exactly “service function chaining” (SFC) – we can let sdxcentral define that for you. From what I understand is that pure SFC uses a “network service header” (which you can see here from IETF) to help perform dynamic routing. This doesn’t use those headers, so I will refer to it as simply “service chaining”. You can think of it as maybe some related tools and ideas to build on to achieve something more like a proper SFC.

In fact… We’re going to perform a series of steps here that are quite manual, but, to demonstrate what you may be able to automate in the future – and my associate Tomofumi has some machinations in the works to do such things. We’ll cover those later.

Now that we’ve establashed we’re going to chain some services together – let’s go ahead and actually chain ‘em up!

What are we building today?

We’re going to spin up 4 containers, and chain the services in them. All the network connections are veth created by koko.

service chain overview

Here you can see we’ll have 4 services chained together, in essence an HTTP request is made by the client, passes the firewall, gets routed by the router, and then lands at an HTTP server. All of these services run in containers, and the network connections are veth, so all of the containers are on the same host.

The firewall is just iptables, and the router is simply kernel routing and allowing ip forwarding in the container. These are shortcuts to help simplify those services allowing at us to look at the pieces that we use to deploy and manage their networking. I tried to put in an example with DPI, and I realized quickly it was too big of a piece to chew, and that it’d detract from the other core functionality to explore in this article.

Requirements

Note that this article assumes you have setup left-over from this previous how-to blog showing koko+vpp. If you’re not interested in the VPP part (we don’t use it in this article) you can skip those sections, but, you will need koko & koro installed and Docker.

Limitations and what’s next

This setup could be further extended and made cooler by making all vxlan (or maybe even vlan) connections to the containers and backing them with the VPP host we create in the last article. However, it’s a further number of steps, and between these articles I beleieve one could make a portmanteau of the two and give that a whirl, too!

Tomo has other cool goodies in the works, and without spoiling the surprise of how cool what he’s been designing, the gist is that they further the automation of what we’re doing here. In a more realistic scenario – that’s the real use-case, to have these type of operations very quickly and automatically – instead of babying them at each step. However, this helps to expose you to the pieces at work for something like that to happen.

A warm-up using iptables (optional)

Ok, let’s have a warm-up quick. We can go through the most basic steps, and we’ll operate a firewall. So here we’ll create two endpoints with a firewall between them. This part is optional and you can skip down to the next header.

But, I encourage you to run through this quick, it won’t take extra time and you can see stepwise how koro is used after, say, not using it.

I’m going to use someone’s dockerhub iptables, and here’s the Dockerfile should you need it.

$ docker pull vimagick/iptables

Now run that image, and two more.

$ docker run --name=iptables -dt --privileged -e 'TCP_PORTS=80,443' -e 'UDP_PORTS=53' -e 'RATE=4mbit' -e 'BURST=4kb' vimagick/iptables:latest
$ docker run --name test1 --privileged --net=none -dt dougbtv/centos-network sleep 2000000
$ docker run --name test2 --privileged --net=none -dt dougbtv/centos-network sleep 2000000

We can use koko to connect them together with veth connections.

$ ./gocode/bin/koko -d test1,link1,10.0.1.1/24 -d iptables,link2,10.0.1.2/24
$ ./gocode/bin/koko -d iptables,link3,10.0.2.1/24 -d test2,link4,10.0.2.2/24

Then, you need default routes on both test1 and test2, like:

$ docker exec -it test /bin/bash -c 'ip route add default via 10.0.1.2 dev link1'
$ docker exec -it test /bin/bash -c 'ip route add default via 10.0.2.1 dev link4'

And the iptables container needs to have ip forwarding…

[root@koko1 centos]# docker exec -it iptables /bin/sh
/ # echo 1 > /proc/sys/net/ipv4/ip_forward

Then you should be able to ping 10.0.2.2 from test1.

Now let’s block icmp, to make sure iptables is working, needs to go into the FORWARD table.

/ # iptables -A FORWARD -p icmp  -j DROP

And you can remove that too…

/ # iptables delete -j FORWARD 1

Cool, those are the working bits, minus koro. So let’s bring in koro.

First, delete those containers (this removes ALL the containers on the host).

$ docker kill $(docker ps -aq)
$ docker rm $(docker ps -aq)

Run those containers again, and now use koko but without assigning IP addresses.

$ docker run --name=iptables -dt --privileged -e 'TCP_PORTS=80,443' -e 'UDP_PORTS=53' -e 'RATE=4mbit' -e 'BURST=4kb' vimagick/iptables:latest
$ docker run --name test1 --privileged --net=none -dt dougbtv/centos-network sleep 2000000
$ docker run --name test2 --privileged --net=none -dt dougbtv/centos-network sleep 2000000
$ ./gocode/bin/koko -d test1,link1 -d iptables,link2
$ ./gocode/bin/koko -d iptables,link3 -d test2,link4

Alright, now, you’ve gotta still set ip forwarding on the iptables container.

[root@koko1 centos]# docker exec -it iptables /bin/sh
/ # echo 1 > /proc/sys/net/ipv4/ip_forward

We’ve got links now, but, no ip addressing. Koro should be able to fix this up for us.

This adds the addresses…

$ ./gocode/bin/koro docker test1 address add 10.0.1.1/24 dev link1
$ ./gocode/bin/koro docker iptables address add 10.0.1.2/24 dev link2
$ ./gocode/bin/koro docker iptables address add 10.0.2.1/24 dev link3
$ ./gocode/bin/koro docker test2 address add 10.0.2.2/24 dev link4

Let’s add a default route to test1 & 2.

$ ./koro docker test1 route add default via 10.0.1.2 dev link1
$ ./koro docker test2 route add default via 10.0.2.1 dev link4

With those in place, we can now ping across the containers.

$ docker exec -it test1 ping -c 5 10.0.2.2

Alright, and now… we’ll take those down. (This kills all containers running on your host, btw.)

$ docker kill $(docker ps -aq)
$ docker rm $(docker ps -aq)

Creating a service chain with koro

Let’s get to the good stuff – time to go ahead and make a service chain, it’ll look like…

service chain

Note that those are all containers, and the interfaces created in them are veth pairs.

With that in hand – let’s spin up all the pieces that we need. Pull my dougbtv/pickle-nginx, we’ll use that.

$ docker pull dougbtv/pickle-nginx

Now, let’s run all the containers.

$ docker run --name client --privileged --net=none -dt dougbtv/centos-network sleep 2000000
$ docker run --name=firewall -dt --privileged -e 'TCP_PORTS=80,443' -e 'UDP_PORTS=53' -e 'RATE=4mbit' -e 'BURST=4kb' vimagick/iptables:latest
$ docker run --name router --privileged --net=none -dt dougbtv/centos-network sleep 2000000
$ docker run -dt --net=none --name webserver dougbtv/pickle-nginx

And run a docker ps to make sure they’re all running.

Ok, these need a bit of grooming. Firstly, we need IP forwarding on the firewall and router.

$ docker exec -it firewall /bin/sh -c 'echo 1 > /proc/sys/net/ipv4/ip_forward'
$ docker exec -it router /bin/sh -c 'echo 1 > /proc/sys/net/ipv4/ip_forward'

Great. Now we can create koko links between all the containers. That’s three veth pairs…

$ ./gocode/bin/koko -d client,link1 -d firewall,link2
$ ./gocode/bin/koko -d firewall,link3 -d router,link4
$ ./gocode/bin/koko -d router,link5 -d webserver,link6

And now we’ll add addresses to them all.

$ ./gocode/bin/koro docker client address add 10.0.1.1/24 dev link1
$ ./gocode/bin/koro docker firewall address add 10.0.1.2/24 dev link2
$ ./gocode/bin/koro docker firewall address add 10.0.2.1/24 dev link3
$ ./gocode/bin/koro docker router address add 10.0.2.2/24 dev link4
$ ./gocode/bin/koro docker router address add 10.0.3.1/24 dev link5
$ ./gocode/bin/koro docker webserver address add 10.0.3.2/24 dev link6

And we’re going to need some more routing.

[root@koko1 centos]# ./gocode/bin/koro docker client route add default via 10.0.1.2 dev link1
[root@koko1 centos]# ./gocode/bin/koro docker webserver route add default via 10.0.3.1 dev link6
[root@koko1 centos]# ./gocode/bin/koro docker firewall route add 10.0.3.0/24 via 10.0.2.2 dev link3
[root@koko1 centos]# ./gocode/bin/koro docker router route add 10.0.1.0/24 via 10.0.2.1 dev link4

Check all the routing.

[root@koko1 centos]# docker exec -it client ip route
default via 10.0.1.2 dev link1 
10.0.1.0/24 dev link1  proto kernel  scope link  src 10.0.1.1 

[root@koko1 centos]# docker exec -it firewall ip route
default via 172.17.0.1 dev eth0 
10.0.1.0/24 dev link2 proto kernel scope link src 10.0.1.2 
10.0.2.0/24 dev link3 proto kernel scope link src 10.0.2.1 
10.0.3.0/24 via 10.0.2.2 dev link3 
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.0.2 

[root@koko1 centos]# docker exec -it router ip route
10.0.1.0/24 via 10.0.2.1 dev link4 
10.0.2.0/24 dev link4  proto kernel  scope link  src 10.0.2.2 
10.0.3.0/24 dev link5  proto kernel  scope link  src 10.0.3.1 

[root@koko1 centos]# docker exec -it webserver ip route
default via 10.0.3.1 dev link6 
10.0.3.0/24 dev link6  proto kernel  scope link  src 10.0.3.2 

Now we have a service chain! Huzzah! You can curl the nginx.

[root@koko1 centos]# docker exec -it client /bin/bash -c 'curl -s 10.0.3.2 | grep -i pickle'
<title>This is pickle-nginx</title>

Let’s cause some chaos, some mass confusion. It’s all well and good we have these four pieces all setup together.

However, the reality is… Something is going to happen. In the real world – everything is broken. To emulate that let’s create this scenario – the firewall goes down. In a more realistic scenario, this pod will be recreated. For this demonstration we’re just going to let it be gone, and we’ll just create new links with koko directly to the router, and then re-route.

Here’s what we’ll do…

service chain failure mode

Note that the firewall winds up failing and is gone, and we’ll fix the routing and ip addressing surrounding it to patch it up.

[root@koko1 centos]# docker kill firewall

That should do it. Alright now we can’t run our same curl, it fails.

[root@koko1 centos]# docker exec -it client /bin/bash -c 'curl 10.0.3.2'
curl: (7) Failed to connect to 10.0.3.2: Network is unreachable

We can use koko & koro to fix this up for us. Let’s create some new interfaces with koko. We’ll also just use a new subnet for this connection (we could finesse the existing, but, this is a couple steps less).

Go ahead and create that veth pair.

$ ./gocode/bin/koko -d client,link7 -d router,link8

Now, we’ll need some IP addresses, too.

$ ./gocode/bin/koro docker client address add 10.0.4.1/24 dev link7
$ ./gocode/bin/koro docker router address add 10.0.4.2/24 dev link8

And we have to fix the client containers default route. We don’t have to delete the existing default route because it went down with the interface – since a veth is a pair. (In a vxlan setup, we’d have to otherwise detect the failure and provide some cleanup), so all we have to do is add a route.

./gocode/bin/koro docker client route add default via 10.0.4.2 dev link7

And – we’re back in business, you can curl the pickle-nginx again.

[root@koko1 centos]# docker exec -it client /bin/bash -c 'curl -s 10.0.3.2 | grep -i pickle'
<title>This is pickle-nginx</title>

In closing.

Using the basics from this technique for a failed service in a container you could make a number of other operations that would use the same basics, e.g. other failure modes (container that is died is replaced with a new one), or extensions of the service chain, say… Adding a DPI container somewhere in the chain.

The purpose of this is to show the steps manually that could be taken automatically – by say a CNI plugin for example. That could make these changes automatically and much more quickly than us lowly humans can make them by punching commands in a terminal.

Using Koko to create vxlan interfaces for cross-host container network isolation -- and cross-connecting them with VPP!

I’ve blogged about koko in the past – the container connector. Due to the awesome work put forward by my associate Tomofumi Hayashi – today we can run it and connect to FD.io VPP (vector packet processing), which is used for a fast data path, something we’re quite interested with in the NFV space. We’re going to setup vxlan links between containers (on separate hosts) back to a VPP forwarding host, where we’ll create cross-connects to forward packets between those containers. As a bonus, we’ll also compile koro, an auxillary utility to use with koko for “container routing”, which we’ll using in a following companion article. Put your gloves on start up your terminals, we’re going to put our hands right on it and have it all up and running.

Also – If you haven’t been paying attention, Tomo has been putting some awesome work into Koko. He’s working on getting it packaged into an RPM, he has significantly improved it by breaking out the go code so you can use it as a library and not just at the command line, and even beautified the syntax for the arguments! …Among other great stuff. Great work, Tomo. Next time we can RPM install it instead of building it ourself (it’s not hard, but, so handy to have the packages, I can’t wait.)

Since we’ll be in the thick of it here, it’s almost free to compile koro while we’re at it. I’m excited to put my hands on koro, and we’ll cover it in the next article (spoiler alert: it’s got service chains in containers using koko and koro!). I’ll refer to the build for koro here for those looking for it.

What are we building?

Here’s a diagram showing the layout of what we’re going to build today:

koko vpp scenario

The gist we’ll build three boxes (I used VMs), and we’ll install VPP on one, and the two other hosts are container hosts where we run containers that we’ll modify using koko.

Some limitations

When we deploy VPP, we are deploying it directly on the host, and not in containers. Also, the containers use VXLAN interfaces that have pairs on the VPP host. This isn’t a limitation per-se, but, more that in the future we’d like to explore further the concepts around user-space networking with containers – so it can feel like a limitation when you know there’s more territory to explore!

Note that this is a manual process shown here, to show you the working parts of these applications. A likely end goal would be to automate these processes in order to have this happen for larger, more complex systems – and in much shorter time periods (and I mean MUCH shorter!) Tomo is working towards these implementations, but I won’t spoil all the fun yet.

Boxen setup & requirements.

For my setup, I used 3 boxes… 4 vcpus, 2048 megs of ram each. I assume a CentOS 7 distro for each of them. I highly recommend CentOS, but, you can probably mentally convert to another distro if you so please. Also, I used VMs, VMs or baremetal will work – I tend to like this approach to spinning up a CentOS cloud image with virsh.

Specifically I had these hosts, so you can refer back if you need to:

  • koko1 - 192.168.1.165
  • koko2 - 192.168.1.143
  • vpp1 - 192.168.1.179

VPP setup

Time to install VPP – it’s not too bad. But, one of the first things you’re going to need to do is enable hugepages.

Setup hugepages

First, take a look to see if huge pages are enabled, likely not:

[root@vpp1 vpp]# cat /proc/meminfo | grep Huge
AnonHugePages:      6144 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

Ok, huge pages isn’t enabled. So what I did was…

[root@vpp1 centos]# echo 'vm.nr_hugepages = 1024' >> /etc/sysctl.conf

That should do the trick and live through a reboot.

If you want it to show up now, issue a:

[root@vpp1 centos]# sysctl -p
vm.nr_hugepages = 1024

And then check with:

[root@vpp1 centos]# cat /proc/meminfo | grep Huge
Hugepagesize:       2048 kB
AnonHugePages:      4096 kB
HugePages_Total:    1024
HugePages_Free:     1024
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

That’s what mine looked like. I also recommend, optionally, to reboot at this point to make sure it sticks.

Compile VPP

Go ahead and install git, and clone the VPP repo from fd.io:

[root@vpp1 centos]# yum install -y git
[root@vpp1 centos]# git clone https://gerrit.fd.io/r/vpp
[root@vpp1 centos]# cd vpp/

Now you should be able to run the make commands, up to and including make run. It will install the deps for us in the first step (there’s a fair amount of them)

[root@vpp1 centos]# yes | make install-dep
[root@vpp1 vpp]# make bootstrap
[root@vpp1 vpp]# make build
[root@vpp1 vpp]# make run

If you get something like this:

dpdk_config: not enough free huge pages

You didn’t properly setup huge pages in the first steps. For what it’s worth I got a few hints from this fd.io jira issue.

Getting your interfaces to show up in VPP

I didn’t see my interface, just a loopback, in the show interface command. What I saw it say was:

vlib_pci_bind_to_uio: Skipping PCI device 0000:00:03.0 as host interface eth0 is up

Important We’re going to bring an interface down on this machine, which may impact your ability to ssh to it. In my case, it’s just virtual machines with a single nic. That being the case, I assigned root a password (e.g. sudo su root and then passwd) then logged into the box with virsh console vpp1 and did a ifdown eth0, then ran make run.

Now bring down your interface on the vpp1 host…

$ ifdown eth0

The interface will show as being in a down state in vpp, but, it shows up in the list.

DBGvpp# show interface
              Name               Idx       State          Counter          Count     
GigabitEthernet0/3/0              1        down      
local0                            0        down      
DBGvpp# show int GigabitEthernet0/3/0
              Name               Idx       State          Counter          Count     
GigabitEthernet0/3/0              1        down      

And I set the interface up, but I only want to do that after I assign it a static address…

DBGvpp# set interface state GigabitEthernet0/3/0 down
DBGvpp# set int ip address GigabitEthernet0/3/0 192.168.1.223/24
DBGvpp# set int state GigabitEthernet0/3/0 up
DBGvpp# ping 192.168.1.1 
[...snip...]

Great! We’re mostly there. We’ll come back here to setup some vxlan and cross-connects in a little bit.

Compiling koko

Ahhh, go apps – they’re easy on us. We don’t need much to do it, mostly, we need git and golang, so go ahead and install up git.

[root@koko1 centos]# yum install -y git

However, for the latest editions of Koko we need go version 1.7 or greater for koko (as documented in this issue). We’ll use repos from go-repo.io – which at the time of writing installs Go 1.8.3.

Install the .repo file and then just say let’s install golang.

[root@koko1 doug]# rpm --import https://mirror.go-repo.io/centos/RPM-GPG-KEY-GO-REPO
[root@koko1 doug]# curl -s https://mirror.go-repo.io/centos/go-repo.repo | tee /etc/yum.repos.d/go-repo.repo
[root@koko1 doug]# yum install -y golang
[root@koko1 doug]# go version
go version go1.8.3 linux/amd64

Set up your go path.

[root@koko1 centos]# mkdir -p /home/centos/gocode/{bin,pkg,src}
[root@koko1 centos]# export GOPATH=/home/centos/gocode/

And go ahead and clone koko.

[root@koko1 centos]# git clone https://github.com/redhat-nfvpe/koko.git $GOPATH/src/koko

Get the koko deps, and then build it.

[root@koko1 centos]# go get koko
[root@koko1 centos]# go build koko
[root@koko1 centos]# ls $GOPATH/bin
koko

That results in a koko binary in the $GOPATH/bin. So you can get some help out of it if you need.

[root@koko1 centos]# $GOPATH/bin/koko --help

Usage:
./koko -d centos1,link1,192.168.1.1/24 -d centos2,link2,192.168.1.2/24 #with IP addr
./koko -d centos1,link1 -d centos2,link2  #without IP addr
./koko -d centos1,link1 -c link2
./koko -n /var/run/netns/test1,link1,192.168.1.1/24 <other>

    See https://github.com/redhat-nfvpe/koko/wiki/Examples for the detail.

Compiling koro

Next we’re going to compile koro, the tool for container routing (hence its namesake) – since you have all setup done for koko before, it’s basically free (work-wise) to compile koro. We’re not

Alright, so assuming you’ve got koko installed, you’re most of the way there, using the same installed applications and set $GOPATH, you can now clone it up.

[root@koko1 centos]# git clone https://github.com/s1061123/koro.git $GOPATH/src/koro

Get the deps, build it, and run the help.

[root@koko1 centos]# go get koro
[root@koko1 centos]# go build koro
[root@koko1 centos]# $GOPATH/bin/koro

Easy street.

Install a compatible Docker.

You’re going to need an up-to-date docker for koko to perform at its best. So let’s get that up and running for us.

These instructions are basically the verbatim docker instll instructions for Docker CE on CentOS.

[root@koko1 centos]# yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
[root@koko1 centos]# yum install -y docker-ce
[root@koko1 centos]# systemctl enable docker
[root@koko1 centos]# systemctl start docker
[root@koko1 centos]# docker version | grep -A1 Server | grep Version
 Version:      17.06.0-ce

Alright, that’s great.

Wash, rinse, and repeat on koko2

Now go ahead, and compile koko and koro and install Docker on the second koko host.

Fire up a few containers and run koko to create vxlan interfaces

Alright, let’s start some containers. First, we’ll pull my handy utility image (it’s just centos:centos7 but has a few handy packages installed, like… iproute).

[root@koko1 centos]# docker pull dougbtv/centos-network

Do that on both hosts, koko1 and koko2, and now we can run that rascal.

[root@koko1 centos]# docker run --name test1 --net=none -dt dougbtv/centos-network sleep 2000000

And on koko2.

[root@koko2 centos]# docker run --name test2 --net=none -dt dougbtv/centos-network sleep 2000000

Now, let’s connect those to a vxlan interface using koko, on the first koko host.

[root@koko1 centos]# /home/centos/gocode/bin/koko -d test1,link1,10.0.1.1/24 -x eth0,192.168.1.223,11
Create vxlan link1

And on koko2 host.

[root@koko2 centos]# /home/centos/gocode/bin/koko -d test2,link2,10.0.1.2/24 -x eth0,192.168.1.223,12
Create vxlan link2

Dissecting the koko parameters

Let’s dissect the parameters we’ve used here. Looking at this command I had you run earlier:

/home/centos/gocode/bin/koko -d test1,link1,10.0.1.1/24 -x eth0,192.168.1.223,11
  • /home/centos/gocode/bin/koko is the path to the compiled koko binary.
  • -d is for the Docker arguments (“Docker == d”)
    • test1 is the name of the container
    • link1 is the name of the interface we’ll create in the container
    • 10.0.1.1/24 is the IP address we’ll assign to link1
  • -x is for the vxlan argument (“v X lan = x”)
    • eth0 is the parent interface that exists on the host.
    • 192.168.1.223 is the address of the VPP host.
    • 11 is the vxlan ID.

Inspecting the containers

Ok cool, let’s enter a container and see what’s been done. We can see that there’s a link1 interface created…

[root@koko1 centos]# docker exec -it test1 ip a

And we can see that it’s a vxlan interface

[root@koko1 centos]# docker exec -it test1 ip -d link show
[... snip ...]
    vxlan id 11 remote 192.168.1.223 dev 2 srcport 0 0 dstport 4789 l2miss l3miss ageing 300 addrgenmode eui64 

Creating vxlan tunnels and cross-connects in VPP

It’s all well and good that the containers are setup, but right now they’re in a state where the vxlan isn’t actually working because the far end of these, the VPP host, isn’t aware of them. Now, we’ll have to set this up in VPP.

Go back to your VPP console. We’re going to create vxlan tunnels, and keep your cli docs for vxlan tunnels handy.

For your reference again, note that:

  • 192.168.1.223 is the vpp host itself.
  • 192.168.1.165 is koko1, and 192.168.1.143 is koko2.
  • 11 & 12 are the vxland IDs we have chosen.

Here’s how we create them:

DBGvpp# create vxlan tunnel src 192.168.1.223 dst 192.168.1.165 vni 11
DBGvpp# create vxlan tunnel src 192.168.1.223 dst 192.168.1.143 vni 12

(If you need to, you can delete those by issuing the same create command and then putting del on the end.)

And you can see what we created…

DBGvpp# show interface
DBGvpp# show interface vxlan_tunnel0
DBGvpp# show interface vxlan_tunnel1

And that’s all well and good, but, it’s not perfect until we setup the cross connect.

DBGvpp# set interface l2 xconnect vxlan_tunnel0 vxlan_tunnel1
DBGvpp# set interface l2 xconnect vxlan_tunnel1 vxlan_tunnel0

Ok, now… Let’s exec a ping in the test1 container we created and applied koko to.

[root@koko1 centos]# docker exec -it test1 ping -c 5 10.0.1.2

Should be good to go!

In closing.

Alright, what’ve done is:

  • Installed koko on two hosts, and ran a container per host
  • Created a vxlan interface inside the container that is switched at VPP
  • Installed VPP on a host, and setup vxlan and cross connects

Excellent! Next up we’re going to take some of these basics, and we’ll demonstrate create a chain of services using koko and koro.

Any time in your schedule? Try using a custom scheduler in Kubernetes

I’ve recently been interested in the idea of extending the scheduler in Kubernetes, there’s a number of reasons why, but at the top of my list is looking at re-scheduling failed pods based on custom metrics – specifically for high performance high availablity; like we need in telecom. In my search for learning more about it, I discovered the Kube docs for configuring multiple schedulers, and even better – a practical application, a toy scheduler created by the one-and-only-kube-hero Kelsey Hightower. It’s about a year old and Hightower is on his game, so he’s using alpha functionality at time of authoring. In this article I modernize at least a component to get it to run in the contemporary day. Today our goal is to run through the toy scheduler and have it schedule a pod for us. We’ll also dig into Kelsey’s go code for the scheduler a little bit to get an intro to what he’s doing.

Fire up your terminals, and let’s get ready to schedule some pods – with the NOT the default scheduler.

What, what’s a scheduler? crond?

Well, not crond, but, part of what makes Kubernetes be Kubernetes is its scheduler. A scheduler, according to Wikipedia, generically speaking is:

[A] method by which work specified by some means is assigned to resources that complete the work. The work may be virtual computation elements such as threads, processes or data flows, which are in turn scheduled onto hardware resources such as processors, network links or expansion cards

So in this case – the “work specified by some means” is our containers (usually Docker containers), and the resource they’re assigned do – are our nodes. That’s a big thing that Kube does for us – it assigns our containers to nodes, and makes sure that they’re running.

If you want to read more about exactly what the default scheduler in Kubernetes does, check out this readme file from the kube repos.

Requirements

Simply have a Kubernetes 1.7 up and running for you. 1.6 might work, too. If you don’t have Kube running, may I suggest that you use my kube-ansible playbooks, and follow my article about installing a kube cluster on centos (ignore that it says kube 1.5 – same steps will produce a 1.7 cluster).

Also, I use an all-CentOS 7 lab environment, and while it might not be required, note that it colors the ancillary tools and viewpoint from which I create this tutorial.

We’ll install a few deps, I wound up with a Go version 1.6.3, which appears to work fine, for your reference.

Install our deps

I’m performing these steps on my kube master, feel free to run them where’s appropriate for you. You’ll need to install some packages, and you’ll need to be able to use the kubectl utility in order to perform these.

Now, let’s go and install the deps we need:

[centos@kube-master ~]$ sudo yum install -y git golang tmux

Now, make yourself a dir for your go source.

[centos@kube-master ~]$ mkdir -p gocode/src

Clone and build the scheduler

Now let’s clone up Hightower’s code into there.

[centos@kube-master ~]$ cd gocode/src/
[centos@kube-master src]$ git clone https://github.com/kelseyhightower/scheduler.git
[centos@kube-master src]$ cd scheduler/
[centos@kube-master scheduler]$ pwd
/home/centos/gocode/src/scheduler

Alright now that we’re there, first thing we’ll do is build the annotator.

[centos@kube-master scheduler]$ cd annotator/
[centos@kube-master annotator]$ go build
[centos@kube-master annotator]$ ls annotator -lh
-rwxrwxr-x. 1 centos centos 7.8M Jul 21 15:23 annotator

Which will produce a binary for us.

Now, go and build the scheduler proper.

[centos@kube-master annotator]$ cd ../
[centos@kube-master scheduler]$ go build
[centos@kube-master scheduler]$ ls scheduler -lh
-rwxrwxr-x. 1 centos centos 7.7M Jul 21 15:24 scheduler

Go makes it easy, right!?

Start your kubectl proxy

We need to run a kubectl proxy, which is a HTTP proxy to access the kube API – our scheduler here will rely on it.

Run tmux:

[centos@kube-master ~]$ tmux 

This will give you a new screen, in that screen run:

[centos@kube-master ~]$ kubectl proxy

You can exit this screen and let it keep running by hitting ctrl+b then d. To return to the screen execute tmux a.

Run the annotation

Alright, we’re going to create some “prices” for each of our nodes. The scheduler will use this and then start the pods on the node with the lowest price.

[centos@kube-master scheduler]$ cd annotator/
[centos@kube-master annotator]$ ./annotator 
kube-master 0.20
kube-minion-1 0.20
kube-minion-2 0.05
kube-minion-3 1.60

Each time you run the annotator, it’ll generate new prices for you. If you just want to list the prices, list them like so:

[centos@kube-master annotator]$ ./annotator -l
kube-master 0.20
kube-minion-1 0.20
kube-minion-2 0.05
kube-minion-3 1.60

Kick up a pod…

Alright, now create a resource definition yaml file with these contents:

[centos@kube-master scheduler]$ cat ~/nginx.yaml 
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
spec:
  replicas: 1
  template:
    metadata:
      #annotations:
      #  "scheduler.beta.kubernetes.io/name": hightower
      labels:
        app: nginx
      name: nginx
    spec:
      schedulerName: hightower
      containers:
        - name: nginx
          image: "nginx:1.11.1-alpine"
          resources:
            requests:
              cpu: "500m"
              memory: "128M"

Hightower had been using the annotation earlier, but, this is now core functionality so what I’ve done that’s different is used the schedulerName property under the spec in the resource definition. As you can see it’s schedulerName: hightower (and hightower is set as a constant as scheduler name in the go code, more on that later)

Now, let’s create this pod:

[centos@kube-master annotator]$ kubectl create -f ~/nginx.yaml 
deployment "nginx" created

We can check out and see that this pod won’t scheduler, which is what we want for now:

[centos@kube-master annotator]$ watch -n1 kubectl get pods

And you might wanna describe it, too…

[centos@kube-master annotator]$ watch -n1 kubectl describe pod nginx-881608959-gwnll

Cool, good it shouldn’t have started yet.

Start the scheduler

Feel free to run this in a tmux screen, but, I ran it in it’s own window.

Fire it up!

[centos@kube-master scheduler]$ ./scheduler 
2017/07/21 15:32:36 Starting custom scheduler...
2017/07/21 15:32:38 Successfully assigned nginx-881608959-vk6t3 to kube-minion-2

Hurray! It scheduled it to kube-minion-2 if you look at our pricing output, you’ll see that is the lowest priced node when we generated prices. Run a kubectl get pods to double check and you can pick up the IP address with a kubectl describe $the_pod_name and curl it to your heart’s content.

If you want, destroy the pod with a:

[centos@kube-master scheduler]$ kubectl delete -f ~/nginx.yaml 

And generate new prices with ./annotator/annotator and run the scheduler again, and see it schedule it to another place when you kubectl create -f it.

Let’s inspect the toy scheduler go code.

So let’s take a look at the code in the toy scheduler. This is really a gloss-over, but maybe can help point you (and later me!) in the right direction to figure out more about how to use these concepts to our own advantages.

The files we’re interested in are:

  • main.go: The main app which starts a couple handler goroutines
  • processor.go: Where our goroutines live.
  • kubernetes.go: The Kube API meat-and-potatoes
  • bestprice.go: Our metric for scheduling.

(There’s also the ./annotator/annotator.go, which is a small util, feel free to poke at that too)

Generally, we have a main.go which is our handler, it starts up some goroutines that run two methods, both found in the process.go file:

  • monitorUnscheduledPods()
  • reconcileUnscheduledPods()

These handle the goroutine logic (e.g. working with the wait group), perform a wait operation (I assume for polling for the rest of the logic), and then call the schedulePod() method also in processor.go.

The monitorUnscheduledPods() also calls the method watchUnscheduledPods() from kubernetes.go which is looking for those unscheduled pods for us (looks to be polling, but, there’s some things named “event” which makes me wonder if it has a watch on those events, I’m unsure and I didn’t dig further for now). The watchUnscheduledPods() method returns a channel to the pods it discovers.

When there’s a pod to be scheduled, finally a bind() method is called from kubernetes.go – this calls the binding core in Kubernetes API, which can bind a pod to a node, for example.

The processor also looks at the bestPrice() method, which is in bestprice.go – this look at the “prices” for each node and returns the lowest value price, this is how we determine which pod is going to go where.

BYOB - Bring your own boxen to an OpenShift Origin lab!

Let’s spin up a OpenShift Origin lab today, we’ll be using openshift-ansible with a “BYO” (bring your own) inventory. Or I’d rather say “BYOB” for “Bring your own boxen”. OpenShift Origin is the upstream OpenShift – in short, OpenShift is a PaaS (platform-as-a-service), but one that is built with a distribution of Kubernetes, and in my opinion – is so valuable because of its strong opinions, which guide you towards some best practices for using Kubernetes for the enterprise. In addition, we’ll use my team’s base-infra-bootstrap which we can use to A. spin up some VMs to use in the lab, and/or B. Setup some basics on the host to make sure we can properly install OpenShift Origin (which is the only thing I use that playbook for, to get a baseline OpenShift Origin environment). Our goal today will be to setup an OpenShift Origin cluster with a master and two compute nodes, we’ll verify that it’s healthy – and we’ll deploy a very basic pod.

If you’re itching to get your hands on the keyboard, skip down to “Clone Doug’s base-infra-bootstrap” to omit the intro.

What, exactly, are we going to deploy?

The gist is we’re going to use Ansible from “some device” (in my case, my workstation, and I’d guess yours, too). We’ll then provision a machine to be a “virt-host” – a host for running virtual machines. Then we’ll spin up 3 virtual machines (with libvirt) to run OpenShift on. Those virtual machines are connected to a br0 bridge which will allow these virtual machines to have IP addressing on your LAN. (As opposed to say, a NAT’ed IP address)

architecture diagram

Requirements

In this setup we use a CentOS 7 virtual machine host, you’ll need decent size on it. You might be able to trim down some of these, but, what I’m using is a baremetal node with 16 cores, using 4 cores per VM, 96 gigs of RAM, and I have 1TB spinning disk.

You’ll need at least:

  • 48 gigs of RAM (16 per VM)
  • ~240 gigs of HDD (~80 gigs per VM)
  • 6-8 cores (2 core per VM, I recommend 4 per VM)

This walk-through assumes that you have a host (like that) with CentOS 7.3 up and running (and hopefully you have some updated packaged and a late kernel, too).

You’ll need a host from which to run Ansible, and you’ll need Ansible installed. Additionally, we’re going to be using OpenShift-Ansible which requires Ansible 2.2.2.0 or greater. This could be the same as your virtual host. Make sure you have SSH keys to your target box.

Additionally – while I use a VM lab, you could definitely spin up baremetal, or some VMs on “the cloud platform of your choosing” (and I hope for your sake, you don’t use one that has vendor lock-in). Just read through and skip the VM provisioning portion.

Limitations

Really – you’ll want a DNS server for your cluster if you’re doing anything bigger than this, and even this setup could benefit from a DNS implementation. I don’t really go there in this implementation.

There is no HA components herein. Those may be extended to this lab environment when the right use-case for the lab comes along.

Additionally, since we’re using a single master node, there won’t be an official load balancer. The load balancer conflicts with some master service, and required a node dedicated to it. (Although, in theory you can probably schedule pods on that node, too.)

Docker storage driver

One of the bumps in the road I ran into while I was working on this was the Docker storage driver.

OpenShift does some great things for us, and that OpenShift-Ansible honors – one of those things being that it discourages you from using a loopback storage driver.

I followed the instructions for configuring direct-lvm storage for Docker from the Docker documentation.

Mostly though, these are covered in the playbooks, so, if you want, dig into those to see how I sorted it out. It’s worth noting that in the most recent Docker versions (the version used here at the time of writing is 1.12.x) make setting up the direct-lvm volumes much easier, and it does all volume actions automagically. In short, what I do is dedicate a disk to each VM and then tell Docker to use it.

Clone Doug’s base-infra-bootstrap

I’ll assume now that you’ve got a machine to use that we can spin up virtual machines on, and that you have SSH keys from whatever box you’re going to run ansible on to that host.

I’ve got a few playbooks put together in a repo that’ll help you gets some basics on a few hosts to use for spinning up OpenShift Origin with a BYO inventory. Its generic-as-can-be-name is, base-infra-bootstrap.

Go ahead and clone that.

$ git clone https://github.com/redhat-nfvpe/base-infra-bootstrap.git

Go ahead and then install the requirements…

ansible-galaxy install -r requirements.yml

Setup the virtual machine host.

Alright, first thing let’s create an inventory to define where our virtual machine host is. This assumes you have a fresh install of CentOS 7 on your virtual machine host. If you’d like an example, open up the ./inventory/example_virtual/openshift-ansible.inventory file in the clone. Modify the virt_host line (in the first few lines) to have a ansible_host that has the IP (or hostname) of the machine we’re going to provision. In theory your inventory can be this simple:

# Setup this host first, and put the IP here.
virt_host ansible_host=192.168.1.42 ansible_ssh_user=root

[virthosts]
virt_host

Create a file like that on your own, or from the example, and put it @ ./inventory/your.inventory.

But in reality – you’ll probably need to specify the NIC that you use to access the LAN/WAN on that host with:

bridge_physical_nic=enp1s0f1

(e.g. replace enp1s0f1 with eth0 if that’s what you have.)

Additionally (in order for the playbook to discover the IP address of the VMs it creates), you’ll need to specify the CIDR for the network on which that NIC operates…

bridge_network_cidr=192.168.1.0/24

Now that you have that setup, we can run the virt-host-setup.yml, like so:

$ ansible-playbook -i inventory/your.inventory virt-host-setup.yml

Oh is it coffee time? IT IS COFFEE TIME. Fill up a big mug, and I recommend stocking on up Vermont Coffee Company’s Tres. It’s legit.

In this process we have:

  • Installed dependencies to run VMs with libvirt
  • Spun up 3 VMs (and pick up their IP addresses)

Setup the inventory for the virtual machines (and grab the ssh keys)

Look in the output from the playbook and look for a section called: “Here are the IPs of the VMs”, grab those IPs and add them into the ./inventory/your.inventory file in this section:

# After running the virt-host-setup, then change these to match.
openshift-master ansible_host=192.168.1.183
openshift-node-1 ansible_host=192.168.1.130
openshift-node-2 ansible_host=192.168.1.224

[nodes]
openshift-master
openshift-node-1
openshift-node-2

Ok, but, that’s no good without grabbing the SSH key to access these. You’ll find the key to them on the virt host, in root’s directory, the file should be here:

$ cat /root/.ssh/id_vm_rsa

Take that file and put it on your ansible machine, and we’ll also add that into the inventory.

Find this section in the inventory, and modify it to match where you put the file (keep the ansible_ssh_user the same, in most cases)

[openshiftnodes:vars]
ansible_ssh_user=centos
ansible_ssh_private_key_file=/home/doug/.ssh/id_openshift_hosts

Modify the virtual machine hosts to get ready for an OpenShift Ansible run.

Cool – now go ahead and run the bootstrap.yml playbook which will setup these VMs to be readied for an openshift Ansible install.

$ ansible-playbook -i inventory/inventory bootstrap.yml

There’s a few things this does that really helps us out so that openshift-ansible can do the magic we need it to do.

  • It installs the correct docker version, and sets direct-lvm storage for Docker
  • It sets up the host files on the machines so that we don’t need DNS

That one should finish in a pretty reasonable amount of time.

Start the OpenShift Ansible run.

In the base-infra-bootstrap clone’s root, you’ll find a file final.inventory which is the inventory we’re going to use for openshift-ansible – except again, we’ll have to replace the IPs in the first three lines of that file. (These will match what you created in the last step for the bootstrap.yml)

Here’s the whole thing in case you need it:

openshift-master ansible_host=192.168.1.51
openshift-node-1 ansible_host=192.168.1.74
openshift-node-2 ansible_host=192.168.1.112

[OSEv3:children]
masters
nodes
etcd
# lb
# nfs

[OSEv3:vars]
ansible_ssh_user=centos
ansible_become=yes
debug_level=2
openshift_deployment_type=origin 
# openshift_release=v3.6
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]
ansible_ssh_private_key_file=/root/.ssh/id_vm_rsa
openshift_master_unsupported_embedded_etcd=true 
openshift_disable_check=disk_availability,memory_availability
# openshift_disable_check=docker_storage

[masters]
openshift-master

[etcd]
openshift-master

# [lb]
# openshift-master

[nodes]
# make them unschedulable by adding openshift_schedulable=False any node that's also a master.
openshift-master openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_schedulable=true
openshift-node-[1:2] openshift_node_labels="{'region': 'primary', 'zone': 'default'}"

Alright, now, let’s ssh into the virtual machine host, and we’ll find that it’s cloned the openshift-ansible repo.

So move into that directory…

$ cd /root/openshift-ansible/

And put the contents of that final inventory into ./my.inventory

Drum roll please, begin the openshift-ansible run…

Now you can run the openshift ansible playbook like so:

(edit January 23rd 2018: The config playbook moved, so, here’s the two plays it’s replaced with now)

$ ansible-playbook -i my.inventory ./playbooks/prerequisites.yml
$ ansible-playbook -i my.inventory ./playbooks/deploy_cluster.yml

Now, make 10 coffees – and/or wait for your Vermont Coffee Company order to complete and then brew that coffee. This takes a bit.

Verifying the setup.

So, we’ll assume that openshift-ansible completed without a hitch (and if it didn’t? Give a read-through of the error, and give a shot at fixing it, and with that info in hand open up an issue or PR on my bootstrap playbooks). Now, we can look at the node status.

SSH into the master, and run:

[centos@openshift-master ~]$ oc status
[...snip...]
[centos@openshift-master ~]$ oc get nodes
NAME                               STATUS    AGE
openshift-master.example.local     Ready     52m
openshift-node-1.example.local   Ready     52m
openshift-node-2.example.local   Ready     52m

You should have 3 nodes, and you might have noticed something in the ./final.inventory – I’ve told OpenShift that it’s OK to schedule pods on the master. We’re using a lot of resources for this lab, so, might as well make use of the master, too.

Optional: Configure the Dashboard.

If you want to, set a hosts file on your workstation to point openshift-master.example.local at the IP we’ve been using as the inventory IP address. And then point a browser @ https://openshift-master.example.local:8443/ and accept the certs to kick up the dashboard.

You’ll then need to configure the access to the dashboard. You can get a gist of the defaults from the /etc/origin/master/master-config.yaml file on the master:

[root@openshift-master centos]# grep -A12 "oauthConfig" /etc/origin/master/master-config.yaml 
oauthConfig:
  assetPublicURL: https://openshift-master.example.local:8443/console/
  grantConfig:
    method: auto
  identityProviders:
  - challenge: true
    login: true
    mappingMethod: claim
    name: htpasswd_auth
    provider:
      apiVersion: v1
      file: /etc/origin/master/htpasswd
      kind: HTPasswdPasswordIdentityProvider

This lets us know that we’re using htpasswd_auth and that the htpasswd file is @ /etc/origin/master/htpasswd. There’s more info in the official docs.

With this in hand, we can create a user.

[centos@openshift-master ~]$ oc create user dougbtv
user "dougbtv" created

And now let’s add a password for that user.

[centos@openshift-master ~]$ sudo htpasswd -c /etc/origin/master/htpasswd dougbtv
New password: 
Re-type new password: 
Adding password for user dougbtv

Great, now you should be able to login with the user dougbtv (in this example) with the password you set there.

Let’s kick off a pod.

Alright, why don’t we use my all time handy favorite nginx pod!

First, let’s create a new project.

[centos@openshift-master ~]$ oc new-project sample

We’re going to use a public nginx container image, so, this one assumes it can run as the user it choses, so… We’re going to allow this. In your own production setup, you’ll likely massage the users and SCCs to fit a cleaner mold.

So in this case, we’ll add the anyuid SCC to the default user.

[centos@openshift-master ~]$ oc adm policy add-scc-to-user anyuid -z default

Then, create a nginx.yaml with these contents:

apiVersion: v1
kind: ReplicationController
metadata:
  name: nginx
spec:
  replicas: 2
  selector:
    app: nginx
  template:
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80

Create the replica set we’re defining with:

[centos@openshift-master ~]$ oc create -f nginx.yaml 

Watch the pods come up…

[centos@openshift-master ~]$ watch -n1 oc get pods

Should the pod fail to come up, do a oc describe pod nginx-A1B2C3 (replacing the pod name with the one from oc get pods)

Then… We can curl something from it. Here’s a shortcut to get you one of the pod’s IP addresses and curl it.

[centos@openshift-master ~]$ curl -s $(oc describe pod $(oc get pods | tail -n1 | awk '{print $1}') | grep -P "^IP" | awk '{print $2}') | grep -i thank
<p><em>Thank you for using nginx.</em></p>

And there you have it!