As of this post, my new Udoo X86 cluster is fully functional and running Kubernetes. One Udoo is a master; three other Udoos are workers. Each of the three workers has a SATA connected 512GB SSD for storage. There is a fifth Udoo in the cluster that I have yet to put into service.
This is where I need to tell you that I have been a Unix system administrator for 25 years. It hasn’t been my official job for all of that time, but I have always continued to build and maintain Unix systems despite what my official role might be at the time. In fact, I hold certifications from Sun Microsystems for both system administration and network administration. This isn’t meant to be a brag. Instead, I hope that it will help put my next comments in context.
Using a Kubernetes offering from a cloud provider may be a fantastic experience for a developer. Building a Kubernetes cluster on bare metal is a horrible experience. Maintaining a Kubernetes cluster on bare metal — especially a multi-tenant scenario — would not be for the faint of heart. Let me try to explain.
The cluster I built (and had to rebuilt multiple times, from scratch) is based on v1.15.2, CentOS 7, and relied on the good will of countless online others who tried to share tutorials on how they had installed bare metal K8s. One of the first issues I ran a cross was that most tutorials were telling people to disable SELinux. “Hmm,” I said to myself, “Is this a ‘requirement’ or a ‘convenience’ people are recommending because they don’t want to concern themselves with the added complexity?” [Refer to You shouldn’t disable SELinux if you need an answer to that rhetorical question.] This brings me to the second problem: almost all information online about Kubernetes’ system-level details are extremely sparse, probably due to its relative newness. For this reason, I’m going to claim (without evidence) that 9 out of 10 bare metal Kubernetes clusters are far from production ready environments — which is fine if their owners are just setting them up as a hobby or for research purposes. Please just don’t expose them to the Internet if you are not going to do the sometimes complex work of hardening them. Please!
Once I had Kubernetes up and was able to deploy and scale, to all the workers, a “Hello World” Python web app via Docker, I experienced some pretty weird latency. “This could not be how Kubernetes performed or nobody would use it,” I thought. After rebuilding the cluster from scratch a couple more times, and trying different pod networking models, I ran across a couple people recommending changes to iptables / firewalld. As with the suggestion that you disable SELinux when installing Kubernetes, I was pretty suspicious about what was being recommended — which was close to “Allow All”. Since my cluster is only for research, I temporarily disabled firewalld to see if the latency situation changed. To my surprise, everything suddenly worked. Although I confirmed that Kubernetes was modifying the iptables dynamically as apps were deployed, something with the base configuration was clearly not correct. A few people have pointed to Docker-specific iptable changes as the culprit, but I have not yet verified what is being blocked as a way to determine the correct course.
With my cluster actually working — if not production ready — I had to turn my attention to the complexities of exposing a deployed application to an accessible network. As if I needed more reasons to not run a Kubernetes cluster on my own bare metal, the only real option appeared to be MetalLB. Although this works, and seems to actually work relatively well, it is explicitly not production ready. You’d think it wasn’t too much to ask that the web application you just deployed to a handful of K8s workers be accessible from a computer on your network and not just the members of your Kubernetes cluster. It should reinforce for you the fact that Kubernetes was designed for cloud providers — not the enterprise — that a feature to abstract the individual workers from an accessible endpoint just isn’t in the “box”. (No, I’m not over-looking what kube-proxy does.) With that said, you can watch the screen recording video of Apache Benchmark against the MetalLB endpoint for a web app I deployed on all three workers in my cluster, including the live log output from each pod, here: K8s Benchmarking.
I’m, by no means, giving up on my Kubernetes cluster. It now has Ceph providing distributed storage across the three workers’ SATA attached SSDs with Rook acting as the orchestrator, mysql is running using that storage, and I have Prometheus collecting metrics. Grafana is next.