Manjusaka

Manjusaka

Some Thoughts on Kubernetes and Containerization

Recently, there have been many discussions in various groups about Kubernetes and containerization. Let's summarize some scattered thoughts as a generalization. This article represents my personal standpoint and does not represent any commercial views.

Containerization#

Currently, a popular viewpoint is to use containers as much as possible. To review this idea, we need to understand the changes that containers have brought us.

Containers undoubtedly bring us many benefits:

  1. It is very convenient to keep the development and production environments consistent. In other words, when developers say "this service works fine on my local machine," it becomes a useful statement.
  2. It makes deploying services more convenient, whether it's distribution or deployment.
  3. It allows for a certain level of resource isolation and allocation.

So, can we just use containers without thinking? No, we can't. We need to review the drawbacks of containerization:

  1. Container security is a major concern. The most popular container implementation (specifically Docker) is essentially based on CGroups + NS for resource and process isolation. Therefore, security is a significant consideration. After all, Docker vulnerabilities and escapes are discovered every year. This means that we need a systematic mechanism to regulate the use of containers, to ensure that any potential security issues can be controlled within a manageable range. Another aspect is image security. Since we often rely on search engines like Baidu, CSDN, Google, and Stack Overflow for programming, there is a risk when we encounter a problem and search for a solution. We might directly copy a Dockerfile without knowing what's inside the base image.
  2. Networking is another issue with containers. When we start multiple containers, how do they communicate with each other? In a production environment with more than one machine, how do we ensure stable communication between containers across different hosts?
  3. Container scheduling and operations are also challenges. When a machine is under high load, how do we schedule some containers to other machines? How do we determine if a container is alive? If a container crashes, how do we restart it?
  4. There are also specific details about containers, such as how to build and package images, how to upload them, and how to troubleshoot corner cases.

When making a business decision, we should not choose a technology just because it is advanced or comfortable. We need to measure the ROI of the decision and make a trade-off between its advantages and disadvantages. Regarding containerization, let's consider some common misconceptions:

  1. We want to use containers for resource isolation! Then, what is the difference between using systemd + cgroup, a simpler method, and containers? Is containerization more cost-effective?
  2. We want to practice DevOps, so we want to use containers! In fact, DevOps and containerization are not closely related. DevOps is more of a methodology, a set of practices for internal collaboration within a team. Roughly speaking, it simplifies the distribution and operations of a service through automation, process improvement, and the introduction of SOPs. In other words, when we practice DevOps, it is not just a technical problem, but an institutional problem (here's a joke: DevOps developers don't need to write scripts). Traditional tools like Ansible for operations and various automation testing methods and frameworks can all be part of DevOps. So, why do we need containers? Is it because traditional tools are more costly to implement DevOps?

From these two examples, we can see that when we consider containerization, we must think about what pain points it truly solves, rather than just adopting it because it seems advanced and trendy.

Kubernetes#

The aforementioned issues with containerization have led to the emergence of container orchestration systems, with Kubernetes being the representative one. Now, let's discuss Kubernetes.

First, I will ignore the scenario of building a self-managed Kubernetes cluster because it is not something that ordinary people can handle. Instead, let's focus on using public cloud services. Taking Alibaba Cloud as an example, when we open the webpage, we see the following images:

images

images

Now, let's ask some questions:

  1. What is VPC?
  2. What is the difference between Kubernetes 1.16.9 and 1.14.8?
  3. What are Docker 19.03.5 and Alibaba Cloud Security Sandbox 1.1.0? What is the difference between them?
  4. What is a dedicated network?
  5. What is a virtual switch?
  6. What are network plugins? What are Flannel and Terway? What is the difference between them? When you look through the documentation and find out that Terway is an Alibaba Cloud-customized CNI plugin based on Calico, you might wonder, what is a CNI plugin? What is Calico?
  7. What is Pod CIDR and how do you set it?
  8. What is Service CIDR and how do you set it?
  9. What is SNAT and how do you configure it?
  10. How do you configure security groups?
  11. What is Kube-Proxy? What is the difference between iptables and IPVS? How do you choose?

Does it differ from what you imagined, where you can just click a few buttons? You might say, "We don't need to worry about these things in a small company. We can just use the default settings." Well, if that's the case, why bother with Kubernetes? Okay, let's assume you have deployed it. Now, let's continue to calculate the costs.

  1. You need a container registry, right? It's not expensive, the basic version in the China region costs 780 RMB per month.
  2. Do you need to expose services within your cluster? Okay, buy the lowest specification SLB, the simple type, for 200 RMB per month.
  3. Okay, you need to pay for logs every month, right? Let's say you have 20GB of logs per month, not much, right? That will be 39.1 RMB.
  4. Do you want cluster monitoring? Great, buy it. Let's say you have 500,000 log entries reported per day. It's not expensive, only 975 RMB per month.

Let's calculate the cost for one cluster: (780+200+39.1+975)*12 = 23292.2 RMB, not including the costs of basic ENIs and ECS instances. It's quite expensive.

Moreover, it will lead to many other problems. For specific details, you can check the Kubernetes issue area.

Conclusion#

I wrote this article not to complain or criticize anyone, but to express a viewpoint. I would like to quote a sentence from an article I really like, "The Middle Platform, I Believe in Your Evil" by Deep Kr (https://mp.weixin.qq.com/s/9j3BnR3UqA-lnJDoM5Hrvg):

At the end of last year, Alibaba's Chairman and CEO, Zhang Yong, also said during a lecture at Hupan University: "If a company focuses on building a middle platform, it will die."

I'm not sure if Xiaoyaozi (a fictional character) said this, but I agree with it. At the same time, I believe that if a company pursues technological advancement for the sake of being advanced, it will die. After all, technology needs to serve the business, and the progress of technology largely depends on the accumulation and demands of the business.

Well, this is probably the most casual article I have ever written. That's it for now. Back to work.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.