Creating a Self Managed Multi-node Kubernetes Cluster Using kubeadm in AWS EC2

Several guides of various permutations for this task already exist. In order to avoid reinventing the wheel, I’ll just be providing the commands and terse explanations on why certain flags are set. Links for further reading and/or to official documentation included as well.

Setup

I’ll be using the following:

  • EC2 Setup
    • 2 Instances (with at least 2 CPUs) using Amazon Linux 2 AMI
  • Kubernetes Setup
    • kubectl / kubelet / kubeadm version – 1.20.1 (latest at time of writing)
    • Container runtime: Docker
    • 1 master/worker, 1 worker (master is tainted)
    • Flannel for network overlay

Step 1: Install Container Runtime

Run on all nodes

Note: Docker is being deprecated as a supported container runtime after v1.20.

$ sudo yum install -y yum-utils
$ sudo amazon-linux-extras install docker
$ sudo service docker start
$ sudo systemctl enable docker
$ sudo usermod -a -G docker ec2-user

For more info, see https://docs.docker.com/engine/install/centos/

Step 2: Let iptables See Bridged Traffic

Run on all nodes

$ cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
$ sudo sysctl --system

For more info, see https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#letting-iptables-see-bridged-traffic

Step 3: Install kubeadm, kubelet and kubectl

Run on all nodes

$ cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF
# Set SELinux in permissive mode (effectively disabling it)
$ sudo setenforce 0
$ sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
$ sudo yum install kubelet kubeadm kubectl --disableexcludes=kubernetes 
$ sudo systemctl enable kubelet.service

For more info, see https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#installing-kubeadm-kubelet-and-kubectl

Note: There is no need for a separate command to “apt-mark hold” the versions of the 3 binaries (unlike debian based OS) – that is done via the “excludes” line before EOF.

Step 4: Initialize Cluster

Run on master node

$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16
(copy the "kubeadm join" command for the worker node for later)
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

Flannel needs --pod-network-cidr=10.244.0.0/16 to be set. Refer to its documentation.

Step 5: Deploy Flannel for Overlay Network

Run on master node

$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

For more info, see How does Kubernetes and Flannel work?

Step 6: Join Worker Nodes to Cluster

Run on worker node

Use the kubeadm join command from Step 4.

$ kubeadm join 172.31.11.72:6443 --token <TOKEN> \
    --discovery-token-ca-cert-hash sha256:<HASH>
[preflight] Running pre-flight checks
        [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
        [WARNING FileExisting-tc]: tc not found in system path
(process waits indefinitely) 

If your Security Group inbound rules are not set properly, you will notice that the process waits indefinitely after the above output is shown. Step 7 shows how to solve this.

If you manage to join the cluster without issue, go on to Step 8.

Step 7: Set EC2 Security Group Inbound Rules

If both master and worker nodes use the same security group, you have the option to allow all inbound traffic originating from the same security group.

Otherwise, these are the individual ports that will need to be allowed:

# Kubernetes Control Plane Nodes
6443/tcp
2379-2380/tcp
10250-10252/tcp
# Kubernetes Worker Nodes
10250/tcp
30000-32767/tcp
# Flannel
8285/udp 
8472/udp
# CoreDNS 
9153/tcp (metrics port)

See Ports for Kubernetes, Ports for Flannel, and Port for CoreDNS.

Step 8: Taint Master Node to Allow Pods to Be Deployed on It

Run on master node

$ kubectl taint nodes --all node-role.kubernetes.io/master-
node/ip-172-31-10-239.us-east-2.compute.internal untainted
error: taint "node-role.kubernetes.io/master" not found

See documentation for more info.

And we are done. In a future post, I’ll show how to set up a storage class for the local disk storage so that all pods can use the instance’s ephemeral disk on the instance for their PersistentVolumeClaims.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s