Kubernetes 切换容器引擎到 Containerd
原文链接:https://mritd.com/2021/05/29/use-containerd-with-kubernetes/
Ubuntu 20.04 x5
etcd 3.4.16
Kubernetes 1.21.1
Containerd 1.3.3
由于 Kubernetes 新版本 Service 实现切换到 IPVS,所以需要确保内核加载了 IPVS modules;以下命令将设置系统启动自动加载 IPVS 相关模块,执行完成后需要重启。
# Kernel modules
cat > /etc/modules-load.d/50-kubernetes.conf <<EOF
# Load some kernel modules needed by kubernetes at boot
nf_conntrack
br_netfilter
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
EOF
# sysctl
cat > /etc/sysctl.d/50-kubernetes.conf <<EOF
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
fs.inotify.max_user_watches=525000
EOF
# check ipvs modules
➜ ~ lsmod | grep ip_vs
ip_vs_sed 16384 0
ip_vs_nq 16384 0
ip_vs_fo 16384 0
ip_vs_sh 16384 0
ip_vs_dh 16384 0
ip_vs_lblcr 16384 0
ip_vs_lblc 16384 0
ip_vs_wrr 16384 0
ip_vs_rr 16384 0
ip_vs_wlc 16384 0
ip_vs_lc 16384 0
ip_vs 155648 22 ip_vs_wlc,ip_vs_rr,ip_vs_dh,ip_vs_lblcr,ip_vs_sh,ip_vs_fo,ip_vs_nq,ip_vs_lblc,ip_vs_wrr,ip_vs_lc,ip_vs_sed
nf_conntrack 139264 1 ip_vs
nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs
libcrc32c 16384 5 nf_conntrack,btrfs,xfs,raid456,ip_vs
# check sysctl
➜ ~ sysctl -a | grep ip_forward
net.ipv4.ip_forward = 1
net.ipv4.ip_forward_update_priority = 1
net.ipv4.ip_forward_use_pmtu = 0
➜ ~ sysctl -a | grep bridge-nf-call
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
1.2、安装 Containerd
Containerd 在 Ubuntu 20 中已经在默认官方仓库中包含,所以只需要 apt 安装即可:
# 其他软件包后面可能会用到,所以顺手装了
apt install containerd bridge-utils nfs-common tree -y
etcd 对于 Kubernetes 来说是核心中的核心,所以个人还是比较喜欢在宿主机安装;宿主机安装情况下为了方便我打包了一些 *-pack 的工具包,用于快速处理:
安装 Cfssl 和 etcd:
# 下载安装包
wget https://github.com/mritd/etcd-pack/releases/download/v3.4.16/etcd_v3.4.16.run
wget https://github.com/mritd/cfssl-pack/releases/download/v1.5.0/cfssl_v1.5.0.run
# 安装 Cfssl 和 etcd
chmod +x *.run
./etcd_v3.4.16.run install
./cfssl_v1.5.0.run install
➜ ~ cat /etc/cfssl/etcd/etcd-csr.json
{
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"O": "etcd",
"OU": "etcd Security",
"L": "Beijing",
"ST": "Beijing",
"C": "CN"
}
],
"CN": "etcd",
"hosts": [
"127.0.0.1",
"localhost",
"*.etcd.node",
"*.kubernetes.node",
"10.0.0.11",
"10.0.0.12",
"10.0.0.13"
]
}
# 复制到 3 台 master
➜ ~ for ip in `seq 1 3`; do scp /etc/cfssl/etcd/*.pem root@10.0.0.1$ip:/etc/etcd/ssl; done
# 复制配置
for ip in `seq 1 3`; do scp /etc/etcd/etcd.cluster.yaml root@10.0.0.1$ip:/etc/etcd/etcd.yaml; done
# 修复权限
for ip in `seq 1 3`; do ssh root@10.0.0.1$ip chown -R etcd:etcd /etc/etcd; done
# 每台机器启动
systemctl start etcd
# 稳妥点应该执行 etcdctl endpoint health
➜ ~ etcdctl member list
55fcbe0adaa45350, started, etcd3, https://10.0.0.13:2380, https://10.0.0.13:2379, false
cebdf10928a06f3c, started, etcd1, https://10.0.0.11:2380, https://10.0.0.11:2379, false
f7a9c20602b8532e, started, etcd2, https://10.0.0.12:2380, https://10.0.0.12:2379, false
2.2、安装 kubeadm
kubeadm 国内用户建议使用 aliyun 的安装源:
# kubeadm
apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
apt update
# ebtables、ethtool kubelet 可能会用,具体忘了,反正从官方文档上看到的
apt install kubelet kubeadm kubectl ebtables ethtool -y
2.3、安装 kube-apiserver-proxy
kube-apiserver-proxy 是我自己编译的一个仅开启四层代理的 Nginx,其主要负责监听 127.0.0.1:6443 并负载到所有的 Api Server 地址(0.0.0.0:5443):
wget https://github.com/mritd/kube-apiserver-proxy-pack/releases/download/v1.20.0/kube-apiserver-proxy_v1.20.0.run
chmod +x *.run
./kube-apiserver-proxy_v1.20.0.run install
➜ ~ cat /etc/kubernetes/apiserver-proxy.conf
error_log syslog:server=unix:/dev/log notice;
worker_processes auto;
events {
multi_accept on;
use epoll;
worker_connections 1024;
}
stream {
upstream kube_apiserver {
least_conn;
server 10.0.0.11:5443;
server 10.0.0.12:5443;
server 10.0.0.13:5443;
}
server {
listen 0.0.0.0:6443;
proxy_pass kube_apiserver;
proxy_timeout 10m;
proxy_connect_timeout 1s;
}
}
systemctl start kube-apiserver-proxy
2.4、安装 kubeadm-config
kubeadm-config 是一系列配置文件的组合以及 kubeadm 安装所需的必要镜像文件的打包,安装完成后将会自动配置 Containerd、ctrictl 等:
wget https://github.com/mritd/kubeadm-config-pack/releases/download/v1.21.1/kubeadm-config_v1.21.1.run
chmod +x *.run
# --load 选项用于将 kubeadm 所需镜像 load 到 containerd 中
./kubeadm-config_v1.21.1.run install --load
2.4.1、containerd 配置
Containerd 配置位于 /etc/containerd/config.toml,其配置如下:
version = 2
# 指定存储根目录
root = "/data/containerd"
state = "/run/containerd"
# OOM 评分
oom_score = -999
[grpc]
address = "/run/containerd/containerd.sock"
[metrics]
address = "127.0.0.1:1234"
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
# sandbox 镜像
sandbox_image = "k8s.gcr.io/pause:3.4.1"
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
default_runtime_name = "runc"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
# 开启 systemd cgroup
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
2.4.2、crictl 配置
在切换到 Containerd 以后意味着以前的 Docker 命令将不再可用,containerd 默认自带了一个 ctr 命令,同时 CRI 规范会自带一个 crictl 命令;crictl 命令配置文件存放在 /etc/crictl.yaml 中:
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
pull-image-on-create: true
2.4.3、kubeadm 配置
kubeadm 配置目前分为 2 个,一个是用于首次引导启动的 init 配置,另一个是用于其他节点 join 到 master 的配置;其中比较重要的 init 配置如下:
# /etc/kubernetes/kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
# kubeadm token create
bootstrapTokens:
- token: "c2t0rj.cofbfnwwrb387890"
nodeRegistration:
# CRI 地址(Containerd)
criSocket: unix:///run/containerd/containerd.sock
kubeletExtraArgs:
runtime-cgroups: "/system.slice/containerd.service"
rotate-server-certificates: "true"
localAPIEndpoint:
advertiseAddress: "10.0.0.11"
bindPort: 5443
# kubeadm certs certificate-key
certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
clusterName: "kuberentes"
kubernetesVersion: "v1.21.1"
certificatesDir: "/etc/kubernetes/pki"
# Other components of the current control plane only connect to the apiserver on the current host.
# This is the expected behavior, see: https://github.com/kubernetes/kubeadm/issues/2271
controlPlaneEndpoint: "127.0.0.1:6443"
etcd:
external:
endpoints:
- "https://10.0.0.11:2379"
- "https://10.0.0.12:2379"
- "https://10.0.0.13:2379"
caFile: "/etc/etcd/ssl/etcd-ca.pem"
certFile: "/etc/etcd/ssl/etcd.pem"
keyFile: "/etc/etcd/ssl/etcd-key.pem"
networking:
serviceSubnet: "10.66.0.0/16"
podSubnet: "10.88.0.1/16"
dnsDomain: "cluster.local"
apiServer:
extraArgs:
v: "4"
alsologtostderr: "true"
# audit-log-maxage: "21"
# audit-log-maxbackup: "10"
# audit-log-maxsize: "100"
# audit-log-path: "/var/log/kube-audit/audit.log"
# audit-policy-file: "/etc/kubernetes/audit-policy.yaml"
authorization-mode: "Node,RBAC"
event-ttl: "720h"
runtime-config: "api/all=true"
service-node-port-range: "30000-50000"
service-cluster-ip-range: "10.66.0.0/16"
# insecure-bind-address: "0.0.0.0"
# insecure-port: "8080"
# The fraction of requests that will be closed gracefully(GOAWAY) to prevent
# HTTP/2 clients from getting stuck on a single apiserver.
goaway-chance: "0.001"
# extraVolumes:
# - name: "audit-config"
# hostPath: "/etc/kubernetes/audit-policy.yaml"
# mountPath: "/etc/kubernetes/audit-policy.yaml"
# readOnly: true
# pathType: "File"
# - name: "audit-log"
# hostPath: "/var/log/kube-audit"
# mountPath: "/var/log/kube-audit"
# pathType: "DirectoryOrCreate"
certSANs:
- "*.kubernetes.node"
- "10.0.0.11"
- "10.0.0.12"
- "10.0.0.13"
timeoutForControlPlane: 1m
controllerManager:
extraArgs:
v: "4"
node-cidr-mask-size: "19"
deployment-controller-sync-period: "10s"
experimental-cluster-signing-duration: "8670h"
node-monitor-grace-period: "20s"
pod-eviction-timeout: "2m"
terminated-pod-gc-threshold: "30"
scheduler:
extraArgs:
v: "4"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
failSwapOn: false
oomScoreAdj: -900
cgroupDriver: "systemd"
kubeletCgroups: "/system.slice/kubelet.service"
nodeStatusUpdateFrequency: 5s
rotateCertificates: true
evictionSoft:
"imagefs.available": "15%"
"memory.available": "512Mi"
"nodefs.available": "15%"
"nodefs.inodesFree": "10%"
evictionSoftGracePeriod:
"imagefs.available": "3m"
"memory.available": "1m"
"nodefs.available": "3m"
"nodefs.inodesFree": "1m"
evictionHard:
"imagefs.available": "10%"
"memory.available": "256Mi"
"nodefs.available": "10%"
"nodefs.inodesFree": "5%"
evictionMaxPodGracePeriod: 30
imageGCLowThresholdPercent: 70
imageGCHighThresholdPercent: 80
kubeReserved:
"cpu": "500m"
"memory": "512Mi"
"ephemeral-storage": "1Gi"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
# kube-proxy specific options here
clusterCIDR: "10.88.0.1/16"
mode: "ipvs"
oomScoreAdj: -900
ipvs:
minSyncPeriod: 5s
syncPeriod: 5s
scheduler: "wrr"
# /etc/kubernetes/kubeadm-join.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
controlPlane:
localAPIEndpoint:
advertiseAddress: "10.0.0.12"
bindPort: 5443
certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b
discovery:
bootstrapToken:
apiServerEndpoint: "127.0.0.1:6443"
token: "c2t0rj.cofbfnwwrb387890"
# Please replace with the "--discovery-token-ca-cert-hash" value printed
# after the kubeadm init command is executed successfully
caCertHashes:
- "sha256:97590810ae34a82501717e33acfca76f16044f1a365c5ad9a1c66433c386c75c"
nodeRegistration:
criSocket: unix:///run/containerd/containerd.sock
kubeletExtraArgs:
runtime-cgroups: "/system.slice/containerd.service"
rotate-server-certificates: "true"
2.5、拉起 Master
在调整好配置后,拉起 Master 节点只需要一条命令:
kubeadm init --config /etc/kubernetes/kubeadm.yaml --upload-certs --ignore-preflight-errors=Swap
2.6、拉起其他 Master
在第一个 Master 启动完成后,使用 join 命令让其他 Master 加入即可;需要注意的是 kubeadm-join.yaml 配置中需要替换 caCertHashes 为第一个 Master 拉起后的 discovery-token-ca-cert-hash 的值。
kubeadm join 127.0.0.1:6443 --config /etc/kubernetes/kubeadm-join.yaml --ignore-preflight-errors=Swap
2.7、拉起其他 Node
Node 节点拉起与拉起其他 Master 节点一样,唯一不同的是需要注释掉配置中的 controlPlane 部分。
# /etc/kubernetes/kubeadm-join.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
#controlPlane:
# localAPIEndpoint:
# advertiseAddress: "10.0.0.12"
# bindPort: 5443
# certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b
discovery:
bootstrapToken:
apiServerEndpoint: "127.0.0.1:6443"
token: "c2t0rj.cofbfnwwrb387890"
# Please replace with the "--discovery-token-ca-cert-hash" value printed
# after the kubeadm init command is executed successfully
caCertHashes:
- "sha256:97590810ae34a82501717e33acfca76f16044f1a365c5ad9a1c66433c386c75c"
nodeRegistration:
criSocket: unix:///run/containerd/containerd.sock
kubeletExtraArgs:
runtime-cgroups: "/system.slice/containerd.service"
rotate-server-certificates: "true"
kubeadm join 127.0.0.1:6443 --config /etc/kubernetes/kubeadm-join.yaml --ignore-preflight-errors=Swap
2.8、其他处理
由于 kubelet 开启了证书轮转,所以新集群会有大量 csr 请求,批量允许即可:
kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve
kubectl taint nodes --all node-role.kubernetes.io/master-
# 列出镜像
ctr images ls
# 列出 k8s 镜像
ctr -n k8s.io images ls
# 导入镜像
ctr -n k8s.io images import xxxx.tar
# 导出镜像
ctr -n k8s.io images export kube-scheduler.tar k8s.gcr.io/kube-scheduler:v1.21.1
https://github.com/mritd/cfssl-pack
https://github.com/mritd/etcd-pack
https://github.com/mritd/kube-apiserver-proxy-pack
https://github.com/mritd/kubeadm-config-pack
- END -
推荐阅读
使用 Jenkins 构建 CI/CD 之多分支流水线
漫画解释如何用 Kubernetes 实现 CI/CD
学习Docker,看这一篇就够了!(收藏版)
Kubernetes 故障排查流程图
正确姿势配置 Pod 的 CPU和内存
60道常见的 Kubernetes 面试题总结
订阅,一起成长
K8s生态圈
点在看,K8s一年不出问题 👇