Kubernetes(K8S)部署集群全流程详解:基于Kubeadm
先放个链接,万一有人关注呢
优质文章推荐
↓ ↓ ↓ ↓ ↓
1.24.3版本已经去掉默认对docker的支持,需要使用go编译安装cri-dockerd,通信变得复杂,1.24以上版本不推荐使用docker运行时
一、Kubernetes集群部署方式
方式1. minikube
Minikube是一个工具,可以在本地快速运行一个单点的Kubernetes,尝试Kubernetes或日常开发的用户使用。不能用于生产环境。
官方地址:https://kubernetes.io/docs/setup/minikube/
方式2. kubeadm
Kubeadm也是一个工具,提供kubeadm init和kubeadm join,用于快速部署Kubernetes集群。
官方地址:https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm/
方式3. 直接使用epel-release yum源,缺点就是版本较低 1.5
方式4. 二进制包
从官方下载发行版的二进制包,手动部署每个组件,组成Kubernetes集群。
其他的开源工具:
https://docs.kubeoperator.io/kubeoperator-v2.2/introduction
二、kubeadm部署k8s集群
官方文档:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
kubeadm部署k8s高可用集群的官方文档:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/
注:本文采用最新版红帽系版本8.2
1、系统配置
1.1、集群环境
机器数量
3台
操作系统
Centos8.2
设置主机名称
分别设置主机名称为:
master node1 node2
每台机器必须设置域名解析
192.168.1.200 master
192.168.1.201 node1
192.168.1.202 node2
1.2、禁用开机启动防火墙
# systemctl disable firewalld
1.3、永久禁用SELinux
编辑文件/etc/selinux/config,将SELINUX修改为disabled,如下:
# sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux
SELINUX=disabled
1.4、关闭系统Swap
1.8版本之后的新规定
Kubernetes 1.8开始要求关闭系统的Swap,如果不关闭,默认配置下kubelet将无法启动。
修改/etc/fstab文件,注释掉SWAP的自动挂载,使用free -m确认swap已经关闭。
[root@master /]# sed -i 's/.*swap.*/#&/' /etc/fstab
#/dev/mapper/centos-swap swap swap defaults 0 0
1.5、检查MAC地址和product_uuid
Verify the MAC address and product_uuid are unique for every node
You can get the MAC address of the network interfaces using the command
# ip link
The product_uuid can be checked by using the command
# cat /sys/class/dmi/id/product_uuid
It is very likely that hardware devices will have unique addresses, although some virtual machines may have identical values. Kubernetes uses these values to uniquely identify the nodes in the cluster. If these values are not unique to each node, the installation process may fail.
1.6、重启系统
2、安装软件
2.1 所有机器安装docker
注意:现在8上安装docker,可以完全按着7的操作命令来,没有任何变化,下面这些操作是早期的centos8上的变化,stream已经更新依赖关系
# yum install wget container-selinux -y
# wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.6-3.3.el7.x86_64.rpm
# yum erase runc -y
# rpm -ivh containerd.io-1.2.6-3.3.el7.x86_64.rpm
# update-alternatives --set iptables /usr/sbin/iptables-legacy
注意:上面的步骤在centos7中无须操作
8直接执行下面命令即可
# yum install -y yum-utils device-mapper-persistent-data lvm2 && yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo && yum makecache && yum -y install docker-ce -y && systemctl enable docker.service && systemctl start docker
2.2 所有机器安装kubeadm和kubelet
配置aliyun的yum源
# cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
安装kubeadm
# yum makecache
# yum install -y kubelet kubeadm kubectl ipvsadm
说明:如果想安装指定版本的kubeadmin
# yum install kubelet-1.16.0-0.x86_64 kubeadm-1.16.0-0.x86_64 kubectl-1.16.0-0.x86_64
# yum install kubelet-1.22.3 kubeadm-1.22.3 kubectl-1.22.3 ipvsadm -y
配置内核参数
# cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0
EOF
# sysctl --system
# modprobe br_netfilter
# sysctl -p /etc/sysctl.d/k8s.conf
如果重新开机,需要重新加载(可以写在 /etc/rc.local 中开机自动加载)
# modprobe ip_vs
# modprobe ip_vs_rr
# modprobe ip_vs_wrr
# modprobe ip_vs_sh
# modprobe nf_conntrack
# lsmod | grep ip_vs
3、获取镜像
特别说明:
三个节点都要下载
注意下载时把版本号修改到官方最新版,即使下载了最新版也可能版本不对应,需要按报错提示下载
每次部署都会有版本更新,具体版本要求,运行初始化过程失败会有版本提示
kubeadm的版本和镜像的版本最好是对应的
用命令查看版本当前kubeadm对应的k8s镜像版本
[root@master ~]# kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.22.3
k8s.gcr.io/kube-controller-manager:v1.22.3
k8s.gcr.io/kube-scheduler:v1.22.3
k8s.gcr.io/kube-proxy:v1.22.3
k8s.gcr.io/pause:3.5
k8s.gcr.io/etcd:3.5.0-0
k8s.gcr.io/coredns/coredns:v1.8.4
注:2022年7月16日最新版1.24.3版的所有镜像都可以通过ali下载
coredns如果下载不了可以通过
docker pull coredns/coredns:1.8.0
从docker官方下载
使用下面的方法在aliyun拉取相应的镜像并重新打标
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.17.2
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.17.2 k8s.gcr.io/kube-apiserver:v1.17.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.17.2
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.17.2 k8s.gcr.io/kube-controller-manager:v1.17.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.17.2
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.17.2 k8s.gcr.io/kube-scheduler:v1.17.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.17.2
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.17.2 k8s.gcr.io/kube-proxy:v1.17.2
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1 k8s.gcr.io/pause:3.1
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0 k8s.gcr.io/etcd:3.4.3-0
docker pull coredns/coredns:1.6.5
docker tag coredns/coredns:1.6.5 k8s.gcr.io/coredns:1.6.5
4、所有节点配置启动kubelet
4.1、配置kubelet使用的pause镜像版本
获取docker的cgroups
# DOCKER_CGROUPS=$(docker info | grep 'Cgroup' |head -1| cut -d' ' -f4)
# echo $DOCKER_CGROUPS
cgroupfs
配置kubelet的cgroups
cat >/etc/sysconfig/kubelet<<EOF
KUBELET_EXTRA_ARGS="--cgroup-driver=$DOCKER_CGROUPS --pod-infra-container-image=k8s.gcr.io/pause:3.5"
EOF
官方推荐修改cgroupdriver为systemd(和上面的cgroupfs驱动二选一)
注意:现在测试如果修改下面配置则上面的配置失效,不影响集群配置
[root@master flannel]# cat /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
[root@master flannel]# systemctl restart docker
[root@master ~]# docker info -f {{.CgroupDriver}}
systemd
[root@master ~]# docker info | grep -i cgroup
Cgroup Driver: systemd
Cgroup Version: 1
4.2、启动
# systemctl daemon-reload
# systemctl enable kubelet && systemctl start kubelet
特别说明:在这里使用systemctl status kubelet,你会发现报错误信息,最新版1.24.3已经不会报错
10月 11 00:26:43 node1 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
10月 11 00:26:43 node1 systemd[1]: Unit kubelet.service entered failed state.
10月 11 00:26:43 node1 systemd[1]: kubelet.service failed.
运行journalctl -xefu kubelet 命令查看systemd日志才发现,真正的错误是:
unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory
这个错误在运行kubeadm init 生成CA证书后会被自动解决,此处可先忽略。
简单地说就是在kubeadm init 之前kubelet会不断重启。
5、初始化集群
5.1、在master节点进行初始化操作
特别说明:
初始化完成必须要记录下初始化过程最后的命令,如下图所示
初始化命令如下:注意修改版本和apiserver地址
[root@master ~]# kubeadm init --kubernetes-version=v1.22.3 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.26.20 --ignore-preflight-errors=Swap
最新1.24.3版如果初始化时一直卡着也不报错,可加上如下参数:
[root@master ~]# kubeadm init --kubernetes-version=v1.22.3 --image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.26.20 --ignore-preflight-errors=Swap
如果初始化报如下错误:
[root@master ~]# kubeadm init --kubernetes-version=v1.24.3 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.26.20 --ignore-preflight-errors=Swap
[init] Using Kubernetes version: v1.24.3
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR CRI]: container runtime is not running: output: E0716 21:24:17.060679 17034 remote_runtime.go:925] "Status from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
time="2022-07-16T21:24:17+08:00" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
解决方案:
[root@master ~]# rm -rf /etc/containerd/config.toml
[root@master ~]# systemctl restart containerd
正确初始化结果如下
# kubeadm init --kubernetes-version=v1.22.3 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.26.20 --ignore-preflight-errors=Swap
[init] Using Kubernetes version: v1.1.0
[preflight] Running pre-flight checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.1. Latest validated version: 18.06
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master localhost] and IPs [192.168.1.200 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master localhost] and IPs [192.168.1.200 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.200]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 19.003093 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "master" as an annotation
[mark-control-plane] Marking the node master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: wip0ux.19q3dpudrnyc6q7i
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join 192.168.26.190:6443 --token xlrpyg.da2kyug4uxnl7o2h \
--discovery-token-ca-cert-hash sha256:60e577818093721bf34746ff8b086d969a0e89f3ac084dfcdaa240fdeeae8fb6
上面记录了完成的初始化输出的内容,根据输出的内容基本上可以看出手动初始化安装一个Kubernetes集群所需要的关键步骤。
其中有以下关键内容:
[kubelet] 生成kubelet的配置文件”/var/lib/kubelet/config.yaml”
[certificates]生成相关的各种证书
[kubeconfig]生成相关的kubeconfig文件
[bootstraptoken]生成token记录下来,后边使用kubeadm join往集群中添加节点时会用到
5.2、在master节点配置使用kubectl
# rm -rf $HOME/.kube
# mkdir -p $HOME/.kube
# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
# chown $(id -u):$(id -g) $HOME/.kube/config
5.3、查看node节点
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master NotReady master 6m19s v1.13.0
6、配置网络插件
6.1、master节点下载yaml配置文件
特别说明:版本会经常更新,flannel官方存储与github上,如果无法下载需要翻墙直接github搜索flannel得到如下地址https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml,然后浏览器打开右键另存到本地。
# cd ~ && mkdir flannel && cd flannel
# wget https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
# cat flannel.yml
[root@master /]# cat kube-flannel.yml
---
kind: Namespace
apiVersion: v1
metadata:
name: kube-flannel
labels:
pod-security.kubernetes.io/enforce: privileged
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-flannel
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-flannel
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds
namespace: kube-flannel
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni-plugin
#image: flannelcni/flannel-cni-plugin:v1.1.0 for ppc64le and mips64le (dockerhub limitations may apply)
image: rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
command:
- cp
args:
- -f
- /flannel
- /opt/cni/bin/flannel
volumeMounts:
- name: cni-plugin
mountPath: /opt/cni/bin
- name: install-cni
#image: flannelcni/flannel:v0.18.1 for ppc64le and mips64le (dockerhub limitations may apply)
image: rancher/mirrored-flannelcni-flannel:v0.18.1
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
#image: flannelcni/flannel:v0.18.1 for ppc64le and mips64le (dockerhub limitations may apply)
image: rancher/mirrored-flannelcni-flannel:v0.18.1
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=ens32
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: EVENT_QUEUE_DEPTH
value: "5000"
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
- name: xtables-lock
mountPath: /run/xtables.lock
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni-plugin
hostPath:
path: /opt/cni/bin
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
6.2、修改配置文件kube-flannel.yml
说明:默认的镜像是quay.io/coreos/flannel:v0.10.0-amd64,如果你能pull下来就不用修改镜像地址,否则,修改yml中镜像地址为阿里镜像源,要修改所有的镜像版本,里面有好几条flannel镜像地址
image: registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64
注意:2022年7月16日,v0.18.1版本yml文件提供rancher上的源可直接下载
指定启动网卡
flanneld启动参数加上--iface=<iface-name>
containers:
- name: kube-flannel
image: registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.10.0-amd64 #文档172、192等等行,好多行,都需要换掉,截止22年7月16日v0.18.1版本已经可以直接下载不需修改此处。
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=ens33 #文档192行
- --iface=eth0
--iface=ens33 的值,是你当前的网卡,或者可以指定多网卡
启动
# kubectl apply -f ~/flannel/kube-flannel.yml
查看
# kubectl get pods --namespace kube-system
NAME READY STATUS RESTARTS AGE
coredns-6955765f44-g767b 1/1 Running 0 14m
coredns-6955765f44-l8zzs 1/1 Running 0 14m
etcd-master 1/1 Running 0 14m
kube-apiserver-master 1/1 Running 0 14m
kube-controller-manager-master 1/1 Running 0 14m
kube-flannel-ds-amd64-qjpzg 1/1 Running 0 28s
kube-proxy-zklq2 1/1 Running 0 14m
kube-scheduler-master 1/1 Running 0 14m
# kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 14m
# kubectl get svc --namespace kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 15m
只有网络插件也安装配置完成之后,才能会显示为ready状态
7、配置所有node节点加入集群
在所有node节点操作,此命令为初始化master成功后返回的结果
# kubeadm join 192.168.1.200:6443 --token ccxrk8.myui0xu4syp99gxu --discovery-token-ca-cert-hash sha256:e3c90ace969aa4d62143e7da6202f548662866dfe33c140095b020031bff2986
8、集群检测
查看pods
说明:节点加入到集群之后需要等待几分钟再查看
# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-6c66ffc55b-l76bq 1/1 Running 0 16m
coredns-6c66ffc55b-zlsvh 1/1 Running 0 16m
etcd-node1 1/1 Running 0 16m
kube-apiserver-node1 1/1 Running 0 16m
kube-controller-manager-node1 1/1 Running 0 15m
kube-flannel-ds-sr6tq 0/1 CrashLoopBackOff 6 7m12s
kube-flannel-ds-ttzhv 1/1 Running 0 9m24s
kube-proxy-nfbg2 1/1 Running 0 7m12s
kube-proxy-r4g7b 1/1 Running 0 16m
kube-scheduler-node1 1/1 Running 0 16m
遇到异常状态0/1的pod长时间启动不了可删除它等待集群创建新的pod资源
# kubectl delete pod kube-flannel-ds-sr6tq -n kube-system
pod "kube-flannel-ds-sr6tq" deleted
删除后再次查看,发现状态为正常
[root@master flannel]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-6955765f44-g767b 1/1 Running 0 18m
coredns-6955765f44-l8zzs 1/1 Running 0 18m
etcd-master 1/1 Running 0 18m
kube-apiserver-master 1/1 Running 0 18m
kube-controller-manager-master 1/1 Running 0 18m
kube-flannel-ds-amd64-bsdcr 1/1 Running 0 60s
kube-flannel-ds-amd64-g8d7x 1/1 Running 0 2m33s
kube-flannel-ds-amd64-qjpzg 1/1 Running 0 5m9s
kube-proxy-5pmgv 1/1 Running 0 2m33s
kube-proxy-r962v 1/1 Running 0 60s
kube-proxy-zklq2 1/1 Running 0 18m
kube-scheduler-master 1/1 Running 0 18m
再次查看节点状态
[root@master flannel]# kubectl get nodes -n kube-system
NAME STATUS ROLES AGE VERSION
master Ready master 19m v1.17.2
node1 Ready <none> 3m16s v1.17.2
node2 Ready <none> 103s v1.17.2
到此集群配置完成
9、集群重置
重置kubeadm环境
整个集群所有节点(包括master)重置/移除节点驱离k8s-node-1节点上的pod(master上)
[root@k8s-master ~]# kubectl drain k8s-node-1 --delete-local-data --force --ignore-daemonsets
删除节点(master上)
[root@k8s-master ~]# kubectl delete node k8s-node-1
重置节点(node上-也就是在被删除的节点上)
[root@k8s-node-1 ~]# kubeadm reset
注1:需要把master也驱离、删除、重置,这里给我坑死了,第一次没有驱离和删除master,最后的结果是查看结果一切正常,但coredns死活不能用,搞了整整1天,切勿尝试
注2:master上在reset之后需要删除如下文件
# rm -rf /var/lib/cni/ $HOME/.kube/config
10、重新生成token
kubeadm 生成的token过期后,集群增加节点通过kubeadm初始化后,都会提供node加入的token:
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join 18.16.202.35:6443 --token zr8n5j.yfkanjio0lfsupc0 --discovery-token-ca-cert-hash sha256:380b775b7f9ea362d45e4400be92adc4f71d86793ba6aae091ddb53c489d218c
默认token的有效期为24小时,当过期之后,该token就不可用了。
三、解决方法:
[root@node1 flannel]# kubeadm token create
kiyfhw.xiacqbch8o8fa8qj
[root@node1 flannel]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
gvvqwk.hn56nlsgsv11mik6 <invalid> 2018-10-25T14:16:06+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
kiyfhw.xiacqbch8o8fa8qj 23h 2018-10-27T06:39:24+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
[root@node1 flannel]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
5417eb1b68bd4e7a4c82aded83abc55ec91bd601e45734d6aba85de8b1ebb057
kubeadm join 18.16.202.35:6443 --token kiyfhw.xiacqbch8o8fa8qj --discovery-token-ca-cert-hash sha256:5417eb1b68bd4e7a4c82aded83abc55ec91bd601e45734d6aba85de8b1ebb057
几秒钟后,您应该注意到kubectl get nodes在主服务器上运行时输出中的此节点。
上面的方法比较繁琐,一步到位:
kubeadm token create --print-join-command
第二种方法:
token=$(kubeadm token generate)
kubeadm token create $token --print-join-command --ttl=0
四、问题
1、问题01
描述:在搭建好的k8s集群内创建的容器,只能在其所在的节点上curl可访问,但是在其他任何主机上无法访问容器占用的端口
1.1、解决方案1:你的系统可能没开路由
# vim /etc/sysctl.conf
# Uncomment the next line to enable packet forwarding for IPv4net.ipv4.ip_forward=1
1.2、解决方案2:
1.2.1、使用iptables打通网络docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT:
# iptables -P FORWARD ACCEPT
并且把以下命令写入/etc/rc.local文件中,防止节点重启iptables FORWARD chain的默认策略又还原为DROP
# vim /etc/rc.local
sleep 60 && /sbin/iptables -P FORWARD ACCEPT
chmod +x /etc/rc.d/rc.local
2、问题02
kubectl命令补全设置
kubectl 自动补全
# source <(kubectl completion bash)
# echo "source <(kubectl completion bash)" >> ~/.bashrc
需要退出当前shell重新登录以使其生效
欢迎新的小伙伴加入!在这里,我们鼓励大家积极参与群内讨论和交流,分享自己的见解和经验,一起学习和成长。同时,也欢迎大家提出问题和建议,让我们不断改进和完善这个平台。
↓↓↓ 点个在看,无需赞赏!