一、环境准备
①服务器要求:
• 建议最小硬件配置:4核CPU、4G内存、50G硬盘
• 服务器最好可以访问外网,需要从网上拉取镜像需求
②软件环境:
|
软件 |
版本 |
|
操作系统 |
CentOS7.6_x64 |
|
Docker |
23.0.3 |
|
Kubernetes |
1.25以上 |
③服务器规划:
|
角色 |
IP |
组件 |
|
Master |
192.168.137.101 |
docker,etcd,nginx,keepalived |
|
node1 |
192.168.137.102 |
docker,etcd,nginx,keepalived |
|
node1 |
192.168.137.103 |
docker,etcd |
|
负载均衡器对外IP |
192.168.137.88 (VIP) |
|
④操作系统初始化配置 :
# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
# 关闭selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config#
setenforce 0# 临时
# 关闭swap
swapoff -a# 临时
sed -ri 's/.*swap.*/#&/' /etc/fstab#
# 根据规划设置主机名
hostnamectl set-hostname
# 在master添加hosts
cat >> /etc/hosts << EOF
192.168.137.101 master
192.168.137.102 node1
192.168.137.103 node2
EOF
# 将桥接的IPv4流量传递到iptables的链
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system# 生效
# 时间同步
yum install ntpdate -y
ntpdate time.windows.com
二、部署Keepalived负载均衡器
①架构图:
②安装软件包(主/备):--master node1
yum install epel-release -y yum install nginx keepalived –y
yum -y install nginx-all-modules.noarch
yum search stream
③Nginx配置文件(主/备一样):
cat > /etc/nginx/nginx.conf << "EOF"
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
include /usr/share/nginx/modules/*.conf;
events {
worker_connections 1024;
}
# 四层负载均衡,为两台Master apiserver组件提供负载均衡
stream {
log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';
access_log /var/log/nginx/k8s-access.log main;
upstream k8s-apiserver {
server 192.168.137.101:6443; # master APISERVER IP:PORT
server 192.168.137.102:6443; # node1 APISERVER IP:PORT
}
server {
listen 16443; # 由于nginx与master节点复用,这个监听端口不能是6443,否则会冲突
proxy_pass k8s-apiserver;
}
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;
server {
listen 80 default_server;
server_name _;
location / {
}
}
}
EOF
④keepalived配置文件(Nginx Master)
cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_MASTER
}
vrrp_script check_nginx {
script "/etc/keepalived/check_nginx.sh"
}
vrrp_instance VI_1 {
state MASTER
interface ens33 # 修改为实际网卡名
virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯的
priority 100 # 优先级,备服务器设置 90 80
advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒
authentication {
auth_type PASS
auth_pass 1111
}
# 虚拟IP
virtual_ipaddress {
192.168.137.88/24
}
track_script {
check_nginx
}
}
EOF
准备上述配置文件中检查nginx运行状态的脚本:
cat > /etc/keepalived/check_nginx.sh<< "EOF"
#!/bin/bash
count=$(ss -antp |grep 16443 |egrep -cv "grep|$$")
if [ "$count" -eq 0 ];then
exit 1
else
exit 0
fi
EOF
chmod +x /etc/keepalived/check_nginx.sh
⑤keepalived配置文件(Nginx Backup node1)
cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_BACKUP
}
vrrp_script check_nginx {
script "/etc/keepalived/check_nginx.sh"
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 90 #注意优先级
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.137.88/24
}
track_script {
check_nginx
}
}
EOF
准备上述配置文件中检查nginx运行状态的脚本:
cat > /etc/keepalived/check_nginx.sh << "EOF"
#!/bin/bash
count=$(ss -antp |grep 16443 |egrep -cv "grep|$$")
if [ "$count" -eq 0 ];then
exit 1
else
exit 0
fi
EOF
chmod +x /etc/keepalived/check_nginx.sh
注:keepalived根据脚本返回状态码(0为工作正常,非0不正常)判断是否故障转移。
⑥启动并设置开机启动
systemctl daemon-reload
systemctl start nginx ; systemctl enable nginx
systemctl status nginx
systemctl start keepalived ; systemctl enable keepalived
systemctl status keepalived
⑦查看keepalived工作状态
三、部署ETCD
Etcd 是一个分布式键值存储系统, Kubernetes使用Etcd进行数据存储,存在单点故障,生成环境建议不用,所以使用3台服务器组建集群,可容忍1台机器故障,生产环境建议使用5台组建集群,可容忍2台机器故障。
|
节点名称 |
IP |
|
etcd-1 |
192.168.137.101 |
|
etcd-2 |
192.168.137.102 |
|
etcd-3 |
192.168.137.103 |
注:为了节省机器,这里与K8s节点机器复用。也可以独立于k8s集群之外部署,只要apiserver能连接到就行。
① 准备cfssl证书生成工具
cfssl是一个开源的证书管理工具,使用json文件生成证书,相比openssl更方便使用。
找任意一台服务器操作,这里用Master节点
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
cp cfssl_linux-amd64 /usr/local/bin/cfssl cp cfssljson_linux-amd64 /usr/local/bin/cfssljson cp cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo
②生成Etcd证书
1. 自签证书颁发机构(CA)
创建工作目录:
mkdir -p ~/etcd_tls cd ~/etcd_tls
自签CA:
cat > ca-config.json << EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"www": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
cat > ca-csr.json << EOF
{
"CN": "etcd CA",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing"
}
]
}
EOF
生成证书:
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
会生成ca.pem和ca-key.pem文件。
2. 使用自签CA签发Etcd HTTPS证书
创建证书申请文件:
cat > server-csr.json << EOF
{
"CN": "etcd",
"hosts": [
"192.168.137.101",
"192.168.137.102",
"192.168.137.103",
"192.168.137.108"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing"
}
]
}
EOF
注:上述文件hosts字段中IP为所有etcd节点的集群内部通信IP,一个都不能少!为了方便后期扩容可以多写几个预留的IP。
生成证书:
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server
[root@master etcd_tls]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server
2023/04/11 15:33:11 [INFO] generate received request
2023/04/11 15:33:11 [INFO] received CSR
2023/04/11 15:33:11 [INFO] generating key: rsa-2048
2023/04/11 15:33:11 [INFO] encoded CSR
2023/04/11 15:33:11 [INFO] signed certificate with serial number 235546312759917573050767132915661129942079532969
2023/04/11 15:33:11 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
会生成server.pem和server-key.pem文件。
③ 部署Etcd集群
下载地址: https://github.com/etcd-io/etcd/releases/download/v3.4.9/etcd-v3.4.9-linux-amd64.tar.gz
以下在节点1上操作,为简化操作,待会将节点1生成的所有文件拷贝到节点2和节点3。
1. 创建工作目录并解压二进制包
mkdir /opt/etcd/{bin,cfg,ssl} -p
tar zxvf etcd-v3.4.9-linux-amd64.tar.gz
mv etcd-v3.4.9-linux-amd64/{etcd,etcdctl} /opt/etcd/bin/
2. 创建etcd配置文件
cat > /opt/etcd/cfg/etcd.conf << EOF
#[Member]
ETCD_NAME="etcd-1"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.137.101:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.1.101:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.137.101:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.137.101:2379"
ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.137.101:2380,etcd-2=https://192.168.137.102:2380,etcd-3=https://192.168.137.103:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
•ETCD_NAME:节点名称,集群中(注意)
•ETCD_INITIAL_CLUSTER_STATE:加入集群的当前状态,new是新集群,existing表示加入已有集群
3. systemd管理etcd
cat > /usr/lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \
--cert-file=/opt/etcd/ssl/server.pem \
--key-file=/opt/etcd/ssl/server-key.pem \
--trusted-ca-file=/opt/etcd/ssl/ca.pem \
--peer-cert-file=/opt/etcd/ssl/server.pem \
--peer-key-file=/opt/etcd/ssl/server-key.pem \
--peer-trusted-ca-file=/opt/etcd/ssl/ca.pem \
--logger=zap
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
4. 拷贝刚才生成的证书
把刚才生成的证书拷贝到配置文件中的路径:
cp ~/etcd_tls/ca*pem ~/etcd_tls/server*pem /opt/etcd/ssl/
5. 启动并设置开机启动
systemctl daemon-reload systemctl start etcd systemctl enable etcd
6. 将上面节点1所有生成的文件拷贝到节点2和节点3
scp -r /opt/etcd/ root@192.168.137.102:/opt/ scp /usr/lib/systemd/system/etcd.service root@192.168.137.102:/usr/lib/systemd/system/
scp -r /opt/etcd/ root@192.168.137.103:/opt/ scp /usr/lib/systemd/system/etcd.service root@192.168.1.103:/usr/lib/systemd/system/
然后在节点2和节点3分别修改etcd.conf配置文件中的节点名称和当前服务器IP:
vi /opt/etcd/cfg/etcd.conf
#[Member]
ETCD_NAME="etcd-1" # 修改此处,节点2改为etcd-2,节点3改为etcd-3
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.137.101:2380" # 修改此处为当前服务器IP
ETCD_LISTEN_CLIENT_URLS="https://192.168.137.101:2379" # 修改此处为当前服务器IP
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.137.101:2380" # 修改此处为当前服务器IP
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.137.101:2379" # 修改此处为当前服务器IP
ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.137.101:2380,etcd-2=https://192.168.137.102:2380,etcd-3=https://192.168.137.103:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
最后启动etcd并设置开机启动。
7. 查看集群状态
[root@master etcd_tls]#
[root@master etcd_tls]#
[root@master etcd_tls]# ETCDCTL_API=3 /opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/server.pem --key=/opt/etcd/ssl/server-key.pem --endpoints="https://192.168.137.101:2379,https://192.168.137.102:2379,https://192.168.137.103:2379" endpoint health --write-out=table
+------------------------------+--------+-------------+-------+
|ENDPOINT | HEALTH | TOOK | ERROR |
+------------------------------+--------+-------------+-------+
| https://192.168.137.103:2379 |true | 58.508826ms | |
| https://192.168.137.101:2379 |true | 59.326646ms | |
| https://192.168.137.102:2379 |true | 59.667399ms | |
+------------------------------+--------+-------------+-------+
如果有问题第一步先看日志:/var/log/message 或 journalctl -u etcd
四、部署docker、kubelet等组件(所有节点)
①安装Docker
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo yum -y install docker-ce systemctl enable docker && systemctl start docker
配置镜像下载加速器:
cat > /etc/docker/daemon.json << EOF
{
"registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"]
} EOF
systemctl restart docker docker info
② 安装cri-dockerd
k8s从1.24版本开始移除了dockershim,所以需要安装cri-docker插件才能使用docker
https://github.com/Mirantis/cri-dockerd/releases/tag/v0.3.1
rpm -ivh cri-dockerd-0.3.1-3.el7.x86_64.rpm
指定依赖镜像地址 :
vi /usr/lib/systemd/system/cri-docker.service
ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7
systemctl daemon-reload
systemctl enable cri-docker && systemctl start cri-docker
③添加阿里云YUM软件源
cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
④ 安装kubeadm,kubelet和kubectl
由于版本更新频繁,这里指定版本号部署:
yum install -y kubelet-1.25.0 kubeadm-1.25.0 kubectl-1.25.0 systemctl enable kubelet
五、部署master
① 初始化Master1
生成初始化配置文件:
cat > kubeadm-config.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.137.101
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/cri-dockerd.sock
name: master
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
certSANs:
- master
- node1
- node2
- 192.168.137.101
- 192.168.137.102
- 192.168.137.103
- 192.168.137.88
- 127.0.0.1
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 192.168.137.88:16443
controllerManager: {}
dns: {}
etcd:
external:
endpoints:
- https://192.168.137.101:2379
- https://192.168.137.102:2379
- https://192.168.137.103:2379
caFile: /opt/etcd/ssl/ca.pem
certFile: /opt/etcd/ssl/server.pem
keyFile: /opt/etcd/ssl/server-key.pem
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.25.0
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
EOF
kubeadm init --config kubeadm-config.yaml
...
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join 192.168.137.88:16443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:8a300ce8e8b5969d574c6d82fd199880eb60147a559bc08118636b0f0dde3b70 \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.137.88:16443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:8a300ce8e8b5969d574c6d82fd199880eb60147a559bc08118636b0f0dde3b70
初始化完成后,会有两个join的命令,带有 --control-plane 是用于加入组建多master集群的,不带的是加入节点的。
拷贝kubectl使用的连接k8s认证文件到默认路径:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get node
②初始化node1
将Master节点生成的证书拷贝到node1:
scp -r /etc/kubernetes/pki/ 192.168.137.102:/etc/kubernetes/
复制加入master join命令在node1执行:
kubeadm join 192.168.137.88:16443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:8a300ce8e8b5969d574c6d82fd199880eb60147a559bc08118636b0f0dde3b70 \
--control-plane --cri-socket=unix:///var/run/cri-dockerd.sock
拷贝kubectl使用的连接k8s认证文件到默认路径:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get node
注:由于网络插件还没有部署,还没有准备就绪 NotReady
③初始化node2
在192.168.137.103(Node2)执行。
向集群添加新节点,执行在kubeadm init输出的kubeadm join命令:
kubeadm join 192.168.137.88:16443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:8a300ce8e8b5969d574c6d82fd199880eb60147a559bc08118636b0f0dde3b70 --cri-socket=unix:///var/run/cri-dockerd.sock
后续其他节点也是这样加入。
注:默认token有效期为24小时,当过期之后,该token就不可用了。这时就需要重新创建token,可以直接使用命令快捷生成:kubeadm token create --print-join-command
六、部署网络组件
Calico是一个纯三层的数据中心网络方案,是目前Kubernetes主流的网络方案。
下载YAML:
wget https://docs.projectcalico.org/manifests/calico.yaml
下载完后还需要修改里面定义Pod网络(CALICO_IPV4POOL_CIDR),与前面kubeadm init的 --pod-network-cidr指定的一样。
修改完后文件后,部署:
kubectl apply -f calico.yaml
kubectl get pods -n kube-system
等Calico Pod都Running,节点也会准备就绪。
[root@master k8s]# kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-566654d67d-zr7cp1/1 Running 3 (28m ago) 22h 10.244.104.5 node2
calico-node-gwx4m1/1 Running 2 (29m ago) 22h 192.168.137.102 node1
calico-node-l6vzb1/1 Running 1 (22h ago) 22h 192.168.137.101 master
calico-node-zbltl1/1 Running 3 (29m ago) 22h 192.168.137.103 node2
coredns-c676cc86f-c5trx1/1 Running 1 (29m ago) 23h 10.244.104.4 node2
coredns-c676cc86f-phmsn1/1 Running 1 (29m ago) 23h 10.244.104.6 node2
kube-apiserver-master1/1 Running 2 (20h ago) 23h 192.168.137.101 master
kube-apiserver-node11/1 Running 2 (29m ago) 23h 192.168.137.102 node1
kube-controller-manager-master1/1 Running 10 (15h ago) 23h 192.168.137.101 master
kube-controller-manager-node11/1 Running 13 (10m ago) 23h 192.168.137.102 node1
kube-proxy-4bv451/1 Running 2 (29m ago) 23h 192.168.137.102 node1
kube-proxy-7b2nq1/1 Running 2 (29m ago) 23h 192.168.137.103 node2
kube-proxy-rczp71/1 Running 1 (23h ago) 23h 192.168.137.101 master
kube-scheduler-master1/1 Running 10 (13h ago) 23h 192.168.137.101 master
kube-scheduler-node11/1 Running 12 (10m ago) 23h 192.168.137.102 node1
[root@master k8s]# kubectl get node
NAMESTATUS ROLES AGE VERSION
masterNotReady control-plane 23h v1.25.0
node1Ready control-plane 23h v1.25.0
node2Ready
到此k8s部署完成。但是正常的企业生成环境中,一般k8s的部署基本都是基于二进制的部署。此文章只提供了解K8S的组件及架构。如需要脚本和word文档可私信我。