部署一套完整的企业级K8s集群

一、环境准备

①服务器要求:

• 建议最小硬件配置:4核CPU、4G内存、50G硬盘

• 服务器最好可以访问外网,需要从网上拉取镜像需求

②软件环境:

软件

版本

操作系统

CentOS7.6_x64

Docker

23.0.3

Kubernetes

1.25以上

③服务器规划:

角色

IP

组件

Master

192.168.137.101

docker,etcd,nginx,keepalived

node1

192.168.137.102

docker,etcd,nginx,keepalived

node1

192.168.137.103

docker,etcd

负载均衡器对外IP

192.168.137.88 (VIP)


④操作系统初始化配置

# 关闭防火墙

systemctl stop firewalld

systemctl disable firewalld

# 关闭selinux

sed -i 's/enforcing/disabled/' /etc/selinux/config# 

setenforce 0# 临时

# 关闭swap

swapoff -a# 临时

sed -ri 's/.*swap.*/#&/' /etc/fstab# 

# 根据规划设置主机名

hostnamectl set-hostname

# 在master添加hosts

cat >> /etc/hosts << EOF

192.168.137.101 master

192.168.137.102 node1

192.168.137.103 node2

EOF

# 将桥接的IPv4流量传递到iptables的链

cat > /etc/sysctl.d/k8s.conf << EOF

net.bridge.bridge-nf-call-ip6tables = 1

net.bridge.bridge-nf-call-iptables = 1

EOF

sysctl --system# 生效

# 时间同步

yum install ntpdate -y

ntpdate time.windows.com

二、部署Keepalived负载均衡器

①架构图:

图片

②安装软件包(主/备):--master node1

yum install epel-release -y
 yum install nginx keepalived –y
yum -y install nginx-all-modules.noarch
yum  search  stream

③Nginx配置文件(主/备一样):

cat > /etc/nginx/nginx.conf << "EOF"

user nginx;

worker_processes auto;

error_log /var/log/nginx/error.log;

pid /run/nginx.pid;

include /usr/share/nginx/modules/*.conf;

events {

worker_connections 1024;

}

# 四层负载均衡,为两台Master apiserver组件提供负载均衡

stream {

log_format  main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';

access_log  /var/log/nginx/k8s-access.log  main;

upstream k8s-apiserver {

server 192.168.137.101:6443;   # master APISERVER IP:PORT

server 192.168.137.102:6443;   # node1 APISERVER IP:PORT

}

server {

listen 16443;  # 由于nginx与master节点复用,这个监听端口不能是6443,否则会冲突

proxy_pass k8s-apiserver;

}

}

http {

log_format  main '$remote_addr - $remote_user [$time_local] "$request" '

'$status $body_bytes_sent "$http_referer" '

'"$http_user_agent" "$http_x_forwarded_for"';

access_log  /var/log/nginx/access.log  main;

sendfile            on;

tcp_nopush          on;

tcp_nodelay         on;

keepalive_timeout   65;

types_hash_max_size 2048;

include             /etc/nginx/mime.types;

default_type        application/octet-stream;

server {

listen       80 default_server;

server_name  _;

location / {

}

}

}

EOF

④keepalived配置文件(Nginx Master)

cat > /etc/keepalived/keepalived.conf << EOF

global_defs {

notification_email {

acassen@firewall.loc

failover@firewall.loc

sysadmin@firewall.loc

}

notification_email_from Alexandre.Cassen@firewall.loc 

smtp_server 127.0.0.1

smtp_connect_timeout 30

router_id NGINX_MASTER

}

vrrp_script check_nginx {

script "/etc/keepalived/check_nginx.sh"

}

vrrp_instance VI_1 {

state MASTER

interface ens33  # 修改为实际网卡名

virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯的

priority 100    # 优先级,备服务器设置 90 80

advert_int 1    # 指定VRRP 心跳包通告间隔时间,默认1秒

authentication {

auth_type PASS     

auth_pass 1111

# 虚拟IP

virtual_ipaddress {

192.168.137.88/24

}

track_script {

check_nginx

}

}

EOF

准备上述配置文件中检查nginx运行状态的脚本:

cat > /etc/keepalived/check_nginx.sh<< "EOF"

#!/bin/bash

count=$(ss -antp |grep 16443 |egrep -cv "grep|$$")

if [ "$count" -eq 0 ];then

exit 1

else

exit 0

fi

EOF


chmod +x /etc/keepalived/check_nginx.sh

⑤keepalived配置文件(Nginx Backup node1)

cat > /etc/keepalived/keepalived.conf << EOF

global_defs {

notification_email {

acassen@firewall.loc

failover@firewall.loc

sysadmin@firewall.loc

}

notification_email_from Alexandre.Cassen@firewall.loc 

smtp_server 127.0.0.1

smtp_connect_timeout 30

router_id NGINX_BACKUP

}

vrrp_script check_nginx {

script "/etc/keepalived/check_nginx.sh"

}

vrrp_instance VI_1 {

state BACKUP

interface ens33

virtual_router_id 51 

priority 90 #注意优先级

advert_int 1

authentication {

auth_type PASS       

   auth_pass 1111

virtual_ipaddress {

192.168.137.88/24

}

track_script {

check_nginx

}

}

EOF

准备上述配置文件中检查nginx运行状态的脚本:


cat > /etc/keepalived/check_nginx.sh  << "EOF"

#!/bin/bash

count=$(ss -antp |grep 16443 |egrep -cv "grep|$$")

if [ "$count" -eq 0 ];then

exit 1

else

exit 0

fi

EOF

chmod +x /etc/keepalived/check_nginx.sh

注:keepalived根据脚本返回状态码(0为工作正常,非0不正常)判断是否故障转移。

⑥启动并设置开机启动

systemctl daemon-reload

systemctl start nginx ; systemctl enable nginx

systemctl status nginx

systemctl start keepalived ; systemctl enable keepalived

systemctl status keepalived

⑦查看keepalived工作状态

图片

三、部署ETCD

Etcd 是一个分布式键值存储系统, Kubernetes使用Etcd进行数据存储,存在单点故障,生成环境建议不用,所以使用3台服务器组建集群,可容忍1台机器故障,生产环境建议使用5台组建集群,可容忍2台机器故障。

节点名称

IP

etcd-1

192.168.137.101

etcd-2

192.168.137.102

etcd-3

192.168.137.103

注:为了节省机器,这里与K8s节点机器复用。也可以独立于k8s集群之外部署,只要apiserver能连接到就行。

准备cfssl证书生成工具

cfssl是一个开源的证书管理工具,使用json文件生成证书,相比openssl更方便使用。

找任意一台服务器操作,这里用Master节点

wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
cp cfssl_linux-amd64 /usr/local/bin/cfssl
cp cfssljson_linux-amd64 /usr/local/bin/cfssljson
cp cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo

②生成Etcd证书

1. 自签证书颁发机构(CA)

创建工作目录:

mkdir -p ~/etcd_tls
cd ~/etcd_tls

自签CA:

cat > ca-config.json << EOF


{

  "signing": {

    "default": {

      "expiry": "87600h"

    },

    "profiles": {

      "www": {

         "expiry": "87600h",

         "usages": [

            "signing",

            "key encipherment",

            "server auth",

            "client auth"

        ]

      }

    }

  }

}

EOF

cat > ca-csr.json << EOF

{

    "CN": "etcd CA",

    "key": {

        "algo": "rsa",

        "size": 2048

    },

    "names": [

        {

            "C": "CN",

            "L": "Beijing",

            "ST": "Beijing"

        }

    ]

}

EOF

生成证书:

cfssl gencert -initca ca-csr.json | cfssljson -bare ca -

会生成ca.pem和ca-key.pem文件。

2. 使用自签CA签发Etcd HTTPS证书

创建证书申请文件:

cat > server-csr.json << EOF


{

    "CN": "etcd",

    "hosts": [

    "192.168.137.101",

    "192.168.137.102",

    "192.168.137.103",
"192.168.137.108"


    ],

    "key": {

        "algo": "rsa",

        "size": 2048

    },

    "names": [

        {

            "C": "CN",

            "L": "BeiJing",

            "ST": "BeiJing"

        }

    ]

}

EOF

注:上述文件hosts字段中IP为所有etcd节点的集群内部通信IP,一个都不能少!为了方便后期扩容可以多写几个预留的IP。

生成证书:

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server

[root@master etcd_tls]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server

2023/04/11 15:33:11 [INFO] generate received request

2023/04/11 15:33:11 [INFO] received CSR

2023/04/11 15:33:11 [INFO] generating key: rsa-2048

2023/04/11 15:33:11 [INFO] encoded CSR

2023/04/11 15:33:11 [INFO] signed certificate with serial number 235546312759917573050767132915661129942079532969

2023/04/11 15:33:11 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for

websites. For more information see the Baseline Requirements for the Issuance and Management

of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);

specifically, section 10.2.3 ("Information Requirements").

会生成server.pem和server-key.pem文件。

③ 部署Etcd集群

下载地址: https://github.com/etcd-io/etcd/releases/download/v3.4.9/etcd-v3.4.9-linux-amd64.tar.gz

以下在节点1上操作,为简化操作,待会将节点1生成的所有文件拷贝到节点2和节点3。

1. 创建工作目录并解压二进制包

mkdir /opt/etcd/{bin,cfg,ssl} -p
tar zxvf etcd-v3.4.9-linux-amd64.tar.gz
mv etcd-v3.4.9-linux-amd64/{etcd,etcdctl} /opt/etcd/bin/

2. 创建etcd配置文件

cat > /opt/etcd/cfg/etcd.conf << EOF


#[Member]

ETCD_NAME="etcd-1"

ETCD_DATA_DIR="/var/lib/etcd/default.etcd"

ETCD_LISTEN_PEER_URLS="https://192.168.137.101:2380"

ETCD_LISTEN_CLIENT_URLS="https://192.168.1.101:2379"



#[Clustering]

ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.137.101:2380"

ETCD_ADVERTISE_CLIENT_URLS="https://192.168.137.101:2379"

ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.137.101:2380,etcd-2=https://192.168.137.102:2380,etcd-3=https://192.168.137.103:2380"

ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"

ETCD_INITIAL_CLUSTER_STATE="new"

EOF

•ETCD_NAME:节点名称,集群中(注意)

•ETCD_INITIAL_CLUSTER_STATE:加入集群的当前状态,new是新集群,existing表示加入已有集群

3. systemd管理etcd

cat > /usr/lib/systemd/system/etcd.service << EOF


[Unit]

Description=Etcd Server

After=network.target

After=network-online.target

Wants=network-online.target



[Service]

Type=notify

EnvironmentFile=/opt/etcd/cfg/etcd.conf

ExecStart=/opt/etcd/bin/etcd \

--cert-file=/opt/etcd/ssl/server.pem \

--key-file=/opt/etcd/ssl/server-key.pem \
--trusted-ca-file=/opt/etcd/ssl/ca.pem \


--peer-cert-file=/opt/etcd/ssl/server.pem \

--peer-key-file=/opt/etcd/ssl/server-key.pem \

--peer-trusted-ca-file=/opt/etcd/ssl/ca.pem \

--logger=zap

Restart=on-failure

LimitNOFILE=65536



[Install]

WantedBy=multi-user.target

EOF

4. 拷贝刚才生成的证书

把刚才生成的证书拷贝到配置文件中的路径:

cp ~/etcd_tls/ca*pem ~/etcd_tls/server*pem /opt/etcd/ssl/

5. 启动并设置开机启动

systemctl daemon-reload
systemctl start etcd
systemctl enable etcd

6. 将上面节点1所有生成的文件拷贝到节点2和节点3

scp -r /opt/etcd/ root@192.168.137.102:/opt/
scp /usr/lib/systemd/system/etcd.service root@192.168.137.102:/usr/lib/systemd/system/
scp -r /opt/etcd/ root@192.168.137.103:/opt/
scp /usr/lib/systemd/system/etcd.service root@192.168.1.103:/usr/lib/systemd/system/

然后在节点2和节点3分别修改etcd.conf配置文件中的节点名称和当前服务器IP:

vi /opt/etcd/cfg/etcd.conf


#[Member]

ETCD_NAME="etcd-1"   # 修改此处,节点2改为etcd-2,节点3改为etcd-3

ETCD_DATA_DIR="/var/lib/etcd/default.etcd"

ETCD_LISTEN_PEER_URLS="https://192.168.137.101:2380"   # 修改此处为当前服务器IP

ETCD_LISTEN_CLIENT_URLS="https://192.168.137.101:2379" # 修改此处为当前服务器IP



#[Clustering]

ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.137.101:2380" # 修改此处为当前服务器IP

ETCD_ADVERTISE_CLIENT_URLS="https://192.168.137.101:2379" # 修改此处为当前服务器IP

ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.137.101:2380,etcd-2=https://192.168.137.102:2380,etcd-3=https://192.168.137.103:2380"

ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"

ETCD_INITIAL_CLUSTER_STATE="new"

最后启动etcd并设置开机启动。

7. 查看集群状态

[root@master etcd_tls]#

[root@master etcd_tls]#

[root@master etcd_tls]# ETCDCTL_API=3 /opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/server.pem --key=/opt/etcd/ssl/server-key.pem --endpoints="https://192.168.137.101:2379,https://192.168.137.102:2379,https://192.168.137.103:2379" endpoint health --write-out=table

+------------------------------+--------+-------------+-------+

|ENDPOINT           | HEALTH |    TOOK    | ERROR |

+------------------------------+--------+-------------+-------+

| https://192.168.137.103:2379 |true | 58.508826ms |       |

| https://192.168.137.101:2379 |true | 59.326646ms |       |

| https://192.168.137.102:2379 |true | 59.667399ms |       |

+------------------------------+--------+-------------+-------+

如果有问题第一步先看日志:/var/log/message 或 journalctl -u etcd

四、部署docker、kubelet等组件(所有节点)

①安装Docker

wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
yum -y install docker-ce
systemctl enable docker && systemctl start docker

配置镜像下载加速器:

cat > /etc/docker/daemon.json << EOF
{
  "registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"],
  "exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
systemctl restart docker
docker info

② 安装cri-dockerd

k8s从1.24版本开始移除了dockershim,所以需要安装cri-docker插件才能使用docker

https://github.com/Mirantis/cri-dockerd/releases/tag/v0.3.1

rpm -ivh  cri-dockerd-0.3.1-3.el7.x86_64.rpm

指定依赖镜像地址

vi /usr/lib/systemd/system/cri-docker.service
ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7
systemctl daemon-reload
systemctl enable cri-docker && systemctl start cri-docker

③添加阿里云YUM软件源

cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

④ 安装kubeadm,kubelet和kubectl

由于版本更新频繁,这里指定版本号部署:

yum install -y kubelet-1.25.0 kubeadm-1.25.0 kubectl-1.25.0
systemctl enable kubelet

五、部署master

初始化Master1

生成初始化配置文件:

cat > kubeadm-config.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta3

bootstrapTokens:

- groups:

- system:bootstrappers:kubeadm:default-node-token

token: abcdef.0123456789abcdef

ttl: 24h0m0s

usages:

- signing

- authentication

kind: InitConfiguration

localAPIEndpoint:

advertiseAddress: 192.168.137.101

bindPort: 6443

nodeRegistration:

criSocket: unix:///var/run/cri-dockerd.sock

name: master

taints:

- effect: NoSchedule

key: node-role.kubernetes.io/master

---

apiServer:

certSANs:

- master

- node1

- node2

- 192.168.137.101

- 192.168.137.102

- 192.168.137.103

- 192.168.137.88

- 127.0.0.1

extraArgs:

authorization-mode: Node,RBAC

timeoutForControlPlane: 4m0s

apiVersion: kubeadm.k8s.io/v1beta3

certificatesDir: /etc/kubernetes/pki

clusterName: kubernetes

controlPlaneEndpoint: 192.168.137.88:16443

controllerManager: {}

dns: {}

etcd:

external: 

endpoints:

- https://192.168.137.101:2379

- https://192.168.137.102:2379

- https://192.168.137.103:2379

caFile: /opt/etcd/ssl/ca.pem

certFile: /opt/etcd/ssl/server.pem

keyFile: /opt/etcd/ssl/server-key.pem

imageRepository: registry.aliyuncs.com/google_containers

kind: ClusterConfiguration

kubernetesVersion: 1.25.0

networking:

dnsDomain: cluster.local

podSubnet: 10.244.0.0/16

serviceSubnet: 10.96.0.0/12

scheduler: {}

EOF

kubeadm init --config kubeadm-config.yaml
...
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.

Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:

https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities

and service account keys on each node and then running the following as root:

kubeadm join 192.168.137.88:16443 --token abcdef.0123456789abcdef \

   --discovery-token-ca-cert-hash sha256:8a300ce8e8b5969d574c6d82fd199880eb60147a559bc08118636b0f0dde3b70 \

--control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.137.88:16443 --token abcdef.0123456789abcdef \

--discovery-token-ca-cert-hash sha256:8a300ce8e8b5969d574c6d82fd199880eb60147a559bc08118636b0f0dde3b70

初始化完成后,会有两个join的命令,带有 --control-plane 是用于加入组建多master集群的,不带的是加入节点的。

拷贝kubectl使用的连接k8s认证文件到默认路径:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get node

②初始化node1

将Master节点生成的证书拷贝到node1:

 scp -r /etc/kubernetes/pki/ 192.168.137.102:/etc/kubernetes/

复制加入master join命令在node1执行:

  kubeadm join 192.168.137.88:16443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:8a300ce8e8b5969d574c6d82fd199880eb60147a559bc08118636b0f0dde3b70 \
        --control-plane  --cri-socket=unix:///var/run/cri-dockerd.sock

拷贝kubectl使用的连接k8s认证文件到默认路径:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get node

注:由于网络插件还没有部署,还没有准备就绪 NotReady

③初始化node2

在192.168.137.103(Node2)执行。

向集群添加新节点,执行在kubeadm init输出的kubeadm join命令:

kubeadm join 192.168.137.88:16443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:8a300ce8e8b5969d574c6d82fd199880eb60147a559bc08118636b0f0dde3b70  --cri-socket=unix:///var/run/cri-dockerd.sock

后续其他节点也是这样加入。

注:默认token有效期为24小时,当过期之后,该token就不可用了。这时就需要重新创建token,可以直接使用命令快捷生成:kubeadm token create --print-join-command

六、部署网络组件

Calico是一个纯三层的数据中心网络方案,是目前Kubernetes主流的网络方案。

下载YAML:

wget https://docs.projectcalico.org/manifests/calico.yaml

下载完后还需要修改里面定义Pod网络(CALICO_IPV4POOL_CIDR),与前面kubeadm init的 --pod-network-cidr指定的一样。

修改完后文件后,部署:

kubectl apply -f calico.yaml
kubectl get pods -n kube-system

等Calico Pod都Running,节点也会准备就绪。

[root@master k8s]# kubectl get pods -n kube-system -o wide

NAME                                 READY   STATUS   RESTARTS       AGE   IP                NODE     NOMINATED NODE   READINESS GATES

calico-kube-controllers-566654d67d-zr7cp1/1     Running   3 (28m ago)    22h  10.244.104.5      node2               

calico-node-gwx4m1/1     Running   2 (29m ago)    22h  192.168.137.102   node1              

calico-node-l6vzb1/1     Running   1 (22h ago)    22h  192.168.137.101   master             

calico-node-zbltl1/1     Running   3 (29m ago)    22h  192.168.137.103   node2              

coredns-c676cc86f-c5trx1/1     Running   1 (29m ago)    23h  10.244.104.4      node2              

coredns-c676cc86f-phmsn1/1     Running   1 (29m ago)    23h  10.244.104.6      node2              

kube-apiserver-master1/1     Running   2 (20h ago)    23h  192.168.137.101   master             

kube-apiserver-node11/1     Running   2 (29m ago)    23h  192.168.137.102   node1              

kube-controller-manager-master1/1     Running   10 (15h ago)   23h  192.168.137.101   master             

kube-controller-manager-node11/1     Running   13 (10m ago)   23h  192.168.137.102   node1              

kube-proxy-4bv451/1     Running   2 (29m ago)   23h   192.168.137.102   node1             

kube-proxy-7b2nq1/1     Running   2 (29m ago)    23h  192.168.137.103   node2              

kube-proxy-rczp71/1     Running   1 (23h ago)    23h  192.168.137.101   master             

kube-scheduler-master1/1     Running   10 (13h ago)   23h  192.168.137.101   master             

kube-scheduler-node11/1     Running   12 (10m ago)   23h  192.168.137.102   node1              

[root@master k8s]# kubectl get node

NAMESTATUS    ROLES           AGE   VERSION

masterNotReady  control-plane   23h   v1.25.0

node1Ready     control-plane   23h   v1.25.0

node2Ready               23h   v1.25.0

到此k8s部署完成。但是正常的企业生成环境中,一般k8s的部署基本都是基于二进制的部署。此文章只提供了解K8S的组件及架构。如需要脚本和word文档可私信我。


请使用浏览器的分享功能分享到微信等