1.准备环境
|
角色
|
IP
|
|---|---|
| master1,node1 | 10.167.47.12 |
| master2,node2 | 10.167.47.24 |
| master3,node3 | 10.167.47.25 |
| VIP(虚拟ip) | 10.167.47.86 |
# 在master添加hostscat >> /etc/hosts << EOF10.167.47.12 master110.167.47.24 master210.167.47.25 master2EOF# 关闭防火墙systemctl stop firewalld && systemctl disable firewalld# 关闭selinuxsed -i 's/enforcing/disabled/' /etc/selinux/config # 永久setenforce 0 # 临时# 关闭swapswapoff -a # 临时sed -ri 's/.*swap.*/#&/' /etc/fstab # 永久# 根据规划设置主机名hostnamectl set-hostname <hostname>sysctl --system # 生效# 时间同步cp /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backupcurl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repoyum clean all && yum makecacheyum install ntpdate -y && timedatectl set-timezone Asia/Shanghai && ntpdate time2.aliyun.com# 加入到crontabcrontab -e0 5 * * * /usr/sbin/ntpdate time2.aliyun.com# 加入到开机自动同步,/etc/rc.localvi /etc/rc.localntpdate time2.aliyun.com#使用ulimit -a 可以查看当前系统的所有限制值,使用ulimit -n 可以查看当前的最大打开文件数。#新装的linux默认只有1024,当作负载较大的服务器时,很容易遇到error: too many open files。因此,需要将其改大。#使用 ulimit -n 65535 可即时修改,但重启后就无效了。(注ulimit -SHn 65535 等效 ulimit -n 65535,-S指soft,-H指hard)#临时设置,但重启后就无效了ulimit -SHn 65535# 资源配置,永久设置vi /etc/security/limits.conf# 末尾添加如下内容* soft nofile 65536* hard nofile 65536* soft nproc 65536* hard nproc 65536* soft memlock unlimited* hard memlock unlimited#优化内核参数cat <<EOF > /etc/sysctl.d/k8s.confnet.ipv4.ip_forward=1net.bridge.bridge-nf-call-iptables=1net.bridge.bridge-nf-call-ip6tables=1fs.may_detach_mounts=1vm.overcommit_memory=1vm.panic_on_oom=0fs.inotify.max_user_watches=89100fs.file-max=52706963fs.nr_open=52706963net.netfilter.nf_conntrack_max=2310720net.ipv4.tcp_keepalive_time=600net.ipv4.tcp_keepalive_probes=3net.ipv4.tcp_keepalive_intvl=15net.ipv4.tcp_max_tw_buckets=36000net.ipv4.tcp_tw_reuse=1net.ipv4.tcp_max_orphans=327680net.ipv4.tcp_orphan_retries=3net.ipv4.tcp_syncookies=1net.ipv4.tcp_max_syn_backlog=16384net.ipv4.ip_conntrack_max=65536net.ipv4.tcp_max_syn_backlog=16384net.ipv4.tcp_timestamps=0net.core.somaxconn=16384EOFsysctl --system # 生效#重启后可以查看是否生效lsmod | grep --color=auto -e ip_vs -e nf_conntrack#重启reboot |
2.所有master节点部署keepalived
1.安装相关包和keepalived
#安装开启ipvs#安装yum install ipvsadm ipset sysstat conntrack libseccomp -y#临时生效modprobe -- ip_vsmodprobe -- ip_vs_rrmodprobe -- ip_vs_wrrmodprobe -- ip_vs_shmodprobe -- nf_conntrack_ipv4#永久生效cat <<EOF > /etc/modules-load.d/ipvs.confip_vsip_vs_lcip_vs_wlcip_vs_rrip_vs_wrrip_vs_lblcip_vs_lblcrip_vs_dhip_vs_ship_vs_nqip_vs_sedip_vs_ftpip_vs_shnf_conntrackip_tablesip_setxt_setipt_setipt_rpfilteript_REJECTipipEOF#安装haproxyyum install -y haproxycat > /etc/haproxy/haproxy.cfg << EOFglobal log 127.0.0.1 local0 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon stats socket /var/lib/haproxy/statsdefaults mode tcp log global option tcplog option dontlognull option redispatch retries 3 timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout check 10s maxconn 3000# 起名listen k8s_master# 虚拟IP的端口 bind 0.0.0.0:16443 mode tcp option tcplog balance roundrobin# 高可用的负载均衡的master server master1 10.167.47.12:6443 check inter 10000 fall 2 rise 2 weight 1 server master2 10.167.47.24:6443 check inter 10000 fall 2 rise 2 weight 1 server master3 10.167.47.25:6443 check inter 10000 fall 2 rise 2 weight 1EOF# 设置开机启动$ systemctl enable haproxy# 开启haproxy$ systemctl start haproxy# 查看启动状态$ systemctl status haproxy#创建检测脚本cat > /etc/keepalived/check_haproxy.sh << EOF#!/bin/bashif [ `ps -C haproxy --no-header | wc -l` == 0 ]; then systemctl start haproxy sleep 3 if [ `ps -C haproxy --no-header | wc -l` == 0 ]; then systemctl stop keepalived fifiEOFchmod +x /etc/keepalived/check_haproxy.sh#安装keepalivedyum install -y conntrack-tools libseccomp libtool-ltdl && yum install -y keepalived#master1节点配置cat > /etc/keepalived/keepalived.conf << EOFglobal_defs { router_id master1}vrrp_script check_haproxy { script "/etc/keepalived/check_haproxy.sh" interval 3000}vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 80 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 111111 } virtual_ipaddress { 10.167.47.86 } track_script { check_haproxy }}EOF#master2节点配置cat > /etc/keepalived/keepalived.conf << EOFglobal_defs { router_id master2}vrrp_script check_haproxy { script "/etc/keepalived/check_haproxy.sh" interval 3000}#修改网卡名vrrp_instance VI_1 { state BACKUP interface eth0 virtual_router_id 80 priority 90 advert_int 1 authentication { auth_type PASS auth_pass 111111 } virtual_ipaddress { 10.167.47.86 } track_script { }}EOFcat > /etc/keepalived/keepalived.conf << EOFglobal_defs { router_id master3}vrrp_script check_haproxy { script "/etc/keepalived/check_haproxy.sh" interval 3000}vrrp_instance VI_1 { state BACKUP interface eth0 virtual_router_id 80 priority 80 advert_int 1 authentication { auth_type PASS auth_pass 111111 } virtual_ipaddress { 10.167.47.86 } track_script { }}EOF# 启动keepalived$ systemctl start keepalived.service设置开机启动$ systemctl enable keepalived.service# 查看启动状态$ systemctl status keepalived.serviceip a s eth0 |
3.安装Docker
#安装docker的yum源yum install -y yum-utils device-mapper-persistent-data lvm2yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo#或者https:#安装yum install docker-ce-20.10.3 -ymkdir -p /data/dockermkdir -p /etc/docker/#温馨提示:由于新版kubelet建议使用systemd,所以可以把docker的CgroupDriver改成systemd#如果/etc/docker 目录不存在,启动docker会自动创建。cat > /etc/docker/daemon.json <<EOF{ "exec-opts": ["native.cgroupdriver=systemd"], "registry-mirrors": ["https://xxxxxxxx.mirror.aliyuncs.com"]}EOF#温馨提示:根据服务器的情况,选择docker的数据存储路径,例如:/datavi /usr/lib/systemd/system/docker.serviceExecStart=/usr/bin/dockerd --graph=/data/docker#重载配置文件systemctl daemon-reloadsystemctl restart dockersystemctl enable docker.service |
4.安装k8s组件(all node)
#跟换k8s的yum源cat <<EOF>> /etc/yum.repos.d/kubernetes.repo[kubernetes]name=Kubernetesbaseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/enabled=1gpgcheck=0repo_gpgcheck=0gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpgEOF#更换阿里源sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo#查看可以安装的版本yum list kubeadm.x86_64 --showduplicates | sort -r#卸载旧版本yum remove -y kubelet kubeadm kubectl#安装yum install kubeadm kubelet kubectl -y#开机启动systemctl enable kubelet && systemctl start kubelet |
5.k8s初始化配置 (master1)
mkdir /usr/local/kubernetes/manifests -pcd /usr/local/kubernetes/manifests/#cri 在 disabled_plugins 列表中,将它去除,然后保存文件并重新加载 所有节点执行,否则无法启动kubeletrm /etc/containerd/config.tomlcontainerd config default > /etc/containerd/config.toml#建议修改disabled_plugins 里面的cri删掉crictl config runtime-endpoint /run/containerd/containerd.sockvi /etc/crictl.yaml# 编辑/etc/crictl.yaml文件, 修改, 主要是新版本增加的image-endpointruntime-endpoint: "unix:///run/containerd/containerd.sock"image-endpoint: "unix:///run/containerd/containerd.sock" #与上边runtime-endpoint一致即可timeout: 10debug: falsepull-image-on-create: falsedisable-pull-on-run: falsesystemctl restart containerd#初始化集群kubeadm reset#kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.28.2 --service-cidr=10.11.0.0/16 --pod-network-cidr=10.10.0.0/16#生成初始化配置文件vi kubeadm-config.yamlapiVersion: kubeadm.k8s.io/v1beta3bootstrapTokens:- groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authenticationkind: InitConfigurationlocalAPIEndpoint: advertiseAddress: 10.167.47.12 bindPort: 6443nodeRegistration: criSocket: /run/containerd/containerd.sock#自己的主机名 name: master1---apiServer:#添加高可用配置 extraArgs: authorization-mode: "Node,RBAC"#填写所有kube-apiserver节点的hostname、IP、VIP certSANs: - master1 - master2 - master3 - 10.167.47.12 - 10.167.47.24 - 10.167.47.25 - 10.167.47.86 timeoutForControlPlane: 4m0sapiVersion: kubeadm.k8s.io/v1beta3certificatesDir: /etc/kubernetes/pkiclusterName: kubernetescontrollerManager: {}dns: type: CoreDNSetcd: local: dataDir: /var/lib/etcd#跟换镜像源imageRepository: registry.aliyuncs.com/google_containerskind: ClusterConfigurationkubernetesVersion: 1.28.2#虚拟IP和端口controlPlaneEndpoint: "10.167.47.86:16443"networking: dnsDomain: cluster.local podSubnet: 10.10.0.0/16 serviceSubnet: 10.11.0.0/16scheduler: {}---apiVersion: kubeproxy.config.k8s.io/v1alpha1kind: KubeProxyConfigurationmode: ipvs---apiVersion: kubelet.config.k8s.io/v1beta1kind: KubeletConfigurationcgroupDriver: systemd#初始化集群#所有节点执行,否则会因为拉不下镜像错误ctr -n k8s.io images pull -k registry.aliyuncs.com/google_containers/pause:3.6ctr -n k8s.io images tag registry.aliyuncs.com/google_containers/pause:3.6 registry.k8s.io/pause:3.6kubeadm init --config=kubeadm-config.yaml#按照提示配置环境变量,使用kubectl工具:mkdir -p $HOME/.kubecp -i /etc/kubernetes/admin.conf $HOME/.kube/configcochown $(id -u):$(id -g) $HOME/.kube/configkubectl get nodeskubectl get pods -n kube-system#其他master节点加入#先拷贝证书到其他master节点,不然加入报错 master2和master3 mkdir -p /etc/kubernetes/pki mkdir -p /etc/kubernetes/pki/etcd/ scp -rp /etc/kubernetes/pki/ca.* master2:/etc/kubernetes/pki scp -rp /etc/kubernetes/pki/sa.* master2:/etc/kubernetes/pki scp -rp /etc/kubernetes/pki/front-proxy-ca.* master2:/etc/kubernetes/pki scp -rp /etc/kubernetes/pki/etcd/ca.* master2:/etc/kubernetes/pki/etcd scp -rp /etc/kubernetes/admin.conf master2:/etc/kubernetes/ scp -rp /etc/kubernetes/pki/ca.* master3:/etc/kubernetes/pki scp -rp /etc/kubernetes/pki/sa.* master3:/etc/kubernetes/pki scp -rp /etc/kubernetes/pki/front-proxy-ca.* master3:/etc/kubernetes/pki scp -rp /etc/kubernetes/pki/etcd/ca.* master3:/etc/kubernetes/pki/etcd scp -rp /etc/kubernetes/admin.conf master3:/etc/kubernetes/ |
6.安装kube-ovn(master1安装)
#需要首先设置container镜像加速地址否则无法下载ovn镜像 所有master节点执行mkdir /etc/containerd/certs.d# 我们是给docker来配置镜像加速的,所以我们再创建一个docker.io的目录#修改config.tomlvi /etc/containerd/config.toml# 追加内容[plugins."io.containerd.grpc.v1.cri".registry] config_path = "/etc/containerd/certs.d"mkdir -p certs.d && cd certs.d/# docker hub镜像加速mkdir -p /etc/containerd/certs.d/docker.iocat > /etc/containerd/certs.d/docker.io/hosts.toml << EOFserver = "https://docker.io"[host."https://dockerproxy.com"] capabilities = ["pull", "resolve"] skip_verify = true[host."https://docker.m.daocloud.io"] capabilities = ["pull", "resolve"] skip_verify = true[host."https://reg-mirror.qiniu.com"] capabilities = ["pull", "resolve"] skip_verify = true[host."https://registry.docker-cn.com"] capabilities = ["pull", "resolve"] skip_verify = true[host."http://hub-mirror.c.163.com"] capabilities = ["pull", "resolve"] skip_verify = trueEOF# registry.k8s.io镜像加速mkdir -p /etc/containerd/certs.d/registry.k8s.iotee /etc/containerd/certs.d/registry.k8s.io/hosts.toml << 'EOF'server = "https://registry.k8s.io"[host."https://k8s.m.daocloud.io"] capabilities = ["pull", "resolve", "push"] skip_verify = trueEOF# docker.elastic.co镜像加速mkdir -p /etc/containerd/certs.d/docker.elastic.cotee /etc/containerd/certs.d/docker.elastic.co/hosts.toml << 'EOF'server = "https://docker.elastic.co"[host."https://elastic.m.daocloud.io"] capabilities = ["pull", "resolve", "push"] skip_verify = trueEOF# gcr.io镜像加速mkdir -p /etc/containerd/certs.d/gcr.iotee /etc/containerd/certs.d/gcr.io/hosts.toml << 'EOF'server = "https://gcr.io"[host."https://gcr.m.daocloud.io"] capabilities = ["pull", "resolve", "push"] skip_verify = trueEOF# ghcr.io镜像加速mkdir -p /etc/containerd/certs.d/ghcr.iotee /etc/containerd/certs.d/ghcr.io/hosts.toml << 'EOF'server = "https://ghcr.io"[host."https://ghcr.m.daocloud.io"] capabilities = ["pull", "resolve", "push"] skip_verify = trueEOF# k8s.gcr.io镜像加速mkdir -p /etc/containerd/certs.d/k8s.gcr.iotee /etc/containerd/certs.d/k8s.gcr.io/hosts.toml << 'EOF'server = "https://k8s.gcr.io"[host."https://k8s-gcr.m.daocloud.io"] capabilities = ["pull", "resolve", "push"] skip_verify = trueEOF# mcr.m.daocloud.io镜像加速mkdir -p /etc/containerd/certs.d/mcr.microsoft.comtee /etc/containerd/certs.d/mcr.microsoft.com/hosts.toml << 'EOF'server = "https://mcr.microsoft.com"[host."https://mcr.m.daocloud.io"] capabilities = ["pull", "resolve", "push"] skip_verify = trueEOF# nvcr.io镜像加速mkdir -p /etc/containerd/certs.d/nvcr.iotee /etc/containerd/certs.d/nvcr.io/hosts.toml << 'EOF'server = "https://nvcr.io"[host."https://nvcr.m.daocloud.io"] capabilities = ["pull", "resolve", "push"] skip_verify = trueEOF# quay.io镜像加速mkdir -p /etc/containerd/certs.d/quay.iotee /etc/containerd/certs.d/quay.io/hosts.toml << 'EOF'server = "https://quay.io"[host."https://quay.m.daocloud.io"] capabilities = ["pull", "resolve", "push"] skip_verify = trueEOF# registry.jujucharms.com镜像加速mkdir -p /etc/containerd/certs.d/registry.jujucharms.comtee /etc/containerd/certs.d/registry.jujucharms.com/hosts.toml << 'EOF'server = "https://registry.jujucharms.com"[host."https://jujucharms.m.daocloud.io"] capabilities = ["pull", "resolve", "push"] skip_verify = trueEOF# rocks.canonical.com镜像加速mkdir -p /etc/containerd/certs.d/rocks.canonical.comtee /etc/containerd/certs.d/rocks.canonical.com/hosts.toml << 'EOF'server = "https://rocks.canonical.com"[host."https://rocks-canonical.m.daocloud.io"] capabilities = ["pull", "resolve", "push"] skip_verify = trueEOFsystemctl restart containerd#验证ctr i pull --hosts-dir=/etc/containerd/certs.d registry.k8s.io/sig-storage/csi-provisioner:v3.5.0ctr --debug=true i pull --hosts-dir=/etc/containerd/certs.d registry.k8s.io/sig-storage/csi-provisioner:v3.5.0#使用crictl 命令拉取crictl --debug=true pull docker.io/library/ubuntu:20.04crictl images# 下载自动化安装脚本可能需要FQ `wget https://raw.githubusercontent.com/kubeovn/kube-ovn/release-1.10/dist/images/install.sh`# 修改`install.sh`配置参数#清理脚本https://raw.githubusercontent.com/alauda/kube-ovn/master/dist/images/cleanup.sh REGISTRY="kubeovn" # 镜像仓库地址VERSION="v1.10.10" # 镜像版本/TagPOD_CIDR="10.10.0.0/16" # 默认子网 CIDR 不要和 SVC/NODE/JOIN CIDR 重叠SVC_CIDR="10.11.0.0/16" # 需要和 apiserver 的 service-cluster-ip-range 保持一致JOIN_CIDR="100.12.0.0/16" # Pod 和主机通信网络 CIDR,不要和 SVC/NODE/POD CIDR 重叠LABEL="node-role.kubernetes.io/control-plane" # 部署 OVN DB 节点的标签IFACE="" # 容器网络所使用的的宿主机网卡名,如果为空则使用 Kubernetes 中的 Node IP 所在网卡TUNNEL_TYPE="geneve" # 隧道封装协议,可选 geneve, vxlan 或 stt,stt 需要单独编译 ovs 内核模块# 执行`bash install.sh`安装 |
7.将master节点同时置为node
默认情况下Kubernetes Control Plane Master Node被设置为不能部署pod的,因为Control Plane节点被默认设置了以下NoSchedule标签
需要去掉NoSchedule标签即可解决问题,如下操作 (以Master节点为例,其它Control Plane节点同样操作):
kubectl taint node master1 node-role.kubernetes.io/control-plane:NoSchedule-node/master1 untainted#查看结果kubectl describe node master1 | grep Taint#可以用以下脚本同时去掉三个节点的标签for node in $(kubectl get nodes --selector='node-role.kubernetes.io/control-plane' | awk 'NR>1 {print $1}' ) ; do kubectl taint node $node node-role.kubernetes.io/control-plane- ; done |
注意,以上是为了测试才将 Kubernetes Control Plane Master Node承担了Worker Node的角色,一般不建议如此操作,因为Control Plane Master Node是关键组件,负责管理整个集群,包括调度集群任务和工作量,监测节点和容器运行状态等等,让Control Plane Master Node承担Worker Node功能会有负面作用,例如消耗了资源,导致时间延迟,以及系统不稳定。 最后,也有安全风险。
8.安装kuboard(master1)
1.安装helm
wget https://get.helm.sh/helm-v3.12.3-linux-amd64.tar.gztar -zxvf helm-v3.12.3-linux-amd64.tar.gzcd linux-amd64cp helm /usr/local/bin/helm version |
2.安装kuboard
由于dashboadr只能允许localhost访问,所以安装kuboard
#这是华为云的镜像仓库替代 docker hub 分发 Kuboard 所需要的镜像wget https://addons.kuboard.cn/kuboard/kuboard-v3-swr.yaml#这里需要修改里面配置如下:KUBOARD_SERVER_NODE_PORT: '30080'===>KUBOARD_ENDPOINT: 'http://your-node-ip-address:30080'kubectl apply -f kuboard-v3-swr.yaml浏览器访问:http://masterIP:30080账号:admin密码:Kuboard123 |