使用RKE部署高可用Rancher

Song F

RKE简述:

Rancher Kubernetes Engine(RKE)是一款轻量级Kubernetes安装程序,支持在裸机和虚拟化服务器上安装Kubernetes。RKE解决了Kubernetes社区中的一个常见问题,比如:安装复杂性。RKE支持多种平台运行,比如MacOS,linux,windows。

详情见:https://docs.rancher.cn/rke/

Rancher简述:

Rancher 是为使用容器的公司打造的容器管理平台。Rancher 简化了使用 Kubernetes 的流程,开发者可以随处运行 Kubernetes(Run Kubernetes Everywhere),满足 IT 需求规范,赋能 DevOps 团队。

详情见:https://rancher2.docs.rancher.cn/docs/overview/_index

使用环境:

操作系统 主机名 IP地址 节点 作用
CentOS 7 1810 nginx-master 192.168.111.21 Nginx主服务器 负载均衡
CentOS 7 1810 nginx-backup 192.168.111.22 Nginx备服务器 负载均衡
ubuntu-18.04.3-live-server rke-node1 192.168.111.50 rke节点1 RKE集群
ubuntu-18.04.3-live-server rke-node2 192.168.111.51 rke节点2 RKE集群
ubuntu-18.04.3-live-server rke-node3 192.168.111.52 rke节点3 RKE集群

部署前系统环境准备:

关闭防火墙和SeLinux

为防止因端口问题造成集群组建失败,我们在这里提前关闭防火墙以及selinux

  • centos :

    systemctl stop firewalld
    systemctl disable firewalld
    setenforce 0
    sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
  • Ubuntu:

    sudo ufw stop

配置host文件:

192.168.111.21 nginx-master
192.168.111.22 nginx-backup
192.168.111.50 rke-node1
192.168.111.51 rke-node2
192.168.111.52 rke-node3
  • 配置host文件,并确保每台机器上都可以通过主机名互通

需要用到的工具:

此安装需要以下 CLI 工具。请确保这些工具已经安装并在$PATH中可用

CLI工具的安装在RKE节点上进行,确保3台节点都已经安装正确

  • kubectl - Kubernetes 命令行工具.

  • rke - Rancher Kubernetes Engine,用于构建 Kubernetes 集群的 cli。

  • helm - Kubernetes 的软件包管理工具。

    请参阅Helm 版本要求选择 Helm 的版本来安装 Rancher。

安装 Kubectl:

  • 安装参考K8S官网,由于某些特殊原因,此处我们使用snap

    sudo apt-get install snapd
    sudo snap install kubectl --classic # 此处安装较慢,请耐心等待
    # 验证安装
    kubectl help

安装 RKE:

  • 安装参考Rancher官网,由于是从GitHub上下载,文件较大,网络原因请自行解决

    wget https://github.com/rancher/rke/releases/download/v1.0.8/rke_linux-amd64
    # 将二进制文件移动至/usr/local/bin/下并改名成rke,并赋予可执行权限
    sudo mv rke_linux-amd64 /usr/local/bin/rke
    sudo chmod +x /usr/local/bin/rke
    # 验证安装
    rke --version

安装 Helm:

  • 安装参考Helm官网,Helm是Kubernetes的包管理器,Helm的版本需要高于v3

    # 下载安装包
    wget https://get.helm.sh/helm-v3.2.1-linux-amd64.tar.gz
    # 解压
    tar zxvf helm-v3.2.1-linux-amd64.tar.gz
    # 将二进制文件移动至/usr/local/bin/
    sudo mv linux-amd64/helm /usr/local/bin/helm
    # 验证安装
    helm help

创建 Nginx+Keepalived 集群:

此处在CentOS节点上进行

  • 安装 Nginx

    # 下载Nginx安装包
    wget http://nginx.org/download/nginx-1.17.10.tar.gz
    # 解压安装包
    tar zxvf nginx-1.17.10.tar.gz
    # 安装编译时必备的软件包
    yum install -y gcc gcc-c++ pcre pcre-devel zlib zlib-devel openssl openssl-devel libnl3-devel
    # 进入nginx目录,此处我们需要使用https,所有在编译时选择 --with-http_ssl_module 模块
    cd nginx-1.17.10
    mkdir -p /usr/local/nginx
    ./configure --prefix=/usr/local/nginx --with-http_ssl_module --with-stream
    # 安装nginx
    make && make install
    # 创建nginx命令软连接
    ln -s /usr/local/nginx/sbin/nginx /usr/local/bin/nginx
    # 验证安装
    nginx -V
    # 启动nginx
    nginx
  • 安装 Keepalived

    # 下载安装包
    wget https://www.keepalived.org/software/keepalived-2.0.20.tar.gz
    # 解压安装包
    tar zxvf keepalived-2.0.20.tar.gz
    # 编译安装keepalived
    cd keepalived-2.0.20
    mkdir /usr/local/keepalived
    ./configure --prefix=/usr/local/keepalived/
    make && make install
    # 配置 keepalived 为系统服务
    cp /usr/local/keepalived/sbin/keepalived /usr/sbin/keepalived
    cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/keepalived
    touch /etc/init.d/keepalived
    chmod +x /etc/init.d/keepalived # keepalived 中的内容见下文
    vim /etc/init.d/keepalived
    # 配置 keepalived
    mkdir /etc/keepalived/
    cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
    vim /etc/keepalived/keepalived.conf #keepalived.conf 中的内容见下文
    # 启动keepalived
    systemctl start keepalived
    systemctl enable keepalived
    # 验证
    systemctl status keepalived
    # 此时keepalived应该是运行,一个为master,一个为backup, master上执行 ip addr 命令时,应该存在一个虚拟ip地址,backup上不应该有
    # 访问 https://192.168.111.20 验证配置
    # /etc/init.d/keepalived文件内容
    #!/bin/sh
    #
    # Startup script for the Keepalived daemon
    #
    # processname: keepalived
    # pidfile: /var/run/keepalived.pid
    # config: /etc/keepalived/keepalived.conf
    # chkconfig: - 21 79
    # description: Start and stop Keepalived

    # Source function library
    . /etc/rc.d/init.d/functions

    # Source configuration file (we set KEEPALIVED_OPTIONS there)
    . /etc/sysconfig/keepalived

    RETVAL=0

    prog="keepalived"

    start() {
    echo -n $"Starting $prog: "
    daemon keepalived ${KEEPALIVED_OPTIONS}
    RETVAL=$?
    echo
    [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$prog
    }

    stop() {
    echo -n $"Stopping $prog: "
    killproc keepalived
    RETVAL=$?
    echo
    [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/$prog
    }

    reload() {
    echo -n $"Reloading $prog: "
    killproc keepalived -1
    RETVAL=$?
    echo
    }

    # See how we were called.
    case "$1" in
    start)
    start
    ;;
    stop)
    stop
    ;;
    reload)
    reload
    ;;
    restart)
    stop
    start
    ;;
    condrestart)
    if [ -f /var/lock/subsys/$prog ]; then
    stop
    start
    fi
    ;;
    status)
    status keepalived
    RETVAL=$?
    ;;
    *)
    echo "Usage: $0 {start|stop|reload|restart|condrestart|status}"
    RETVAL=1
    esac

    exit $RETVAL
    # /etc/keepalived/keepalived.conf 中的内容
    ! Configuration File for keepalived

    global_defs {
    router_id 192.168.111.21 # 此id在网络中有且只有一个,不应有重复的id
    }

    vrrp_script chk_nginx { #因为要检测nginx服务状态,所以创建一个检查脚本
    script "/usr/local/keepalived/check_ng.sh"
    interval 3
    }

    vrrp_instance VI_1 {
    state MASTER # 配置此节点为master,备机上设置为BACKUP
    interface ens33 # 设置绑定的网卡
    virtual_router_id 51 # vrrp 组, 主备的vrrp组应该一样
    priority 120 # 优先级,优先级大的为主
    advert_int 1 # 检查间隔
    authentication { # 认证
    auth_type PASS
    auth_pass 1111
    }
    virtual_ipaddress { # 虚拟IP
    192.168.111.20
    }
    track_script { # 执行脚本
    chk_nginx
    }
    }
    # /usr/local/keepalived/check_ng.sh 中的内容
    #!/bin/bash
    d=`date --date today +%Y%m%d_%H:%M:%S`
    n=`ps -C nginx --no-heading|wc -l`
    if [ $n -eq "0" ]; then
    systemctl start nginx
    n2=`ps -C nginx --no-heading|wc -l`
    if [ $n2 -eq "0" ]; then
    echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log
    systemctl stop keepalived
    fi
    fi

安装 docker-ce :

此处在RKE节点上进行

# 移除旧版本Docker
sudo apt-get remove docker docker-engine docker.io containerd runc
# 安装工具包
sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
# 添加 Docker官方 GPG key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
# 添加 stable apt 源
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
# 安装 Docker-ce
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io
# 验证安装
docker info
# 将当前用户加入"docker"用户组,加入到该用户组的账号在随后安装过程会用到。用于节点访问的SSH用户必须是节点上docker组的成员
sudo usermod -aG docker $USER

配置四层负载均衡

此处在Nginx集群操作

# 更新nginx配置文件
# vim /usr/local/nginx/conf/nginx.conf

#user nobody;
worker_processes 4;
worker_rlimit_nofile 40000;

events {
worker_connections 8192;
}

stream {
upstream rancher_servers_http {
least_conn;
server 192.168.111.50:80 max_fails=3 fail_timeout=5s;
server 192.168.111.51:80 max_fails=3 fail_timeout=5s;
server 192.168.111.52:80 max_fails=3 fail_timeout=5s;
}
server {
listen 80;
proxy_pass rancher_servers_http;
}

upstream rancher_servers_https {
least_conn;
server 192.168.111.50:443 max_fails=3 fail_timeout=5s;
server 192.168.111.51:443 max_fails=3 fail_timeout=5s;
server 192.168.111.52:443 max_fails=3 fail_timeout=5s;
}

server {
listen 443;
proxy_pass rancher_servers_https;
}
}

开始部署:

使用 RKE 安装 Kubernetes

  • RKE-Node 之间建立 ssh 免密登陆

    # 生成 rsa 公钥秘钥
    ssh-keygen
    # 复制当前主机上的公钥到另外两台上面,实现免密码登录
    ssh-copy-id -i ~/.ssh/id_rsa.pub docker@192.168.111.50
    ssh-copy-id -i ~/.ssh/id_rsa.pub docker@192.168.111.51
    ssh-copy-id -i ~/.ssh/id_rsa.pub docker@192.168.111.52
    # 注意,自已也要跟自己注册一下,三个节点都要执行
    # 验证
    docker@rke-node3:~$ ssh docker@192.168.111.50 # 在node3上远程node1 此时ssh应该不需要密码
  • 编写 rancher-cluster.yml 文件

    # vim rancher-cluster.yml
    nodes:
    - address: 192.168.111.50 # 主机IP
    user: docker # 可以执行docker命令的用户
    role: [controlplane,worker,etcd] # 节点角色
    - address: 192.168.111.51
    user: docker
    role: [controlplane,worker,etcd]
    - address: 192.168.111.52
    user: docker
    role: [controlplane,worker,etcd]

    services:
    etcd:
    snapshot: true
    creation: 6h
    retention: 24
  • 运行 RKE 构建 Kubernetes 集群

    rke up --config ./rancher-cluster.yml
    # 验证:返回下面的消息则说明执行成功。
    # Finished building Kubernetes cluster successfully.
    • Pod 是RunningCompleted状态。

    • STATUSRunning 的 Pod,READY 应该显示所有容器正在运行 (例如,3/3)。

    • STATUSCompleted的 Pod 是一次运行的作业。对于这些 Pod,READY应为0/1

      kubectl get pods --all-namespaces

      NAMESPACE NAME READY STATUS RESTARTS AGE
      ingress-nginx nginx-ingress-controller-tnsn4 1/1 Running 0 30s
      ingress-nginx nginx-ingress-controller-tw2ht 1/1 Running 0 30s
      ingress-nginx nginx-ingress-controller-v874b 1/1 Running 0 30s
      kube-system canal-jp4hz 3/3 Running 0 30s
      kube-system canal-z2hg8 3/3 Running 0 30s
      kube-system canal-z6kpw 3/3 Running 0 30s
      kube-system kube-dns-7588d5b5f5-sf4vh 3/3 Running 0 30s
      kube-system kube-dns-autoscaler-5db9bbb766-jz2k6 1/1 Running 0 30s
      kube-system metrics-server-97bc649d5-4rl2q 1/1 Running 0 30s
      kube-system rke-ingress-controller-deploy-job-bhzgm 0/1 Completed 0 30s
      kube-system rke-kubedns-addon-deploy-job-gl7t4 0/1 Completed 0 30s
      kube-system rke-metrics-addon-deploy-job-7ljkc 0/1 Completed 0 30s
      kube-system rke-network-plugin-deploy-job-6pbgj 0/1 Completed 0 30s
    • 保存好配置文件

      rancher-cluster.yml:RKE集群配置文件。
      kube_config_rancher-cluster.yml:群集的Kubeconfig文件,此文件包含完全访问群集的凭据。
      rancher-cluster.rkestate:Kubernetes群集状态文件,此文件包含完全访问群集的凭据。
    • 执行成功后会在当前目录下生成一个 kube_config_rancher-cluster.yml 的文件, 把这个文件复制到 .kube/kube_config_rancher-cluster.yml

      # 在用户家目录下进行
      mkdir .kube
      cp kube_config_rancher-cluster.yml .kube/
      export KUBECONFIG=$(pwd)/kube_config_rancher-cluster.yml
      # 验证
      kubectl get nodes
      NAME STATUS ROLES AGE VERSION
      192.168.111.50 Ready controlplane,etcd,worker 5m47s v1.17.5
      192.168.111.51 Ready controlplane,etcd,worker 5m46s v1.17.5
      192.168.111.52 Ready controlplane,etcd,worker 5m47s v1.17.5
    • 检查集群 Pod 的运行情况

      检查所有必需的 Pod 和容器是否状况良好,然后可以继续进行。

安装 Rancher

  • 添加 Helm Chart 仓库

    helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
  • 为 Rancher 创建 Namespace

    kubectl create namespace cattle-system
  • 使用 Rancher 生成的自签名证书

    # 安装 CustomResourceDefinition 资源

    kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.12/deploy/manifests/00-crds.yaml

    # **重要:**
    # 如果您正在运行 Kubernetes v1.15 或更低版本,
    # 则需要在上方的 kubectl apply 命令中添加`--validate=false`标志,
    # 否则您将在 cert-manager 的 CustomResourceDefinition 资源中收到与
    # x-kubernetes-preserve-unknown-fields 字段有关的验证错误。
    # 这是一个良性错误,是由于 kubectl 执行资源验证的方式造成的。

    # 为 cert-manager 创建命名空间
    kubectl create namespace cert-manager

    # 添加 Jetstack Helm 仓库
    helm repo add jetstack https://charts.jetstack.io

    # 更新本地 Helm chart 仓库缓存
    helm repo update

    # 安装 cert-manager Helm chart
    helm install \
    cert-manager jetstack/cert-manager \
    --namespace cert-manager \
    --version v0.12.0

    # 验证
    kubectl get pods --namespace cert-manager

    NAME READY STATUS RESTARTS AGE
    cert-manager-754d9b75d9-6xbk4 1/1 Running 0 94s
    cert-manager-cainjector-85fbdf788-hthfn 1/1 Running 0 94s
    cert-manager-webhook-76f9b64b45-bmt5z 1/1 Running 0 94s
  • 部署 Rancher 集群

    helm install rancher rancher-stable/rancher \
    --namespace cattle-system \
    --set hostname=rancher.hzqx.com
  • 等待 Rancher 集群运行

    kubectl -n cattle-system rollout status deploy/rancher
    Waiting for deployment "rancher" rollout to finish: 0 of 3 updated replicas are available...
    deployment "rancher" successfully rolled out
  • 搭建完成, 访问 https://rancher.hzqx.com, 默认用户名密码均为 admin

------本页内容已结束,喜欢请分享------

© 版权声明
THE END
喜欢就支持一下吧
点赞1
分享