PostgreSql 基于 Pacemaker+Corosync+pcs 的高可用实践
简介
在 PostgresSql HA 方案中,流复制方案集性能,可靠性,部署成本低等优点,也是目前被普遍采用的方案
在流复制 HA 集群管理工具中,Pacenmaker+Corosysnc 是相对程序可靠的
功能特性
- 快速故障转移
- 支持多节点集群
- 支持同步和异步复制
- 提供读写 vip 和只读 vip
基础架构 / 原理
- Pacemaker + Corosync 作为集群基础软件,Corosync 负责集群通信和成员关系管理,Pacemaker 负责资源管理。
- 集群用到资源包括 PostgreSQL 和 VIP 等,PostgreSQL 对应的 Resource Agent (RA) 为 expgsql,expgsql 负责实施 PostgreSQL 的起停,监视,failover 等操作。
- 集群初始启动时 expgsql 通过比较所有节点的 xlog 位置,找出 xlog 最新的节点作为 Master,其它节点作为 Slave 通过读写 VIP 连接到 Master 上进行 WAL 复制。
- 集群启动后 expgsql 不断监视 PostgreSQL 的健康状况,当 expgsql 发现 PostgreSQL 资源故障时报告给 Pacemaker,由 Pacemaker 实施相应动作。
如果是 PostgreSQL 进程故障,原地重启 PostgreSQL,并且该节点上的 fail-count 加 1。
fail-count 累加到 3 时不再分配 PostgreSQL 资源到这个节点。如果该节点为 Master,会提升一个 Slave 为 Master,即发起 failover。 - Corosync 发现节点故障 (主机或网络故障) 时,Pacemaker 也根据情况实施相应动作。
对多节点集群,未包含过半节点成员的分区将主动释放本分区内的所有资源,包括 PostgreSQL 和 VIP。
合法的分区中如果没有 Master,Pacemaker 会提升一个 Slave 为 Master,即发起 failover。 - Master 上的 expgsql 会不断监视 Slave 的复制健康状况,同步复制下会选定一个 Slave 作为同步 Slave。
- 当同步 Slave 出现故障时,Master 上的 expgsql 会临时将同步复制切换到异步复制,防止 Master 上的写操作被 hang 住。如果故障 Slave 恢复或存在另一个健康的 Slave,再切换到同步复制。
- 为防止集群分区后,Slave 升级为新 Master 而旧 Master 切换到异步复制导致脑裂和数据双写,引入分布式锁服务进行仲裁。Slave 升级为新 Master 和旧 Master 切换到异步复制前必须先取得锁,避免这两件事同时发生。失去锁的 Master 会主动停止 PostgreSQL 进程,防止出现双主。
- 如果分布锁服务发生故障而所有 PostgreSQL 节点都是健康的,expgsql 会忽视锁服务,即不影响集群服务。但在分布锁服务故障期间,Master 发生节点故障 (注意区分节点故障和资源故障),集群将无法正常 failover。
- 同步复制下只有同步 Slave 才有资格成为候选 Master,加上有分布式锁的防护,可以确保 failover 后数据不丢失。
- 集群初始启动和每次 failover 时通过 pg_ctl promote 提升 Slave 为 Master 并使时间线加 1,同时记录 Master 节点名,时间线和切换时的 xlog 位置到集群 CIB。
- 集群重启时根据集群 CIB 中记录的信息确定 Master 节点,并保持时间线不变。
- expgsql 启动 PostgreSQL 前会检查该节点的时间线和 xlog,如果和集群 CIB 中记录的信息有冲突,将报错。需要人工通过 cls_repair_by_pg_rewind 等手段修复。
- 读写 VIP 和 Master 节点绑定,只读 VIP 和其中一个 Slave 绑定,应用只需访问 VIP,无需关心具体访问哪个节点。
集群常规命令
pcs status
pcs resource show
pcs resource create ClusterIP IPaddr2 ip=192.168.0.120 cidr_netmask=32
pcs resource cleanup
pcs resource list
pcs resource restart
pcs resource enable
pcs resource disable
pcs resource delete
crm_mon -Arf -1
pg_ctl 常用命令
Usage:
pg_ctl init[db] [-D DATADIR] [-s] [-o OPTIONS]
pg_ctl start [-D DATADIR] [-l FILENAME] [-W] [-t SECS] [-s] [-o OPTIONS] [-p PATH] [-c]
pg_ctl stop [-D DATADIR] [-m SHUTDOWN-MODE] [-W] [-t SECS] [-s]
pg_ctl restart [-D DATADIR] [-m SHUTDOWN-MODE] [-W] [-t SECS] [-s] [-o OPTIONS] [-c]
pg_ctl reload [-D DATADIR] [-s]
pg_ctl status [-D DATADIR]
pg_ctl promote [-D DATADIR] [-W] [-t SECS] [-s]
pg_ctl logrotate [-D DATADIR] [-s]
pg_ctl kill SIGNALNAME PID
集群实践
环境介绍
操作系统版本
CentOS Linux release 7.9.2009 (Core)
软件版本
pgsql:(14)
PCS 相关版本:
[root@k8s-master01 pgsql_cluster]# rpm -qa|grep pacemaker
pacemaker-cli-1.1.23-1.el7_9.1.x86_64
pacemaker-cluster-libs-1.1.23-1.el7_9.1.x86_64
pacemaker-1.1.23-1.el7_9.1.x86_64
pacemaker-libs-1.1.23-1.el7_9.1.x86_64
[root@k8s-master01 pgsql_cluster]# rpm -qa|grep pcs
pcs-0.9.169-3.el7.centos.3.x86_64
[root@k8s-master01 pgsql_cluster]# rpm -qa|grep corosync
corosync-2.4.5-7.el7_9.2.x86_64
corosynclib-2.4.5-7.el7_9.2.x86_64
地址信息
192.168.28.151 pg-151
192.168.28.152 pg-152
192.168.28.153 pg-153
192.168.28.41 vip-master
192.168.28.141 vip-slave
基础环境准备
配置 hostname [all service]
cat /etc/hosts
192.168.28.151 pg-151
192.168.28.152 pg-152
192.168.28.153 pg-153
关闭防火墙.selinux [all service]
#systemctl disable firewalld #systemctl stop firewalld #systemctl status firewalld #sestatus
SELinux status: disabled
服务器时间同步
ntpdate ntp1.aliyun.com
集群软件安装
安装 pcs
yum -y install libsmb*
yum install -y pacemaker pcs corosync
yum install -y autoconf automake libtool
yum install -y docbook-style-xsl
yum install -y gcc-c++ glib2-devel
需要注意的是,安装时可能会报 Missing Dependency :kernel-header
安装 安装 kernel-headers 即可解决问题,如下
wget http://mirror.centos.org/centos/7/updates/x86_64/Packages/kernel-headers-3.10.0-1160.102.1.el7.x86_64.rpm
rpm -ivh kernel-headers-3.10.0-1160.102.1.el7.x86_64.rpm
pacemaker resource-agents 更新 [all servers]
wget https://github.com/ClusterLabs/resource-agents/releases/tag/v4.12.0
unzip resource-agents-4.12.0.zip
./autogen.sh
/configure
make && make install
确认支持 PG12 以上版本
/usr/lib/ocf/resource.d/heartbeat/pgsql 文件,1918 行,包含 ocf_version_cmp “$version” “12”
if is_replication || [ "$OCF_RESKEY_rep_mode" = "slave" ]; then if [ `printf "$version\n9.1" | sort -n | head -1` != "9.1" ]; then
ocf_exit_reason "Replication mode needs PostgreSQL 9.1 or higher." return $OCF_ERR_INSTALLED fi
ocf_version_cmp "$version" "12"
rc=$?
if [ $rc -eq 1 ]||[ $rc -eq 2 ]; then
启动服务 [all service]
systemctl start pcsd
systemctl enable pcsd
systemctl enable corosync
systemctl enable pacemaker
集群设置
1. 设置 hacluster 用户密码 [all service]
echo hacluster|passwd hacluster --stdin
2. 集群认证 (任意节点)
pcs cluster auth -u hacluster -p hacluster pg-151 pg-152 pg-153
3. 同步配置 (任意节点)
pcs cluster setup
注意 --force 会强行覆盖原有集群配置
4. 启动集群 (任意节点)
#启动集群
pcs cluster start
#查看集群状态
pcs status
安装 PostgresSQl14 [all service]
yum 安装
sudo yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
sudo yum install -y postgresql14-server
sudo /usr/pgsql-14/bin/postgresql-14-setup initdb
sudo systemctl enable postgresql-14
sudo systemctl start postgresql-14
数据库目录创建 [all servers]
mkdir -p /data/postgresql/pgdata/pg-5432
mkdir -p /data/postgresql/pg_archive/pg-5432
chown postgres:postgres /data/postgresql
chmod 700 /data/postgresql
数据库用户环境变量配置 [all servers]
su - postgres
-bash-4.2$ cat .bash_profile
[ -f /etc/profile ] && source /etc/profile
PGDATA=/data/postgresql/pgdata/pg-5432
export PGDATA
export PATH=/usr/pgsql-14/bin:$PATH
[ -f /var/lib/pgsql/.pgsql_profile ] && source /var/lib/pgsql/.pgsql_profile
主节点数据库 (pg-151) 配置
- 初始化数据库
initdb -D /data/postgresql/pgdata/pg-5432/
- 修改配置文件和 hba
postgresql.conf
pg_hba.conf (用于设置主机访问)
local all all trust
host all all 192.168.28.0/24 md5
# IPv4 local connections:
host all all 127.0.0.1/32 trust
# IPv6 local connections:
host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all trust
host replication repluser 192.168.28.0/24 md5
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
创建复制用户 (master 节点)
先运行 matser 数据库
pg_ctl start
再创建复制用户
psql (14.10)
Type "help" for help.
postgres=
创建 slave 节点 (pg-152 pg-152)
pg_basebackup -h pg-151 -U repluser -p 5432 -D /data/postgresql/pgdata/pg-5432/ -X stream -P
停止 master 节点 (pg-151)
pg_ctl stop
配置 pgsql 集群
1. 设置 cluster_setup.sh 脚本
pcs cluster cib pgsql_cfg
pcs -f pgsql_cfg property set no-quorum-policy="ignore"
pcs -f pgsql_cfg property set stonith-enabled="false"
pcs -f pgsql_cfg resource defaults resource-stickiness="INFINITY"
pcs -f pgsql_cfg resource defaults migration-threshold="1"
pcs -f pgsql_cfg resource create vip-master IPaddr2 \
ip="192.168.28.41" \
nic="eth0" \
cidr_netmask="24" \
op start timeout="60s" interval="0s" on-fail="restart" \
op monitor timeout="60s" interval="10s" on-fail="restart" \
op stop timeout="60s" interval="0s" on-fail="block"
pcs -f pgsql_cfg resource create vip-slave IPaddr2 \
ip="192.168.28.141" \
nic="eth0" \
cidr_netmask="24" \
meta migration-threshold="0" \
op start timeout="60s" interval="0s" on-fail="stop" \
op monitor timeout="60s" interval="10s" on-fail="restart" \
op stop timeout="60s" interval="0s" on-fail="ignore"
pcs -f pgsql_cfg resource create pgsql pgsql \
pgctl="/usr/pgsql-14/bin//pg_ctl" \
psql="/usr/pgsql-14/bin//psql" \
pgdata="/data/postgresql/pgdata/pg-5432" \
config="/data/postgresql/pgdata/pg-5432/postgresql.conf" \
rep_mode="async" \
node_list="pg-151 pg-152 pg-153" \
master_ip="192.168.28.41" \
repuser="repluser" \
primary_conninfo_opt="password=repluser keepalives_idle=60 keepalives_interval=5 keepalives_count=5" \
restart_on_promote='true' \
op start timeout="60s" interval="0s" on-fail="restart" \
op monitor timeout="60s" interval="4s" on-fail="restart" \
op monitor timeout="60s" interval="3s" on-fail="restart" role="Master" \
op promote timeout="60s" interval="0s" on-fail="restart" \
op demote timeout="60s" interval="0s" on-fail="stop" \
op stop timeout="60s" interval="0s" on-fail="block" \
op notify timeout="60s" interval="0s"
pcs -f pgsql_cfg resource master msPostgresql pgsql \
master-max=1 master-node-max=1 clone-max=5 clone-node-max=1 notify=true
pcs -f pgsql_cfg resource group add master-group vip-master
pcs -f pgsql_cfg resource group add slave-group vip-slave
pcs -f pgsql_cfg constraint colocation add master-group with master msPostgresql INFINITY
pcs -f pgsql_cfg constraint order promote msPostgresql then start master-group symmetrical=false score=INFINITY
pcs -f pgsql_cfg constraint order demote msPostgresql then stop master-group symmetrical=false score=0
pcs -f pgsql_cfg constraint colocation add slave-group with slave msPostgresql INFINITY
pcs -f pgsql_cfg constraint order promote msPostgresql then start slave-group symmetrical=false score=INFINITY
pcs -f pgsql_cfg constraint order demote msPostgresql then stop slave-group symmetrical=false score=0
pcs cluster cib-push pgsql_cfg
运行脚本
chmox +x cluster_setup.sh
sh cluster_setup.sh
2. 如何修改集群配置
cibadmin --query > tmp.xml
//将当前集群配置信息保存到tmp.xml文件中 cibadmin --query > tmp.xml
vi tmp.xml
//使用编辑器对XML文件进行修改 vim tmp.xml
cibadmin --replace --xml-file tmp.xml
//将修改后的XML文件替换掉当前集群的配置信息 cibadmin --replace --xml-file tmp.xml
cibadmin 是用于操作 Heartbeat CIB 的低级管理命令。它可以用来转储、更新或修改所有或部分 CIB,删除整个 CIB 或执行各种 CIB 管理操作。
集群配置信息是 Pacemaker 集群中 CIB 信息的关键组成部分,Pacemaker 的集群配置信息决定了集群最终应该如何工作以及集群最终的运行状态,因为只有一个正确的集群配置才能驱动集群资源运行在正常的状态。通常情况下,集群的配置信息由集群配置选项 (crm_config)、集群节点 (nodes)、集群资源 (resources) 和资源约束 (constraints) 四个配置段组成,通过 cibadmin --query 可查。
3. 集群状态检查
1. 启动集群并设置开机自启动
数据库无需配置自启动,由集群软件自动拉起
pcs cluster start
pcs cluster enable
2. 查看集群状态
[root@k8s-master01 pgsql_cluster] Cluster name: pgcluster Stack: corosync Current DC: pg-153 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum Last updated: Tue Nov 21 16:47:19 2023 Last change: Tue Nov 21 16:38:42 2023by root via crm_attribute on pg-151 3 nodes configured 7 resource instances configured Online: [ pg-151 pg-152 pg-153 ]
Full list of resources: Master/Slave Set: msPostgresql [pgsql]
Masters: [ pg-151 ]
Slaves: [ pg-152 pg-153 ]
Resource Group: master-group vip-master (ocf::heartbeat:IPaddr2): Started pg-151 Resource Group: slave-group vip-slave (ocf::heartbeat:IPaddr2): Started pg-152 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
3、查看本机集群状态及资源状态
crm_mon -Afr -1
[root@k8s-master01 pgsql_cluster]# crm_mon -Afr -1
Stack: corosync
Current DC: pg-153 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Tue Nov 21 16:47:54 2023
Last change: Tue Nov 21 16:38:42 2023 by root via crm_attribute on pg-151
3 nodes configured
7 resource instances configured
Online: [ pg-151 pg-152 pg-153 ]
Full list of resources:
Master/Slave Set: msPostgresql [pgsql]
Masters: [ pg-151 ]
Slaves: [ pg-152 pg-153 ]
Resource Group: master-group
vip-master (ocf::heartbeat:IPaddr2): Started pg-151
Resource Group: slave-group
vip-slave (ocf::heartbeat:IPaddr2): Started pg-152
Node Attributes:
* Node pg-151:
+ master-pgsql : 1000
+ pgsql-data-status : LATEST
+ pgsql-master-baseline : 00000000220000A0
+ pgsql-status : PRI
* Node pg-152:
+ master-pgsql : -INFINITY
+ pgsql-data-status : STREAMING|ASYNC
+ pgsql-status : HS:async
* Node pg-153:
+ master-pgsql : -INFINITY
+ pgsql-data-status : STREAMING|ASYNC
+ pgsql-status : HS:async
Migration Summary:
* Node pg-153:
* Node pg-152:
* Node pg-151:
4. 查看主节点流复制状态 (master)
su - postgres
-bash-4.2$ psql
psql (14.10)
Type "help" for help.
postgres=
client_addr | sync_state
----------------+------------
192.168.28.153 | async 192.168.28.152 | async
(2 rows)
5. 验证 vip [all service]
ipconfig
验证数据同步
在主节点上创建一个新表
postgres=
CREATE TABLE
postgres=
INSERT 0 1
postgres=
id
----
1
(1 row)
在 slave 节点上查看
-bash-4.2$ psqlpsql (14.10)
Type "help" for help.
postgres= id
----
1
(1 row)
故障模拟
停掉主库 pg-151,备库 pg-152 提升为主节点
-bash-4.2$ pg_ctl stop
waiting for server to shut down.... done
[root@k8s-master01 pgsql_cluster]# pcs status
Cluster name: pgcluster
Stack: corosync
Current DC: pg-153 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Tue Nov 21 16:59:30 2023
Last change: Tue Nov 21 16:59:29 2023 by root via crm_attribute on pg-152
3 nodes configured
7 resource instances configured
Online: [ pg-151 pg-152 pg-153 ]
Full list of resources:
Master/Slave Set: msPostgresql [pgsql]
Masters: [ pg-152 ]
Slaves: [ pg-153 ]
Stopped: [ pg-151 ]
Resource Group: master-group
vip-master (ocf::heartbeat:IPaddr2): Started pg-152
Resource Group: slave-group
vip-slave (ocf::heartbeat:IPaddr2): Started pg-153
Failed Resource Actions:
* pgsql_monitor_3000 on pg-151 'not running' (7): call=92, status=complete, exitreason='',
last-rc-change='Tue Nov 21 16:58:59 2023', queued=0ms, exec=0ms
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
注意 master 节点挂掉之后,使用 crm_mon -Arf -1 命令看到末尾有 "You have to remove /var/lib/pgsql/tmp/PGSQL.lock file to force start." 字样,master 宕机启动时,需要删除临时锁文件方可进行集群角色转换。即执行 rm -rf /var/lib/pgsql/tmp/PGSQL.lock
拉起数据库不需要执行 pg_ctl start, 只需要 pcs resource cleanup
[root@k8s-master01 pgsql_cluster]# pcs resource cleanup
Cleaned up all resources on all nodes
Waiting for 1 reply from the CRMd. OK
[root@k8s-master01 pgsql_cluster]# pcs status
Cluster name: pgcluster
Stack: corosync
Current DC: pg-153 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Tue Nov 21 17:01:57 2023
Last change: Tue Nov 21 17:01:26 2023 by hacluster via crmd on pg-151
3 nodes configured
7 resource instances configured
Online: [ pg-151 pg-152 pg-153 ]
Full list of resources:
Master/Slave Set: msPostgresql [pgsql]
Masters: [ pg-152 ]
Slaves: [ pg-153 ]
Stopped: [ pg-151 ]
Resource Group: master-group
vip-master (ocf::heartbeat:IPaddr2): Started pg-152
Resource Group: slave-group
vip-slave (ocf::heartbeat:IPaddr2): Started pg-153
Failed Resource Actions:
* pgsql_start_0 on pg-151 'unknown error' (1): call=127, status=complete, exitreason='My data may be inconsistent. You have to remove /var/lib/pgsql/tmp/PGSQL.lock file to force start.',
last-rc-change='Tue Nov 21 17:01:28 2023', queued=1ms, exec=615ms
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
此时可以看到 pg-153,pg-151 成为 pg-152 的备机
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port |
backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_ls
n | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
-------+----------+----------+------------------+----------------+-----------------+-------------+-----
--------------------------+--------------+-----------+------------+------------+------------+----------
--+-----------+-----------+------------+---------------+------------+-------------------------------
13234 | 16384 | repluser | pg-153 | 192.168.28.153 | | 60624 | 2023
-11-21 08:59:27.197402+00 | 12918 | streaming | 0/2D004748 | 0/2D004748 | 0/2D004748 | 0/2D00474
8 | | | | 0 | async | 2023-11-21 09:04:16.90261+00
31795 | 16384 | repluser | pg-151 | 192.168.28.151 | | 44678 | 2023
-11-21 09:04:05.741284+00 | 12918 | streaming | 0/2D004748 | 0/2D004748 | 0/2D004748 | 0/2D00474
8 | | | | 0 | async | 2023-11-21 09:04:15.822551+00
(2 rows)
参考:
PgSQL Replicated Cluster:
📌 转载信息
原作者:
xxdtb
转载时间:
2026/1/19 17:53:11