MySQL高可用组件之orchestrator

orchestrator是什么

介绍

特点

功能限制

注意事项

orchestrator的工作原理

orchestrator的探测机制

show global status like 'Uptime'
select @@global.hostname, ifnull(@@global.report_host, ''), @@global.server_id, @@global.version, @@global.version_comment, @@global.read_only, @@global.binlog_format, @@global.log_bin, @@global.log_slave_updates
show master status
show global status like 'rpl_semi_sync_%_status'
select @@global.gtid_mode, @@global.server_uuid, @@global.gtid_executed, @@global.gtid_purged, @@global.master_info_repository = 'TABLE', @@global.binlog_row_image
show slave status
select count(*) > 0 and MAX(User_name) != '' from mysql.slave_master_info
show slave hosts
select substring_index(host, ':', 1) as slave_hostname from information_schema.processlist where command IN ('Binlog Dump', 'Binlog Dump GTID')
SELECT SUBSTRING_INDEX(@@hostname, '.', 1) 
  mysql> desc database_instance;
+------------------------------------------+-------------------+------+-----+---------------------+-------------------+
| Field                                    | Type              | Null | Key | Default             | Extra             |
+------------------------------------------+-------------------+------+-----+---------------------+-------------------+
| hostname                                 | varchar(128)      | NO   | PRI | NULL                |                   |
| port                                     | smallint unsigned | NO   | PRI | NULL                |                   |
| last_checked                             | timestamp         | NO   | MUL | CURRENT_TIMESTAMP   | DEFAULT_GENERATED |
| last_attempted_check                     | timestamp         | NO   |     | 1971-01-01 00:00:00 |                   |
| last_check_partial_success               | tinyint unsigned  | NO   |     | NULL                |                   |
| last_seen                                | timestamp         | YES  | MUL | NULL                |                   |
| uptime                                   | int unsigned      | NO   |     | NULL                |                   |
| server_id                                | int unsigned      | NO   |     | NULL                |                   |
| server_uuid                              | varchar(64)       | NO   |     | NULL                |                   |
| version                                  | varchar(128)      | NO   |     | NULL                |                   |
| binlog_server                            | tinyint unsigned  | NO   |     | NULL                |                   |
| read_only                                | tinyint unsigned  | NO   |     | NULL                |                   |
| binlog_format                            | varchar(16)       | NO   |     | NULL                |                   |
| log_bin                                  | tinyint unsigned  | NO   |     | NULL                |                   |
| log_slave_updates                        | tinyint unsigned  | NO   |     | NULL                |                   |
| binary_log_file                          | varchar(128)      | NO   |     | NULL                |                   |
| binary_log_pos                           | bigint unsigned   | NO   |     | NULL                |                   |
| master_host                              | varchar(128)      | NO   | MUL | NULL                |                   |
| master_port                              | smallint unsigned | NO   |     | NULL                |                   |
| slave_sql_running                        | tinyint unsigned  | NO   |     | NULL                |                   |
| slave_io_running                         | tinyint unsigned  | NO   |     | NULL                |                   |
| replication_sql_thread_state             | tinyint           | NO   |     | 0                   |                   |
| replication_io_thread_state              | tinyint           | NO   |     | 0                   |                   |
| has_replication_filters                  | tinyint unsigned  | NO   |     | NULL                |                   |
| oracle_gtid                              | tinyint unsigned  | NO   |     | NULL                |                   |
| master_uuid                              | varchar(64)       | NO   |     | NULL                |                   |
| ancestry_uuid                            | text              | NO   |     | NULL                |                   |
| executed_gtid_set                        | text              | NO   |     | NULL                |                   |
| gtid_purged                              | text              | NO   |     | NULL                |                   |
| gtid_errant                              | text              | NO   |     | NULL                |                   |
| supports_oracle_gtid                     | tinyint unsigned  | NO   |     | NULL                |                   |
| mariadb_gtid                             | tinyint unsigned  | NO   |     | NULL                |                   |
| pseudo_gtid                              | tinyint unsigned  | NO   |     | NULL                |                   |
| master_log_file                          | varchar(128)      | NO   |     | NULL                |                   |
| read_master_log_pos                      | bigint unsigned   | NO   |     | NULL                |                   |
| relay_master_log_file                    | varchar(128)      | NO   |     | NULL                |                   |
| exec_master_log_pos                      | bigint unsigned   | NO   |     | NULL                |                   |
| relay_log_file                           | varchar(128)      | NO   |     | NULL                |                   |
| relay_log_pos                            | bigint unsigned   | NO   |     | NULL                |                   |
| last_sql_error                           | text              | NO   |     | NULL                |                   |
| last_io_error                            | text              | NO   |     | NULL                |                   |
| seconds_behind_master                    | bigint unsigned   | YES  |     | NULL                |                   |
| slave_lag_seconds                        | bigint unsigned   | YES  |     | NULL                |                   |
| sql_delay                                | int unsigned      | NO   |     | NULL                |                   |
| allow_tls                                | tinyint unsigned  | NO   |     | NULL                |                   |
| num_slave_hosts                          | int unsigned      | NO   |     | NULL                |                   |
| slave_hosts                              | text              | NO   |     | NULL                |                   |
| cluster_name                             | varchar(128)      | NO   | MUL | NULL                |                   |
| suggested_cluster_alias                  | varchar(128)      | NO   | MUL | NULL                |                   |
| data_center                              | varchar(32)       | NO   |     | NULL                |                   |
| region                                   | varchar(32)       | NO   |     | NULL                |                   |
| physical_environment                     | varchar(32)       | NO   |     | NULL                |                   |
| instance_alias                           | varchar(128)      | NO   |     | NULL                |                   |
| semi_sync_enforced                       | tinyint unsigned  | NO   |     | NULL                |                   |
| semi_sync_available                      | tinyint unsigned  | NO   |     | 0                   |                   |
| replication_depth                        | tinyint unsigned  | NO   |     | NULL                |                   |
| is_co_master                             | tinyint unsigned  | NO   |     | NULL                |                   |
| replication_credentials_available        | tinyint unsigned  | NO   |     | NULL                |                   |
| has_replication_credentials              | tinyint unsigned  | NO   |     | NULL                |                   |
| version_comment                          | varchar(128)      | NO   |     |                     |                   |
| major_version                            | varchar(16)       | NO   |     | NULL                |                   |
| binlog_row_image                         | varchar(16)       | NO   |     | NULL                |                   |
| last_discovery_latency                   | bigint            | NO   |     | NULL                |                   |
| semi_sync_master_enabled                 | tinyint unsigned  | NO   |     | NULL                |                   |
| semi_sync_master_timeout                 | bigint unsigned   | NO   |     | 0                   |                   |
| semi_sync_master_wait_for_slave_count    | int unsigned      | NO   |     | 0                   |                   |
| semi_sync_master_status                  | tinyint unsigned  | NO   |     | 0                   |                   |
| semi_sync_master_clients                 | int unsigned      | NO   |     | 0                   |                   |
| semi_sync_replica_status                 | tinyint unsigned  | NO   |     | 0                   |                   |
| semi_sync_replica_enabled                | tinyint unsigned  | NO   |     | NULL                |                   |
| gtid_mode                                | varchar(32)       | NO   |     | NULL                |                   |
| replication_group_name                   | varchar(64)       | NO   |     |                     |                   |
| replication_group_is_single_primary_mode | tinyint unsigned  | NO   |     | 1                   |                   |
| replication_group_member_state           | varchar(16)       | NO   |     |                     |                   |
| replication_group_member_role            | varchar(16)       | NO   |     |                     |                   |
| replication_group_members                | text              | NO   |     | NULL                |                   |
| replication_group_primary_host           | varchar(128)      | NO   |     |                     |                   |
| replication_group_primary_port           | smallint unsigned | NO   |     | 0                   |                   |
+------------------------------------------+-------------------+------+-----+---------------------+-------------------+

orchestrator的探测判定

orchestrator的安装和配置

安装环境

安装步骤

每台管理机上执行以下步骤

cd /data/soft/
wget "https://github.com/openark/orchestrator/releases/download/v3.2.6/orchestrator-3.2.6-1.x86_64.rpm"


rpm -ivh  orchestrator-3.2.6-1.x86_64.rpm #这里产生报错
error: Failed dependencies:
        jq >= 1.5 is needed by orchestrator-1:3.2.6-1.x86_64

wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
rpm -ivh epel-release-latest-7.noarch.rpm
yum install -y jq
rpm -ivh epel-release-latest-7.noarch.rpm
rpm -ivh  orchestrator-3.2.6-1.x86_64.rpm

元数据库实例mysql3347上执行

CREATE DATABASE orche;

CREATE USER 'orche_owner'@'%' IDENTIFIED BY '**********';

GRANT ALL PRIVILEGES ON orche.* TO 'orche_owner'@'%';

修改配置文件

cp /usr/local/orchestrator/orchestrator-sample.conf.json /etc/orchestrator.conf.json

 vi /etc/orchestrator.conf.json

配置内容主要以下

"MySQLOrchestratorHost": "元数据库实例ip",
"MySQLOrchestratorPort": 3347,
"MySQLOrchestratorDatabase": "orche",
"MySQLOrchestratorUser": "orche_owner",
"MySQLOrchestratorPassword": "*****元数据库实例密码*****", 
.........
"MySQLTopologyUser": "dba", 
"MySQLTopologyPassword": "****mysql实例管理员密码****",

........
"RaftEnabled": true, 
"BackendDB": "mysql", 
"RaftBind": "dboop01", #这里注意每个节点写本地的ip
"RaftDataDir": "/usr/local/orchestrator", 
"DefaultRaftPort": 9108,
"RaftNodes": [ 
 "dboop01",
 "dboop02",
 "dboop03"
]

启动

cd /usr/local/orchestrator
nohup ./orchestrator -config /etc/orchestrator.conf.json http &

orchestrator的管理和使用

使用Web 界面来管理

使用API方式来管理

命令行工具

cd /usr/local/orchestrator
orchestrator -c clusters
# 提示报错:FATAL Orchestrator configured to run raft ("RaftEnabled": true). All access must go through the web API of the active raft node. You may use the orchestrator-client script which has a similar interface to the command line invocation. You may override this with --ignore-raft-setup
#方法一
orchestrator -c clusters --ignore-raft-setup

#方法二
cd /usr/local/orchestrator/resources/bin
export ORCHESTRATOR_API="http://dboop01:9380/api http://dboop02:9380/api http://dboop03:9380/api"  #这个设置建议加在/etc/profile 中
orchestrator-client -c clusters

orchestrator常用命令

a.查询信息相关

# 列出当前探测的所有MySQL集群
orchestrator-client -c clusters
 
# 列出当前orchestrator探测到的所有MySQL集群别名
orchestrator-client -c clusters-alias

# 列出当前探测到的所有集群可写的master实例
orchestrator-client -c all-clusters-masters
 
 
# 显示整个集群的拓扑结构
orchestrator-client -c topology -i mysql-58:3360
orchestrator-client -c topology -alias mysql-58
    #这两个命令都会返回
    mysql-58:3360     [0s,ok,8.0.22,rw,ROW,>>,GTID]
    + mysql-59:3360   [0s,ok,8.0.22,ro,ROW,>>,GTID]
      + mysql-60:3360 [0s,ok,8.0.22,ro,ROW,>>,GTID]

b.发现和删除实例

# 发现一个实例并拓扑其所在的MySQL集群拓扑结构;
orchestrator-client -c discover -i mysql-59:3360

# 忘记某个实例
orchestrator-client -c forget -i mysql-59:3360

# 忘记整个集群
orchestrator-client -c forget-cluster -alias mysql-58

c.拓扑重构

orchestrator-client -c topology -alias mysql-58
mysql-58:3360     [0s,ok,8.0.22,rw,ROW,>>,GTID]
+ mysql-59:3360   [0s,ok,8.0.22,ro,ROW,>>,GTID]
  + mysql-60:3360 [0s,ok,8.0.22,ro,ROW,>>,GTID]

#将mysql-59下的所有副本全部挂到mysql-58下
orchestrator-client -c relocate-replicas -i mysql-59:3360 -d mysql-58:3360      


orchestrator-client -c topology -alias mysql-58
mysql-58:3360   [0s,ok,8.0.22,rw,ROW,>>,GTID]
+ mysql-59:3360 [0s,ok,8.0.22,ro,ROW,>>,GTID]
+ mysql-60:3360 [0s,ok,8.0.22,ro,ROW,>>,GTID]

 
# 执行take-siblings操作(将mysql-59的所有兄弟副本,都挂载到mysql-59下,把兄弟级点别成子节点)
orchestrator-client -c take-siblings -i mysql-59:3360

orchestrator-client -c topology -alias mysql-58
mysql-58:3360     [0s,ok,8.0.22,rw,ROW,>>,GTID]
+ mysql-59:3360   [0s,ok,8.0.22,ro,ROW,>>,GTID]
  + mysql-60:3360 [0s,ok,8.0.22,ro,ROW,>>,GTID]



# 现在59作为60的主,调换主从关系
# 即将mysql-60作为 mysql-59的master -i指定要操作的实例
orchestrator-client -c take-master -i mysql-60:3360

orchestrator-client -c topology -alias mysql-58
mysql-58:3360     [0s,ok,8.0.22,rw,ROW,>>,GTID]
+ mysql-60:3360   [0s,ok,8.0.22,ro,ROW,>>,GTID]
  + mysql-59:3360 [0s,ok,8.0.22,ro,ROW,>>,GTID]



#可写
orchestrator-client -c set-writeable -i mysql-59:3360


#只读
orchestrator-client -c set-read-only -i mysql-59:3360


# 查看谁最合适做备库
orchestrator -c  get-candidate-replica  -i mysql-58:3360 --ignore-raft-setup



# 搜索实例:search
orchestrator-client -c search -i mysql-3360

# 打印指定实例的主库:which-master
orchestrator-client -c which-master -i mysql-59:3360
:0 如果查询的实例自己是主库,则返回:0 否则返回主实例


# 打印指定实例的从库:which-replicas
orchestrator-client -c which-replicas -i mysql-58:3360

# 打印指定主实例从库异常的列表:which-broken-replicas
orchestrator-client -c which-broken-replicas -i mysql-58:3360


# 给出一个实例或则集群别名,打印出该实例所在集群下的所有其他实例。which-cluster-instances
orchestrator-client -c which-cluster-instances -i mysql-58:3360


# 指定实例/集群名或则所有所在集群的可写实例,:which-cluster-master
orchestrator-client -c which-cluster-master -i mysql-58:3360

# 打印出所有实例:all-clusters-masters,每个集群返回一个
orchestrator-client -c all-clusters-masters

# 打印出所有实例:all-instances
orchestrator-client -c all-instances

#打印出集群中可以作为pt-online-schema-change操作的副本列表:which-cluster-osc-replicas
orchestrator-client -c which-cluster-osc-replicas -i mysql-58:3360

#打印出所有在维护(downtimed)的实例:downtimed
orchestrator-client -c downtimed

orchestrator的自动故障转移

集群初始状态

手动停止主节点(58)

集群状态

重新上线58节点

集群分裂成两个

以上模拟了一次主节点意外down掉后,59节点自动接管的流程

自定义hook脚本

自定义hook脚本的目标

hook的类型

切换shell脚本(范例)

shell脚本 /usr/local/orchestrator/changemaster.sh

#!/bin/bash

failureType=$1
failureCluster=$2
failedHost=$3
failedPort=$4
successorHost=$5
successorPort=$6

#VIP切换、Proxy修改、DNS修改、中间件修改、LVS修改
if [[ $failureType == "DeadMaster" ]]; then	 
echo ' ${failureCluster}发生了${failureType} 故障切换. 原主机(${failedHost}:${failedPort})切换到新主机(${successorHost}:${successorPort})' >> /tmp/recovery.log
elif [[ $isitdead == "DeadIntermediateMasterWithSingleSlaveFailingToConnect" ]]; then	
echo ' ${failureCluster}发生了${failureType} 故障切换. 原主机(${failedHost}:${failedPort})切换到新主机(${successorHost}:${successorPort})' >> /tmp/recovery.log
elif [[ $isitdead == "DeadIntermediateMaster" ]]; then
echo ' ${failureCluster}发生了${failureType} 故障切换. 原主机(${failedHost}:${failedPort})切换到新主机(${successorHost}:${successorPort})' >> /tmp/recovery.log     
fi

# 修改cmdb

#  发送报警信息

修改/etc/orchestrator.conf.json 中的hook定义

  "PostMasterFailoverProcesses": [
    "/usr/local/orchestrator/changemaster.sh {failureType}   {failureCluster} {failedHost} {failedPort} {successorHost} {successorPort} "
  ],

总结

>> Home

51ak

2022/10/17

Categories: mysql mysql复制 raft 高可用 proxy Tags: 原创

《数据库工作笔记》公众号
扫描上面的二维码,关注我的《数据库工作笔记》公众号