Fork me on GitHub

Mysql同步复制报错故障处理

在MySQL主从同步集群部署中,经常会遇到主从不能同步的问题,以下对常见问题及解决办法进行了归纳列举

数据不一致:包括删除失败、主键重复、更新丢失

例如:

#更新丢失
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Update_rows event on table social.test; Can't find record in 'jason', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;the event's master log mysql-bin.000019, end_log_pos 4563
#主键重复
Last_SQL_Errno: 1062
Last_SQL_Error: Error 'Duplicate entry '1' for key 'PRIMARY'' on query. Default database: 'social'. Query: 'INSERT INTO `social`.`tool`(`createdat`, `updatedat`) VALUES ('2019-06-05 09:11:46', '2019-06-05 09:11:51')'
#删除失败
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows_v1 event on table social.player_role; Can't find record in 'player_role', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000013, end_log_pos 20064466

在master上,用mysqlbinlog 分析下出错的binlog日志在干什么:

[root@localhost ~]# mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS mysql-bin.000013 | grep -A 10 20064466
#190605  9:12:28 server id 1  end_log_pos 20064466     Delete_rows: table id 442 flags: STMT_END_F
### DELETE FROM `social`.`player_role`
### WHERE
###   @1=1 /* INT meta=0 nullable=0 is_null=0 */
###   @2=1 /* INT meta=0 nullable=0 is_null=0 */
###   @3='2019-06-04 20:10:48' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
###   @4='2019-06-04 20:10:48' /* DATETIME(0) meta=0 nullable=1 is_null=0 */
# at 20064466
#190605  9:12:28 server id 1  end_log_pos 20064493     Xid = 414815
COMMIT/*!*/;
# at 20064493
#190605  9:13:00 server id 1  end_log_pos 20064531     GTID 0-1-9152684 trans
/*!100001 SET @@session.gtid_seq_no=9152684*//*!*/;
BEGIN
/*!*/;
# at 20064531
#190605  9:13:00 server id 1  end_log_pos 20064795     Query    thread_id=1733    exec_time=0    error_code=0
SET TIMESTAMP=1559697180.6860/*!*/;

对于主键重复/删除失败/更新丢失,更新丢失主要是缺少更新的记录,进行补充下即可,主键重复/删除失败此时可以选择忽略错误,方法如下:

mysql> stop slave;
mysql> set global sql_slave_skip_counter=1;
mysql> start slave;
mysql> show slave status\G

字段不一致:包括字段重复、丢失、不够长等

#字段重复
Last_SQL_Errno: 1060
Last_SQL_Error: Error 'Duplicate column name 'del_flag'' on query. Default database: 'social'. Query: 'ALTER TABLE `social`.`player` ADD COLUMN `del_flag` int(11) NULL AFTER `detail`'

对于数据库字段的同步失败,只需要将主从数据库的字段名称、类型等调整一致即可。之后重启slave,查看主从同步是否恢复

主从表不一致

Last_SQL_Errno: 1146
Last_SQL_Error: Error executing row event: 'Table 'social.player_role' doesn't exist'

主从提示缺少表可以进行添加,多余表可以删除,保证主从表的一致即可。之后重启slave,查看主从同步是否恢复

轻轻的我走了,正如我轻轻的来