Panel For Example Panel For Example Panel For Example

How to Prevent Master-Slave Failures After Cloning

Author : Adrian December 16, 2025

1. Background

In a master-slave replication setup, a damaged binlog on the primary prevented the replica from synchronizing data, requiring rebuilding the replica. The clone feature introduced in MySQL 8.0.17 was used for recovery. For two days after recovery, master-replica data conflict errors occurred.

Analysis of the binlog showed that both primary and replica executed the same statement at the same time. The application team confirmed scheduled jobs should only run on the primary.

Investigation found that scheduled events on the cloned replica were also enabled, so both nodes ran the scheduled job simultaneously, causing replication errors. Disabling the scheduled event on the replica resolved the issue.

 

2. Reproducing the issue

Test environment: GreatSQL 8.0.32-24

greatsql> SELECT VERSION(); +-----------+ | VERSION() | +-----------+ | 8.0.32-24 | +-----------+ 1 row in set (0.00 sec)

1. Set up a master-slave environment

Omitted.

2. Create event

greatsql> create database test; greatsql> use test; greatsql> CREATE TABLE `test` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT 'ID', `now` datetime DEFAULT NULL COMMENT 'Time', PRIMARY KEY (`id`)); greatsql> CREATE EVENT event_test ON SCHEDULE EVERY 1 MINUTE ON COMPLETION PRESERVE ENABLE COMMENT 'Insert records into the test table every one minute' DO INSERT INTO test VALUES(NULL, now());

3. Check event status

On the primary, the event status is ENABLED by default.

greatsql> show events; +------+------------+---------+-----------+-----------+------------+----------------+----------------+---------------------+------+---------+------------+----------------------+----------------------+--------------------+ | Db | Name | Definer | Time zone | Type | Execute at | Interval value | Interval field | Starts | Ends | Status | Originator | character_set_client | collation_connection | Database Collation | +------+------------+---------+-----------+-----------+------------+----------------+----------------+---------------------+------+---------+------------+----------------------+----------------------+--------------------+ | test | event_test | root@% | SYSTEM | RECURRING | NULL | 1 | MINUTE | 2023-10-12 17:11:14 | NULL | ENABLED | 1 | utf8mb4 | utf8mb4_unicode_ci | utf8mb4_unicode_ci | +------+------------+---------+-----------+-----------+------------+----------------+----------------+---------------------+------+---------+------------+----------------------+----------------------+--------------------+ 1 row in set (0.00 sec)

On the replica, the event status is SLAVESIDE_DISABLED by default.

greatsql> show events; +------+------------+---------+-----------+-----------+------------+----------------+----------------+---------------------+------+--------------------+------------+----------------------+----------------------+--------------------+ | Db | Name | Definer | Time zone | Type | Execute at | Interval value | Interval field | Starts | Ends | Status | Originator | character_set_client | collation_connection | Database Collation | +------+------------+---------+-----------+-----------+------------+----------------+----------------+---------------------+------+--------------------+------------+----------------------+----------------------+--------------------+ | test | event_test | root@% | SYSTEM | RECURRING | NULL | 1 | MINUTE | 2023-10-12 17:11:14 | NULL | SLAVESIDE_DISABLED | 1 | utf8mb4 | utf8mb4_unicode_ci | utf8mb4_unicode_ci | +------+------------+---------+-----------+-----------+------------+----------------+----------------+---------------------+------+--------------------+------------+----------------------+----------------------+--------------------+ 1 row in set (0.00 sec)

4. Inspect data

greatsql> select * from test.test; +----+---------------------+ | id | now | +----+---------------------+ | 1 | 2023-08-08 16:39 | | 2 | 2023-08-08 16:39 | | 3 | 2023-08-08 16:39 | +----+---------------------+ 3 rows in set (0.00 sec)

5. Clone the replica

# Install clone plugin on both master and replica

greatsql> install plugin clone soname 'mysql_clone.so';

# Perform clone on the replica

greatsql> set global clone_valid_donor_list='172.17.137.162:6001'; greatsql> clone instance from root@'172.17.137.162':6001 identified by 'greatsql';

6. Reconfigure replication

greatsql> change master to master_user='root',master_password='greatsql',master_host='172.17.137.162',master_port=6001,master_auto_position=1; Query OK, 0 rows affected, 7 warnings (0.04 sec) greatsql> start slave; Query OK, 0 rows affected, 1 warning (0.04 sec)

7. Check replication status

greatsql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for source to send event Master_Host: 172.17.137.162 Master_User: root Master_Port: 6001 Connect_Retry: 60 Master_Log_File: binlog.000001 Read_Master_Log_Pos: 2959 Relay_Log_File: relaylog.000002 Relay_Log_Pos: 395 Relay_Master_Log_File: binlog.000001 Slave_IO_Running: Yes Slave_SQL_Running: No Last_Errno: 1062 Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'e8bf88f9-2acd-11ee-a98a-00163e605c74:8' at master log binlog.000001, end_log_pos 2606. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any. Skip_Counter: 0 Exec_Master_Log_Pos: 2307 Relay_Log_Space: 1242 Until_Condition: None greatsql> select * from performance_schema.replication_applier_status_by_worker limit 1; *************************** 1. row *************************** CHANNEL_NAME: WORKER_ID: 1 THREAD_ID: NULL SERVICE_STATE: OFF LAST_ERROR_NUMBER: 1062 LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'e8bf88f9-2acd-11ee-a98a-00163e605c74:8' at master log binlog.000001, end_log_pos 2606; Could not execute Write_rows event on table test.test; Duplicate entry '5' for key 'test.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 2606 LAST_ERROR_TIMESTAMP: 2023-08-08 16:39.033240 APPLYING_TRANSACTION: e8bf88f9-2acd-11ee-a98a-00163e605c74:8 APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2023-08-08 16:45.795753 APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2023-08-08 16:45.795753 APPLYING_TRANSACTION_START_APPLY_TIMESTAMP: 2023-08-08 16:39.032510 1 row in set (0.00 sec)

The replica error shows a primary key conflict.

8. Check scheduled event status on the replica

The event on the replica is currently ENABLED.

greatsql> show events; +------+------------+---------+-----------+-----------+------------+----------------+----------------+---------------------+------+---------+------------+----------------------+----------------------+--------------------+ | Db | Name | Definer | Time zone | Type | Execute at | Interval value | Interval field | Starts | Ends | Status | Originator | character_set_client | collation_connection | Database Collation | +------+------------+---------+-----------+-----------+------------+----------------+----------------+---------------------+------+---------+------------+----------------------+----------------------+--------------------+ | test | event_test | root@% | SYSTEM | RECURRING | NULL | 1 | MINUTE | 2023-08-08 15:58:45 | NULL | ENABLED | 1 | utf8mb4 | utf8mb4_unicode_ci | utf8mb4_unicode_ci | +------+------------+---------+-----------+-----------+------------+----------------+----------------+---------------------+------+---------+------------+----------------------+----------------------+--------------------+ 1 row in set (0.00 sec)

Because the scheduled event on the replica was executing and wrote data, it caused the primary key conflict.

 

9. Resolution

greatsql> alter event event_test DISABLE; Query OK, 0 rows affected (0.01 sec)

Disable the event on the replica, skip the primary key conflict error, and then restart replication.

 

Summary

  1. If the primary runs scheduled jobs, after cloning a replica, disable scheduled jobs on the replica to avoid simultaneous execution that can cause replication failures.
  2. When cloning, if the donor contains replication configuration, the cloned recipient will copy that information and may automatically start replication after the instance restarts. To avoid this, add skip-slave-start to the recipient's configuration to prevent automatic start after reboot.