Handling an Unexpected Halt of a Replica
• When GTID-based replication is in use (gtid_mode=ON), set SOURCE_AUTO_POSITION=1 |
MASTER_AUTO_POSITION=1, which activates GTID auto-positioning for the connection to the
source to automatically identify and retrieve missing transactions. This option is set using a CHANGE
REPLICATION SOURCE TO statement (from MySQL 8.0.23) or CHANGE MASTER TO statement
(before MySQL 8.0.23). If the replica has multiple replication channels, you need to set this option
for each channel individually. For details of how GTID auto-positioning works, see Section 2.3.3,
“GTID Auto-Positioning”. When file position based replication is in use, SOURCE_AUTO_POSITION=1
| MASTER_AUTO_POSITION=1 is not used, and instead the binary log position or relay log position is
used to control where replication starts.
• From MySQL 8.0.27, when GTID-based replication is in use (gtid_mode=ON), set GTID_ONLY=1,
which makes the replica use only GTIDs in the recovery process, and stop persisting binary log and
relay log file names and file positions in the replication metadata repositories. This option is set using
a CHANGE REPLICATION SOURCE TO statement. If the replica has multiple replication channels,
you need to set this option for each channel individually. With GTID_ONLY=1, during recovery, the file
position information is ignored and GTID auto-skip is used to skip transactions that have already been
supplied, rather than identifying the correct file position. This strategy is more efficient provided that you
purge relay logs using the default setting for relay_log_purge, which means only one relay log file
needs to be inspected.
• Set sync_relay_log=1, which instructs the replication receiver thread to synchronize the relay log to
disk after each received transaction is written to it. This means the replica's record of the current position
read from the source's binary log (in the applier metadata repository) is never ahead of the record of
transactions saved in the relay log. Note that although this setting is the safest, it is also the slowest due
to the number of disk writes involved. With sync_relay_log > 1, or sync_relay_log=0 (where
synchronization is handled by the operating system), in the event of an unexpected halt of a replica there
might be committed transactions that have not been synchronized to disk. Such transactions can cause
the recovery process to fail if the recovering replica, based on the information it has in the relay log as
last synchronized to disk, tries to retrieve and apply the transactions again instead of skipping them.
Setting sync_relay_log=1 is particularly important for a multi-threaded replica, where the recovery
process fails if gaps in the sequence of transactions cannot be filled using the information in the relay
log. For a single-threaded replica, the recovery process only needs to use the relay log if the relevant
information is not available in the applier metadata repository.
• Set innodb_flush_log_at_trx_commit=1, which synchronizes the InnoDB logs to disk before
each transaction is committed. This setting, which is the default, ensures that InnoDB tables and the
InnoDB logs are saved on disk so that there is no longer a requirement for the information in the relay
log regarding the transaction. Combined with the setting sync_relay_log=1, this setting further
ensures that the content of the InnoDB tables and the InnoDB logs is consistent with the content of
the relay log at all times, so that purging the relay log files cannot cause unfillable gaps in the replica's
history of transactions in the event of an unexpected halt.
• Set relay_log_info_repository = TABLE, which stores the replication SQL thread position in the
InnoDB table mysql.slave_relay_log_info, and updates it together with the transaction commit to
ensure a record that is always accurate. This setting is the default from MySQL 8.0, and the FILE setting
is deprecated. From MySQL 8.0.23, the use of the system variable itself is deprecated, so omit it and
allow it to default. If the FILE setting is used, which was the default in earlier releases, the information is
stored in a file in the data directory that is updated after the transaction has been applied. This creates a
risk of losing synchrony with the source depending at which stage of processing a transaction the replica
halts at, or even corruption of the file itself. With the setting relay_log_info_repository = FILE,
recovery is not guaranteed.
• Set relay_log_recovery = ON, which enables automatic relay log recovery immediately following
server startup. This global variable defaults to OFF and is read-only at runtime, but you can set it to ON
with the --relay-log-recovery option at replica startup following an unexpected halt of a replica.
191