Skip to content

MySQL snapshot consistency: quiescing the engine before a block snapshot

Why a raw LVM/ZFS snapshot of a MySQL data directory is not application-consistent, and the FLUSH/lock dance that turns it into one — for both InnoDB and mixed-storage-engine setups.

A block-level snapshot of a MySQL data directory — taken via LVM thin, ZFS, btrfs, or a cloud-provider volume snapshot — is by default only crash-consistent, not application-consistent. That distinction is the difference between “the backup recovers cleanly on first try” and “the backup recovers but innodb_force_recovery=4 was needed and we lost two minutes of transactions.” This article walks through the quiesce dance that turns a crash-consistent snapshot into an application-consistent one, the InnoDB-specific path, and the runbook we apply on engagements doing snapshot-based MySQL backups.

How to verify

Before the runbook, confirm engine, GTID mode, and binlog state:

mysql -e "SHOW VARIABLES LIKE 'default_storage_engine';"
mysql -e "SHOW VARIABLES LIKE 'gtid_mode';"
mysql -e "SHOW MASTER STATUS\G"
mysql -e "SHOW ENGINE INNODB STATUS\G" | grep -A 2 "Pending normal aio"
mysql -e "SELECT @@hostname, @@version, @@datadir;"

GTID mode and binlog position are what you’ll record before the snapshot so a restore can resume replication from the right LSN.

What’s happening

MySQL writes to the filesystem through buffered I/O and per-engine logs. At any moment there are:

  • Dirty pages in the InnoDB buffer pool not yet flushed to the tablespace files.
  • Pending writes in the InnoDB redo log (ib_logfile0, ib_logfile1, or #innodb_redo/).
  • Open transactions that have written to undo segments but have not committed.
  • Per-table changes in the binary log that may not be sync’d to disk.

A block snapshot taken at this moment captures whatever is on the storage layer — which is internally inconsistent. InnoDB’s crash recovery can usually fix it on startup (the redo log is the whole point of InnoDB’s ARIES-style recovery), but “usually” is not good enough for backups. The fix is to put MySQL into a known consistent state immediately before the snapshot fires.

For pure InnoDB: FLUSH TABLES WITH READ LOCK (FTWRL) plus a metadata-locked log-position read is the minimum. For mixed engines (MyISAM, Aria), FTWRL is mandatory because those engines have no redo log and no recovery path.

A better path for pure InnoDB is LOCK INSTANCE FOR BACKUP (MySQL 8.0+) — it blocks DDL but allows read traffic, then FLUSH TABLES issues the actual storage-layer flush. The combination gives application-consistent snapshots with much less impact than FTWRL.

The procedure

This is the runbook we automate. Adjust for engine mix and MySQL version.

  1. Open a long-lived MySQL session that will hold the lock for the snapshot duration. Cannot be a connection pool client.

    mysql --host=localhost --user=backup --password
  2. In that session, freeze the engine. MySQL 8.0+ pure InnoDB:

    LOCK INSTANCE FOR BACKUP;
    FLUSH TABLES;

    MySQL 5.7 or mixed-engine setups:

    FLUSH TABLES WITH READ LOCK;
  3. Record the binlog position and GTID set. These are what a restored replica needs to resume replication.

    SHOW MASTER STATUS\G
    -- record File: binlog.000123, Position: 4567, Executed_Gtid_Set: ...
  4. Outside the MySQL session, in a separate shell, take the block snapshot.

    lvcreate -s -n mysql-snap-$(date +%Y%m%dT%H%M%S) vg0/mysql
    # or zfs snapshot tank/mysql@2026-06-01T02:00
    # or btrfs subvolume snapshot -r /var/lib/mysql /snap/mysql-...
    # or cloud-provider snapshot CLI
  5. Immediately release the lock in the MySQL session. Holding FTWRL longer than necessary blocks every writer.

    UNLOCK TABLES;
    -- or for MySQL 8.0+ pure InnoDB: UNLOCK INSTANCE;
  6. Mount the snapshot read-only and copy/stream the data directory to the backup destination.

    mkdir /mnt/mysql-snap
    mount -o ro,nouuid /dev/vg0/mysql-snap-2026-06-01T02:00 /mnt/mysql-snap
    restic -r s3:... backup --tag mysql-prod /mnt/mysql-snap
  7. Drop the snapshot and unmount.

    umount /mnt/mysql-snap
    lvremove -f vg0/mysql-snap-2026-06-01T02:00
  8. Verify the restore in a scratch environment monthly. Restore the snapshot to a fresh MySQL host, start mysqld, watch the error log for [Note] InnoDB: Starting crash recovery. — zero recovery messages mean the quiesce worked.

Common pitfalls

  • Snapshotting without locking. The most common mistake. The recovery on startup may succeed silently but you’ve eaten a chance of corruption that nobody notices until the disaster restore.
  • Locking but not flushing. LOCK INSTANCE FOR BACKUP alone does not flush dirty pages — FLUSH TABLES after it is required.
  • Holding the lock during the entire backup copy. Take the snapshot first, then release the lock, then copy from the snapshot mount. Otherwise you block every writer for the duration of the upload.
  • Forgetting the binlog position. A restored replica needs CHANGE REPLICATION SOURCE TO with the right position; without it you cannot resume async replication and the standby is dead.
  • Mixed engines. InnoDB recovery saves InnoDB. MyISAM has no recovery; if you have any MyISAM tables in the dataset, FTWRL is non-negotiable.
  • InnoDB innodb_flush_log_at_trx_commit=0 or =2. With these settings, even the redo log is not synced on commit, so a snapshot might miss recently-committed transactions even with the lock. We set this to 1 for any cluster doing snapshot backups.

In the engagements we run, snapshot-based MySQL backups are paired with Percona XtraBackup for the hot-streaming path and pgBackRest-equivalent-style tooling. The quiesce runbook above is what we wrap around the snapshot when we need an LVM/ZFS-level consistent capture — for example to feed a BorgBackup tier or a cloud-provider snapshot service. The full operating wrap is at /en/services/managed-operations/.