7.1 Optimizing Backup Performance

This section describes the performance considerations for backup operations with the MySQL Enterprise Backup product. When optimizing and tuning the backup procedure, measure both the raw performance (how long it takes the backup to complete) and the amount of overhead on the database server. When measuring backup performance, consider:

Full or Incremental Backup

After taking a full backup, subsequent backups can be performed more quickly by doing incremental backups, where only the changed data is backed up. For an incremental backup, specify the --incremental or --incremental-with-redo-log-only option to mysqlbackup. See Section 4.1.8, “Incremental Backup Options” for information about these options. For usage instructions for the backup and apply stages of incremental backups, see Section 3.3.2, “Making an Incremental Backup” and Example 5.3, “Applying an Incremental Backup to a Full Backup”.

Compressed Backup

Compressing the backup data before transmitting it to another server involves additional CPU overhead on the database server where the backup takes place, but less network traffic and less disk I/O on the server that is the final destination for the backup data. Consider the load on your database server, the bandwidth of your network, and the relative capacities of the database and destination servers when deciding whether or not to use compression. See Section 3.3.3, “Making a Compressed Backup” and Section 4.1.7, “Compression Options” for information about creating compressed backups.

Compression involves a tradeoff between backup performance and restore performance. In an emergency, the time needed to uncompress the backup data before restoring it might be unacceptable. There might also be storage issues if there is not enough free space on the database server to hold both the compressed backup and the uncompressed data. Thus, the more critical the data is, the more likely that you might choose not to use compression: accepting a slower, larger backup to ensure that the restore process is as fast and reliable as possible.

Single-File Backups

The single-file backup by itself is not necessarily faster than the traditional type of backup that produces a directory tree of output files. Its performance advantage comes from combining different steps that you might otherwise have to perform in sequence, such as combining the backup data into a single output file and transferring it to another server. See Section 4.1.1.4, “Work with Single-File Backups” for the options related to single-file backups, and Section 3.3.5, “Making a Single-File Backup” for usage instructions.

InnoDB Configuration Options Settings

Prior to MySQL 5.5, it was common practice to keep the redo logs fairly small to avoid long startup times when the MySQL server was killed rather than shut down normally. In MySQL 5.5 and higher, the performance of crash recovery is significantly improved. With those releases, you can make your redo log files bigger if that helps your backup strategy and your database workload.

As discussed later, there are a number of reasons why you might prefer to run with the setting innodb_file_per_table=1.

Parallel Backup

The mysqlbackup command can take advantage of modern multicore CPUs and operating system threads to perform backup operations in parallel. See Section 4.1.11, “Performance / Scalability / Capacity Options” for the options to control how many threads are used for different aspects of the backup process. If you see that there is unused system capacity during backups, consider increasing the values for these options and testing whether doing so increases backup performance:

Depending on your operating system, you can measure resource utilization using commands such as top, iostat, sar, dtrace, or a graphical performance monitor. Do not increase the number of read or write threads iowait once the system iowait value reaches approximately 20%.

MyISAM Considerations

IMPORTANT:
  • Although the mysqlbackup command backs up InnoDB tables without interrupting database use, the final stage that copies non-InnoDB files (such as MyISAM tables and .frm files) temporarily puts the database into a read-only state, using the statement FLUSH TABLES WITH READ LOCK. For best backup performance and minimal impact on database processing:

    1. Do not run long SELECT queries or other SQL statements at the time of the backup run.

    2. Keep your MyISAM tables relatively small and primarily for read-only or read-mostly work.

    Then the locked phase at the end of a mysqlbackup run is short (maybe a few seconds), and does not disturb the normal processing of mysqld much. If the preceding conditions are not met in your database application, use the --only-innodb or --only-innodb-with-frm option to back up only InnoDB tables, or use the --no-locking option to back up non-InnoDB files. Note that MyISAM, .frm, and other files copied under the --no-locking setting cannot be guaranteed to be consistent, if they are updated during this final phase of the backup.

  • For a large database, a backup run might take a long time. Always check that mysqlbackup has completed successfully, either by verifying that the mysqlbackup command returned exit code 0, or by observing that mysqlbackup has printed the text mysqlbackup completed OK!.

  • The mysqlbackup command is not the same as the former MySQL Backup open source project from the MySQL 6.0 source tree. The MySQL Enterprise Backup product supersedes the MySQL Backup initiative.

  • Schedule backups during periods when no DDL operations involving tables are running. See Section A.1, “Limitations of mysqlbackup Command” for restrictions on backups at the same time as DDL operations.

Network Performance

For data processing operations, you might know the conventional advice that Unix sockets are faster than TCP/IP for communicating with the database. Although the mysqlbackup command supports the options --protocol=tcp, --protocol=socket, and --protocol=pipe, these options do not have a significant effect on backup or restore performance. These processes involve file-copy operations rather than client/server network traffic. The database communication controlled by the --protocol option is low-volume. For example, mysqlbackup retrieves information about database parameters through the database connection, but not table or index data.

Data Size

If certain tables or databases contain non-critical information, or are rarely updated, you can leave them out of your most frequent backups and back them up on a less frequent schedule. See Section 4.1.9, “Partial Backup Options” for information about the relevant options, and Section 3.3.4, “Making a Partial Backup” for instructions about leaving out data from specific tables, databases, or storage engines. Partial backups are faster because they copy, compress, and transmit a smaller volume of data.

To minimize the overall size of InnoDB data files, consider enabling the MySQL configuration option innodb_file_per_table. This option can minimize data size for InnoDB tables in several ways:

Avoid creating indexes that are not used by queries. Because indexes take up space in the backup data, unnecessary indexes slow down the backup process. (The copying and scanning mechanisms used by mysqlbackup do not rely on indexes to do their work.) For example, it is typically not helpful to create an index on each column of a table, because only one index is used by any query. Because the primary key columns are included in each InnoDB secondary index, it wastes space to define primary keys composed of numerous or lengthy columns, or multiple secondary indexes with different permutations of the same columns.

The Apply-Log Phase

If you store the backup data on a separate machine, and that machine is not as busy the machine hosting the database server, you can offload some postprocessing work (the apply-log phase) to that separate machine. Section 4.1.1.2, “Apply-Log Operations for Existing Backup Data”

There is always a performance tradeoff between doing the apply-log phase immediately after the initial backup (makes restore faster), or postponing it until right before the restore (makes backup faster). In an emergency, restore performance is the most important consideration. Thus, the more crucial the data is, the more important it is to run the apply-log phase immediately after the backup. Either combine the backup and apply-log phases on the same server by specifying the backup-and-apply-log option, or perform the fast initial backup, transfer the backup data to another server, and then perform the apply-log phase using one of the options from Section 4.1.1.2, “Apply-Log Operations for Existing Backup Data”.