|Oracle9i Recovery Manager User's Guide
Release 2 (9.2)
Part Number A96566-01
The primary goal of RMAN tuning is to create an adequate flow of data between disk and storage device. Tuning RMAN backup and restore operations involves the following tasks discussed in this chapter:
RMAN backup and restore operations have the following distinct components:
The slowest of these operations is called the bottleneck. RMAN tuning is the task of identifying the bottleneck (or bottlenecks) and attempting to make it more efficient by using RMAN commands, initialization parameter settings, or adjustments to physical media. The key to tuning RMAN is understanding I/O.
RMAN's backup and restore jobs use two types of I/O buffers:
DISK and tertiary storage (usually tape). When performing a backup, RMAN reads input files using disk buffers and writes the output backup file by using either disk or tape buffers. When performing restores, RMAN reverses these roles.
Besides being divided into
sbt, I/O is also divided into synchronous and asynchronous. Synchronous devices only perform one I/O task at a time. Hence, you can easily determine how much time backup jobs require. In contrast to synchronous I/O, asynchronous I/O can perform more than one task at a time.
To tune RMAN effectively, you must thoroughly understand concepts such as synchronous and asynchronous I/O, disk and tape buffers, and channel architecture. When you understand these concepts, then you can learn how to use fixed views to monitor bottlenecks, and use the techniques described in "Improving RMAN Backup Performance" to solve problems.
This section contains these topics:
RMAN I/O uses two different types of buffers: disk and tape. These buffers are typically different sizes. To understand how RMAN allocates disk buffers, you must understand how RMAN multiplexing works, as described in "Multiplexed Backup Sets". Review this section before proceeding.
RMAN multiplexing is the number of files in a backup read simultaneously and then written to the same backup piece. The degree of multiplexing depends on the
FILESPERSET parameter of the
BACKUP command as well as the
MAXOPENFILES parameter of the
For example, assume that you back up two datafiles with one channel. You set
3 and set
8. In this case, the number of files in each backup set is 2 (the lesser of
FILESPERSET and the files read by each channel), and so the level of multiplexing is
2 (the lesser of
MAXOPENFILES and the number of files in each backup set).
When RMAN backs up from disk, it uses the algorithm described in Table 14-1 to determine how many buffers to allocate and how large to make the buffers.
|If level of multiplexing is . . .||Then . . .|
Less than or equal to 4
RMAN allocates buffers of size 1 MB so that the total buffer size for all the input files is 16 MB. For example, if
Greater than 4 but less than or equal to 8
RMAN allocates disk buffers of size 512 KB so that the total buffer size for all the files is less than 16 MB.
Greater than 8
RMAN allocates a fixed 4 disk buffers of 128 KB for each file, so that the total size is 512 KB for each file.
In the example shown in Figure 14-1, "Disk Buffer Allocation", one channel is backing up four datafiles on a robust striped disk configuration.
MAXOPENFILES is set to
FILESPERSET is set to
4. Hence, the level of multiplexing is 4. So, the total size of the buffers for each datafile is 4 MB.
To calculate the total size of the buffers allocated in a backup set, multiply the total bytes for each datafile by the number of datafiles being concurrently accessed by the channel, and then multiply this number by the number of channels.
Assume that you use one channel to back up four datafiles, and use the settings shown in Figure 14-1. In this case, multiply as follows to obtain the total size of the buffers allocated for the backup:
MAXOPENFILES parameter so that the number of files read simultaneously is just enough to utilize the output device fully. This consideration is especially important when the output device is tape.
If you make a backup to an
sbt device, then Oracle allocates four buffers for each channel for the tape writers (or reads if doing a restore). Oracle allocates these buffers only if the channel is an
sbt channel. Typically, each tape buffer is 256 KB. To calculate the total size of buffers used during a backup or restore, multiply the buffer size by 4, and then multiply this product by the number of channels.
As illustrated in Figure 14-2, assume that you use one tape channel and each buffer is 256 KB. In this case, the total size of buffers used during a backup is as follows:
RMAN allocates the tape buffers in the SGA or the PGA, depending on whether I/O slaves are used. If you set the initialization parameter
true, then RMAN allocates tape buffers from the SGA or the large pool if the
LARGE_POOL_SIZE initialization parameter is set. If you set the parameter to
false, then RMAN allocates the buffers from the PGA.
If you use I/O slaves, then set the
LARGE_POOL_SIZE initialization parameter to set aside SGA memory dedicated to holding these large memory allocations. Hence, the RMAN I/O buffers do not compete with the library cache for SGA memory.
When RMAN reads or writes data, the I/O is either synchronous or asynchronous. When the I/O is synchronous, a server process can perform only one task at a time. When it is asynchronous, a server process can begin an I/O and then perform other work while waiting for the I/O to complete. It can also begin multiple I/O operations before waiting for the first to complete.
You can set initialization parameters that determine the type of I/O. If you set
true, then the tape I/O is asynchronous. Otherwise, the I/O is synchronous. It is recommended that you always set
Some operating systems support native asynchronous I/O, and Oracle takes advantage of this feature if it is available. On operating systems that do not support native asynchronous I/O, Oracle can simulate it by using special I/O slave processes that are dedicated to performing I/O on behalf of another process. You can control disk I/O slaves by setting the
DBWR_IO_SLAVES parameter to a nonzero value. Oracle allocates four backup disk I/O slaves for any nonzero value of
Figure 14-3 shows synchronous I/O in a backup to tape. The following steps occur:
Figure 14-4 shows asynchronous I/O in a tape backup. The following steps occur:
The following factors affect the speed of the backup to tape:
The tape native transfer rate is the speed of writing to a tape without compression. This speed represents the upper limit of the backup rate. The upper limit of your backup performance should be the aggregate transfer rate of all of your tape drives. If your backup is already performing at that rate, and if it is not using an excessive amount of CPU, then RMAN performance tuning will not help.
The level of tape compression is very important for backup performance. If the tape has good compression, then the sustained backup rate is faster. For example, if the compression ratio is 2:1 and native transfer rate of the tape drive is 6 MB/s, then the resulting backup speed is 12 MB/s.
One of the most interesting issues for backup performance is tape streaming. Almost all tape drives currently on the market are fixed-speed, streaming tape drives. In other words, these drives can only write data at one speed. As a result, when they run out of data to write to tape, they must slow down and stop. For example, when the drive's buffer empties, the tape is moving so quickly that it actually overshoots and must rewind past the point where it stopped writing.
The physical tape block size can affect backup performance. The block size is the amount of data written by media management software to a tape in one write operation. The common rule is that a larger tape block size leads to a faster backup. Note that physical tape block size is not controlled by RMAN or the Oracle server, but by media management software. Larger physical tape block size leads to a faster backup. The physical tape block size is controlled by media management software.
You can set various channel limit parameters that apply to operations performed by the allocated server session in the
You can use these parameters to do the following:
You can specify the channel parameters described in Table 14-2.
Specifies the maximum size of a backup piece. Use this parameter to force RMAN to create multiple backup pieces in a backup set. RMAN creates each backup piece with a size no larger than the value specified in the parameter.
Specifies the bytes/second that RMAN reads on this channel. Use this parameter to set an upper limit for bytes read so that RMAN does not consume excessive disk bandwidth and degrade online performance.
For example, set
Determines the maximum number of input files that a backup or copy can have open at a given time (default value is
BACKUP command lets you set parameters that influence how RMAN selects files for input into backup sets. You can set these parameters to do the following:
You can specify the parameters described in Table 14-3.
Specifies the maximum size in bytes of a backup set. Use this parameter to prevent a single backup set from spanning multiple volumes.
Specifies the maximum number of files to place in a backup set. The default value for
For example, if you have fifty input datafiles and two channels, you can set
Assume the datafiles are located on five disks, each disk supplies data at 10 bytes/second, and the tape drive requires 20 bytes/second to keep streaming. If you set
Control the number of datafiles accessed by a channel by setting
Oracle9i Recovery Manager Reference for
Many factors can affect backup performance. Often, finding the solution to a slow backup is a process of trial and error. To get the best performance for a backup, follow the suggested steps in this section:
Make sure that the
RATE parameter is not set on the
CHANNEL commands, as described in Table 14-2. The
RATE parameter is intended to slow down a backup so that you can run it in the background with as little effect as possible on OLTP operations.
RATE parameter specifies units of bytes/second. Test to find a value that improves performance of your queries while still letting RMAN complete the backup in a reasonable amount of time. Note that
RATE is not designed to increase backup throughput, but to decrease backup throughput so that more disk bandwidth is available for other database operations.
If (and only if) you are backing up to an
sbt device, then set the
BACKUP_TAPE_IO_SLAVES initialization parameter to
true to cause the tape buffers to be allocated from the SGA. You can control the buffer size with the
PARMS parameter on the
BACKUP_TAPE_IO_SLAVES initialization parameter simulates asynchronous tape I/O by spawning an additional process to wait for tape I/O completion, leaving the primary process free to process additional disk blocks while waiting for tape I/O to complete. If you do not set this parameter, then I/O to the tape layer is synchronous, which means no other work can occur until the tape is done writing.
BACKUP_TAPE_IO_SLAVES parameter requires that the buffers for the respective disk or tape I/O be allocated from the shared memory (SGA), so that they can be shared between two processes. Therefore, allocate a large enough SGA size to accommodate this memory usage. If you set the
BACKUP_TAPE_IO_SLAVES parameter, then also set the
If (and only if) your disk does not support asynchronous I/O, then try setting the
DBWR_IO_SLAVES initialization parameter to a nonzero value. Any nonzero value for
DBWR_IO_SLAVES causes a fixed number (four) of disk I/O slaves to be used for backup and restore, which simulates asynchronous I/O. If I/O slaves are used, I/O buffers are obtained from the SGA (or the large pool, if configured).
Set this initialization parameter only if Oracle reports an error in the
alert.log stating that it does not have enough memory and that it will not start I/O slaves. The message looks something like the following:
ksfqxcre: failure to allocate shared memory means sync I/O will be used whenever async I/O to file not supported natively
When attempting to get shared buffers for I/O slaves, Oracle does the following:
LARGE_POOL_SIZEis set, then Oracle attempts to get memory from the large pool. If this value is not large enough, then Oracle does not try to get buffers from the shared pool.
LARGE_POOL_SIZEis not set, then Oracle attempts to get memory from the shared pool.
logfile indicating that synchronous I/O is used for this backup.
The memory from the large pool is used for many features, including the shared server (formerly called multi-threaded server), parallel query, and RMAN I/O slave buffers. Configuring the large pool prevents RMAN from competing with other subsystems for the same memory.
Requests for contiguous memory allocations from the shared pool are usually small (under 5 KB) in size. However, it is possible that a request for a large contiguous memory allocation can either fail or require significant memory housekeeping to release the required amount of contiguous memory. Although the shared pool may be unable to satisfy this memory request, the large pool is able to do so. The large pool does not have a least recently used (LRU) list; Oracle does not attempt to age memory out of the large pool.
LARGE_POOL_SIZE initialization parameter to configure the large pool. To see in which pool (shared pool or large pool) the memory for an object resides, query
The Oracle9i formula for setting
LARGE_POOL_SIZE is as follows:
For backups to disk, the tape buffer size is obviously 0, so set
LARGE_POOL_SIZE to 16 MB. For tape backups, the size of a single tape buffer is defined by the RMAN channel parameter
BLKSIZE, which defaults to 256 KB. Assume a case in which you are backing up to two tape drives. If the tape buffer size is 256 KB, then set
LARGE_POOL_SIZE to 18 MB. If you increase
BLKSIZE to 512 KB, then increase
LARGE_POOL_SIZE to 20 MB.
As explained in "Multiplexed Backup Sets", the level of multiplexing is determined by the following factors:
MAXOPENFILESvalue and the number of files going into in each backup set
You should adjust the level of multiplexing to account for the disk configuration on the server. Striped disk configurations involve hardware multiplexing, so that the level of RMAN multiplexing does not need to be as high. For example, consider the following disk configuration scenarios:
For example, if each channel reads fifteen datafiles, and
MAXOPENFILES=8, then you can calculate the level of multiplexing as follows:
If the datafiles are striped across two disks, then the multiplexing level is too high. In this case, you should set
MAXOPENFILES to a lower value such as
When performing a full backup of files that are largely empty, or when performing an incremental backup when few blocks have changed, you may not be able to supply data to the tape fast enough to keep it streaming. In either case, you can improve performance by increasing the level of multiplexing.
A good way to test whether your tape backup performance is slow because of empty files is to try backing up archived logs, which contain nothing but data (in other words, they do not contain empty space).
An incremental backup is an RMAN backup in which only modified blocks are backed up. Incremental backups are not necessarily faster than full backups because Oracle still reads the entire datafile to take an incremental backup. If tape drives are not locally attached, then incremental backups can be faster. You must consider how much bandwidth exists for reading the disks compared to the bandwidth for writing to the tapes. If tape bandwidth is limited compared to disk, then incremental backups may help.
If only a few blocks have changed in an incremental backup, then you need to input many buffers from the datafile before you accumulate enough blocks to fill a buffer and write to tape. Hence, the tape drive may not stream.
If you set the level of multiplexing (as described in "Step 5: Adjust the Level of Multiplexing") to a large value, then you can scan many datafiles in parallel, the output buffers for the tape drive are filled quickly, and you can write them frequently o keep the drive streaming. The
FILESPERSET value should be less than or equal to
MAXOPENFILES. For example, set both parameters to
8, and raise this value if the tape drive does not stream. For an incremental backup,
50 is a good value for the level of multiplexing. For a full or incremental level 0 backup, the level of multiplexing should be a lower value such as
If the tape is not streaming, but the problem is not due to an incremental backup or by backing up empty files, then you can try adjusting the block size of the tape buffer. You can change the size of each tape buffer using the
PARMS parameter of the
CHANNEL command. If the
BLKSIZE parameter for
PARMS is supported on your platform, then you can set it to the desired size of each buffer. For example, configure an
sbt channel as follows:
A good rule of thumb is to set
BLKSIZE to a value that is a little less than the tape block size of the media manager. What "a little less" means depends on the media manager. For example, if the tape block size is 512 KB and the media manager has a header of size 16 KB, then you can set
Note that it is also a good idea to increase the media management physical tape block size. For example, you do not want to set the
BLKSIZE parameter to 512 KB and leave the physical tape block size as 32 KB.
If none of the previous steps improves backup performance, then try to determine the exact source of the bottleneck. Use the
V$BACKUP_ASYNC_IO views to determine the source of backup or restore bottlenecks and to see detailed progress of backup jobs.
V$BACKUP_SYNC_IO contains rows when the I/O is synchronous to the process (or thread on some platforms) performing the backup.
V$BACKUP_ASYNC_IO contains rows when the I/O is asynchronous. Asynchronous I/O is obtained either with I/O processes or because it is supported by the underlying operating system.
This section contains these topics:
Oracle9i Database Reference for more information about these views
To determine whether your tape is streaming when the I/O is synchronous, query the
EFFECTIVE_BYTES_PER_SECOND column in the
V$BACKUP_ASYNC_IO view. Table 14-4 describes how to use this column.
|If EFFECTIVE_BYTES_PER_SECOND is . . .||Then . . .|
Less than the raw capacity of the hardware
The tape is not streaming. When the tape is not streaming the performance drop is typically huge.
More than the raw capacity of the hardware
The tape may be streaming, depending on the compression ratio of the data.
With synchronous I/O, it is difficult to identify specific bottlenecks because all synchronous I/O is a bottleneck to the process. The only way to tune synchronous I/O is to compare the rate (in bytes/second) with the device's maximum throughput rate. If the rate is lower than the rate that the device specifies, then consider tuning this aspect of the backup and restore process. The
DISCRETE_BYTES_PER_SECOND column in the
V$BACKUP_SYNC_IO view displays the I/O rate. Note that if you see data in
V$BACKUP_SYNC_IO, then the problem is that you have not enabled asynchronous I/O or you are not using disk I/O slaves.
Long waits are the number of times the backup or restore process told the operating system to wait until an I/O was complete. Short waits are the number of times the backup or restore process made an operating system call to poll for I/O completion in a nonblocking mode. Ready indicates the number of time when I/O was already ready for use and so there was no need to made an operating system call to poll for I/O completion.
The simplest way to identify the bottleneck is to query
V$BACKUP_ASYNC_IO for the datafile that has the largest ratio for
LONG_WAITS divided by
If you have synchronous I/O but you have set
Oracle9i Database Reference for descriptions of the