Restoring the redundancy of an Oracle ASM disk group after a transient disk path failure can be time consuming. This is especially true if the recovery process requires rebuilding an entire Oracle ASM disk group. Oracle ASM fast mirror resync significantly reduces the time to resynchronize a failed disk in such situations. When you replace the failed disk, Oracle ASM can quickly resynchronize the Oracle ASM disk extents.
To use this feature, the disk group compatibility attributes must be set to
11.1 or higher. For more information, refer to "Disk Group Compatibility".
Any problems that make a failure group temporarily unavailable are considered transient failures that can be recovered by the Oracle ASM fast mirror resync feature. For example, transient failures can be caused by disk path malfunctions, such as cable failures, host bus adapter failures, controller failures, or disk power supply interruptions.
Oracle ASM fast resync keeps track of pending changes to extents on an offline disk during an outage. The extents are resynced when the disk is brought back online.
By default, Oracle ASM drops a disk in 3.6 hours after it is taken offline. You can set the
DISK_REPAIR_TIME disk group attribute to delay the drop operation by specifying a time interval to repair the disk and bring it back online. The time can be specified in units of minutes (
M) or hours (
H). If you omit the unit, then the default unit is hours. The
DISK_REPAIR_TIME disk group attribute can only be set with the
DISKGROUP SQL statement and is only applicable to normal and high redundancy disk groups.
If the attribute is not set explicitly, then the default value (
3.6h) applies to disks that have been set to
OFFLINE mode without an explicit
AFTER clause. Disks taken offline due to I/O errors do not have a
DISK_REPAIR_TIME attribute value is an estimate that should be adequate for most environments. However, ensure that the attribute value is set to the amount of time that you think is necessary in your environment to fix any transient disk error, and during which you are able to tolerate reduced data redundancy.
The elapsed time (since the disk was set to
OFFLINE mode) is incremented only when the disk group containing the offline disks is mounted. The
REPAIR_TIMER column of
V$ASM_DISK shows the amount of time left (in seconds) before an offline disk is dropped. After the specified time has elapsed, Oracle ASM drops the disk. You can override this attribute with the
DISK statement and the
If a disk is offlined by Oracle ASM because of an I/O (write) error or is explicitly offlined using the
OFFLINE statement without the
AFTER clause, then the value specified for the
DISK_REPAIR_TIME attribute for the disk group is used.
DISK_REPAIR_TIME attribute has no effect on offline disks. The new value is used for any disks that go offline after the attribute is updated. You can confirm this behavior by viewing the Oracle ASM alert log.
If an offline disk is taken offline for a second time, then the elapsed time is reset and restarted. If another time is specified with the
AFTER clause for this disk, the first value is overridden and the new value applies. A disk that is in
OFFLINE mode cannot be dropped with an
DISK statement; an error is returned if attempted. If for some reason the disk must be dropped (such as the disk cannot be repaired) before the repair time has expired, a disk can be dropped immediately by issuing a second
OFFLINE statement with a
AFTER clause specifying
You can use
DISKGROUP to set the
DISK_REPAIR_TIME attribute to a specified hour or minute value, such as 4.5 hours or 270 minutes. For example:
ALTER DISKGROUP data SET ATTRIBUTE 'disk_repair_time' = '4.5h' ALTER DISKGROUP data SET ATTRIBUTE 'disk_repair_time' = '270m'
After you repair the disk, run the SQL statement
DISK. This statement brings a repaired disk group back online to enable writes so that no new writes are missed. This statement also starts a procedure to copy of all of the extents that are marked as stale on their redundant copies.
If a disk goes offline when the Oracle ASM instance is in rolling upgrade mode, the disk remains offline until the rolling upgrade has ended and the timer for dropping the disk is stopped until the Oracle ASM cluster is out of rolling upgrade mode. See "Upgrading and Patching Oracle ASM". Examples of taking disks offline and bringing them online follow.
The following example takes disk
DATA_001 offline and drops it after five minutes.
ALTER DISKGROUP data OFFLINE DISK DATA_001 DROP AFTER 5m;
The next example takes the disk
DATA_001 offline and drops it after the time period designated by
ALTER DISKGROUP data OFFLINE DISK DATA_001;
This example takes all of the disks in failure group
FG2 offline and drops them after the time period designated by
DISK_REPAIR_TIME elapses. If you used a
AFTER clause, then the disks would be dropped after the specified time:
ALTER DISKGROUP data OFFLINE DISKS IN FAILGROUP FG2;
The next example brings all of the disks in failure group
ALTER DISKGROUP data ONLINE DISKS IN FAILGROUP FG2;
This example brings only disk
ALTER DISKGROUP data ONLINE DISK DATA_001;
This example brings all of the disks in disk group
ALTER DISKGROUP data ONLINE ALL;
V$ASM_OPERATION view while you run
ONLINE statements displays the name and state of the current operation that you are performing. For example, the following SQL query shows values in the
PASS column during an online operation.
SQL> SELECT GROUP_NUMBER, PASS, STATE FROM V$ASM_OPERATION; GROUP_NUMBER PASS STAT ------------ --------- ---- 1 RESYNC RUN 1 REBALANCE WAIT 1 COMPACT WAIT
An offline operation does not generate a display in a
V$ASM_OPERATION view query.
You can set the
CONTENT.TYPE disk group attributes. The
FAILGROUP_REPAIR_TIME disk group attribute specifies a default repair time for the failure groups in the disk group. The
CONTENT.TYPE disk group attribute specifies the type of data expected to be stored in a disk group. You can set these attributes with ASMCA, ASMCMD
mkdg, or SQL
DISKGROUP statements. For information about disk group attributes, refer to "Managing Disk Group Attributes".
lsop command shows the resync time estimate. There are separate rows in the
V$ASM_OPERATION table for different phases of rebalance: disk resync, rebalance, and data compaction.
online command has a
power option to specify the power for the online operation. The SQL
DISK statement also has the power option.
chdg command provides the
replace option in addition to the
drop tags. The ASMCMD
mkdg command has an additional time parameter (
-t) to specify the time to offline a failure group.