7 Understanding Oracle ACFS Advanced Topics
Oracle ACFS advanced topics include discussions about more complex administrative issues.
This appendix discusses Oracle Advanced Cluster File System (Oracle ACFS) advanced topics, including limits, advanced administration, troubleshooting, and patching.
See Also:
Articles available at My Oracle Support (https://support.oracle.com
) for information about Oracle ACFS and Oracle ADVM.
This appendix contains the following topics:
-
How to Clone a Full Database (non-CDB or CDB) with ACFS Snapshots
-
Steps to perform an RMAN sparse backup and restore of a PDB using ACFS fshares
-
Steps to perform an RMAN sparse backup and restore of a PDB using an ACFS snapshot
-
Oracle ACFS Plug-in Generic Application Programming Interface
-
Oracle ACFS Tagging Generic Application Programming Interface
For an overview of Oracle ACFS, see Introducing Oracle ACFS and Oracle ADVM.
Limits of Oracle ACFS
The limits of Oracle ACFS are discussed in this section.
The topics contained in this section are:
Note:
Oracle ACFS does not support hard links on directories.
Oracle ACFS Disk Space Usage
Oracle ACFS supports 256 mounts on 64-bit systems. However, more file systems can be mounted if there is adequate memory.
Oracle ACFS supports 2^40 (1 trillion) files in a file system. More than 4 billion files have been tested. There is no absolute limit to the number of directories in a file system; the limit is based on hardware resources.
Oracle ACFS preallocates large user files to improve performance when writing data. This storage is not returned when the file is closed, but it is returned when the file is deleted. Oracle ACFS also allocates local metadata files as nodes mount the file system for the first time. This storage is approximately 64-128 megabytes per node.
Oracle ACFS also keeps local bitmaps available to reduce contention on the global storage bitmap when searching for free space. This disk space is reported as in
use
by tools such as the Linux df
command even though some space may not actually be allocated yet. This local storage pool can be as large as 128 megabytes per node and can allow space allocations to succeed, even though commands, such as df
, report less space available than what is being allocated.
The maximum sizes that can be allocated to an Oracle ACFS file system are shown in Table 7-1. The storage limits for Oracle ACFS and Oracle ASM are dependent on disk group compatibility attributes.
Table 7-1 Maximum file sizes for Oracle ACFS file systems/Oracle ADVM volumes
Redundancy | Disk Group with COMPATIBLE.ASM < 12.2.0.1 | Disk Group with COMPATIBLE.ASM >= 12.2.0.1 |
---|---|---|
External |
128 TB |
128 TB |
Normal |
64 TB |
128 TB |
High |
42.6 TB |
128 TB |
Note:
Customers with Compatible.ASM >= 19.0
wanting a 1PB ADVM volume can
set the following diskgroup attribute to ’64
’. This parameter only
effects newly created volumes:
SQL> alter diskgroup my_dg set attribute
'advm_extent_size_mb'='64';
See Also:
-
Oracle Automatic Storage Management Administrator's Guide for information about file size limits and disk group compatibility settings
-
Oracle Automatic Storage Management Administrator's Guide for information about storage limits for Oracle ASM files and disk groups
Oracle ACFS Error Handling
Oracle ASM instance failure or forced shutdown while Oracle ACFS or another file system is using an Oracle ADVM volume results in I/O failures. The volumes must be closed and re-opened to access the volume again. This requires dismounting any file systems that were mounted when the local Oracle ASM instance failed. After the instance is restarted, the corresponding disk group must be mounted with the volume enabled followed by a remount of the file system. See "Deregistering, Dismounting, and Disabling Volumes and Oracle ACFS File Systems".
If any file systems are currently mounted on Oracle ADVM volume files, the SHUTDOWN
ABORT
command should not be used to terminate the Oracle ASM instance without first dismounting those file systems. Otherwise, applications encounter I/O errors and Oracle ACFS user data and metadata being written at the time of the termination may not be flushed to storage before the Oracle ASM storage is fenced. If there is not time to permit the file system to dismount, then you should run two sync
(1) commands to flush cached file system data and metadata to persistent storage before issuing the SHUTDOWN
ABORT
operation.
Oracle ACFS does not interrupt the operating system environment when a metadata write fails, whether due to Oracle ASM instance failure or storage failure. Instead, Oracle ACFS isolates errors to a specific file system, putting it in an offline error state. The only operation that succeeds on that node for that file system from that point forward is a dismount operation. Another node recovers any outstanding metadata transactions, assuming it can write the metadata out to the storage. It is possible to remount the file system on the offlined node after the I/O condition is resolved.
It might not be possible for an administrator to dismount a file system while it is in the offline error state if there are processes referencing the file system, such as a directory of the file system being the current working directory for a process. To dismount the file system in this case it would be necessary to identify all processes on that node with references to files and directories on the file system and cause them to exit. The Linux fuser
or lsof
commands list information about processes and open files.
If Oracle ACFS detects inconsistent file metadata returned from a read operation, based on checksum or expected type comparisons, Oracle ACFS takes the appropriate action to isolate the affected file system components and generate a notification that fsck
should be run as soon as possible. Each time the file system is mounted a notification is generated with a system event logger message until fsck
is run.
Oracle ACFS and NFS
When exporting file
systems through NFS on Linux, use the -fsid=num
exports option. This option forces the file system
identification portion of the file handle used to communicate
with NFS clients to be the specified number instead of a number
derived from the major and minor number of the block device on
which the file system is mounted. You can use any 32-bit number
for num
, but it must be unique among all the
exported file systems. In addition, num
must be
unique among members of the cluster and must be the same
num
on each member of the cluster
for a given file system. This is needed because Oracle ADVM
block device major numbers are not guaranteed to be the same
across restarts of the same node or across different nodes in
the cluster.
Note:
Oracle ASM Dynamic Volume Manager (Oracle ADVM) volumes and Oracle Advanced Cluster File System (Oracle ACFS) file systems are currently not supported on disk groups that have been created from NFS or Common Internet File System (CIFS) files. However, Oracle ACFS file systems may be exported as NFS or CIFS file systems to network clients in some cases. Samba/CIFS clients on Windows cannot use ACLs when interfacing with Oracle ACFS Linux, Solaris, or AIX servers.When using High Availability NFS for Grid Home Clusters (HANFS), HANFS automatically handles the situation described in the previous paragraph. For information about HANFS, refer to "High Availability Network File Storage for Oracle Grid Infrastructure".
Limits of Oracle ADVM
The limits of Oracle ADVM are discussed in this topic.
The default configuration for an Oracle ADVM volume is 8 columns and a 1 MB stripe width. The default volume extent size (64 MB).
Setting the number of columns on an Oracle ADVM dynamic volume to 1
effectively turns off striping for the Oracle ADVM volume. Setting the columns to 8 (the default) is recommended to achieve optimal performance with database data files and other files.
On Linux platforms Oracle ASM Dynamic Volume Manager (Oracle ADVM) volume devices are created as block devices regardless of the configuration of the underlying storage in the Oracle ASM disk group. Do not use raw
(8)
to map Oracle ADVM volume block devices into raw volume devices.
For information about ASMCMD commands to manage Oracle ADVM volumes, refer to Managing Oracle ADVM with ASMCMD.
How to Clone a Full Database (non-CDB or CDB) with ACFS Snapshots
ACFS snapshots are sparse, point-in-time copies of the filesystem and this can be used to create full DB clones as well as clones of PDBs using PDB snapshot cloning when the DB is on ACFS (User Interface for PDB Cloning). ACFS snapshots can be used in test and development environments to create quick and space efficient clones of a test master. This section explains the steps required to create a full DB clone using ACFS snaps with an example.
Test setup: We have a single test master CDB called
SOURCE
that will be cloned. The CDB has ten PDBs by name
sourcepdb[1-10]
and each of them is loaded with a OLTP schema. This is
a Real Application Cluster database (RAC) and the instances are running on both the nodes.
The datafiles, redo logs and controlfiles are stored in ACFS mounted at
"/mnt/dbvol
". This filesystem is created on DATA
diskgroup. Recovery logs and archive logs are stored on a filesystem mounted at
"/mnt/rvol
" that is created on top of the RECO
diskgroup. Note that ACFS snaps will be contained within the filesystem and can be accessed
through the same mountpoint.
Oracle highly recommends periodically creating backups of the test master database to provide a recovery method in case of issues.
For more detailed information regarding the configuration of ACFS Spanpshots on Exadata, see Setting up Oracle Exadata Storage Snapshots
For more information about different ACFS Snapshot use cases, please see My Oracle Support (MOS) note Oracle ACFS Snapshot Use Cases on Exadata (Doc ID 2761360.1).
Steps to Create and Use the Clone
Oracle ACFS Loopback Support
Oracle ACFS supports loopback functionality on the Linux operating system, enabling Oracle ACFS files to be accessed as devices.
An Oracle ACFS loopback device is an operating system pseudo-device that enables an Oracle ACFS file to be accessed as a block device. This functionality can be used with Oracle Virtual Machines (OVM) in support of OVM images, templates, and virtual disks (vdisks) created in Oracle ACFS file systems and presented through Oracle ACFS loopback devices.
Oracle ACFS loopback functionality provides performance gains over NFS. Files can be sparse or non-sparse.
In addition to general loopback support, Oracle ACFS also provides support for loopback direct I/O (DIO) on sparse images.
Oracle ACFS Drivers Resource Management
Oracle ACFS, Oracle ADVM, and OKS drivers are loaded during the start of the Oracle Grid Infrastructure stack, except in an Oracle Restart configuration. The drivers remain loaded until the system is rebooted, at which point, they are loaded again when the Oracle Grid Infrastructure stack restarts.
For information about commands to manage Oracle ACFS, Oracle ADVM, and OKS drivers, refer to "Oracle ACFS Driver Commands".
Oracle ACFS Registry Resource Management
The Oracle ACFS registry resource is supported only for Oracle Grid Infrastructure cluster configurations; it is not supported for Oracle Restart configurations. See "Oracle ACFS and Oracle Restart".
With Oracle ASM 12c Release 1 (12.1), the Oracle ACFS registry uses the standard single file system resource available through the SRVCTL file system interface. For more information, refer to "Oracle ACFS File System Resource Management". Using SRVCTL enables applications to depend on registered file systems, such as for management of the registered file systems using srvctl
filesystem
. By default, acfsutil
registry
shows only file systems that are set to be always mounted, with the AUTO_START
attribute set to always
.
The Oracle ACFS registry requires root privileges to register and delete file systems, however, other users can be entitled to start and stop (mount and unmount) the file systems by use of the user
option.
Oracle ACFS File System Resource Management
The Oracle ACFS file system resource is supported only for Oracle Grid Infrastructure cluster configurations; it is not supported for Oracle Restart configurations. See "Oracle ACFS and Oracle Restart".
Oracle ASM Configuration Assistant (ASMCA) facilitates the creation of Oracle ACFS file system resources (ora.
diskgroup
.
volume
.acfs
). During database creation with Database Configuration Assistant (DBCA), the Oracle ACFS file system resource is included in the dependency list of its associated disk group so that stopping the disk group also attempts to stop any dependent Oracle ACFS file systems.
An Oracle ACFS file system resource is typically created for use with application resource dependency lists. For example, if an Oracle ACFS file system is configured for use as an Oracle Database home, then a resource created for the file system can be included in the resource dependency list of the Oracle Database application. This dependency causes the file system and stack to be automatically mounted due to the start action of the database application.
The start action for an Oracle ACFS file system resource is to mount the file system. This Oracle ACFS file system resource action includes confirming that the associated file system storage stack is active and mounting the disk group, enabling the volume file, and creating the mount point if necessary to complete the mount operation. If the file system is successfully mounted, the state of the resource is set to online
; otherwise, it is set to offline
.
The check action for an Oracle ACFS file system resource verifies that the file system is mounted. It sets the state of the resource to online
status if mounted, otherwise the status is set to offline
.
The stop action for an Oracle ACFS file system resource attempts to dismount the file system. If the file system cannot be dismounted due to open references, the stop action displays and logs the process identifiers for any processes holding a reference.
Use of the srvctl
start
and stop
actions to manage the Oracle ACFS file system resources maintains their correct resource state.
Oracle ACFS and Oracle Restart
Oracle Restart does not support root-based Oracle ACFS resources for this release. Consequently, the following operations are not automatically performed:
-
Loading Oracle ACFS drivers
On Linux, drivers are automatically loaded and unloaded at system boot time and system shutdown time. If an action is required while the system is running or the system is running on other operating system (OS) versions, you can load or unload the drivers manually with the
acfsload
command. However, if the drivers are loaded manually,then the Oracle ACFS drivers must be loaded before the Oracle Restart stack is started.For more information, refer to acfsload.
-
Mounting Oracle ACFS file systems listed in the Oracle ACFS mount registry
The Oracle ACFS mount registry is not supported in Oracle Restart. However, Linux entries in the
/etc/fstab
file with a valid Oracle ASM device do have the associated volume enabled and are automatically mounted on system startup and unmounted on system shutdown. Note that high availability (HA) recovery is not applied after the file system is mounted; that functionality is a one time action.A valid
fstab
entry has the following format:device mount_point acfs noauto 0 0
For example:
/dev/asm/dev1-123 /mntpoint acfs noauto 0 0
The last three fields in the previous example prevent Linux from attempting to automatically mount the device and from attempting to run other system tools on the device. This action prevents errors when the Oracle ASM instance is not available at times during the system startup. Additional standard
fstab
syntax options may be added for the file system mount.Should a mount or unmount operation be required on other OS versions, or after the system is started, you can mount Oracle ACFS file systems manually with the
mount
command. For information, refer to Managing Oracle ACFS with Command-Line Tools. -
Mounting resource-based Oracle ACFS database home file systems
The Oracle ACFS resources associated with these actions are not created for Oracle Restart configurations. While Oracle ACFS resource management is fully supported for Oracle Grid Infrastructure configurations, the Oracle ACFS resource-based management actions must be replaced with alternative, sometimes manual, operations in Oracle Restart configurations. During an attempt to use commands, such as
srvctl
, that register a root-based resource in Oracle Restart configurations, an appropriate error is displayed.
Oracle ACFS Driver Commands
This
section describes the Oracle ACFS driver commands that are used during installation to
manage Oracle ACFS, Oracle ADVM, and Oracle Kernel Services Driver (OKS) drivers. These
commands are located in the /bin
directory of the Oracle Grid
Infrastructure home.
acfsload
Purpose
acfsload
loads or unloads Oracle ACFS, Oracle ADVM, and Oracle Kernel Services Driver (OKS) drivers.
Syntax
acfsload { start | stop } [ -s ]
acfsload
—h
displays help text and exits.
Table 7-2 contains the options available with the acfsload
command.
Table 7-2 Options for the acfsload command
Option | Description |
---|---|
|
Loads the Oracle ACFS, Oracle ADVM, and OKS drivers. |
|
Unloads the Oracle ACFS, Oracle ADVM, and OKS drivers. |
|
Operate in silent mode. |
Description
You can use acfsload
to manually load or unload the Oracle ACFS, Oracle ADVM, and OKS drivers.
Before unloading drivers with the stop
option, you must dismount Oracle ACFS file systems and shut down Oracle ASM. For information about dismounting Oracle ACFS file systems, refer to Deregistering, Dismounting, and Disabling Volumes and Oracle ACFS File Systems.
root
or administrator privilege is required to run acfsload
.
Examples
The following is an example of the use of acfsload
to stop (unload) all drivers.
# acfsload stop
acfsdriverstate
Purpose
acfsdriverstate
provides information on the current state of the Oracle ACFS, Oracle ADVM, and Oracle Kernel Services Driver (OKS) drivers.
Syntax
acfsdriverstate [-orahome ORACLE_HOME ]
{ installed | loaded | version [-v] |
supported [-v] [-k <kernel_rpm_file_path> | <installed_kernel_rpm_package>} [-s]
acfsdriverstate
—h
displays help text and exits.
Table 7-3 contains the options available with the acfsdriverstate
command.
Table 7-3 Options for the acfsdriverstate command
Option | Description |
---|---|
|
Specifies the Oracle Grid Infrastructure home in which the user has permission to
run the |
|
Determines whether Oracle ACFS is installed on the system. |
|
Determines whether the Oracle ADVM, Oracle ACFS, and OKS drivers are loaded in memory. |
|
Reports the currently installed version of the Oracle ACFS system software. |
|
Reports whether the system is a supported kernel for Oracle ACFS. |
|
This command option accepts an arbitrary kernel rpm – either a file or one already installed on the system – and determine if the current ACFS install is supported on the specified kernel. To specify a installed kernel rpm package, verify the file
“ To specify a kernel rpm file, verify the file has read
permissions and that the rpm provides the appropriate symbols,
of the form |
|
Specifies verbose mode for additional details. |
Description
You can use acfsdriverstate
to display detailed information on the current state of the Oracle ACFS, Oracle ADVM, and OKS drivers.
Examples
The following is an example of the use of acfsdriverstate
.
$ acfsdriverstate version ACFS-9325: Driver OS kernel version = 3.8.13-13.el6uek.x86_64. ACFS-9326: Driver build number = 171126. ACFS-9212: Driver build version = 18.1.0.0 ().. ACFS-9547: Driver available build number = 171126. ACFS-9548: Driver available build version = 18.1.0.0 ()..
Oracle ACFS Plug-in Generic Application Programming Interface
Oracle ACFS plug-in operations are supported through a common, operating system (OS) independent file plug-in (C library) application programming interface (API).
The topics contained in this section are:
For more information about Oracle ACFS plug-ins, refer to "Oracle ACFS Plugins".
Oracle ACFS Pre-defined Metric Types
Oracle ACFS provides the ACFSMETRIC1_T
and ACFSMETRIC2_T
pre-defined metric types.
The ACFSMETRIC1_T
metric set is defined for the storage virtualization model. The metrics are maintained as a summary record for either a selected set of tagged files or all files in the file system. Oracle ACFS file metrics include: number of reads, number of writes, average read size, average write size, minimum and maximum read size, minimum and maximum write size, and read cache (VM page cache) hits and misses.
Example:
typedef struct _ACFS_METRIC1 { ub2 acfs_version; ub2 acfs_type; ub4 acfs_seqno; ub8 acfs_nreads; ub8 acfs_nwrites; ub8 acfs_rcachehits; ub4 acfs_avgrsize; ub4 acfs_avgwsize; ub4 acfs_minrsize; ub4 acfs_maxrsize; ub4 acfs_minwsize; ub4 acfs_maxwsize; ub4 acfs_rbytes_per_sec; ub4 acfs_wbytes_per_sec; ub8 acfs_timestamp; ub8 acfs_elapsed_secs; } ACFS_METRIC1;
The ACFSMETRIC2_T
is a list of Oracle ACFS write description records containing the fileID
, starting offset, size, and sequence number of each write. The sequence number preserves the Oracle ACFS write record order as preserved by the plug-in driver. The sequence number provides a way for applications to order multiple message buffers returned from the API. It also provides detection of dropped write records due to the application not draining the message buffers fast enough through the API.
The write records are contained within multiple in-memory arrays. Each array of records may be fetched with the API with a buffer size currently set to 1 M. At the beginning of the fetched ioctl
buffer is a struct
which describes the array, including the number of records it contains. The kernel buffers drop the oldest write records if the buffers are filled because the buffers are not being read quickly enough.
Example:
typedef struct _ACFS_METRIC2 { ub2 acfs_version; ub2 acfs_type; ub4 acfs_num_recs; ub8 acfs_timestamp; ACFS_METRIC2_REC acfs_recs[1]; } ACFS_METRIC2; typedef struct _ACFS_FILE_ID { ub8 acfs_fenum; ub4 acfs_genum; ub4 acfs_reserved1; } typedef struct _ACFS_METRIC2_REC { ACFS_FILE_ID acfs_file_id; ub8 acfs_start_offset; ub8 acfs_size; ub8 acfs_seq_num; } ACFS_METRIC2_rec;
Oracle ACFS Plug-in APIs
Purpose
The Oracle ACFS plug-in application programming interface (API) sends and receives messages to and from the local plug-in enabled Oracle ACFS driver from the application plug-in module.
Syntax
sb8 acfsplugin_metrics(ub4 metric_type, ub1 *metrics, ub4 metric_buf_len, oratext *mountp );
sb8 acfsfileid_lookup(ACFS_FILEID file_id, oratext *full_path, oratext *mountp );
Description
The acfsplugin_metrics
API is used by an Oracle ACFS application plug-in module to retrieve metrics from the Oracle ACFS driver. The Oracle ACFS driver must first be enabled for plug-in communication using the acfsutil
plugin
enable
command. The selected application plug-in metric type model must match the plug-in configuration defined with the Oracle ACFS plug-in enable command. For information about the acfsutil
plugin
enable
command, refer to "acfsutil plugin enable". The application must provide a buffer large enough to store the metric structures described in "Oracle ACFS Pre-defined Metric Types".
If the provided buffer is NULL
and metric_buf_len
=
0
, the return value is the size required to hold all the currently collected metrics. The application can first query Oracle ACFS to see how big a buffer is required, then allocate a buffer of the necessary size to pass back to Oracle ACFS.
The mount path must be provided to the API to identify the plug-in enabled Oracle ACFS file system that is being referenced.
A nonnegative value is returned for success: 0
for success with no more metrics to collect, 1
to indicate that more metrics are available, or 2
to indicate that no new metrics were collected during the interval. In the case of an error, a negative value is returned and errno
is set on Linux environments.
When using metric type #2, the returned metrics include an ACFS_FILE_ID
, which contains the fenum and genum pair. In order to translate from the fenum and genum pair to a file path, the application can use acfsfileid_lookup
. The application must provide a buffer of length ACFS_FILEID_MAX_PATH_LEN
to hold the path. If there are multiple hard links to a file, the returned path is the first one. This is same behavior when using acfsutil info id
.
System administrator or Oracle ASM administrator privileges are required to send and receive messages to and from the plug-in enabled Oracle ACFS file system driver.
Writing Applications
To use the plugin API, applications must include the C header file acfslib.h
which defines the API functions and structures.
#include <acfslib.h>
When building the application executable, the application must be linked with the acfs12
library. Check the platform-specific documentation for information about environment variables that must be defined. For example:
export LD_LIBRARY_PATH=${ORACLE_HOME}/lib:$ {LD_LIBRARY_PATH}
Then when linking, add the -lacfs12
flag.
Examples
In Example 7-1, the command enables an Oracle ACFS file system mounted on /humanresources
for the plug-in service.
Example 7-1 Application Plug-in for Storage Visibility: Poll Model
$ /sbin/acfsutil plugin enable -m acfsmetric1 -t HRDATA /humanresources
With this command, the application plug-in polls the Oracle ACFS plug-in enabled driver for summary metrics associated with files tagged with HRDATA
. The application code includes the following:
#include <acfslib.h> ... /* allocate message buffers */ ACFS_METRIC1 *metrics = malloc (sizeof(ACFS_METRIC1)); /* poll for metric1 data */ while (condition) { /* read next summary message from ACFS driver */ if ((rc = acfsplugin_metrics(ACFS_METRIC_TYPE1,(ub1*)metrics,sizeof(*metrics), mountp)) < 0) { perror("….Receive failure … "); break; } /* print message data */ printf ("reads %8llu ", metrics->acfs_nreads); printf("writes %8llu ", metrics->acfs_nwrites); printf("avg read size %8u ", metrics->acfs_avgrsize); printf("avg write size %8u ", metrics->acfs_avgwsize); printf("min read size %8u ", metrics->acfs_minrsize); printf("max read size %8u ", metrics->acfs_maxrsize); ... sleep (timebeforenextpoll); }
In Example 7-2, the command enables an Oracle ACFS file system mounted on /humanresources
for the plug-in service.
Example 7-2 Application Plug-in for File Content: Post Model
$ /sbin/acfsutil plugin enable -m acfsmetric1 -t HRDATA -i 5m /humanresources
With this command, every 5 minutes the Oracle ACFS plug-in enabled driver posts file content metrics associated with files tagged with HRDATA
. In the application code, the call to acfsplugin_metrics()
is blocked until the metrics are posted. The application code includes the following:
#include <acfslib.h> ... ACFS_METRIC1 *metrics = malloc (sizeof(ACFS_METRIC1)); /* Wait for metric Data */ while (condition) { /* Wait for next file content posting from ACFS driver */ rc = ACFS_PLUGIN_MORE_AVAIL; /* A return code of 1 indicates that more metrics are available * in the current set of metrics. */ while( rc == ACFS_PLUGIN_MORE_AVAIL) { /* This call blocks until metrics are available. */ rc = acfsplugin_metrics(ACFS_METRIC_TYPE1,(ub1*)metrics,sizeof(*metrics), mountp); if (rc < 0) { perror("….Receive failure … "); break; } else if (rc == ACFS_PLUGIN_NO_NEW_METRICS) { printf("No new metrics available."); break; } if (last_seqno != metrics->acfs_seqno-1 ) { printf("Warning: Unable to keep up with metrics collection."); printf("Missed %d sets of posted metrics.", (metrics->acfs_seqno-1)-last_seqno); } /* print message data */ printf ("reads %8llu ", metrics->acfs_nreads); printf("writes %8llu ", metrics->acfs_nwrites); printf("avg read size %8u ", metrics->acfs_avgrsize); printf("avg write size %8u ", metrics->acfs_avgwsize); printf("min read size %8u ", metrics->acfs_minrsize); printf("max read size %8u ", metrics->acfs_maxrsize); ... last_seqno = metrics->acfs_seqno; } } free(metrics);
Example 7-3 Application for Resolving the File Path from a Fenum and Genum Pair
Oracle ACFS Tagging Generic Application Programming Interface
Oracle ACFS tagging operations are supported through a common operating system (OS) independent file tag (C library) application programming interface (API).
An Oracle ACFS tagging API demonstration utility is provided. The demo provides instructions to build the utility with a makefile on each supported platform.
On Solaris, Oracle ACFS tagging APIs can set tag names on symbolic link files, but backup and restore utilities do not save the tag names that are explicitly set on the symbolic link files. Also, symbolic link files lose explicitly set tag names if they have been moved, copied, tarred, or paxed.
The following files are included:
-
$ORACLE_HOME/usm/public/acfslib.h
-
$ORACLE_HOME/usm/demo/acfstagsdemo.c
-
$ORACLE_HOME/usm/demo/Makefile
Linux, Solaris, or AIX makefile for creating the demo utility.
The topics contained in this section are:
Oracle ACFS Tagging Error Values
The following are the values for Linux, Solaris, or AIX errno
in case of failure:
-
EINVAL
– The tag name syntax is invalid or too long. -
ENODATA
– The tag name does not exist for this file or directory. -
ERANGE
- The value buffer is too small to hold the returned value. -
EACCES
– Search permission denied for a directory in the path prefix of path; or the user does not have permission on the file to read tag names. -
ENAMETOOLONG
– The file name is too long. -
ENOENT
– A component of path does not exist.
acfsgettag
Purpose
Retrieves the value associated with an Oracle ACFS file tag name.
Syntax
sb8 acfsgettag(const oratext *path, const oratext *tagname, oratext *value, size_t size, ub4 flags);
Table 7-4 contains the options available with the acfsgettag
command.
Table 7-4 Options for the acfsgettag command
Option | Description |
---|---|
|
Specifies a pointer to a file or directory path name. |
|
Specifies a pointer to a NULL-terminated Oracle ACFS tag name in the format of a valid tag name for regular files and directories. |
|
Specifies the memory buffer to retrieve the Oracle ACFS tag value. |
|
Specifies the byte size of the memory buffer that holds the returned Oracle ACFS tag value. |
|
Reserved for future use. Must be set to 0. |
Description
The acfsgettag
library call retrieves the value string of the Oracle ACFS tag name. The return value is the nonzero byte length of the output value
string on success or ACFS_TAG_FAIL
on failure. For information about operating system-specific extended error information values that may be obtained when an ACFS_TAG_FAIL
is returned, refer to "Oracle ACFS Tagging Error Values".
Because Oracle ACFS tag names currently use a fixed value string of 0
(the number zero character with a byte length of one) the value is the same for all Oracle ACFS tag name entries. The size of the value
buffer can be determined by calling acfsgettag
with a NULL value
and 0
size
. The library call returns the byte size necessary to hold the value string of the tag name. acfsgettag
returns an ENODATA
error when the tag name is not set on the file.
Examples
Example 7-4 is an example of the use of the acfsgettag
function call.
Example 7-4 Retrieving a file tag value
sb8 rc; size_t size; oratext value[2]; const oratext *path = "/mnt/dir1/dir2/file2"; const oratext *tagname = "patch_set_11_1"; size = 1; (byte) memset((void *)value, 0, 2*sizeof(oratext)); rc = acfsgettag (path, tagname, value, size, 0); If (rc == ACFS_TAG_FAIL) /* check errno or GetLastError() to process error returns /*
acfslisttags
Purpose
Lists the tag names assigned to an Oracle ACFS file. For additional information, refer to "acfsutil tag info".
Syntax
sb8 acfslisttags(const oratext *path, oratext *list, size_t size, ub4 flags);
Table 7-4 contains the options available with the acfslisttags
command.
Table 7-5 Options for the acfslisttags command
Option | Description |
---|---|
|
Specifies a pointer to a file or directory path name. |
|
Specifies a pointer to a memory buffer containing the list of Oracle ACFS tag names. |
|
Specifies the size (bytes) of the memory buffer that holds the returned Oracle ACFS tag name list. |
|
Reserved for future use. Must be set to 0. |
Description
The acfslisttags
library call retrieves all the tag names assigned to an Oracle ACFS file. acfslisttags
returns a list of tag names into the list
memory buffer. Each tag name in the list is terminated with a NULL. If a file has no tag names then the list is empty. The memory buffer must be large enough to hold all of the tag names assigned to an Oracle ACFS file.
An application must allocate a buffer and specify a list size large enough to hold all of the tag names assigned to an Oracle ACFS file. An application can optionally obtain the list buffer size needed by first calling acfslisttags
with a zero value buffer size and NULL list buffer. The application then checks for nonzero, positive list size return values to allocate a list buffer and call acfslisttags
to retrieve the actual tag name list.
On success, the return value is a positive byte size of the tag name list or 0
when the file has no tag names. On failure, the return value is ACFS_TAG_FAIL
. For information about operating system-specific extended error information values that may be obtained when an ACFS_TAG_FAIL
is returned, refer to "Oracle ACFS Tagging Error Values".
Examples
Example 7-5 is an example of the use of the acfslisttags
function call.
Example 7-5 Listing file tags
sb8 listsize; sb8 listsize2; const oratext *path = "/mnt/dir1/dir2/file2"; oratext *list; /* Determine size of buffer to store list */ listsize = acfslisttags (path, NULL, 0, 0); if (listsize == ACFS_TAG_FAIL) /* retrieve the error code and return */ if (listsize) { list = malloc(listsize) /* Retrieve list of tag names */ listsize2 = acfslisttags (path, list, listsize, 0); if (listsize2 == ACFS_TAG_FAIL) /* check errno or GetLastError() to process error returns */ if (listsize2 > 0) /* file has a list of tag names to process */ else /* file has no tag names. */ } else /* file has no tag names. */
acfsremovetag
Purpose
Removes the tag name on an Oracle ACFS file.
Syntax
sb8 acfsremovetag(const oratext *path, const oratext *tagname, ub4 flags);
Table 7-6 contains the options available with the acfsremovetag
command.
Table 7-6 Options for the acfsremovetag command
Option | Description |
---|---|
|
Specifies a pointer to a file or directory path name. |
|
Specifies a pointer to a NULL-terminated Oracle ACFS tag name in the format of a valid tag name for regular files and directories. |
|
Reserved for future use. Must be set to |
Description
The acfsremovetag
library call removes a tag name on an Oracle ACFS file. The return value is ACFS_TAG_SUCCESS
or ACFS_TAG_FAIL
. For information about operating system-specific extended error information values that may be obtained when an ACFS_TAG_FAIL
is returned, refer to "Oracle ACFS Tagging Error Values".
Examples
Example 7-6 is an example of the use of the acfsremovetag
function call.
Example 7-6 Removing file tags
sb8 rc; const oratext *path= "/mnt/dir1/dir2/file2"; const oratext *tagname = "patch_set_11_1"; rc = acfsremovetag (path, tagname, 0); If (rc == ACFS_TAG_FAIL) /* check errno or GetLastError() to process error returns */
acfssettag
Purpose
Sets the tag name on an Oracle ACFS file. For additional information, refer to "acfsutil tag set".
Syntax
sb8 acfssettag(const oratext *path, const oratext *tagname, oratext *value, size_t size, ub4 flags);
Table 7-7 contains the options available with the acfssettag
command.
Table 7-7 Options for the acfssettag command
Option | Description |
---|---|
|
Specifies a pointer to a file or directory path name. |
|
Specifies a pointer to a NULL-terminated Oracle ACFS tag name in the format of a valid tag name for regular files and directories. |
|
Specifies the memory buffer to set the Oracle ACFS tag value. |
|
Specifies the byte size of the Oracle ACFS tag value. |
|
Reserved for future use. Must be set to 0. |
Description
The acfssettag
library call sets a tag name on an Oracle ACFS file. The return value is ACFS_TAG_SUCCESS
or ACFS_TAG_FAIL
. For information about operating system-specific extended error information values that may be obtained when an ACFS_TAG_FAIL
is returned, refer to "Oracle ACFS Tagging Error Values".
Because Oracle ACFS tag names currently use a fixed value string of 0
(the number zero character with a byte length of one) the value
is the same for all Oracle ACFS tag name entries.
Examples
Example 7-7 is an example of the use of the acfssettag
function call.
Example 7-7 Setting file tags
sb8 rc; size_t size; const oratext *value ; const oratext *path= "/mnt/dir1/dir2/file2"; const oratext *tagname = "patch_set_11_1"; value = "0"; /* zero */ size = 1; (byte) rc = acfssettag (path, tagname, (oratext *)value, size, 0); If (rc == ACFS_TAG_FAIL) /* check errno and GetLastError() to process error returns */
Understanding Oracle ACFS I/O Failure Console Messages
Oracle ACFS logs information for I/O failures in the operating-specific system event log.
A console message has the following format:
[Oracle ACFS]: I/O failure (error_code) with device device_name during a operation_name op_type. file_entry_num Starting offset: offset. Length of data transfer: io_length bytes. Impact: acfs_type Object: object_type Oper.Context: operation_context Snapshot?: yes_or_no AcfsObjectID: acfs_object_id . Internal ACFS Location: code_location.
The italicized variables in the console message syntax correspond to the following:
-
I/O failure
The operating system-specific error code, in Hex, seen by Oracle ACFS for a failed I/O. This may indicate a hardware problem, or it might indicate a failure to initiate the I/O for some other reason.
-
Device
The device involved, usually the ADVM device file, but under some circumstances it might be a string indicating the device minor number
-
Operation name
The kind of operation involved:
user
data
,metadata
, orpaging
-
Operation type
The type of operation involved:
synch
read
,synch
write
,asynch
read
, orasynch
write
-
File entry number
The Oracle ACFS File entry number of the file system object involved, as a decimal number. The
acfsutil
info
fileid
tool finds the corresponding file name. -
Offset
The disk offset of the I/O, as a decimal number.
-
Length of I/O
The length of the I/O in bytes, as decimal number.
-
File system object impacted
An indication that the file system object involved is either node-local, or is a resource accessed clusterwide. For example:
Node
orCluster
-
Type of object impacted
A string indicating the kind of file system object involved, when possible. For example:
Unknown
,User
Dir.
,User
Symlink
,User
File
,Sys.Dir
,Sys.File
, orMetaData
-
Sys.Dir.
Oracle ACFS-administered directory within the visible namespace
-
sys.File
Oracle ACFS-administered file within the visible namespace
-
MetaData
Oracle ACFS-administered resources outside of the visible namespace
-
-
Operational context
A higher-level view of what code context was issuing the I/O. This is for use by Oracle Support Services. For example:
Unknown
,Read
,Write
,Grow
,Shrink
,Commit
, orRecovery
-
Snapshot
An indication of whether, if possible to determine, the data involved was from a Snapshot. For example:
Yes
,No
, or?
-
Object type of the file system
An internal identifier for the type of file system object. For use by Oracle Support Services.
-
Location of the code
An internal identifier of the code location issuing this message. For use by Oracle Support Services.
The following is an example from /var/log/messages
in a Linux environment:
[Oracle ACFS]: I/O failure (0xc0000001) with device /dev/sdb during a metadata synch write . Fenum Unknown. Starting offset: 67113984. Length of data transfer: 2560 bytes. Impact: Node Object: MetaData Oper.Context: Write Snapshot?: ? AcfsObjectID: 8 . Internal ACFS Location: 5 .
Configuring Oracle ACFS Snapshot-Based Replication
The requirements for Oracle ACFS snapshot-based replication are discussed in this section.
This section describes how to configure Oracle ACFS snapshot-based replication
available with release 12.2 or higher. As with Oracle ACFS replication installations
before release 12.2, the overall functional goal of snapshot-based replication is to
ensure that updates from a primary cluster are replicated to a standby cluster. However,
the snapshot based replication technology uses snapshots of the primary storage location
and transfers the differences between successive snapshots to the standby storage
location using the standard ssh
command. Oracle ACFS replication
functionality before release 12.2 replicated changes continuously, building on Oracle
networking technologies, notably Network Foundation Technologies (NFT), to ensure
connectivity between the primary and standby clusters.
This change in the design and implementation of Oracle ACFS replication introduces some differences in how replication is configured and used. For example, the use of ssh
requires setting up host and user keys appropriately on the primary and standby nodes where replication is performed.
Oracle ACFS replication also provides role reversal and failover capabilites that you can configure by enabling both the primary cluster and the standby cluster to communicate with each other as required. In role reversal, the standby assumes the role of the primary and the primary becomes the standby. Failover may involve either role reversal or the establishment of a new standby for the new primary to use.
This section contains the following topics:
See Also:
-
Oracle ACFS Replication for an overview of Oracle ACFS replication
-
Oracle ACFS Command-Line Tools for Replication for information about Oracle ACFS replication commands
-
Oracle Automatic Storage Management Administrator's Guide for information about Oracle ASM privileges
Choosing an Oracle ACFS Replication User
The user identity under which replication is performed (the repluser) must be
carefully managed. This is the user that will scan
the primary looking for files to be replicated,
and that will create, update or delete replicated
files on the standby. When ssh
is
in use, ssh
will log in as this
user on the standby node involved in replication.
The user chosen as repluser should have Oracle ASM administrator privileges.
The user specified to the Oracle installer when
the Oracle software was first installed usually
belongs to the needed groups, so it is convenient
to choose this user as the replication user. In
this discussion, the replication user is
identified as repluser; however, you would
replace repluser with the actual user name
that you have selected. For information about
running Oracle ACFS acfsutil
commands, refer to About Using Oracle ACFS Command-Line Tools.
Note:
The same user and group identities must be specified for repluser on both your primary cluster and your standby cluster. Additionally, the mappings between user names and numeric uids
, and between group names and numeric gids
, must be identical on both the primary cluster and the standby cluster. This is required to ensure that the numeric values are used in the same manner on both clusters because replication transfers only the numeric values from the primary to standby.
Choosing a Transport for Oracle ACFS Replication
Starting with version 23ai of the Oracle Grid Infrastructure software, a new transport has been introduced for communication between the primary and standby replication sites. The new transport is based on Secure Sockets Layer (SSL) instead of Secure Shell (ssh).
On OL8 / X64 platforms, the user can now choose either to use SSL-based replication, or to continue using ssh-based replication. The use of ssh continues to be fully supported. On platforms other than OL8 / X64, ssh is the only transport provided.
The transport to be used is normally chosen when replication is initiated. However, support is provided to update an existing replication relationship from ssh-based to SSL-based.
SSL-Based Replication
SSL-based replication provides new authentication and secure transport mechanisms. Networking using ssh is replaced by Posix sockets. Authentication using ssh host and user keys is replaced by OpenSSL and Oracle wallets. Encryption and message authentication (HMAC) code is replaced by Intel IPPS cryptography primitives, on platforms where they are available.
From a user standpoint, the biggest advantage of using SSL-based replication is simplified configuration. Instead of requiring host and user keys to be set up and distributed for each machine to be used in replication, SSL-based replication depends on establishing a single set of credentials shared between the primary and standby sites. This can be done trivially using command-line options, or (if desired) by manual communication of site credentials.
SSL-based replication will also perform better than ssh-based replication in most contexts.
SSH-Based Replication
The ssh transport in 23ai is the same transport provided by all past releases of snapshot-based replication. An enhancement in 23ai is a key setup assistant,acfsreplssh
, that may be used to ease the
configuration of host and user keys for machines to be used in replication. The use of
acfsreplssh
is recommended but not required.
Configuring SSL-Based Oracle ACFS Replication
Configuring SSL-Based Oracle ACFS Replication
This topic describes how to configure SSL-based replication in release 23ai or higher on OL8/ X64 platforms. The only configuration step needed is to ensure that the two replication sites share a set of credentials, as described below..
Initiating Replication
The use of SSL-based replication is specified with the option -T
ssl
, given to both acfsutil repl init
commands. A
corresponding option -T ssh
specifies the use of ssh-based
replication. The use of ssh
is the default, so this latter option
is not normally needed.
Credential Definition and Distribution
To use SSL-based replication, the user must distribute credentials to be shared between the primary and standby replication sites. This is analogous to (but much simpler than) setting up host and user keys in configuring ssh-based replication.
The credentials consist of an X.509 certificate and a public-private key pair. Credentials are established on one of the two replication sites, then securely copied to the other site. Thus SSL-based replication uses a shared-secret model for authentication, and both sides of a replication relationship (i.e. the primary and standby sites) use the same credentials. Since credentials are site-wide, all instances of replication active on a given site also use the same credentials. Credentials are created by any of these commands, unless credentials already exist on the local site:
- Explicit creation with
acfsutil repl update
-o sslCreateCredentials
- Implicit creation via
acfsutil repl init standby -T ssl
- Implicit creation via a
cfsutil repl init primary -T ssl
- Implicit creation as part of updating ssh-based replication to be
SSL-based, using
acfsutil repl update -T ssl
acfsutil repl init standby
command:
$ acfsutil repl init standby -T ssl -u repluser /standbyFS
This command will establish credentials if none exist on the standby site, or will
re-use credentials that exist already. Once the credentials exist, there are three
methods for distributing them to the primary site: - Maximally secure - export at site A, secure copy to site B, import at site B
- Fairly secure - manual credential sync initiated on command line
- Fairly secure - automatic credential sync upon authentication failure
Export/import is the most secure method for distributing credentials. It is the default. The administrator exports the credentials to an export file at the local site, copies the file by a secure manner of her choosing ("sneaker net", scp, etc.) to a remote site, and then imports (installs) the credentials at the remote site.
$ acfsutil repl update -o sslPermitCredentialSync
and then uses this command on the primary site:
$ acfsutil repl update -o sslSyncCredentials -s standby_address
to fetch the credentials from the standby site and import them at the primary site. Lastly, if credential syncing is enabled as above, an automatic credential sync may occur if replication experiences an authentication failure during a replication operation. This is basically the same as manual credential syncing but is done automatically and seamlessly.
By default, credential syncing is disabled and can only occur if the administrator has explicitly enabled it at both sites. If credential syncing is permitted and a credential sync occurs, credential syncing is automatically disabled after the successful sync. This is to prevent a latent "open door" that would allow unintentional syncing.
In this example, the admin chooses the maximally secure (in this case "sneaker net") transfer of the credentials with their manual installation on the remote site. We assume that boston-clu is our standby cluster, and that nashua-clu is our primary cluster.
$ acfsutil repl init standby -T ssl -u repluser /standbyFS
$ acfsutil repl update -o sslExportCredentials=/mySecureThumbDrive/mycredentials
The admin then hand-carries the secure thumb drive from Boston to Nashua. On cluster
nashua-clu, the admin initiates replication with commands like these:
$ acfsutil repl update -o sslImportCredentials=/mySecureThumbDrive/mycredentials
$ acfsutil repl init primary -T ssl -i 10m -s repluser@boston-clu -m /standbyFS /primaryFS
In this example, credentials at a standby site are copied and installed to a primary site at replication initialization time, via an explicit manual request. Once the credential transfer succeeds, credential syncing is automatically disabled and all subsequent interactions are authenticated using the installed credentials.
$ acfsutil repl update -o sslPermitCredentialSync
On cluster nashua-clu, the admin syncs credentials with commands like these:
$ acfsutil repl update -o sslPermitCredentialSync
$ acfsutil repl update -o sslSyncCredentials -s boston-clu
In this scenario, the second acfsutil repl update
command at
nashua-clu will sync credentials from boston-clu. Once the
credentials are synced, credential syncing is automatically disabled. Then all
future acfsutil repl init
commands will use the credentials
installed at each site.
T ssl
to specify the transport. It's
initialized first on the standby:
$ acfsutil repl init standby -T ssl -u repluser /standbyFS
And then on the primary:
$ acfsutil repl init primary -T ssl -i 10m -s repluser@boston-clu -m /standbyFS /primaryFS
In this example, credentials at a standby
site are copied and installed to a primary site at replication initialization time,
implicitly by the acfsutil repl init
commands. Once the credential
transfer succeeds, credential syncing is automatically disabled and all subsequent
interactions are authenticated using the installed credentials.
$ acfsutil repl init standby -T ssl -o sslPermitCredentialSync -u repluser /standbyFS
On cluster nashua-clu, the admin initiates replication with a command like
this one:
$ acfsutil repl init primary -T ssl -o sslPermitCredentialSync -i 10m -s repluser@boston-clu -m /standbyFS /primaryFS
In this scenario, the acfsutil repl init primary
command will
encounter an authentication failure internally and will automatically sync
credentials with the boston-clu site, with no human intervention required.
Once the credentials are synced, credential syncing is automatically disabled.
Note that we also demonstrate in this example that if desired the
-o sslPermitCredentialSync
option can be included on the
acfsutil repl init
command lines. This is equivalent to
specifying the option using separate acfsutil repl update
commands.
If the credentials in use by replication have been updated, the updated credentials can be distributed to the remote site by any of the methods mentioned above.
$ acfsutil repl update -o sslExportCredentials=/mySecureThumbDrive/mycredentials
to capture the credentials in a file. Then on the remote site, the file is used with
a command like:
$ acfsutil repl update -o sslImportCredentials=/mySecureThumbDrive/mycredentials
to import the credentials. $ acfsutil repl update -o sslPermitCredentialSync
and then on
the local site uses the command:
$ acfsutil repl update -o sslSyncCredentials -s remote-address
to fetch the credentials from the remote site and import them at the local site.
Once this operation is complete, credential syncing is automatically disabled.
$ acfsutil repl update -o sslPermitCredentialSync
and then on
the local site simply waits for replication to discover the update (i.e., receive an
authentication failure), at which point it will sync credentials, disabling further
syncing once this operation is complete. An existing ssh-based instance of replication can be updated to use SSL as its transport. Once updated to use SSL, the instance of replication cannot revert to using ssh instead. This update is performed using the acfsutil repl update command, run on both the primary and the standby clusters.
You should update the standby
cluster first, then the primary cluster. Just like when you specify SSL-based
replication when initializing replication with acfsutil repl init
,
you can create and distribute credentials using import / export, manually or
automatically.
In all cases, once the credentials have been synced, SSL takes over as the transport for the replication instance. Until that point, ssh remains in use as the transport.
$ acfsutil repl update -T ssl /standbyFS
$ acfsutil repl update -o sslExportCredentials=/mySecureThumbDrive/mycredentials
Next, on the primary cluster nashua-clu, the user imports the credentials and
updates to SSL-based replication:
$ acfsutil repl update -o sslImportCredentials=/mySecureThumbDrive/mycredentials
$ acfsutil repl update -T ssl /primaryFS
$ acfsutil repl update -o sslPermitCredentialSync
$ acfsutil repl update -T ssl /standbyFS
Then, on the primary cluster nashua-clu, the user enables syncing the
credentials, updates to SSL-based replication, and finally performs the credential
sync:
$ acfsutil repl update -o sslPermitCredentialSync
$ acfsutil repl update -o sslSyncCredentials -s boston-clu
$ acfsutil repl update -T ssl /primaryFS
$ acfsutil repl update -o sslPermitCredentialSync
$ acfsutil repl update -T ssl /standbyFS
Then, on the primary cluster nashua-clu, the user enables syncing the
credentials and updates to SSL-based replication:
$ acfsutil repl update -o sslPermitCredentialSync
$ acfsutil repl update -T ssl /primaryFS
In
this case, replication will discover the need to sync credentials and will perform
the operation automatically. Configuring ssh
for Use With Oracle ACFS Replication
This topic describes how to configure ssh
for use by Oracle ACFS
snapshot-based replication available with release 12.2 or higher. Configuration can be
performed in either of two ways:
- You can use the key setup assistant
acfsreplssh
to ease setting up the needed user and host keys - Alternatively, you can configure the needed keys manually
ssh
must be usable in either direction between the clusters — from
the primary cluster to the standby cluster and from the standby to the primary. This
means that passwordless operation of ssh must be enabled in both directions – that is,
that host and user keys must be defined to allow ssh to connect from the local to the
remote cluster with no manual intervention. These are the ssh keys that must be
configured before replication is used:
- A public key for repluser, as defined on each node of the
local cluster (both the primary and the standby), must appear in the
authorized_keys2
file for repluser on each node of the remote cluster (both standby and primary). - A host key for each node of the remote cluster must appear in the
known_hosts
file on each node of the local cluster, unless replication is configured with strict host key checking disabled.
Configuring ssh Using the Key Setup Assistant
The ssh key setup assistant acfsreplssh
may be used to
configure, verify or delete the host and user keys required when ssh
is
used as the transport for Oracle ACFS snapshot-based replication. This topic describes
how to use acfsreplssh
to configure ssh for replication in one
direction – from the primary to the standby. To configure ssh completely, you must
perform these steps a second time with the primary and standby roles reversed.
To generalize our description, we will speak in terms of the local and the remote cluster. Run the command once with the primary as the local cluster, then a second time with the standby as the local cluster.
Command Synopsis
acfsreplssh
looks like this:
acfsreplssh{ configure | verify | remove }
[-v] [-m]
[-p remote_password]
[-V remote_vip]
[-o sshStrictKey=ynvalue]
{ -c remote_cluster | remote_host1 [remote_hostn...] }
where the command-line options have these meanings: -c remote_cluster
Uses remote_cluster
as the name of a network endpoint to contact
the remote cluster, then runs the acfsutil cluster info command on the remote cluster to
identify the cluster members. Each hostname shown as a member is then processed as if it
had been specified as a remote_host directly on the command line.
-m
Runs in “mockup mode” – shows what operations the command would perform if run normally, but does not run any of the operations. May be used with -v to see more details of what would happen.
-o sshStrictKey=ynvalue
Specifies whether or not host keys
must already exist. A ynvalue starting with “y” signifies “yes”; a ynvalue starting with
“n” signifies “no”. For the configure
command, “yes” means that host
keys must already exist (the command will not configure them), while “no” means the
command will accept (and configure) any host key presented by a remote node. For the
verify
command, this option has no effect – a host key must already
exist for all remote nodes. (The verify
command never adds or modifies
keys.) The default value of this option is “no” – that is, the
configure
command will configure any host key presented. See the
-o sshStrictKey
option to the acfsutil repl init
and acfsutil repl update
commands for more information on host key
checking.
-p
remote_password
Specifies remote_password as the password to be used in logging in as
repluser on the remote cluster. Note that specifying the password on the command line in
this way is not secure, and should be done only when there can be no security-related
consequences. If -p is not given, acfsreplssh
will prompt for the
remote password (just once per invocation).
-v
Runs in “verbose mode” – shows details of the command's execution.
-V remote_vip
remote_hosts
named on the command line.
This means that remote_vip
will be added to the line in the
local known_hosts
file for each remote_host
given on the command line.
Note:
All of the examples below show-p
not being specified. That is, they
assume that acfsreplssh
will prompt for the remote cluster password
each time the command is run.
The assistant should be run on each node of the local cluster to operate on the host and user keys for a single remote cluster. It should be invoked as repluser. The assistant will output the identity of the invoking user for confirmation, and will then set up keys for that user to allow replication between the node where the assistant is run and the named remote cluster.
To identify the hosts in the remote cluster, either:- Specify
-c
remote_cluster to name a network endpoint in the remote cluster, which will be queried to obtain the cluster members, or
- Name each host in the remote cluster as a remote_host on the command line.
Configuring ssh
on each local host
The acfsreplssh configure
command is used to configure the
ssh keys that will be required by replication. The command can be invoked either as part
of setting up replication for the first time, or at some later point, to ensure that all
the keys needed for replication are still present on the clusters involved. Each run of
acfsreplssh
configure
– even multiple successive runs – will add a user or host key
only where needed, preserving existing keys.
On each host, acfsreplssh
will save the current key-related
data for repluser before it modifies the data, unless this has been done already.
That is, any pre-existent copy of the directory ~repluser/.ssh
will be
renamed with the suffix .acfsreplssh.backup
, unless a backup directory
with this suffix already exists. Any directory created will have permissions 0400
(readable by owner only) and will be owned by repluser.
Here’s an example. Let’s say we have a 4-node primary cluster bosc26, with nodes
n1 though n4. We also have a 2-node standby cluster chic26,
with nodes n1 and n2. We have a SCAN VIP defined on each cluster,
bosc26vip and chic26vip, and we wish to use the SCAN VIP on each
cluster as the network endpoint for replication. To avoid having to specify each remote
hostname on the command, we use -c
with the VIP name as its value.
To configure the full set of keys required by replication, using the VIP
defined on each cluster, run acfsreplssh
as follows.
$ acfsreplssh configure -V chic26vip -c chic26vip
Next, run this command on each node of chic26:
$ acfsreplssh configure -V bosc26vip -c bosc26vip
The
first command above configures a public key for the current user (which is assumed to be
repluser) on the host within bosc26 where the command is run (if a key
does not exist already), then adds the key to the authorized_keys2
file
for the current user on each host within chic26. Then a host key for each
chic26 host is added to the known_hosts
file for the current
user on the bosc26 host, unless -o sshStrictKey=yes
has been
specified. If that option is given, then all needed host keys must be present already on
the bosc26 host.
With these commands, acfsutil cluster info will be run using chic26vip or bosc26vip, respectively, to determine the members of the remote cluster.
The second command performs the same operations, but in the opposite
direction. That is, the command configures a public key for the current user on the host
within chic26 where the command is run (if a key does not exist already), then
adds the key to the authorized_keys2
file for the current user on each
host within bosc26. Then a host key for each bosc26 host is added to the
known_hosts
file for the current user on the chic26 host,
unless -o sshStrictKey=yes
has been specified. If that option is given,
then all needed host keys must be present already on the chic26 host.
-c
with one
of the remote node names as its value. First, run this command on each node of bosc26:
# acfsreplssh configure -c chic26n1
Next, run this command on
each node of chic26: # acfsreplssh configure -c bosc26n1
With these commands, acfsutil cluster info
will be run on
chic26n1 or bosc26n1, respectively, to determine the members of the
remote cluster.
Verifying ssh configuration on each local host
The acfsreplssh verify
command can be used to verify an
existing set of keys. This command does not add or modify any keys – it simply tries to
use the ones already present. To verify the keys that are present to support a given
instance of replication, use the same form of command line as you used to configure
those keys, just with the verify
keyword specified.
To verify the full set of keys required by replication, using the VIP
defined on each cluster, run acfsreplssh
as follows.
# acfsreplssh verify -V chic26vip -c chic26vip
Next,
to verify the keys needed for potential future replication from the current standby to
the current primary, run this command on each node of chic26:
# acfsreplssh verify -V bosc26vip -c bosc26vip
Alternatively, to verify the full set of keys required by replication, without using a
VIP, run acfsreplssh
as follows. First, to verify the keys needed for
replication from the primary cluster to the standby cluster, run this command on each
node of bosc26:
# acfsreplssh verify -c chic26n1
Next, to verify the
keys needed for potential future replication from the current standby to the current
primary, run this command on each node of chic26:
# acfsreplssh verify -c bosc26n1
Removing ssh
configuration info on each local
host
The acfsreplssh remove
command can be used to remove an
existing set of keys if (for instance) the user updates the value of repluser
with the acfsutil repl update
command. The acfsreplssh
remove
command must be run on each local host where keys are to be removed.
The command will remove keys associated with the named remote cluster, as well. To
remove the keys on a given local host that are present to support an old repluser, use
the same form of command line as you used to configure those keys on that host, just
with the remove
keyword specified instead of
configure
.
For instance, to return to our previous example of replication involving the use of a
VIP, you can remove the set of keys that were previously configured, using the VIP
defined on each cluster, by running acfsreplssh
as follows.
$ acfsreplssh remove -V chic26vip -c chic26vip
Next,
run this command on each node of chic26:
$ acfsreplssh remove -V bosc26vip -c bosc26vip
Configuring ssh
manually
The procedures in this topic describe the manual steps needed to configure
ssh
for replication in one direction — from the primary to the
standby. To configure ssh
completely, you must perform the instructions
a second time with the primary and standby roles reversed. When you perform the
instructions the first time, complete the steps as written for the primary cluster and
the standby cluster. The second time, reverse the primary and standby roles. Perform the
steps marked as necessary on the primary cluster on your standby cluster and
perform the steps marked as necessary on the standby cluster on your primary
cluster. The procedures that must be performed twice are described in:
After you have completed all the necessary procedures, you can use the instructions described in Validating your ssh-related key configuration to confirm that you have configured ssh
correctly in both directions.
Distributing keys for Oracle ACFS replication
The process of distributing keys for Oracle ACFS replication includes getting a public key from the primary cluster, getting host keys for the standby cluster, ensuring permissions are configured properly for ssh
-related files, configuring sshd
as necessary, and lastly validating the ssh
configuration.
Note:
When creating host keys, ensure that you create keys for both fully-qualified domain hostnames and the local hostnames.
Getting a public key for repluser from the primary cluster
A public key for repluser defined on each node of your primary cluster must be known to repluser on each node of your standby cluster.
To make this key known, the directory ~repluser/.ssh
must exist on each standby node. If this directory does not exist, then create it with access only for repluser. Ensure that an ls
command for the .ssh
directory displays output similar to:
repluser@standby $ ls -ld ~/.ssh drwx------ 2 repluser dba 4096 Jan 27 17:01 .ssh
If a public key file for repluser exists on a given primary node, then add
its contents to the set of keys authorized to log in as repluser on each node
of the standby where replication is run. Append the key to the file
~repluser/.ssh/authorized_keys2
on each standby node, creating
this file if necessary.
If a public key file does not exist, generate a public and private key pair on the primary by running the following command as repluser.
$ ssh-keygen -t rsa
You can press the enter key in response to each prompt issued by the command. Copy the resulting .pub
file to each standby node.
You have the option to share the same public/private key pair for repluser
across all of the nodes in your primary cluster, or to establish a different key
pair for each primary node. If the same public key is valid for repluser
across all nodes in your primary cluster, then only that key must be added to the
file ~repluser/.ssh/authorized_keys2
on each node of your standby
cluster. If each primary node has its own public key for repluser, then all
the public keys must be added to the file. In either case, you can minimize work by
copying the updated authorized_keys2
file on a given node of the
standby to the other nodes of the cluster.
Getting host keys for the standby cluster
A host key for each standby node where replication may run must be known on each
primary node where replication may run. One way to generate the correct key is to
run ssh
manually as repluser from each primary node to each
standby node. If the correct host key is not known already, then a warning displays
and you can enable ssh
to add the key.
The following is an example of obtaining a host key:
[repluser@primary data]$ ssh repluser@standby date The authenticity of host 'standby (10.137.13.85)' can't be established. RSA key fingerprint is 1b:a9:c6:68:47:b4:ec:7c:df:3a:f0:2a:6f:cf:a7:0a. Are you sure you want to continue connecting (yes/no)?
If you respond with yes
, then the ssh
setup is
complete. A host key for host standby is stored in the known_hosts
file (~repluser/.ssh/known_hosts
) on the host primary for
the user repluser.
After the host key setup for standby nodes is complete on a given primary node, you need to perform an additional step if you use a Virtual IP address (VIP) to communicate with your standby cluster. You must add the VIP name or address at the start of each line of the known_hosts
file that refers to a host in the standby cluster. For example, if you use a VIP with the name standby12_vip
, and your known_hosts
file contains the following two lines that refer to your standby:
standby1,10.242.20.22 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC3pM2YTd4UUiEWEoCKDGgaTgsmPkQToDrdtU+JtVIq/96muivU BaJUK83aqzeNIQkh+hUULsUdgKoKT5bxrWYqhY6AlTEqNgBHjBrJt9C73BbQd9y48jsc2G+WQWyuI/ +s1Q+hIJdBNMxvMBQAfisPWWUcaIx9Y/JzlPgF6lRP2cbfqAzixDot9fqRrAKL3G6A75A/6TbwmEW07d1zqOv l7ZGyeDYf5zQ72F/V0P9UgMEt/5DmcYTn3kTVGjOTbnRBe4A4lY4rVw5c+nZBDFre66XtORfQgwQB5ztW/Pi 08GYbcIszKoZx2HST9AZxYIAgcrnNYG2Ae0K6QLxxxScP standby2,10.242.20.23 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDIszcjzNtKN03SY8Kl846skFTVP1HF/ykswbmkctEjL6KTWTW+NR U4MGbvkBqqdXxuPCR7aoGO2U3PEOg1UVf3DWUoux8IRvqKU+dJcdTibMFkDAIhTnzb14gZ/lRTjn+GYsuP5 Qz2vgL/U0ki887mZCRjWVL1b5FNH8sXBUV2QcD7bjF98VXF6n4gd5UiIC3jv6l2nVTKDwtNHpUTS1dQAi+1D tr0AieZTsuxXMaDdUZHgKDotjciMB3mCkKm/u3IFoioDqdZE4+vITX9G7DBN4CVPXawp+b5Kg8X9P+08Eehu tMlBJ5lafy1bxoVlXUDLVIIFBJNKrsqBvxxxpS7
To enable the use of the VIP, you would modify these two lines to read as follows:
standby12_vip,standby1,10.242.20.22 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC3pM2YTd4UUiEWEoCKDGgaTgsmPkQToDrdtU+JtVIq/96muivU BaJUK83aqzeNIQkh+hUULsUdgKoKT5bxrWYqhY6AlTEqNgBHjBrJt9C73BbQd9y48jsc2G+WQWyuI/ +s1Q+hIJdBNMxvMBQAfisPWWUcaIx9Y/JzlPgF6lRP2cbfqAzixDot9fqRrAKL3G6A75A/6TbwmEW07d1zqOv l7ZGyeDYf5zQ72F/V0P9UgMEt/5DmcYTn3kTVGjOTbnRBe4A4lY4rVw5c+nZBDFre66XtORfQgwQB5ztW/Pi 08GYbcIszKoZx2HST9AZxYIAgcrnNYG2Ae0K6QLxxxScP standby12_vip,standby2,10.242.20.23 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDIszcjzNtKN03SY8Kl846skFTVP1HF/ykswbmkctEjL6KTWTW+NR U4MGbvkBqqdXxuPCR7aoGO2U3PEOg1UVf3DWUoux8IRvqKU+dJcdTibMFkDAIhTnzb14gZ/lRTjn+GYsuP5 Qz2vgL/U0ki887mZCRjWVL1b5FNH8sXBUV2QcD7bjF98VXF6n4gd5UiIC3jv6l2nVTKDwtNHpUTS1dQAi+1D tr0AieZTsuxXMaDdUZHgKDotjciMB3mCkKm/u3IFoioDqdZE4+vITX9G7DBN4CVPXawp+b5Kg8X9P+08Eehu tMlBJ5lafy1bxoVlXUDLVIIFBJNKrsqBvxxxpS7
Ultimately, the host key configuration performed on this first node of your primary cluster must be performed on every node in your primary cluster; the result of the above sequence, or an equivalent, must exist on each primary node. One way to minimize the manual effort required to achieve this configuration is to update the known_hosts
file on one node of the primary cluster, then copy the updated file to the other nodes of the cluster.
Note:
By default, replication enables strict host key checking by ssh
, to ensure that the primary node connects to the intended standby node or cluster when it runs ssh
. However, if you are certain that this checking is unneeded, such as the case when the primary and standby clusters communicate over a private network, the use of strict host key checking by ssh
can be disabled. For information about disabling strict host key checking, refer to the -o sshStrictKey=no
option of the acfsutil
repl
init
primary
command. If strict host key checking is disabled, then no host key setup is required. For information about the acfsutil
repl
init
command, refer to acfsutil repl init.
Notes on permissions for ssh-related files
For ssh
to work with the keys you have established, you must ensure
that permissions are set properly on each node for the .ssh
directory for repluser and some of the files the directory contains.
For details on the permissions that should be given to each .ssh
directory and key files within the directory, refer to the documentation for your ssh
implementation, such as the FILES
section of the ssh(1)
manual page.
Notes on sshd configuration
After you begin using replication, ssh
is started frequently to perform replication operations. On some platforms, the ssh
daemon sshd
may be configured to log a message through syslog
or a similar facility each time an ssh
connection is established. To avoid this, the server configuration file /etc/ssh/sshd_config
can be modified to specify a lower frequency of logging. The parameter that controls logging is called LogLevel
. Connection messages are issued at level INFO
. Any lower LogLevel
setting, such as ERROR
, suppresses those messages. For example, you can suppress log messages by adding the following line to the file:
LogLevel ERROR
Validating your ssh-related key configuration
After you have established the host and user keys for ssh
on both
your primary and your standby clusters, you can use the command
acfsutil
repl
info
-c
-u
to validate the keys. You run this command as repluser on
each node of each cluster. It takes as arguments all the hostnames or addresses on
the remote cluster that the local cluster may use in the future to perform
replication.
If you are not using a VIP to connect to your remote cluster, then for a given
replication relationship, only one remote hostname or address is provided to
acfsutil
repl
init
primary
. However, if future relationships involve other remote host
addresses, specify the complete set of remote addresses when running the
acfsutil repl info -c -u
command.
If you are using a VIP to connect to your remote cluster, then you should specify the names or host-specific addresses of all remote hosts on which the VIP may be active. Do not specify the VIP name or an address associated with the VIP. When replication uses ssh
to connect to a VIP, the host key returned is the key associated with the host where the VIP is currently active. Only the hostnames or addresses of individual remote nodes are used by ssh
in this situation.
The validation command to run on each node of your primary cluster has the following format:
$ acfsutil repl info -c -u repluser standby1 [standby2 ...] [snap_shot@]primary-mountpoint
In the command, standbyn
specifies the standby cluster hostname or address. The validation command confirms that user repluser can use ssh
to connect to each standby hostname or address given, in the same manner as replication initialization. Use the same command format if you are using a VIP, such as standby12_vip
, to connect to the cluster. Do not specify the name of the VIP.
If you plan to disable strict host key checking, you can skip this checking by adding the -o sshStrictKey=no
option to the command line.
After you have confirmed that each node of your primary cluster can connect to all nodes of your standby cluster, run the validation command again. This time run the command on each node of your standby cluster. Specify a hostname or IP address for all nodes of your primary cluster using the following format:
$ acfsutil repl info -c -u repluser primary1 [primary2 ...] [snap_shot@]standby-mountpoint
In the command, primaryn
specifies the primary cluster hostname or address.
Oracle Patching and Oracle ACFS
This section discusses patching with Oracle ACFS in a Grid Infrastructure environment.
Overview of Oracle ACFS Patching
Oracle ACFS is installed as part of Oracle Grid Infrastructure. However, Oracle ACFS runs from various system locations, such as /lib/modules
and /sbin
on Linux.
Oracle ACFS integrates with the Oracle Grid Infrastructure delivery and patch mechanisms: OUI and OPatch. Regardless of the delivery mechanism; Oracle Release, Oracle Patchset, Oracle Release Update, or Oracle One-off; Oracle ACFS content is delivered in patches.
When updating the Oracle Grid Infrastructure, without Oracle Zero Downtime Grid Infrastructure Patching, Oracle ACFS is also updated in the system locations, ensuring seamless operation of the Oracle Grid software. During the updates; whether Release, Release Update, Patchset, or One-off; the Oracle Clusterware stack is stopped on a local node and services are migrated to other nodes. The Oracle Grid Software is then patched and services are restarted on the local node.
Patching Without Oracle Zero Downtime Oracle Grid Infrastructure Patching
During the patch operation, Oracle ACFS software is updated first in the Grid Home by the OPatch or OUI file placement operation, and then later moved to the appropriate system locations and loaded into memory while Oracle Clusterware is down. The restart of Oracle Clusterware has the side effect of freeing up in operating system (OS) kernel references so that Oracle ACFS can be updated in the OS Kernel.
Patching With Zero Downtime Oracle Grid Infrastructure
When using Zero Downtime Oracle Grid Infrastructure Patching, only the Oracle Grid Infrastructure user space binaries in the Oracle Grid Home are patched. Commands which run out of the Oracle Grid Home immediately use the latest versions. Oracle Grid Infrastructure components that are installed outside of the Oracle Grid Home; such as ACFS, AFD, OLFS, and OKA OS system software (OS kernel modules and system tools); are updated in the Grid Home, but not installed to the system locations. They continue to run the version previous to the patch version. After patching, this results in the OPatch inventory displaying the new patch number in the OPatch inventory. However, the running software does not contain these changes, only the software that is available in the Grid Home. Until the newly available software is loaded into memory and accompanying user tools are copied to system locations, the system does not utilize the available fixes that are in the Oracle Grid Infrastructure Home.
To determine which Oracle ACFS system software is running and installed, the following commands can be used:
-
crsctl
query
driver
activeversion
-all
This command shows the Active Version of Oracle ACFS on all nodes of the cluster. The Active Version is the version of the Oracle ACFS Driver that is currently loaded and running on the system. This also implicitly indicates the ACFS system tools version. The
crsctl
query
command, available from 18c and onwards, shows data from all nodes of the cluster.In the following example, 19.4 is available in the Oracle Home, but 19.2 is the current running version. OPatch
lsinventory
reports 19.4 as the patched version. Oracle Grid Infrastructure OS drivers are only running 19.2.crsctl query driver activeversion -all Node Name : node1 Driver Name : ACFS BuildNumber : 200114 BuildVersion : 19.0.0.0.0 (19.2.0.0.0)
-
crsctl
query
driver
softwareversion
-all
This command shows the available Software Version of the Oracle Grid Infrastructure software (and by extension, the available Software Version of the Oracle ACFS Software) that is currently installed in the Oracle Grid Home. The
crsctl
query
command, available from 18c and onwards, shows data from all nodes of the cluster.crsctl query driver softwareversion -all Node Name : node1 Driver Name : ACFS BuildNumber : 200628 BuildVersion : 19.0.0.0.0 (19.4.0.0.0)
-
acfsdriverstate
version
-v
This command shows the full information on the running Oracle ACFS modules on the local node. The ACFS-9548 and ACFS-9547 messages displays the version of the Oracle ACFS software that is available in the Oracle Grid Infrastructure home.
acfsdriverstate
reports on the local node only. Bug numbers are only available when running one-off patches.acfsdriverstate version -v ACFS-9325: Driver OS kernel version = 4.1.12-112.16.4.el7uek.x86_64. ACFS-9326: Driver build number = 200114. ACFS-9212: Driver build version = 19.0.0.0.0 (19.2.0.0.0). ACFS-9547: Driver available build number = 200628. ACFS-9548: Driver available build version = 19.0.0.0.0 (19.2.0.0.0) ACFS-9549: Kernel and command versions. Kernel: Build version: 19.0.0.0.0 Build full version: 19.2.0.0.0 Build hash: 9256567290 Bug numbers: NoTransactionInformation Commands: Build version: 19.0.0.0.0 Build full version: 19.2.0.0.0 Build hash: 9256567290 Bug numbers: NoTransactionInformation
Updating Oracle Grid Infrastructure Files
Until the Oracle Clusterware stack is stopped and the Oracle ACFS driver modules are updated, Oracle ACFS fixes are not loaded into memory. The process that loads the Oracle ACFS fixes into system memory also installs the required tools for Oracle ACFS operation into system locations.
You can perform one of the following procedures:
-
To load Oracle ACFS fixes into memory and system locations, the following commands must be issued on a node by node basis:
crsctl
stop
crs
-f
Stops the CRS stack and all applications on the local node
root.sh
-updateosfiles
Updates Oracle ACFS and other Oracle Grid Infrastructure Kernel modules on the system to the latest version
crsctl
start
crs
-wait
Restarts CRS on the node
-
Alternatively, if a node reboots with a kernel version change, then newer drivers are automatically loaded and newer system tools installed into the system directories. It is assumed that all nodes in the cluster change kernel versions at the same time.
After one of these events has occurred, the crsctl
query
activeversion
and crsctl
query
softwareversion
commands report the same information: the loaded and running operating system (OS) software is the same as the latest available in the Oracle Grid Infrastructure Home. You can run other Oracle ACFS version commands as described in Verifying Oracle ACFS Patching.
Verifying Oracle ACFS Patching
When using standard OPatch patches to apply Oracle Release Updates and Patches, the inventory accurately reflects what is installed in the Grid Infrastructure home and on the system. For example:
[grid@racnode1]$ opatch lsinventory ... .. Oracle Grid Infrastructure 19c 19.0.0.0.0 There are 1 products installed in this Oracle Home. Interim patches (5) : Patch 30501910: applied on Sat Mar 07 15:42:08 AEDT 2020 Unique Patch ID: 23299902 Patch description: "Grid Infrastructure Jan 2020 Release Update : 19.4.0.0.200628 (30501910)" Created on 28 Dec 2019, 10:44:46 hrs PST8PDT Bugs fixed:
The output in the lsinventory
example lists the OPatch RU and other patches that are applied, as well as the bug numbers and other information. These patches are applied to the Grid Infrastructure home. During normal patching operations, they are also applied to the operating system (OS) locations and loaded into memory, ensuring that Oracle Grid Infrastructure OS system software fixes are in sync with the Grid Infrastructure home. However, when using Zero Downtime Grid Infrastructure Patching, the content for Oracle Grid Infrastructure system software installed on the system, such as Oracle ACFS, is not updated at the same time.
The crsctl
query
driver
and acfsdriverstate
commands can be used to verify whether the installed Oracle Grid Infrastructure system software level is the same as the software level of the Grid Infrastructure home. Refer to the discussion about Zero Downtime Oracle Grid Infrastructure patching in Overview of Oracle ACFS Patching.
For patching and update operations applied without Zero Downtime Oracle Grid Infrastructure Patching, the active and software version should always be the same.
If it is necessary to install updated Oracle Grid Infrastructure OS system software, refer to the procedures in Updating Oracle Grid Infrastructure Files.
After all the Oracle Grid Infrastructure OS system software is updated, the version should be the same as the Opatch lsinventory
output displayed for any patches or updates to the Grid Infrastructure home, in this case, 19.4.0.0.0. Additionally, the Oracle Grid Infrastructure OS system software that is available and active should have the same version number displayed. For example:
Output from the lsinventory command: Patch description: "Grid Infrastructure Jan 2020 Release Update : 19.4.0.0.0.200628 (30501910)" crsctl query driver activeversion -all Node Name : node1 Driver Name : ACFS BuildNumber : 200628 BuildVersion : 19.0.0.0.0 (19.4.0.0.0) crsctl query driver softwareversion -all Node Name : node1 Driver Name : ACFS BuildNumber : 200628 BuildVersion : 19.0.0.0.0 (19.4.0.0.0)
You can run the acfsdriverstate
version
command for additional Oracle ACFS information on the local node, including information on commands and utilities. For example:
acfsdriverstate version ACFS-9325: Driver OS kernel version = 4.1.12-112.16.4.el7uek.x86_64. ACFS-9326: Driver build number = 200628. ACFS-9212: Driver build version = 19.0.0.0.0 (19.4.0.0.0) ACFS-9547: Driver available build number = 200628. ACFS-9548: Driver available build version = 19.0.0.0.0 (19.4.0.0.0).