Setting Up Peering Between the ZFS Storage Appliances
After the physical connection between the
ZFS Storage Appliances has been established, you set them up as
peers using the drSetupService
command in the Service CLI. You run this command from both systems so
that each system operates as the replica of the other system.
The required replication parameters for standard storage are mandatory with the setup command. If Private Cloud Appliance systems also include high-performance storage, then add the replication parameters for the high-performance storage pool to the setup command.
However, only set up replication for high-performance storage if the high-performance storage pool is effectively available on the ZFS Storage Appliances. If not, run the setup command again to add the high-performance storage pool at a later time, after it has been configured on the ZFS Storage Appliances.
When you set up the replication interfaces for the disaster recovery service, the system
assumes that the gateway is the first host address in the subnet of the local IP address you
specify. This applies to the replication interface for standard storage and high-performance
storage. For example, if you specify a local IP address as 10.50.7.31/23
and
the gateway address is not 10.50.6.1 then you must add the gateway IP address to the
drSetupService
command using the gatewayIp
and
gatewayIpPerf
parameters.
Optionally, you can also set a maximum number of DR configurations and a retention period for disaster recovery job details.
Setting Up Peering Between the ZFS Storage Appliances Before 302-b892153
Note:
Both Private Cloud Appliance racks in the disaster recovery configuration must be running the same version of the system software.
Syntax (entered on a single line):
drSetupService localIp=<primary_system_standard_replication_ip> (in CIDR notation) remoteIp=<replica_system_standard_replication_ip> localIpPerf=<primary_system_performance_replication_ip> (in CIDR notation) remoteIpPerf=<replica_system_performance_replication_ip> [Optional Parameters:] gatewayIp=<local_subnet_gateway_ip> (default: first host IP in localIp subnet) gatewayIpPerf=<local_subnet_gateway_ip> (default: first host IP in localIpPerf subnet) maxConfig=<number_DR_configs> (default and maximum is 20) jobRetentionHours=<hours> (default and minimum is 24)
Examples:
-
With only standard storage configured:
system 1
PCA-ADMIN> drSetupService \ localIp=10.50.7.31/23 gatewayIp=10.50.7.10 remoteIp=10.50.7.33
system 2
PCA-ADMIN> drSetupService \ localIp=10.50.7.33/23 gatewayIp=10.50.7.10 remoteIp=10.50.7.31
-
With both standard and high-performance storage configured:
system 1
PCA-ADMIN> drSetupService \ localIp=10.50.7.31/23 gatewayIp=10.50.7.10 remoteIp=10.50.7.33 \ localIpPerf=10.50.7.32/23 gatewayIpPerf=10.50.7.10 remoteIpPerf=10.50.7.34
system 2
PCA-ADMIN> drSetupService \ localIp=10.50.7.33/23 gatewayIp=10.50.7.10 remoteIp=10.50.7.31 \ localIpPerf=10.50.7.34/23 gatewayIpPerf=10.50.7.10 remoteIpPerf=10.50.7.32
Important:
When setting up disaster recovery, after you rundrSetupService
on the first system you must wait for the job to complete
before running the command on the second system. You can monitor the job on the first system
by running drGetJob jobid=<unique-id>
.
The script configures both ZFS Storage Appliances.
After successful configuration of the replication interfaces, you must enable replication over the interfaces you just configured.
Enabling Replication for Disaster Recovery
To enable replication between the two storage appliances, using the interfaces you
configured earlier, re-run the same drSetupService
command from the Service CLI, but this time followed by
enableReplication=True
. You must also provide the
remotePassword
to authenticate with the other storage appliance and
complete the peering setup.
Examples:
-
With only standard storage configured:
system 1
PCA-ADMIN> drSetupService \ localIp=10.50.7.31/23 gatewayIp=10.50.7.10 remoteIp=10.50.7.33 \ enableReplication=True remotePassword=********
system 2
PCA-ADMIN> drSetupService \ localIp=10.50.7.33/23 gatewayIp=10.50.7.10 remoteIp=10.50.7.31 \ enableReplication=True remotePassword=********
-
With both standard and high-performance storage configured:
system 1
PCA-ADMIN> drSetupService \ localIp=10.50.7.31/23 gatewayIp=10.50.7.10 remoteIp=10.50.7.33 \ localIpPerf=10.50.7.32/23 gatewayIpPerf=10.50.7.10 remoteIpPerf=10.50.7.34 \ enableReplication=True remotePassword=********
system 2
PCA-ADMIN> drSetupService \ localIp=10.50.7.33/23 gatewayIp=10.50.7.10 remoteIp=10.50.7.31 \ localIpPerf=10.50.7.34/23 gatewayIpPerf=10.50.7.10 remoteIpPerf=10.50.7.32 \ enableReplication=True remotePassword=********
Important:
When enabling replication, after you rundrSetupService
on the first system you must wait for the job to complete
before running the command on the second system. You can monitor the job on the first system
by running drGetJob jobid=<unique-id>
.
At this stage, the ZFS Storage Appliances in the disaster recovery setup have been successfully peered. The storage appliances are ready to perform scheduled data replication every 5 minutes. The data to be replicated is based on the DR configurations you create. See Managing Disaster Recovery Configurations.
Modifying the ZFS Storage Appliance Peering Setup
After you set up the disaster recovery service and enabled replication between the systems,
you can change the parameters of the peering configuration. You change the service using the
drUpdateService
command in the Service CLI.
Syntax (entered on a single line):
drUpdateService localIp=<primary_system_standard_replication_ip> (in CIDR notation) remoteIp=<replica_system_standard_replication_ip> localIpPerf=<primary_system_performance_replication_ip> (in CIDR notation) remoteIpPerf=<replica_system_performance_replication_ip> gatewayIp=<local_subnet_gateway_ip> (default: first host IP in localIp subnet) gatewayIpPerf=<local_subnet_gateway_ip> (default: first host IP in localIpPerf subnet) maxConfig=<number_DR_configs> (default and maximum is 20) jobRetentionHours=<hours> (default and minimum is 24)
Example 1 – Simple parameter change
This example shows how you change the job retention time from 24 to 48 hours and reduce the maximum number of DR configurations from 20 to 12.
PCA-ADMIN> drUpdateService jobRetentionHours=48 maxConfig=12 Command: drUpdateService jobRetentionHours=48 maxConfig=12 Status: Success Time: 2022-08-11 09:20:48,570 UTC Data: Message = Successfully started job to update DR admin service Job Id = ec64cef4-ba68-493d-89c8-22df51553cd8
Use the drShowService
command to check the current configuration. Run the
command to display the configuration parameters before you change them. Run it again
afterward to confirm that the changes have been applied successfully.
PCA-ADMIN> drShowService Command: drShowService Status: Success Time: 2022-08-11 09:23:54,951 UTC Data: Local Ip = 10.50.7.31/23 Remote Ip = 10.50.7.33 Replication = ENABLED Replication High = DISABLED Message = Successfully retrieved site configuration maxConfig = 12 gateway IP = 10.50.7.10 Job Retention Hours = 48
Example 2 – Replication IP change
There might be network changes in the data center that require you to use different subnets and IP addresses for the replication interfaces configured in the disaster recovery service. This configuration change must be applied in many commands on the two peer systems, and in a specific order. If the systems contain both standard and high-performance storage – as in the example that follows –, change the replication interface settings for both storage types in the same order.
-
Update the local IP and gateway parameters on system 1. Leave the remote IPs unchanged.
PCA-ADMIN> drUpdateService \ localIp=10.100.33.83/28 gatewayIp=10.100.33.81 \ localIpPerf=10.100.33.84/28 gatewayIpPerf=10.100.33.81
-
Update the local IP, gateway, and remote IP parameters on system 2.
PCA-ADMIN> drUpdateService \ localIp=10.100.33.88/28 gatewayIp=10.100.33.81 remoteIp=10.100.33.83 \ localIpPerf=10.100.33.89/28 gatewayIpPerf=10.100.33.81 remoteIpPerf=10.100.33.84
-
Update the remote IP parameters on system 1.
PCA-ADMIN> drUpdateService \ remoteIp=10.100.33.88 remoteIpPerf=10.100.33.89
Example 3 – Trusting a New ZFS Storage Appliance Certificate
The following example shows the command that must be run if the ZFS Storage Appliance certificate on the peer rack is updated. This command retrieves the new certificate from the remote host and adds it to the trust list,
PCA-ADMIN> drUpdateService \ remoteIp=s10.100.33.88 remoteIpPerf=10.100.33.89
Unconfiguring the ZFS Storage Appliance Peering Setup
If a reset has been performed on one or both of the systems in the disaster recovery
solution, and you need to unconfigure the disaster recovery service to remove the entire
peering setup between the ZFS Storage Appliances, use the
drDeleteService
command in the Service CLI.
Caution:
This command requires no other parameters. Be careful when entering it at the
PCA-ADMIN>
prompt, to avoid executing it unintentionally.
You can't unconfigure the disaster recovery service while DR configurations still exist. Proceed as follows:
-
Remove all DR configurations from the two systems that have been configured as replicas for each other.
-
Sign in to the Service CLI on one of the systems and enter the
drDeleteService
command. -
Sign in to the Service CLI on the second system and enter the
drDeleteService
command there as well.
When the disaster recovery service isn't configured, the drShowService
command returns an error.
PCA-ADMIN> drShowService Command: drShowService Status: Failure Time: 2022-08-11 12:31:22,840 UTC Error Msg: PCA_GENERAL_000001: An exception occurred during processing: Operation failed. [...] Error processing dr-admin.service.show response: dr-admin.service.show failed. Service not set up.
Setting Up Peering Between the ZFS Storage Appliances
Note:
Both Private Cloud Appliance racks in the disaster recovery configuration must be running the either both earlier than build 302-b892153 or both build 302-b892153 or later.Before beginning, the show netNetworkConfig
output must have valid entries
for the following:
- DNS IP addresses
- Management Node Hostnames
- Management Node IP Addresses
- Free Public IP Addresses
- A Valid IP address for the ZFS CapacityPool Replication Endpoint
- sn01-dr1.<rack_name><domain_name>
- sn02-dr1.<rack_name><domain_name> (if you use a Performance Pool)
For DNS mapping configured with the zone delegation option, these DNS mappings are managed by Private Cloud Appliance DNS.
To populate the rack core DNS, edit the network configuration:
-
system 1
PCA-ADMIN> edit networkConfig \ zfsCapacityPoolReplicationEndpoint=10.0.7.31
-
system 2
PCA-ADMIN> edit networkConfig \ zfsCapacityPoolReplicationEndpoint=10.0.7.32
For DNS mapping configured with the manual option, these DNS mappings are managed by the data center DNS.
For more information on creating Private Cloud Appliance DNS PTR entries, and DNS management in general, see "Working with Zone Records" in the Networking chapter of the Oracle Private Cloud Appliance User Guide.
Syntax (entered on a single line):
drSetupService localIp=<primary_system_standard_replication_ip> (in CIDR notation) remoteHost=<replica_system_standard_replication_fqdn_for_remoteHost> localIpPerf=<primary_system_performance_replication_ip> (in CIDR notation) remoteHostPerf=<replica_system_performance_replication_fqdn_for_remoteHostPerf> [Optional Parameters:] gatewayIp=<local_subnet_gateway_ip> (default: first host IP in localIp subnet) gatewayIpPerf=<local_subnet_gateway_ip> (default: first host IP in localIpPerf subnet) maxConfig=<number_DR_configs> (default and maximum is 20) jobRetentionHours=<hours> (default and minimum is 24)
Examples:
-
With only standard storage configured:
system 1
PCA-ADMIN> drSetupService \ localIp=10.0.7.31/23 gatewayIp=10.0.7.10 remoteHost=sn01-dr1.rack1,example.com
system 2
PCA-ADMIN> drSetupService \ localIp=10.0.7.33/23 gatewayIp=10.0.7.10 remoteHost=sn01-dr1.rack2.example.com
-
With both standard and high-performance storage configured:
system 1
PCA-ADMIN> drSetupService \ localIp=10.0.7.31/23 gatewayIp=10.0.7.10 remoteHost=sn01-dr1.rack1.example.com \ localIp=10.0.7.32/23 gatewayIp=10.0.7.10 remoteHostPerf=sn02-dr1.rack1.example.com
system 2
PCA-ADMIN> drSetupService \ localIp=10.0.7.33/23 gatewayIp=10.50.7.10 remoteHost=sn01-dr1.rack2.example.com \ localIpPerf=10.0.7.34/23 gatewayIpPerf=10.0.7.10 remoteHostPerf=sn02-dr1.rack2.example.com
Important:
When setting up disaster recovery, after you rundrSetupService
on the first system you must wait for the job to complete
before running the command on the second system. You can monitor the job on the first system
by running drgetjob jobid=<unique-id>
.
For example:
PCA-ADMIN> drgetjob jobid=<unique-id> Command: drgetjob jobid=<unique-id> Status: Success Time: 2023-08-01 15:26:46,973 UTC Data: Type = setup service Job Id = <unique-id> Status = Success Start Time = 2023-08-01 15:26:28.935479 Message = job successfully retrieved
Note:
Ensure that the "Success" status message appears in the Data fields and not only the Command field.The script configures both ZFS Storage Appliances.
After successful configuration of the replication interfaces, you must enable replication over the interfaces you configured.
Enabling Replication for Disaster Recovery
To enable replication between the two storage appliances, using the interfaces you
configured earlier, run the same drSetupService
command from the Service CLI, but this time followed by
enableReplication=True
. You must also provide the
remotePassword
to authenticate with the other storage appliance and
complete the peering setup.
Examples:
-
With only standard storage configured:
system 1
PCA-ADMIN> drSetupService \ localIp=10.0.7.31/23 gatewayIp=10.0.7.10 \ enableReplication=True remotePassword=******** remoteHost=sn01-dr1.rack2.example.com
system 2
PCA-ADMIN> drSetupService \ localIp=10.0.7.33/23 gatewayIp=10.0.7.10 \ enableReplication=True remotePassword=******** remoteHost=sn01-dr1.rack1.example.com
-
With both standard and high-performance storage configured:
system 1
PCA-ADMIN> drSetupService \ localIp=10.0.7.31/23 gatewayIp=10.0.7.10 remoteHost=sn01-dr1.rack2.example.com \ localIpPerf=10.0.7.32/23 gatewayIpPerf=10.0.7.10 remoteHostPerf=sn02-dr1.rack2.example.com \ enableReplication=True remotePassword=********
system 2
PCA-ADMIN> drSetupService \ localIp=10.0.7.33/23 gatewayIp=10.0.7.10 remoteHost=sn01-dr1.rack1.example.com \ localIpPerf=10.0.7.34/23 gatewayIpPerf=10.0.7.10 remoteHostPerf=sn02-dr1.rack1.example.com \ enableReplication=True remotePassword=********
Important:
When enabling replication, after you rundrSetupService
on the first system you must wait for the job to complete
before running the command on the second system. You can monitor the job on the first system
by running drGetJob jobid=<unique-id>
.
At this stage, the ZFS Storage Appliances in the disaster recovery setup have been successfully peered. The storage appliances are ready to perform scheduled data replication every 5 minutes. The data to be replicated is based on the DR configurations you create. See Managing Disaster Recovery Configurations.
Modifying the ZFS Storage Appliance Peering Setup
After you set up the disaster recovery service and enabled replication between the systems,
you can change the parameters of the peering configuration individually. You change the
service using the drUpdateService
command in the Service CLI.
Syntax (entered on a single line):
drUpdateService localIp=<primary_system_standard_replication_ip> (in CIDR notation) remoteHost=<replica_system_standard_replication_fqdn> localIpPerf=<primary_system_performance_replication_ip> (in CIDR notation) remoteHostPerf=<replica_system_performance_replication_fqdn> gatewayIp=<local_subnet_gateway_ip> (default: first host IP in localIp subnet) gatewayIpPerf=<local_subnet_gateway_ip> (default: first host IP in localIpPerf subnet) maxConfig=<number_DR_configs> (default and maximum is 20) jobRetentionHours=<hours> (default and minimum is 24)
Example 1 – Simple parameter change
This example shows how you change the job retention time from 24 to 48 hours and reduce the maximum number of DR configurations from 20 to 12.
PCA-ADMIN> drUpdateService jobRetentionHours=48 maxConfig=12 Command: drUpdateService jobRetentionHours=48 maxConfig=12 Status: Success Time: 2022-08-11 09:20:48,570 UTC Data: Message = Successfully started job to update DR admin service Job Id = ec64cef4-ba68-493d-89c8-22df51553cd8
Use the drShowService
command to check the current configuration. Run the
command to display the configuration parameters before you modify them. Run it again
afterward to confirm that your changes have been applied successfully.
PCA-ADMIN> drShowService Command: drShowService Status: Success Time: 2022-08-11 09:23:54,951 UTC Data: Local Ip = 10.0.7.31/23 Remote Host = sn01-dr1.exmaple.com Replication = ENABLED Replication High = DISABLED Message = Successfully retrieved site configuration maxConfig = 12 gateway IP = 10.0.7.10 Job Retention Hours = 48
Example 2 – Replication IP change
There might be network changes in the data center that require you to use different subnets and IP addresses for the replication interfaces configured in the disaster recovery service. This configuration change must be applied in several commands on the two peer systems, and in a specific order. If the systems contain both standard and high-performance storage – as in the example following – change the replication interface settings for both storage types in the same order.
-
Update the replication endpoint parameters on system 1.
PCA-ADMIN> edit networkConfig zfsCapacityPoolReplicationEndpoint=10.100.3.88 \ zfsPerfPoolReplicationEndpoint=10.100.3.89
-
Update the local IP and gateway parameters on system 1. Leave the remote IPs unchanged.
PCA-ADMIN> drUpdateService \ localIp=10.100.3.83/28 gatewayIp=10.100.3.81 \ localIpPerf=10.100.3.84/28 gatewayIpPerf=10.100.3.81
-
Update the replication endpoint parameters on system 2.
PCA-ADMIN> edit networkConfig zfsCapacityPoolReplicationEndpoint=10.100.3.88 \ zfsPerfPoolReplicationEndpoint=10.100.3.89
-
Update the local IP, gateway, and remote host parameters on system 2.
PCA-ADMIN> drUpdateService \ localIp=10.100.3.88/28 gatewayIp=10.100.3.81 remoteHost=sn01-dr1.rack1.example.com \ localIpPerf=10.100.3.89/28 gatewayIpPerf=10.100.3.81 remoteHostPerf=sn02-dr1.rack1.example.com
Example 3 – Configuration Without Performance Pool
The following example applies these four commands to a configuration using only the basic pool and not the performance pool.
-
Update the replication endpoint parameters on system 1.
PCA-ADMIN> edit networkConfig zfsCapacityPoolReplicationEndpoint=10.16.9.43 Command: edit networkConfig zfsCapacityPoolReplicationEndpoint=10.16.9.43 Status: Success Time: 2023-08-16 12:08:30,585 UTC JobId: 175b1600-eabe-4a0f-aa45-xxxxxx65599c1
-
Update the local IP parameters on system 1. Leave the remote IPs unchanged. Check that job has finished successfully.
PCA-ADMIN> drUpdateService localIp=10.16.9.43/12 Command: drUpdateService localIp=10.16.9.43/12 Status: Success Time: 2023-08-16 12:09:45,137 UTC Data: Message = Successfully started job to update DR admin service Job Id = 2844b731-f53c-4d92-850d-xxxxx22b49e3 PCA-ADMIN> drgetJob jobId=2844b731-f53c-4d92-850d-xxxxx22b49e3 Command: drgetJob jobId=2844b731-f53c-4d92-850d-xxxxx22b49e3 Status: Success Time: 2023-08-16 12:15:19,560 UTC Data: Type = update_service Job Id = 2844b731-f53c-4d92-850d-xxxxx22b49e3 Status = finished Start Time = 2023-08-16 12:09:45.017743 End Time = 2023-08-16 12:15:19.443415 Result = success Message = job successfully retrieved Response = Successfully updated DR service
-
Update the replication endpoint parameters on system 2.
PCA-ADMIN> edit networkConfig zfsCapacityPoolReplicationEndpoint=10.16.11.43 Command: edit networkConfig zfsCapacityPoolReplicationEndpoint=10.16.11.43 Status: Success Time: 2023-08-16 12:22:36,218 UTC JobId: b7bff723-0237-4a11-9d08-xxxxxd166e1d
-
Update the local IP parameters on system 2. Leave the remote IPs unchanged. Check that job has finished successfully.
PCA-ADMIN> drUpdateService localIp=10.16.11.43/12 Command: drUpdateService localIp=10.16.11.43/12 Status: Success Time: 2023-08-16 12:24:54,882 UTC Data: Message = Successfully started job to update DR admin service Job Id = 1d6826ac-04db-49f9-aa27-35996f69410a PCA-ADMIN> drgetjob jobId=1d6826ac-04db-49f9-aa27-xxxxx69410a Command: drgetjob jobId=1d6826ac-04db-49f9-aa27-xxxxxf69410a Status: Success Time: 2023-08-16 12:31:55,828 UTC Data: Type = update_service Job Id = 1d6826ac-04db-49f9-aa27-xxxxxf69410a Status = finished Start Time = 2023-08-16 12:24:54.655686 End Time = 2023-08-16 12:30:16.461914 Result = success Message = job successfully retrieved Response = Successfully updated DR service
Example 4 – Trusting a New ZFS Storage Appliance Certificate
The following example shows the command that must be run if the ZFS Storage Appliance certificate on the peer rack is updated. This command retrieves the new certificate from the remote host and adds it to the trust list,
PCA-ADMIN> drUpdateService \ remoteHost=sn01-dr1.rack1.example.com remoteHostPerf=sn02-dr1.rack1.example.com
Unconfiguring the ZFS Storage Appliance Peering Setup
If a reset has been performed on one or both of the systems in a disaster recovery
solution, and you need to unconfigure the disaster recovery service to remove the entire
peering setup between the ZFS Storage Appliances, use the
drDeleteService
command in the Service CLI.
Caution:
This command requires no other parameters. Be careful when entering it at the
PCA-ADMIN>
prompt, to avoid executing it unintentionally.
You cannot unconfigure the disaster recovery service while DR configurations still exist. Proceed as follows:
-
Remove all DR configurations from the two systems that have been configured as replicas for each other.
-
Log in to the Service CLI on one of the systems and enter the
drDeleteService
command. -
Log in to the Service CLI on the second system and enter the
drDeleteService
command there as well.
When the disaster recovery service isn't configured, the drShowService
command returns an error.
PCA-ADMIN> drShowService Command: drShowService Status: Failure Time: 2022-08-11 12:31:22,840 UTC Error Msg: PCA_GENERAL_000001: An exception occurred during processing: Operation failed. [...] Error processing dr-admin.service.show response: dr-admin.service.show failed. Service not set up.