8 Fault Recovery
This chapter provides information about fault recovery for Oracle Communications Cloud Native Core, Converged Policy (Policy) deployment.
8.1 Overview
You must take database backup and restore it either on the same or a different cluster. It uses the Policy database (MySQL NDB Cluster) to run any command or to follow any instructions.
Database Model of CNC Policy
- Configuration Data: The configuration data is exclusive for a given site. Thus, an exclusive logical database is created and used by a site to store its configuration data. Using CNC Console and Configuration Management service, you can configure the data in the respective site only.
- Session Data: The session data is shared across sites. Thus, a common logical database is created and used by all sites. The data is replicated across sites to preserve and share session with mated sites. In case of cross sites messaging or a site failure, shared session data helps in continuity of service.
The following image shows the Policy database model in three different sites:
Figure 8-1 Database Model

8.2 Impacted Areas
The following table shares information about the impacted areas during Policy fault recovery:
Scenario | Requires Fault Recovery or re-install of CNE? | Requires Fault Recovery or re-install of cnDBTier? | Requires Fault Recovery or re-install of Policy? | Other |
---|---|---|---|---|
Scenario: Session Database Corruption | No | Yes
Restoring cnDBTier from older backup is the only way to restore back to restore point. |
No
Only if cnDBTier credentials are changed. |
All sites require Fault Recovery. |
Scenario: Site Failure | Yes | Yes | Yes | NA |
8.3 Prerequisites
Before performing any fault recovery procedure, ensure that the following prerequisites are met:
- cnDBTier must be in a healthy state and available on multiple sites
along with Policy. To check the cnDBTier status, perform the following steps:
- Run the following command to ensure that all the nodes are
connected:
ndb_mgm> show
- Run the following command to check the pod status:
kubectl get pods -n <namespace>
If the pod status is
Running
, then the cnDBTier is in healthy state. - Run the following command to check if the replication is up:
mysql> show slave status\G
In case there is any error, see Fault Recovery chapter in Oracle Communications Cloud Native Core, cnDBTier Installation and Upgrade Guide.
- Run the following command to check which cnDBTier is having
ACTIVE replication to take backup:
select * from replication_info.DBTIER_REPLICATION_CHANNEL_INFO;
- Run the following command to ensure that all the nodes are
connected:
- Automatic backup must be enabled on cnDBTier. Enabling automatic backup
helps in achieving the following:
- Restore stable version of the network function database
- Minimize significant loss of data due to upgrade or roll back failures
- Minimize loss of data due to system failure
- Minimize loss of data due to data corruption or deletion due to external input
- Migrate database information for a network function from one site to another
- The following files must be available for fault recovery:
- Custom values file
(
occnp-custom-values-<release_number>
) - Helm charts
(
occnp-<release_number>.tgz
) - Secrets and Certificates
- RBAC resources
- Custom values file
(
Note:
For details on enabling automatic backup, see Fault Recovery section in Oracle Communications Cloud Native Core, cnDBTier (cnDBTier) Installation Guide.
8.4 Fault Recovery Scenarios
This section describes the fault recovery procedures for various scenarios.
Note:
This chapter describes scenario based procedures to restore Policy databases only. To restore all the databases that are part of cnDBTier, see Fault Recovery chapter in Oracle Communications Cloud Native Core, cnDBTier Installation and Upgrade Guide available on My Oracle Support (MOS).8.4.1 Scenario: Session Database Corruption
This section describes how to recover Policy when its session database corrupts.
When the session database corrupts, the database on all other sites can also corrupt due to data replication. It depends on the replication status after the corruption has occurred. If the data replication breaks due to database corruption, cnDBTier fails in either single or multiple sites (not all sites). And if the data replication is successful, database corruption replicates to all the cnDBTier sites and cnDBTier fails in all sites.
The fault recovery procedure covers following sub-scenarios:
8.4.1.1 When DBTier Failed in All Sites
This section describes how to recover session database when successful data replication corrupts all the cnDBTier sites.
To recover session database, perform the following steps:
- Uninstall Policy Helm charts on all sites. For more information about uninstalling Helm charts, see Oracle Communication Cloud Native Core, Converged Policy Installation and Upgrade Guide available on MOS.
- Perform cnDBTier fault recovery procedure:
- Use auto-data backup file for restore procedure. For more information about DBTier restore, see Fault Recovery chapter in Oracle Communications Cloud Native Core, cnDBTier Installation and Upgrade Guide available on MOS.
- Install Policy Helm charts. For more information about installing Helm
charts, see Oracle Communication Cloud Native Core, Converged Policy
Installation, Upgrade and Fault Recovery Guide available on MOS.
Note:
You can also refer to thecustom-values.yaml
file used at the time of Policy installation for Helm charts installation.
8.4.2 Scenario: Site Failure
This section describes how to perform fault recovery when either one or many of your sites have software failure.
8.4.2.1 Single or Multiple Site Failure
Note:
It is assumed that one of the cnDBTier is in healthy state.Note:
Ensure that all the prerequisites mentioned are met.- Uninstall Policy. For more information, see the Uninstalling CNC Policy section in Oracle Communications Cloud Native Core, Converged Policy Installation, Upgrade and Fault Recovery Guide.
- Install a new cluster by performing the Cloud Native Environment (CNE) installation procedure. For more information, see Oracle Communications Cloud Native Core, Cloud Native Environment (CNE) Installation and Upgrade Guide available on My Oracle Support.
- Install cnDBTier, in case replication is down or cnDBTier pods are not up and running. For information about installing cnDBTier, see Oracle Communications Cloud Native Core, cnDBTier Installation and Upgrade Guide.
- Perform DBTier fault recovery procedure:
- Perform DBTier fault recovery procedure to take backup from older healthy site by following the Create On-demand Database Backup procedure in Oracle Communications Cloud Native Core, cnDBTier Installation and Upgrade Guide.
- Restore the database to new site by following the Restore Database with Backup procedure in Oracle Communications Cloud Native Core, cnDBTier Installation and Upgrade Guide.
- Install Policy Helm charts. For more information on installing Helm charts, see Oracle Communications Cloud Native Core, Converged Policy Installation, Upgrade and Fault Recovery Guide.