7.4 Fault Recovery Scenarios
This section describes the fault recovery procedures for various scenarios.
7.4.1 Deployment Failure
Restore SCP as described in Restoring SCP.
7.4.2 cnDBTier Corruption
When the database corrupts, the database on all other sites may corrupt due to data replication. It depends on the replication status after the corruption has occurred.
If the data replication is interrupted due to database corruption, then cnDBTier fails in either single or multiple sites, not all the sites. If the data replication is successful, then database corruption replicates to all the cnDBTier sites and cnDBTier fails in all the sites.
To recover cnDBTier when cnDBTier corrupts in single, multiple, or all sites, see Oracle Communications Cloud Native Core, cnDBTier Installation, Upgrade, and Fault Recovery Guide.
7.4.3 SCP Data Corruption
Take a backup of the SCP database (DB) and restore the database on a different Network Database (NDB) cluster. This procedure is for on-demand backup and restore of SCP DB. The commands used for these procedures are provided by the MySQL NDB cluster.
kubectl -n <namespace> exec <management node pod> -- ndb_mgm -e show
<namespace>
is the namespace where cnDBTier is deployed<management node pod>
is the management node pod of cnDBTier
Example:
[cloud-user@vcne2-bastion-1 ~]$ kubectl -n scpsvc exec ndbmgmd-0 -- ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=1 @10.233.86.202 (mysql-8.0.22 ndb-8.0.22, Nodegroup: 0, *)
id=2 @10.233.81.144 (mysql-8.0.22 ndb-8.0.22, Nodegroup: 0)
[ndb_mgmd(MGM)] 2 node(s)
id=49 @10.233.81.154 (mysql-8.0.22 ndb-8.0.22)
id=50 @10.233.86.2 (mysql-8.0.22 ndb-8.0.22)
[mysqld(API)] 2 node(s)
id=56 @10.233.81.164 (mysql-8.0.22 ndb-8.0.22)
id=57 @10.233.96.39 (mysql-8.0.22 ndb-8.0.22)
[cloud-user@vcne2-bastion-1 ~]$
- If the SCP database backup is required, do the following:
- If the SCP database restore is required, do the following:
7.4.4 Single or Multiple Site Failure
This section describes how to perform fault recovery when either one, many, or all of the sites have software failure.
The following are site failure scenarios:
7.4.4.1 Single or Multiple Site Failure
7.4.4.2 All Sites Failure
- Install a new Kubernetes cluster by performing the Cloud Native Environment (CNE) installation procedure as described in Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.
- Install cnDBTier as described in Oracle Communications Cloud Native Core, cnDBTier Installation, Upgrade, and Fault Recovery Guide.
- To perform cnDBTier fault recovery, restore the latest backed up data as described in Oracle Communications Cloud Native Core, cnDBTier Installation, Upgrade, and Fault Recovery Guide.
- Restore SCP as described in Restoring SCP.