6 Fault Recovery
This chapter describes the procedures to perform fault recovery for Oracle Communications Cloud Native Core, Network Exposure Function deployment.
6.1 Impacted Areas
Table 6-1 Impacted Areas
Scenario | Requires Fault Recovery or Re-install of CNE | Requires Fault Recovery or Re-install of cnDBTier | Requires Fault Recovery or Re-install of NEF |
---|---|---|---|
Site Failure | Yes | Yes | Yes |
Migration to New Cluster | Yes, if cluster is not present. | Yes, if cnDBTier is not present. | Yes |
Database Corruption | Yes, in case complete site is down. | Yes, in case replication is not up. | Yes |
Deployment Corruption | No | No | Yes |
6.2 Prerequisites
Before performing any fault recovery procedure, ensure that the following prerequisites are met:
- Perform the following steps to ensure that cnDBTier is in a healthy state:
- Run the following command to check if all the cnDBTier nodes are
connected:
ndb_mgm> show
Example:
ndb_mgm> show Connected to Management Server at: localhost:1186
The output ensures that the nodes are connected.
- Run the following command to check the pod
status:
kubectl get pods -n <namespace>
All the pods must be in
Running
state. - Run the following command to check if the DB replication is up and
running:
mysql> show replica status\G
The values for the Replica_IO_Running and Replica_SQL_Running fields must be "
Yes
".In case of any errors, see Oracle Communications Cloud Native Core, cnDBTier Installation, Upgrade, and Fault Recovery Guide.
- Run the following command to check if all the cnDBTier nodes are
connected:
- Enable the automatic backup on cnDBTier by scheduling regular backups. The regular
backups are help in fault recovery with the following tasks:
- Restore stable version of the network function database
- Minimize significant loss of data due to upgrades or roll back failures
- Minimize loss of data due to system failure
- Minimize loss of data due to data corruption or deletion due to external input
- Migrate network function database information from one site to another
- Ensure that the following NF specific files are available for fault recovery:
- Custom values file
(
ocnef-custom-values-<release_number>
) - Helm charts
(
ocnef-<release_number>.tgz
) - Secrets and Certificates
- RBAC resources
- Custom values file
(