2 Recovery Scenarios

The DSR disaster recovery procedure falls into six basic categories. It is primarily dependent on the state of the NOAM and SOAM server.

Note:

A failed server in disaster recovery context refers to a server that has suffered partial or complete software failure to the extent that it cannot restart or be returned to normal operation and requires intrusive activities to redeploy base software.

Table 2-1 DSR disaster recovery procedure

Recovery Servers
Recovery of the entire network from a total outage. Refer 2.1 Complete Server Outage (All Servers)
  • All NOAM servers failed
  • All SOAM servers failed
  • 1 or more MP servers failed
Recovery of one or more servers with at least one NOAM server intact. Refer 2.2 Partial Server Outage with One NOAM Server Intact and both SOAMs Failed
  • 1 or more NOAM servers intact
  • All SOAM servers or MP servers failed
Recovery of the NOAM pair with one or more SOAM servers intact. Refer 2.3 Partial Server Outage with both NOAM Servers Failed and One SOAM Server Intact
  • All NOAM servers failed
  • 1 or more SOAM servers intact
Recovery of one or more server with at least one NOAM and one SOAM server intact. Refer 2.4 Partial Server Outage with NOAM and one SOAM Server Intact
  • 1 or more NOAM servers intact
  • 1 or more SOAM servers intact
  • 1 or more MP servers failed
Recovery of the NOAM pair with DR-NOAM available and one or more SOAM servers intact. Refer 2.5 Partial Server Outage with both NOAM Servers Failed with DR-NOAM Available
  • All NOAM servers failed
  • 1 or more SOAM servers intact
  • DR-NOAM available
Recovery of one or more server with corrupt databases that cannot be restored via replication from the active parent node. Refer 2.6 Partial Service Outage with Corrupt Database
  • Server is intact.
  • Database gets corrupted on the server.
  • Replication is occurring to the server with corrupted database.
Recovery Scenario 6: Case 2(Database Recovery)
  • Server is intact.
  • Database gets corrupted on the server.
  • Latest Database backup of the corrupt server is not present.
  • Replication is inhibited (either manually or because of comcol upgrade barrier).

2.1 Complete Server Outage (All Servers)

Scenario

  • All NOAM servers failed
  • All SOAM servers failed
  • 1 or more MP servers failed

In the severe case scenario where all the servers in the network have suffered complete software failure. The servers are recovered using OVA images then restoring database backups to the active NOAM and SOAM servers.

Database backups will be taken from customer offsite backup storage locations (assuming these were performed and stored offsite prior to the outage). If no backup files are available, the only option is to rebuild the entire network from scratch. The network data must be reconstructed from whatever sources are available, including entering all data manually.

2.2 Partial Server Outage with One NOAM Server Intact and both SOAMs Failed

Scenarios

  • 1 or more NOAM servers intact.
  • All SOAM servers failed.
  • 1 or more MP servers failed.

This case assumes that at least one NOAM servers intact. All SOAM servers have failed and are recovered using OVA images. Database is restored on the SOAM server and replication will recover the database of the remaining servers.

2.3 Partial Server Outage with both NOAM Servers Failed and One SOAM Server Intact

Scenarios:

  • All NOAM servers failed.
  • 1 or more SOAM servers intact.

Database is restored on the NOAM and replication will recover the database of the remaining servers.

2.4 Partial Server Outage with NOAM and one SOAM Server Intact

Scenarios:

  • 1 or more NOAM servers intact.
  • 1 or more SOAM servers intact.
  • 1 or more MP servers failed.

The simplest case of disaster recovery is with at least one NOAM and one SOAM servers intact. All servers are recovered using base recovery of software. Database replication from the active NOAM and SOAM servers will recover the database to all servers.

2.5 Partial Server Outage with both NOAM Servers Failed with DR-NOAM Available

Scenarios:

  • All NOAM servers failed.
  • 1 or more SOAM servers intact.
  • DR-NOAM available.

This case assumes that a partial outage with both NOAM servers failed but a DR NOAM available. The DR NOAM is switched from secondary to primary then recovers the failed NOAM servers.

2.6 Partial Service Outage with Corrupt Database

Case 1: Database is corrupted, replication channel is inhibited (either manually or because of comcol upgrade barrier) and database backup is available.

Case 2: Database is corrupted, but replication channel is active.