Go to primary content
User Data Repository Alarms, KPIs, and Measurements
Release 12.4
E82598-01
Go To Table Of Contents
Contents

Previous
Previous
Next
Next

31101 - Database replication to slave failure

Alarm Group:
REPL
Description:
Database replication to a slave database has failed. This alarm is generated when:
  • The replication master finds the replication link is disconnected from the slave.
  • The replication master's link to the replication slave is OOS, or the replication master cannot get the slave's correct HA state because of a failure to communicate.
  • The replication mode is relayed in a cluster and either:
    • No nodes are active in cluster, or
    • None of the nodes in cluster are getting replication data.
Severity:
Critical
Instance:
May include AlarmLocation, AlarmId, AlarmState, AlarmSeverity, and bindVarNamesValueStr
HA Score:
Normal
Auto Clear Seconds:
300
OID:
comcolDbRepToSlaveFailureNotify

Recovery:

  1. Verify the path for all services on a node by typing path.test –a <toNode> in a command interface to test the paths for all services.
  2. Use the path test command to test the communication between nodes by typing iqt -pE NodeInfo to get the node ID. Then type path.test -a <nodeid> to test the paths for all services.
  3. Examine the Platform savelogs on all MPs, SO, and NO by typing sudo /usr/TKLC/plat/sbin/savelogs_plat in the command interface. The plat savelogs are in the /tmp directory.
  4. Check network connectivity between the affected servers.
  5. If there are no issues with network connectivity, contact unresolvable-reference.htm#GUID-DD0927BD-FD0B-4CEB-86E9-98A33C12D4E0.