Go to primary content
Diameter Signaling Router Alarms and KPIs
Release 8.2
E89015-01
Go To Table Of Contents
Contents

Previous
Previous
Next
Next

31228 - HA standby offline

Alarm Group:
HA
Description:
High availability standby server is offline.
Severity:
Critical
Instance:
May include AlarmLocation, AlarmId, AlarmState, AlarmSeverity, and bindVarNamesValueStr
HA Score:
Normal
Auto Clear Seconds:
0 (zero)
OID:
comcolHaStandbyOfflineNotify

Recovery:

  1. If loss of communication between the active and standby servers is caused intentionally by maintenance activity, alarm can be ignored. It clears automatically when communication is restored between the two servers.
  2. If communication fails at any other time, it is recommended to look for network connectivity issues and it is recommended to contact My Oracle Support (MOS) if needed.
  3. A workaround for this problem is to increase the failCount values for all server groups in the HaCfg table. Bumping it up from 5 to 10 should solve the problem. Check with the application team before applying this workaround. Run the iset -ffailCount=10 HaCfg command on the active NO where "1=1".

    Note:

    This command is disruptive and causes active servers in the entire topology to lose service for about one minute while HA is reconfigured. A new server may be selected as active after the change is applied. If less disruption is required, you can apply the change one server group at a time as an alternative.