B Automatic Monitoring of Events
This appendix contains overviews of monitored events, GUI and surveillance notifications, and traps.
Introduction
This appendix contains:
-
“Overview of Monitored Events”, which describes how the LSMS monitors itself for events and alarms and how it reports them.
-
“Overview of GUI Notifications”, which describes the display, format, and logging of notifications that appear on the graphical user interface.
-
“Overview of Surveillance Notifications”, which describes the display, format, and logging of Surveillance notifications.
-
“Overview of Traps”, which describes the transmission, format, and logging of SNMP traps.
-
A listing of all events, in numerical order, starting on page B-18. For each event, this appendix includes:
-
Explanation of the probable cause for the event
-
Suggested recovery
-
Indication of whether the event results in a GUI notification, Surveillance notification, trap, or some combination of these.
-
Overview of Monitored Events
This section describes:
Types of Events and Alarms Reported
The LSMS monitors itself for the types of events and alarms shown in Table B-1 . When one of these events occurs, the LSMS does one or more of the following:
-
Displays a notification on the graphical user interface (GUI notification)
-
Posts a Surveillance notification at a certain frequency to the administration console by default, or to the second serial port if so configured
-
Sends a trap to a Network Management System (NMS) if you have installed the optional Remote Monitoring feature
Every GUI notification and Surveillance notification contains its associated event number. Traps contain a trap ID, which is explained in Overview of Traps.
Table B-1 Notification Event Number Categories
Event Number Range | Category | Description |
---|---|---|
0000–1999 |
EMS |
Events that pertain to an Element Management System (EMS). The EMS is a process that runs on the Multi-Purpose Server (MPS) at a network element. |
2000–3999 |
NPAC |
Events that pertain to a Number Portability Administration Center (NPAC) |
4000–5999 |
Platform and switchover (some of these events do not produce GUI notifications) |
Events that pertain to system resources, such as disks, hardware, memory, central processing unit (CPU) utilization and to switchover functions |
6000–7999 |
Main LSMS processes |
Events that pertain to one of the following main LSMS processes: |
8000–8999 |
Applications |
Events that pertain to LSMS applications that are feature or application dependent, such as LNP Database Synchronization, Service Assurance, or NPA Split Administration |
How Servers Report Alarms and Events
The LSMS 9.0 servers perform the following functions to monitor and report events:
- The standby server:
- Monitors itself only for:
- Platform events (see Platform Alarms)
- Switchover-readiness events, such as those that describe database replication or critical network interfaces
- Controls the appropriate AlarmLED (Critical, Major, or Minor) on the front of the server by illuminating the LED when one or more platform alarm in that category exists and turning off the LED when no platform alarms in that category exist
- Sends any notification to its Serial Port 3 and logs the notification in its Surveillance log
- Sends the notification to the active server
- Monitors itself only for:
- The active server performs the following functions:
- Monitors itself for both platform events and application events
- Controls the appropriate AlarmLED (Critical, Major, or Minor) on the front of the server by illuminating the LED when one or more platform alarm in that category exists and turning off the LED when no platform alarms in that category exist
- Sends all platform events for itself, events reported from the standby server, and appropriate application events for itself to its Serial Port 3 and also logs the event as appropriate in its Surveillance log (some event notifications are reported repeatedly; for more information about which events are reported repeatedly, see the individual event descriptions)
- Alarms that originate from the active server contain the alarm text with no hostname
- Alarms that originate from the standby server contain the alarm text preceded by the standby server’s hostname
Note:
Although all events are reported through SNMP traps and all platform alarms are reported through Surveillance notifications, not all application alarms are reported both through the GUI and through Surveillance notifications; for more information about which alarms are reported in which way, see the individual event descriptions.
- Displays one time on the GUI each platform or application event for itself and each platform event received from the standby server:
- Alarms that originate from the active server display the alarm text with no hostname
- Alarms that originate from the standby server display the alarm text preceded by the standby server’s hostname
- Sends one Simple Network Management Protocol (SNMP) trap for each platform or application event for itself and for each platform event received from the standby server. Each trap contains the IP address of the server from which the notification originated.
Overview of GUI Notifications
Displaying GUI Notifications
GUI notifications are displayed on the GUI only if the GUI is active when the reported event occurs, but all GUI notifications are logged in an appropriate log as described in Logging GUI Notifications. Figure B-1 shows an example of notifications displayed on the GUI.
Figure B-1 GUI Notifications

Format of GUI Notifications
This section describes the general format used for most GUI notifications, as well as additional fields used for GUI event notifications (used to report information only) and for EMS GUI notifications. The formats are expressed as an ordered sequence of variables. Variables are expressed with the name of the variable enclosed by angle brackets; for example, <Severity> indicates a variable for the severity assigned to a GUI notification. Variables Used in GUI Notification Format Descriptionsshows the variables used in GUI notification formats.
General Format for GUI Notifications
The format for most GUI notifications is:
[<Severity>]:<Time Stamp> <Event Number> <Message Text String>
In addition, the following types of GUI notifications contain additional fields:
-
EMS GUI notifications contain information about the EMS for which they are reporting status (see Format for EMS GUI Notifications)
-
Notifications that have the severity
EVENT
can contain additional event data fields (see Format for GUI Notifications with EVENT Severity)
Format for EMS GUI Notifications
EMS GUI notifications (event numbers in the range 0000–1999) contain a <CLLI>
value to indicate the Common Language Location Identifier for the network element where the EMS resides. The format for EMS GUI notifications is:
[<Severity>]:<Time Stamp> <Event Number> <CLLI>: <Message Text String>
Format for GUI Notifications with EVENT Severity
Notifications that have the severity EVENT
can contain additional event data fields. The format for GUI notifications with severity EVENT
is:
[EVENT]:<Time Stamp> <Event Number> <EventType>:<EventData1>, [<EventData2>],...
Variables Used in GUI Notification Format Descriptions
Table B-2 shows the possible values and meanings for each of the variables shown in format definitions for GUI notifications.
Table B-2 Variables Used in GUI Notifications
Field | Description | ||
---|---|---|---|
Indicates seriousness of event, using both text and color, as follows: |
|||
Text |
Color |
Meaning |
|
|
Red |
Reports a serious condition that requires immediate attention |
|
|
Yellow |
Reports a moderately serious condition that should be monitored, but does not require immediate attention |
|
[Minor] | Turquoise | Reports a condition of minor significance that should be monitored, but which does not require immediate attention. | |
|
Green |
Reports status information or the clearing of a condition that caused previous posting of a |
|
|
White |
For information only |
|
|
Indicates time that the event was detected, in format: YYYY-MM-DD hh:mm:ss where fields are as follows: |
||
Field |
Meaning |
Possible Values |
|
|
Year |
Any four digits |
|
|
Month |
01 through 12 |
|
|
Day |
01 through 31 |
|
|
Hour |
00 through 23 |
|
|
Minute |
00 through 59 |
|
|
Second |
00 through 59 |
|
|
Four-digit number that identifies the specific GUI notification (also indicates the type of GUI notification, as shown in Table B-1 ). |
||
|
Text string (which may contain one or more variables defined in Table B-3 ) that provides a small amount of information about the event. For more information about the event, look up the corresponding event number in this appendix; for each event number, this appendix shows the text string as it appears in a GUI notification, as well as a more detailed explanation and suggested recovery. |
||
|
Used in all EMS GUI notifications to indicate the Common Language Location Identifier for the network element where the EMS resides. |
||
|
Optional event data fields, as indicated by square brackets around the field, included in GUI notifications with severity |
Variables Used in Message Text String of GUI Notifications
Table B-3 shows the variables that can appear in the message text of a GUI notification.
Table B-3 Variables Used in Message Text of GUI Notifications
Symbol | Possible Values and Meanings | Number of Characters |
---|---|---|
|
PRIMARY=Primary NPAC SECONDARY=Secondary NPAC |
7 or 9 |
|
Time, in minutes, between retries of a request sent to an NPAC after it sent a failure response |
1-10 |
|
Number of times the LSMS will retry to recover from a failure response sent by NPAC |
1-10 |
|
Year, month, day, hour, minute, second |
14 |
|
CA = Canada MA = MidAtlantic MW = Midwest NE = Northeast SE = Southeast SW = Southwest WE = Western WC = WestCoast |
2 |
Examples of GUI Notifications
Example of General Format GUI Notifications
Following is an example of a general GUI notification (for a description of its format, see General Format for GUI Notifications):
[Critical]:1998-07-05 11:49:56 2012 NPAC PRIMARY-NE Connection Attempt Failed:
Access Control Failure
Example of an EMS GUI Notification
Following is an example of an EMS GUI notification (for a description of its format, see Format for EMS GUI Notifications). In this example, <CLLI>
has the value LNPBUICK
:
[Critical]:1998-07-05 11:49:56 0003 LNPBUICK: Primary Association Failed
Example of GUI Notification with EVENT Severity Level
Following is an example of a GUI notification with severity [EVENT]
. For a description of its format, see Format for GUI Notifications with EVENT Severity:
[EVENT]: 2000-02-05 11:49:56 8069 LNPBUICK: Audit LNP DB Synchronization Aborted
Logging GUI Notifications
When an event that generates a GUI notification occurs, that notification is logged in the file created for those events. Table B-4 shows the types of log files used for each of these file names, where <mmdd>
indicates the month and day the event was logged.
Table B-4 Logs for GUI Notifications
Event Type | Log File |
---|---|
EMS Alarms, NPAC Alarms, and Main LSMS Process Alarms |
|
Non-alarm Events |
|
For information about the format of the logs and how to view the logs, refer to the Database Administrator's Guide.
Overview of Surveillance Notifications
Surveillance notifications are created by the Surveillance feature. These notifications can report status that is not available through the GUI notifications and report status that can be monitored without human intervention.
Displaying Surveillance Notifications
Surveillance notifications are sent to Serial Port 3 on each server.
Format of Surveillance Notifications
All Surveillance notifications reported on the same server where the event occurred have the following format:
<Event Number>|<Time Stamp>|<Message Text String>
Surveillance notifications that originated from the non-active server and are reported on the active server where the event occurred have an additional field that shows the hostname of the server where the event occurred, as shown in the following format:
<Event Number>|<Time Stamp>|<Host Name>|<Message Text String>
Variables Used in Surveillance Notification Format Descriptions
Table B-5 shows the possible values and meanings for each of the variables shown in format definition for Surveillance notifications.
Table B-5 Variables Used in Surveillance Notifications
Field | Description | ||
---|---|---|---|
|
Four-digit number that identifies the specific Surveillance notification and also indicates the type of Surveillance notification, as shown in Table B-2 . |
||
|
Indicates time that the event was detected, in format:
|
||
Field |
Meaning |
Possible Values |
|
|
Hour |
00 through 23 |
|
|
Minute |
00 through 59 |
|
|
Month |
First three letters of month’s name |
|
|
Day |
01 through 31 |
|
|
Year |
Any four digits |
|
|
First seven letters of the name of the host (one of two redundant servers) that noted the event. (In addition, the documentation of the individual event includes information about whether the event is reported by the active server or inactive server, or both servers.) |
||
|
Text string (which may contain one or more variables defined in Table B-6) that provides a small amount of information about the event. For more information about the event, look up the corresponding event number in this appendix; for each event number, this appendix shows the text string as it appears in a Surveillance notification, as well as a more detailed explanation and suggested recovery. |
Variables Used in Message Text String of Surveillance Notifications
Table B-6 shows the variables that can appear in the message text of a Surveillance notification.
Table B-6 Variables Used in Message Text of Surveillance Notifications
Symbol | Possible Values and Meanings | Number of Characters |
---|---|---|
|
Common Language Location Identifier for the network element |
11 |
|
|
7 or 9 |
|
0000 = Midwest 0001 = MidAtlantic 0002 = Northeast 0003 = Southeast 0004 = Southwest 0005 = Western 0006 = WestCoast 0008 = Canada |
4 |
|
IP address of the NPAC |
10 |
|
First 12 characters of process name |
12 |
|
Midwest MidAtlantic Northeast Southeast Southwest Western WestCoast Canada |
6 to 12 |
|
Return code |
1 or 2 |
|
System name of machine that implements the Service Assurance Manager |
12 |
|
Name of disk volume, for example: |
3 |
|
Name of disk volume, for example: |
3 |
Example of a Surveillance Notification
Following is an example of a Surveillance notification:
LSMS8088|14:58 Mar 10, 2000|lsmspri|Notify: sys Admin - Auto Xfer Failure
Logging Surveillance Notifications
In addition to displaying Surveillance notifications, the Surveillance feature logs all Surveillance notifications in the file survlog.log
in the/var/
TKLC/lsms/logs
directory.
If the LSMS Surveillance feature becomes unable to properly report conditions, it logs the error information in a file, named lsmsSurv.log
, in the /var/TKLC/lsms/logs directory on each server’s system disk. When the size of lsmsSurv.log
exceeds 1MB, it is copied to a backup file, named lsmsSurv.log.bak
,in the same directory. There is only one LSMS Surveillance feature backup log file, which limits the amount of log disk space to approximately 2MB.
Overview of Traps
The optional Remote Monitoring feature provides the capability for the LSMS to report certain events and alarms to a remote location, using the industry-standard Simple Network Management Protocol (SNMP). The LSMS implements an SNMP agent.
Customers can use this feature to cause the LSMS to report events and alarms to another location, which implements an SNMP Network Management System (NMS). An NMS is typically a standalone device, such as a workstation, which serves as an interface through which a human network manager can monitor and control the network. The NMS typically has a set of management applications (for example, data analysis and fault recovery applications).
For more information about the LSMS implementation of an SNMP agent, see “Understanding the SNMP Agent Process”.
SNMP Version 3 Trap PDU Format
An SNMPv3 trap PDU consists of the following fields:
- PDU Type
Specifies the type of PDU (in this case, trap).
- Request ID
Used to associate requests with responses.
- Error Status
Specifies an error or error type in response PDUs only (else set to 0)
- Error Index
Associates an error with a particular object instance in response PDUs only (else set to 0)
- Variable Bindings
Each variable binding contains an object field followed by its value field. The object and value fields together specify information about the event being reported.
SNMP Version 1 Trap PDU Format
Following is an overview of the format of the SNMP version 1 trap request. For more information about SNMP message formats, refer to SNMP, SNMPv2, SNMPv3, and RMON 1 and 2, Third Edition, William Stallings, Addison Wesley, ISBN 0-201-48534-6, 1999.
Each SNMP message consists of the following fields:
- SNMP authentication header, which consists of:
- Version identifier, used to ensure that both the sender and receiver of the message are using the same version of the SNMP protocol. Currently, the LSMS supports only version 1, which has a version identifier of 0 (zero).
- Community name, used to authenticate the NMS. The SNMP agent uses this field as a password to ensure that the sender of the message is allowed to access the SNMP agent’s information. The LSMS supports only trap requests, which originate at the LSMS; therefore, this field is not significant.
- Protocol data unit (PDU), which for a trap request consists of:
An SNMPv1 trap PDU consists of the following fields:
- PDU Type field, which specifies the type of PDU (in this case, trap).
- Enterprise field, which identifies the device generating the message. For the LSMS SNMP agent, this field is 323.
- Agent address field, which contains the IP address of the host that runs SNMP agent. For the LSMS SNMP agent, this field contains the IP address of the LSMS active server.
- Generic trap type, which can be set to any value from 0 through 6. Currently, the LSMS supports only the value 6, which corresponds to the enterpriseSpecific type of trap request.
- Specific trap type, which can be used to identify a specific trap.
- Time stamp, which indicates how many hundredths of a second have elapsed since the last reinitialization of the host that runs the SNMP agent.
- One or more variables bindings, each of which contains an object field followed by a value field. The object and value fields together specify information about the event being reported.
Logging SNMP Agent Actions
When the LSMS SNMP agent process starts, stops, or sends a trap request, it logs information about the action in a log file. The log file is named lsmsSNMP.log
.<MMDD>
, where <MMDD>
represents the current month and day. The log file is stored in the directory /usr/TKLC/lsms/logs/snmp
.
Table B-7 shows the actions and information logged by the LSMS SNMP agent.
Table B-7 Information Logged by the LSMS SNMP Agent
Action | Information Logged |
---|---|
The SNMP agent starts |
Action, followed by day, date, time, and year; for example: |
The SNMP agent stops |
Action, followed by day, date, time, and year; for example: |
The SNMP agent sends a trap request |
The following fields, delimited by pipe characters:
Following is a sample entry logged when a trap is sent (in this entry, a trap with a trap_ID of 3 is sent to two NMSs): |
Event Descriptions
0001
Explanation
The
EMS Ethernet interface has
a problem. The
ping
utility did not receive a
response from the interface associated with the
EMS.
Recovery
Consult with your network administrator.
Event Details
Table B-8 Event 0001 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - EMS interface failure |
Source |
Both servers |
Frequency |
Every 2.5 minutes as long as condition exists |
Trap |
|
Trap ID |
16 |
Trap MIB Name |
emsInterfaceFailure |
0002
Explanation
The
EMS, which is indicated in
the System field on the
GUI or whose
CLLI has the value that
replaces
<CLLI>
in the Surveillance
notification text, requires a resynchronization with the
LSMS that cannot be
accomplished by automatic resynchronization between the
LSMS and the
EMS.
Recovery
Perform one of the synchronization procedures described in the LNP Database Synchronization User's Guide.
Event Details
Table B-9 Event 0002 Details
GUI Notification |
|
Severity |
Critical |
Text |
DB Maintenance Required |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NE CLLI=<CLLI> |
Source |
Active server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
33 |
Trap MIB Name |
emsRequiresResynchWithLSMS |
0003
Explanation
The LSMS has lost association with the primary EMS of the network element, which is indicated in the System field on the GUI or whose CLLI has the value that replaces <CLLI> in the Surveillance notification text; the association with the secondary EMS is established.
Recovery
Determine why the primary association failed (connectivity problem, EMS software problems, NE software problem, etc.). Correct the problem. Association will be automatically retried.
Event Details
Table B-10 Event 0003 Details
GUI Notification |
|
Severity |
Major |
Text |
Primary Association Failed |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NE CLLI=<CLLI> |
Source |
Active server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
5 |
Trap MIB Name |
primaryEMSAssocLostSecEstablished |
0004
Explanation
The LSMS has lost association with the primary EMS of the network element, which is indicated in the System field on the GUI or whose CLLI has the value that replaces <CLLI> in the Surveillance notification text; the association with the secondary EMS is not established.
Recovery
Determine why the primary association failed (connectivity problem, EMS software problems, NE software problem, etc.). Correct the problem, and then reestablish the association with the primary EMS.
Event Details
Table B-11 Event 0004 Details
GUI Notification |
|
Severity |
Critical |
Text |
Primary Association Failed |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NE CLLI=<CLLI> |
Source |
Active server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
36 |
Trap MIB Name |
primaryEMSAssocLostNoSec |
0006
Explanation
The pending queue used to hold transactions to be sent
to the
EMS/NE, which is indicated in the System field on the
GUI or whose
CLLI has the value that
replaces <CLLI> in the
Surveillance notification text, is full. To help ensure that no updates are
lost, the
eagleagent
will abort associations
with both the primary
EMS and secondary
EMS. Updates will be queued
in a resynchronization log until the
EMS reassociates.
Recovery
Determine why the EMS/NE is not receiving LNP updates, and correct the problem.
Event Details
Table B-12 Event 0006 Details
GUI Notification |
|
Severity |
Critical |
Text |
All Association(s) Aborted: Pending Queue Full |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
97 |
Trap MIB Name |
emsAssociationAbortedQueueFull |
0007
Explanation
The network element, which is indicated in the System
field on the
GUI or whose
CLLI has the value that
replaces
<CLLI>
in the Surveillance
notification text, is busy and is sending ’retry later’ in response to a
message sent by the
eagleagent
. The
eagleagent
has already tried resending
the same message the maximum number of times. The
eagleagent
has aborted associations
with both the primary
EMS and secondary
EMS.
Recovery
Correct the problem at the network element. When the EMS reconnects with the LSMS, the LSMS will automatically resynchronize the network element’s LNP database.
Event Details
Table B-13 Event 0007 Details
GUI Notification |
|
Severity |
Critical |
Text |
All Association(s) Aborted: Retries Exhausted |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
98 |
Trap MIB Name |
emsAssocAbortedMaxResend |
0008
Explanation
The
LSMS has lost
association with the secondary
EMS which is indicated in
the System field on the
GUI or whose
CLLI has the value that
replaces
<CLLI>
in the Surveillance
notification text. The association with the primary
EMS is still up.
Recovery
Determine why the secondary association failed (connectivity problem, EMS software problems, NE software problem, etc.) and then reestablish the association with the secondary EMS.
Event Details
Table B-14 Event 0008 Details
GUI Notification |
|
Severity |
Major |
Text |
Secondary Association Failed |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NE CLLI=<CLLI> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
130 |
Trap MIB Name |
secondaryEMSAssocLost |
0009
Explanation
The LSMS has established the first association with the network element (NE) which is indicated in the System field on the GUI or whose CLLI has the value that replaces <CLLI> in the Surveillance notification text. The first association established is called the primary association. This EMS is called the primary EMS.
Recovery
No action required; this notification is for information only.
Event Details
Table B-15 Event 0009 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Primary Association Established |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
8 |
Trap MIB Name |
primaryEMSAssocEstablished |
0010
Explanation
The LSMS has established the second association with the network element (NE) which is indicated in the System field on the GUI or whose CLLI has the value that replaces <CLLI> in the Surveillance notification text. The association is established only if a primary association already exists. This EMS is called the secondary EMS.
Recovery
No action required; this notification is for information only.
Event Details
Table B-16 Event 0010 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Secondary Association Established |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
134 |
Trap MIB Name |
secondaryEMSAssocEstablished |
0011
Explanation
The primary association for the EMS/NE, which is indicated in the System field on the GUI or whose CLLI has the value that replaces <CLLI> in the Surveillance notification text, is either down or is inhibited, such that transactions sent to the primary EMS will not be received by the NE. Transactions are being sent to the secondary EMS instead of the primary EMS.
Recovery
Determine why the primary association failed (connectivity problem, EMS software problem, NE software problem, or other problem). Correct the problem. Association will be automatically retried. When the association is reestablished, it will be a secondary association, and the EMS will be the secondary EMS.
Event Details
Table B-17 Event 0011 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Successful Switchover Occurred to Secondary EMS |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
139 |
Trap MIB Name |
transactionToSecondary |
2000
Explanation
The
NPAC Ethernet interface has
a problem. The
ping
utility did not receive a
response from the interface associated with the
NPAC.
Recovery
Consult with your network administrator.
Event Details
Table B-18 Event 2000 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC interface failure |
Source |
Both primary and secondary servers |
Frequency |
Every 2.5 minutes as long as condition exists |
Trap |
|
Trap ID |
15 |
Trap MIB Name |
npacInterfaceFailure |
2001
Explanation
The association with the NPAC identified by <NPAC_region_ID> has been disconnected by the user.
Recovery
Examine additional GUI notifications to determine whether the LSMS is retrying the association. Follow the recovery actions described for the GUI notification.
Event Details
Table B-19 Event 2001 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Disconnected |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC=<PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
37 |
Trap MIB Name |
lostNPACAssoc |
2002
Explanation
The
LSMS is not able to
confirm the physical connectivity with the
NPAC, which is specified in
the System field on the
GUI or is indicated by
<NPAC_region_ID>
in the
Surveillance notification.
Recovery
Check the physical connection between the LSMS and the NPAC. The problem may be in the network, a router, or both.
Event Details
Table B-20 Event 2002 Details
GUI Notification |
|
Severity |
Critical |
Text |
LSMS Physical Disconnect with NPAC |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC=<NPAC_region_ID> |
Source |
Active server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
45 |
Trap MIB Name |
failedNPACConnectivity |
2003
Explanation
The
NPAC (PRIMARY
or
SECONDARY
, as indicated) identified by
<NPAC_region_ID>
rejected the
association because it received a message from the
LSMS that failed
security checks. This can be due to one of the following:
-
The CMIP departure time is more than five minutes out of synchronization with the NPAC servers.
-
The security key is not valid.
-
The CMIP sequence number is out of sequence (messages must be returned to the NPAC in the same order in which they were received).
Recovery
Do the following:
-
Log in as
lsmsadm
to the active server. -
Enter the following command to determine what the LSMS system time is:
$ date
-
Contact the NPAC administrator to determine what the NPAC time is. If the NPAC time is more than five minutes different from the LSMS time, reset the LSMS system time on both servers and on the administration console using one of the procedures described in “Managing the System Clock”.
-
After you have verified that the NPAC and LSMS times are within five minutes of each other, cause a different security key to be used by stopping and restarting the regional agent. Enter the following commands, where
<region>
is the name of the region in which this notification occurred:$LSMS_DIR/lsms stop <region>
$LSMS_DIR/lsms start <region>
-
Start the GUI again.
-
Attempt to reassociate with the NPAC. For information about associating with an NPAC, refer to the Configuration Guide.
-
If the problem persists, contact Oracle Technical Service.
Event Details
Table B-21 Event 2003 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Aborted by PEER: Access Control Failure |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC=<PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
95 |
Trap MIB Name |
npacRejectedAssocAccessCtrlFail |
2004
Explanation
The primary or secondary NPAC, identified by <NPAC_region_ID>, rejected the association because it received data that was not valid.
Recovery
Contact the NPAC administrator.
Event Details
Table B-22 Event 2004 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Aborted by PEER: Invalid Data Received |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
96 |
Trap MIB Name |
npacRejectedAssocInvalidData |
2005
Explanation
The LSMS has lost association with the primary or secondary NPAC identified by <NPAC_region_ID> because the user aborted the association.
Recovery
Reassociate with the NPAC when the reason for aborting the association no longer exists. For information about associating with an NPAC, refer to the Configuration Guide.
Event Details
Table B-23 Event 2005 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>]-<NPAC_region_ID> Association Aborted by User |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
9 |
Trap MIB Name |
npacAbortByUser |
2006
Explanation
The LSMS did not receive an association response from the NPAC within the timeout period. The LSMS will attempt the association with the NPAC again after an interval that defaults to two minutes, but can be configured to a different value by Oracle.
Recovery
Determine whether there is a network connection problem and/or contact the NPAC administrator to determine whether the NPAC is up and running.
Event Details
Table B-24 Event 2006 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Bind Timed Out - Auto Retry After NPAC_RETRY_INTERVAL |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
As soon as condition occurs, and at two-minute intervals as long as condition exists |
Trap |
|
Trap ID |
100 |
Trap MIB Name |
assocRespNPACTimeout |
2007
Explanation
The NPAC association attempt was rejected by the NPAC, and the LSMS was informed to attempt the NPAC association again to the same NPAC host after an interval that defaults to two minutes, but can be configured to a different value by Oracle.
Recovery
No action required; the LSMS will automatically try to associate again.
Event Details
Table B-25 Event 2007 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Aborted by PEER - Auto Retry Same Host After NPAC_RETRY_INTERVAL |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC=< PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
101 |
Trap MIB Name |
assocRejectedRetrySameHost |
2008
Explanation
The NPAC association attempt was rejected by the NPAC, and the LSMS was informed to attempt the NPAC association again to the other NPAC host after an interval that defaults to two minutes, but can be configured to a different value by Oracle.
Recovery
No action required; the LSMS will automatically try to associate again.
Event Details
Table B-26 Event 2008 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>]-<NPAC_region_ID>- Connection Aborted by PEER - Auto Retry Other Host After NPAC_RETRY_INTERVAL |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
102 |
Trap MIB Name |
assocRejectedRetryOtherHost |
2009
Explanation
A problem exists in the network connectivity. The LSMS will attempt the association with the NPAC again after an interval that defaults to two minutes, but can be configured to a different value by Oracle.
Recovery
Check the network connectivity for errors. Verify the
ability to
ping
the
NPAC from the
LSMS.
Event Details
Table B-27 Event 2009 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Aborted by PROVIDER - Auto Retry Same Host After NPAC_RETRY_INTERVAL |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
103 |
Trap MIB Name |
nwtkProblemRetryNPACAssoc |
2010
Explanation
The LSMS received three consecutive responses from the NPAC with a download status of failure from a recovery action request. The LSMS has aborted the association and will attempt to associate again after a retry interval that defaults to five minutes, but can be configured to a different value by Oracle. The LSMS will retry the recovery action after the association is reestablished.
Recovery
No action required; the LSMS will automatically try to associate again.
Event Details
Table B-28 Event 2010 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Aborted Due to Recovery Failure - Auto Retry After NPAC_RETRY_INTERVAL |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
104 |
Trap MIB Name |
lsmsAbortedNPACassocDowRecFail |
2011
Explanation
The LSMS has disconnected the association with the NPAC region in question due to the lack of a response to heartbeat messages from the LSMS to the NPAC.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-29 Event 2011 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Disconnected by Heartbeat |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
111 |
Trap MIB Name |
lostNPACAssoc |
2012
Explanation
The
NPAC (primary
or
secondary
, as indicated) identified by
<NPAC_region_ID> rejected the association because of an access
control failure. This can be due to one of the following:
-
The OSI Presentation Address is incorrect.
-
The Service Provider ID in the regional configuration file is incorrect.
-
The CMIP departure time is more than five minutes out of synchronization with the NPAC servers.
-
The security key is not valid.
Recovery
Do the following:
-
Verify that the correct PSEL, SSEL, TSEL, and NSAP values have been configured for the OSI Presentation Address (for more information, refer to “Viewing a Configured NPAC Component” in the Configuration Guide). If you need to change the values, use the procedure described in “Modifying an NPAC Component” in the Configuration Guide.
-
Verify that the configured Service Provider ID (SPID) is the same as the SPID assigned by the NPAC. For more information about this configuration file, refer to “Modifying LSMS Configuration Components” in the Configuration Guide.
-
Verify that the configured NPAC_SMS_NAME is the same as the value assigned by the NPAC (this field is case-sensitive). For more information about this configuration file, refer to “Modifying an NPAC Component” in the Configuration Guide.
-
Log in as
lsmsadm
to the active server. -
Enter the following command to determine what the LSMS system time is:
$ date
-
Contact the NPAC administrator to determine what the NPAC time is. If the NPAC time is more than five minutes different from the LSMS time, reset the LSMS system time on both servers and on the administration console by performing one of the procedures described in “Managing the System Clock”.
-
After you have verified that the NPAC and LSMS times are within five minutes of each other, cause a different security key to be used by stopping and restarting the regional agent. Enter the following commands, where
<region>
is the name of the region in which this notification occurred:$ $LSMS_DIR/lsms stop <region>
$ $LSMS_DIR/lsms start <region>
-
Start the GUI again.
-
Attempt to reassociate with the NPAC.
-
If the problem persists, contact Oracle Technical Service.
Event Details
Table B-30 Event 2012 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Attempt Failed: Access Control Failure |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
106 |
Trap MIB Name |
assocRejDueToAccessControl |
2014
Explanation
The userInfo value in the cmipUserInfo portion of the NPAC association response CMIP message is not valid.
Recovery
Contact the NPAC administrator to determine why the NPAC is sending an invalid association response.
Event Details
Table B-31 Event 2014 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Attempt Failed: Invalid Data Received |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
108 |
Trap MIB Name |
npacConnFailedCMIP |
2015
Explanation
The NPAC association was terminated gracefully by the NPAC.
Recovery
According to the NANC specifications, this should never occur; if this message is seen, contact the NPAC administrator for the reason for the association unbind.
Event Details
Table B-32 Event 2015 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Disconnected by NPAC |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
109 |
Trap MIB Name |
npacAssocGracefullyTerminated |
2018
Explanation
The LSMS was unable to properly resynchronize (with the NPAC) the data that was lost while the LSMS was not associated with the NPAC.
Recovery
Do the following:
-
Abort the NPAC association (refer to the Configuration Guide).
-
Attempt to reassociate with the NPAC (refer to the Configuration Guide).
-
If the reassociation is not successful, contact the NPAC and contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-33 Event 2018 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Recovery Failed |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
112 |
Trap MIB Name |
lsmsDataLostBadResynch |
2019
Explanation
The LSMS data lost during the resynchronization time was not resynchronized properly with the NPAC.
Recovery
Do the following:
-
Abort the NPAC association (refer to the Configuration Guide).
-
Reestablish the NPAC association (refer to the Configuration Guide).
-
Determine whether notification automatic-monitoring-events1.html
NPAC <PRIMARY|SECONDARY> Recovery Complete
is posted. If instead notification2019
reappears, perform a resynchronization for a period of time starting one hour before the2019
notification first appeared, using either the GUI (refer to “Resynchronizing for a Defined Period of Time Using the GUI” in the Database Administrator's Guide). -
If
2019
continues to appear, contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-34 Event 2019 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Recovery Partial Failure |
Surveillance Notification |
|
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Recovery Failure |
Source |
Active server |
Frequency |
Once , as soon as condition occurs |
Trap |
|
Trap ID |
113 |
Trap MIB Name |
badNPACresynchTime |
2020
Explanation
The LSMS aborted the NPAC association because the LSMS received a message from the NPAC that did not have the correct LSMS key signature.
Recovery
Verify that the correct keys are being used by both the NPAC and the LSMS.
Event Details
Table B-35 Event 2020 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Security Violation. Association Aborted. Retrying |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NPAC= <PRIMARY|SECONDARY>-<NPAC_region_ID> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
114 |
Trap MIB Name |
assocAbortedBadKeys |
2021
Explanation
An associate retry timer was in effect. The retry attempt was canceled because a GUI user issued an Associate, Abort or Disconnect request. If an Associate request was issued, the association is attempted immediately.
Recovery
No action required; for information only.
Event Details
Table B-36 Event 2021 Details
GUI Notification |
|
Severity |
Major |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Automatic Association Retry Canceled |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
122 |
Trap MIB Name |
npacAutoAssociationRetryCanceled |
2022
Explanation
Either the
LSMS did not receive any
response from the
NPAC before a timeout
expired or the
LSMS received a response
from the
NPAC with a download status
of failure from a recovery action request. The
NPAC is unable to process
the recovery action due to a temporary resource limitation. The
LSMS will retry the
request for the number of times indicated by
<retry_number>
with the interval
between each retry indicated by
<retry_interval>
minutes. If
recovery is not successful after the indicated number of retries, the
LSMS will abort the
association and post the following notification:
[Critical]: <Timestamp> 2010
: NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Aborted Due to Recovery Failure - Auto Retry After NPAC_RETRY_INTERVAL
Recovery
No action required; for information only.
Event Details
Table B-37 Event 2022 Details
GUI Notification |
|
Severity |
Major |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Fail/No Response from NPAC Recovery - Auto Retry <retry_number> Times in <retry_interval> Minutes |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
123 |
Trap MIB Name |
npacRecoveryFailureResourceLimit |
2023
Explanation
The NPAC association will be down for the specified period of time (from the first time field shown in the notification to the second time field shown in the notification) due to NPAC-scheduled down time.
Recovery
When the scheduled down time is over, manually reestablish the NPAC association. For information about aborting and reestablishing an association, refer to the Configuration Guide.
Event Details
Table B-38 Event 2023 Details
GUI Notification |
|
Severity |
Major |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] ScheduleDownTime from [<YYYYMMDDhhmmss>] to [<YYYYMMDDhhmmss>] |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
124 |
Trap MIB Name |
npacAssocPeriodDown |
2024
Explanation
An Associate request has been sent to the NPAC after a retry timer expired.
Recovery
No action required; for information only.
Event Details
Table B-39 Event 2024 Details
GUI Notification |
|
Severity |
Major |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Timer Expired - Resending Association Request |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
125 |
Trap MIB Name |
npacAssocRequestSentAfterRetryTimer |
2025
Explanation
The NPAC association was successfully established.
Recovery
No action required; for information only.
Event Details
Table B-40 Event 2025 Details
GUI Notification |
|
Severity |
Cleared |
Text |
NPAC [<PRIMARY|SECONDARY>-<NPAC_region_ID>] Connection Successfully Established |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
7 |
Trap MIB Name |
npacAssocEstablished |
4000
Explanation
The active server has initiated an automatic switchover to the inactive server.
Recovery
No action required; for information only.
Event Details
Table B-41 Event 4000 Details
GUI Notification |
|
Severity |
Event |
Text |
Switchover Initiated |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Switchover initiated |
Source |
Active server |
Frequency |
Once, soon as condition occurs. |
Trap |
|
Trap ID |
11 |
Trap MIB Name |
switchOverStarted |
4001
Explanation
LSMS service has been switched over.
Recovery
No action required; for information only.
Event Details
Table B-42 Event 4001 Details
GUI Notification |
|
Severity |
Event |
Text |
Switchover complete |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Switchover complete |
Source |
Active server |
Frequency |
Once, soon as condition occurs. |
Trap |
|
Trap ID |
12 |
Trap MIB Name |
switchOverCompleted |
4002
Explanation
LSMS service could not be switched over to the inactive server; the inactive server was not able to start LSMS service.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-43 Event 4002 Details
GUI Notification |
|
Severity |
Event |
Text |
Switchover Failed |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Switchover Failed |
Source |
Active server |
Frequency |
Once, as soon as condition occurs. |
Trap |
|
Trap ID |
13 |
Trap MIB Name |
switchOverFailed |
4003
Explanation
This notification indicates that the disk controller <controllerId> is out of service and is affecting shared storage. This notification is only valid on E3000 systems.
controllerId= The specific controller number (either 0 or 1).
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-44 Event 4003 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - Loss of disk on < controllerId> |
Source |
Either server |
Frequency |
Every 5 minutes as long as condition exists |
Trap |
|
Trap ID |
14 |
Trap MIB Name |
diskContrService |
4004
Explanation
The Ethernet interface used to connect to the
application network has a problem. This interface usually connects to
network-connected workstations. The
ping
utility did not receive a
response from the interface associated with the application network.
Recovery
Consult with your network administrator.
Event Details
Table B-45 Event 4004 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - APP interface failure |
Source |
Either server |
Frequency |
Every 2.5 minutes as long as condition exists |
Trap |
|
Trap ID |
17 |
Trap MIB Name |
appsInterfaceFailure |
4005
Explanation
This notification indicates that the Ethernet interface used to connect to the ADMINISTRATION network has a problem.
Recovery
Consult with your network administrator.
Event Details
Table B-46 Event 4005 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - ADMIN interface faire |
Source |
Either server |
Frequency |
Every 2.5 minutes as long as condition exists |
Trap |
|
Trap ID |
18 |
Trap MIB Name |
adminInterfaceFailure |
4006
Explanation
This notification indicates that the system disk has lost synchronization, possibly due to a hardware problem.
driveSpecId= disk drive specification.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-47 Event 4006 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - < driveSpecId > |
Source |
Either server |
Frequency |
Every 5 minutes as long as condition exists |
Trap |
|
Trap ID |
20 |
Trap MIB Name |
systemDiskSynch |
4007
Explanation
Database replication has failed.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-48 Event 4007 Details
GUI Notification |
|
Severity |
Critical |
Text |
DB Repl Err - <dbReplErr> |
Surveillance Notification |
|
Text |
Notify:Sys Admin - DB repl error |
Source |
Both servers |
Frequency |
Every minute as long as condition exists. |
Trap |
|
Trap ID |
21 |
Trap MIB Name |
dataReplError |
4008
Explanation
The database replication process monitor has failed.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-49 Event 4008 Details
GUI Notification |
|
Severity |
Critical |
Text |
DB Proc Mon Err - <dbMonErr> |
Surveillance Notification |
|
Text |
Notify:Sys Admin - DB monitor failure |
Source |
Active server |
Frequency |
Every five minutes as long as condition exists. |
Trap |
|
Trap ID |
22 |
Trap MIB Name |
dbMonitorFail |
4009
Explanation
The server has an internal disk error.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-50 Event 4009 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - Internal Disk Error |
Source |
Either server |
Frequency |
Within five minutes of the condition occurring and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
23 |
Trap MIB Name |
internalDiskError |
4010
Explanation
This notification indicates that the hot-spare feature has completed automatic data resynchronization.
Recovery
No action required; this notification is for information only.
Event Details
Table B-51 Event 4010 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - < driveSpecId >- < driveSpecId > |
Source |
Either server |
Frequency |
Once |
Trap |
|
Trap ID |
24 |
Trap MIB Name |
hotSparedDataResynch |
4011
Explanation
This notification indicates that LSMS database replication is delayed.
Recovery
No action required.
Event Details
Table B-52 Event 4011 Details
GUI Notification |
|
Severity |
N/A |
Text |
DB Repl Info |
Surveillance Notification |
|
Text |
Notify:Sys Admin - DB repl info |
Source |
Either server |
Frequency |
Within five minutes of the condition occurring and every minute thereafter as long as condition exists. |
Trap |
|
Trap ID |
25 |
Trap MIB Name |
dataReplInfo |
4012
Explanation
A process specified by
<process_name>
is utilizing 40
percent or more of the
LSMS’s
CPU resource and the <second_ID> indicates a specific instance of
the process, as follows:
-
When the
<process_name>
iseagleagent
, the <second_ID> specifies the Common Language Location Indicator (CLLI) of the network element -
When the
<process_name>
isnpacagent
, the <second_ID> specifies the name of the region -
When the
<process_name>
is noteagleagent
ornpacagent
, the <second_ID> specifies the process ID (PID) of the process.
Recovery
Because this notification is posted every five minutes as long as the condition exists, you may choose to ignore this notification the first time that it appears. However, if this notification is repeated several times in a row, do one of the following:
-
If the
<process_name>
is notnpacagent
, go to step 4. Otherwise, determine whether thenpacagent
is still using 40% or more of the CPU resource by entering the following command, where<region>
can be optionally specified (it is the name of the region as displayed at the end of the notification text):$ ps -eo pid,pcpu,args | grep npacagent | grep <region>
-
If the
npacagent
is still using 40% or more of the CPU resource, enter the following commands to stop thenpacagent
and restart it, where<region>
is the name of the NPAC region whosenpacagent
is using 40% or more of the CPU resource:$ cd $LSMS_DIR
$ lsms stop <region>
$ lsms start <region>
-
Repeat step 1. If the
npacagent
you tried to stop is still using 40% or more of the CPU resource, contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66. -
If the
<process_name>
is noteagleagent
, go to step 7. Otherwise, determine whether theeagleagent
is still using 40% or more of the CPU resource by entering the following command, where<CLLI>
can be optionally specified (it is the name of the network element as displayed at the end of the notification text):$ ps -eo pid,pcpu,args | grep eagleagent | grep <CLLI>
-
If the
eagleagent
is still using 40% or more of the CPU resource, enter the following commands to stop theeagleagent
and restart it, where<CLLI>
is the Common Language Location Indicator (CLLI) of the network element whoseeagleagent
is using 40% or more of the CPU resource:$ cd $LSMS_DIR
$ eagle stop <CLLI>
$ eagle start <region>
-
Repeat step 1. If the process you tried to stop is still using 40% or more of the CPU resource, contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
-
If the
<process_name>
is noteagleagent
ornpacagent
, contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-53 Event 4012 Details
GUI Notification |
|
Severity |
Major |
Text |
Process [<process_name>-<second_ID>] Utilizing High Percentage of CPU |
Surveillance Notification |
|
Text |
Notify:Sys Admin - [<process_name>-<second_ID>] |
Source |
Either server |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
26 |
Trap MIB Name |
cpuUtilitzationOver39 |
4013
Explanation
The
LSMS server with default
hostname
lsmspri
has been inhibited.
Recovery
As soon as possible, start the server by performing the procedure described in “Starting a Server”.
Event Details
Table B-54 Event 4013 Details
GUI Notification |
|
Severity |
Major |
Text |
Primary Server Inhibited |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Primary inhibited |
Source |
Server with default hostname
|
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
27 |
Trap MIB Name |
primaryServerInhibited |
4014
Explanation
The
LSMS server with default
hostname
lsmssec
has been inhibited.
Recovery
As soon as possible, start the server by performing the procedure described in “Starting a Server”.
Event Details
Table B-55 Event 4014 Details
GUI Notification |
|
Severity |
Major |
Text |
|
Surveillance Notification |
|
Text |
|
Source |
Server with default hostname
|
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
28 |
Trap MIB Name |
secondaryServerInhibited |
4015
Explanation
A heartbeat link is down.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-56 Event 4015 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - Heartbeat failure |
Source |
Both servers |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
29 |
Trap MIB Name |
heartbeatLinkDown |
4016
Explanation
This notification indicates that the Heartbeat 2 link is down.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-57 Event 4016 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - Heartbeat 2 failure |
Source |
Both server |
Frequency |
Once |
Trap |
|
Trap ID |
30 |
Trap MIB Name |
heartbeatLinkTwoDown |
4017
Explanation
This notification indicates that the LSMS network configuration is incorrect.
Recovery
Customer or field engineers should:
- Verify network configuration and network cabling
- Verify serial configuration and cabling if serial keepalive is configured
- If the problem persists, contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66
Event Details
Table B-58 Event 4017 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - Network setup error |
Source |
Active server |
Frequency |
Every 5 minutes |
Trap |
|
Trap ID |
31 |
Trap MIB Name |
lsmsNtwkConfigError |
4018
Explanation
This notification indicates that the LSMS network configuration is not supported or recommended.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-59 Event 4018 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - Network setup unsupp |
Source |
Active server |
Frequency |
Every 5 minutes |
Trap |
|
Trap ID |
32 |
Trap MIB Name |
lsmsNtwkConfigNotSupported |
4019
Explanation
This notification indicates that the disk volume specified by diskVolName has exceeded the 95 percent usage threshold.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-60 Event 4019 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - < diskVolName > |
Source |
Either server |
Frequency |
Every 5 minutes |
Trap |
|
Trap ID |
38 |
Trap MIB Name |
diskVolume95Usage |
4020
Explanation
The server’s swap space has exceeded the critical usage threshold (default = 95%).
Recovery
If the problem persists, contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-61 Event 4020 Details
GUI Notification |
|
Severity |
Critical |
Text |
Swap space exceeds Critical |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Swap space Critical |
Source |
Either server |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
39 |
Trap MIB Name |
swapSpaceCritical |
4021
Explanation
The
LSMS application or
system
daemon whose name has
<process_name>
as the first 12
characters is not running.
Recovery
No user action is necessary. The Surveillance process
automatically restarts the Service Assurance process (sacw
) and the
sentryd
process automatically restarts
other processes.
Event Details
Table B-62 Event 4021 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - <process_name> failed |
Source |
Active server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
40 |
Trap MIB Name |
lsmsAppsNotRunning |
4022
Explanation
The backup of the LSMS database has completed successfully.
Recovery
No action required; for information only.
Event Details
Table B-63 Event 4022 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
DATABASE backup complete |
Source |
Standby server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
41 |
Trap MIB Name |
backupCompleted |
4023
Explanation
The backup of the LSMS database has failed.
Recovery
Review backup output to determine why backup failed, correct the problems, and run backup script again manually.
Note:
Determine whether the NAS can be reached using the ping command. If the NAS cannot be reached, restart the NAS. To restart the NAS turn the power off, then turn the power on. If the NAS can be reached, contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.Event Details
Table B-64 Event 4023 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - DATABASE backup failed |
Source |
Standby server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
42 |
Trap MIB Name |
backupFailed |
4024
Explanation
The primary LSMS server (Server 1A) is not providing the LSMS service.
Recovery
No action required; for information only.
Event Details
Table B-65 Event 4024 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - Primary not online |
Source |
Both primary and secondary servers |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
63 |
Trap MIB Name |
primaryServerNotOnline |
4025
Explanation
The standby server is not prepared to take over LSMS service.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-66 Event 4025 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - Can't switch to standby |
Source |
Standby server |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
64 |
Trap MIB Name |
standbyNotReadyForSwitchover |
4026
Explanation
The secondary LSMS server (Server 1B) is currently providing the LSMS service.
Recovery
No action required; for information only.
Event Details
Table B-67 Event 4026 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - Secondary online |
Source |
Both primary and secondary servers |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
65 |
Trap MIB Name |
secServerProvidingLSMSService |
4027
Explanation
The standby LSMS server cannot determine the availability of the LSMS service on the active server.
Recovery
Determine if the other server is working normally. Also,
verify that the heartbeat connections (eth2
,
eth3
, and the serial cable) are
connected and functioning properly
Event Details
Table B-68 Event 4027 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - Primary status unknown |
Source |
Standby server |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
66 |
Trap MIB Name |
secServerCannotDeterminePrimAvailability |
4028
Explanation
This notification indicates an LSMS mirroring inconsistency.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-69 Event 4028 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - < volume_name > |
Source |
Either server |
Frequency |
Every 5 minutes |
Trap |
|
Trap ID |
169 |
Trap MIB Name |
lsmsMirroringInconsistance |
4029
Explanation
This notification indicates that the LSMS filesystem is not writeable.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-70 Event 4029 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - < fileSystem > |
Source |
Either server |
Frequency |
Every 5 minutes |
Trap |
|
Trap ID |
170 |
Trap MIB Name |
lsmsFilesystemNotWritable |
4030
Explanation
The server’s swap space has exceeded the major usage threshold (default = 80%).
Recovery
If the problem persists, contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-71 Event 4030 Details
GUI Notification |
|
Severity |
Major |
Text |
|
Surveillance Notification |
|
Text |
|
Source |
Both servers |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
190 |
Trap MIB Name |
swapSpaceWarning |
4031
Explanation
A database replication error that was reported earlier by the 4007 event has now been cleared.
Recovery
No action necessary.
Event Details
Table B-72 Event 4031 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Database Replication cleared - <dbReplErr> |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
195 |
Trap MIB Name |
dataReplClear |
4032
Explanation
A database process monitor error that was reported earlier by the 4008 event has now been cleared.
Recovery
No action necessary.
Event Details
Table B-73 Event 4032 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Database Replication cleared - <dbMonErr> |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
196 |
Trap MIB Name |
dbMonitorCLear |
4033
Explanation
The LSMS database failed count operation, which suggests a corrupt MySQL index.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-74 Event 4033 Details
GUI Notification |
|
Severity |
Critical |
Text |
Database Corrupt Index |
Surveillance Notification |
|
Text |
None |
Source |
Both servers |
Frequency |
Every 30 minutes. |
Trap |
|
Trap ID |
200 |
Trap MIB Name |
dbCorruptIndex |
4034
Explanation
This notification indicates that the Invalid Snapshot has been detected.
Recovery
Clean Up After Failed or Interrupted Snapshot
Event Details
Table B-75 Event 4034 Details
GUI Notification |
|
Severity |
Critical |
Text |
Invalid Snapshot - <snapName> |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Invalid Snapshot |
Source |
Active server |
Frequency |
Every 30 minutes |
Trap |
|
Trap ID |
201 |
Trap MIB Name |
snapInvalidErr |
4035
Explanation
This notification indicates that the Invalid Snapshot error has been cleared.
Recovery
No action required; this notification is for information only.
Event Details
Table B-76 Event 4035 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Invalid Snapshot cleared - <snapName> |
Surveillance Notification |
|
Text |
Invalid Snapshot cleared - <snapName> |
Source |
Active server |
Frequency |
Every 30 minutes |
Trap |
|
Trap ID |
202 |
Trap MIB Name |
snapInvalidClear |
4036
Explanation
This notification indicates that the Snapshot is greater than 80% full.
Recovery
No action required; this notification is for information only.
Event Details
Table B-77 Event 4036 Details
GUI Notification |
|
Severity |
Critical |
Text |
Full Snapshot - <snapName> |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Full Snapshot |
Source |
Active server |
Frequency |
Every 30 minutes |
Trap |
|
Trap ID |
203 |
Trap MIB Name |
fullSnapshot |
4037
Explanation
This notification indicates that the Snapshot full error is cleared.
Recovery
No action required; this notification is for information only.
Event Details
Table B-78 Event 4037 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Full Snapshot cleared - <snapName> |
Surveillance Notification |
|
Text |
Full Snapshot cleared - <snapName> |
Source |
Active server |
Frequency |
Every 30 minutes |
Trap |
|
Trap ID |
204 |
Trap MIB Name |
fullSnapshotClear |
4038
Explanation
The mate server is down.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-79 Event 4038 Details
GUI Notification |
|
Severity |
Critical |
Text |
Mate Server Down |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Mate Server Down |
Source |
Both servers |
Frequency |
Every minute as long as condition exists |
Trap |
|
Trap ID |
205 |
Trap MIB Name |
mateServerDown |
4039
Explanation
The mate server is up.
Recovery
No action is required.
Event Details
Table B-80 Event 4039 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Mate Server Up |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Mate Server Up |
Source |
Both servers |
Frequency |
As soon as condition clears |
Trap |
|
Trap ID |
206 |
Trap MIB Name |
mateServerUp |
4100
Explanation
One or more platform alarms in the minor category exists. To determine which minor platform alarms are being reported, see “How to Decode Platform Alarms”. When the active server reports minor platform alarms that originated on the other server, the hostname of the other server is inserted before the alarm string.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Note:
If you received Event 4100 in response to an snmpget error, contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 to have the NAS snmp daemon stopped and restarted.Event Details
Table B-81 Event 4100 Details
GUI Notification |
|
Severity |
Minor |
Text |
Minor Platform Alarm [hostname]: <alarm_string> |
Surveillance Notification |
|
Text |
Notify:Sys Admin - ALM <alarm_string> |
Source |
Both servers |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
191 |
Trap MIB Name |
minorPlatAlarmMask |
4101
Explanation
All platform alarms in the minor category have been cleared. When the active server reports that all minor platform alarms have cleared on the other server, the hostname of the other server is inserted before the alarm string.
Recovery
No action necessary.
Event Details
Table B-82 Event 4101 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Minor Platform Alarms Cleared |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Minor Plat alrms clear |
Source |
Both servers |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
197 |
Trap MIB Name |
minorPlatAlarmClear |
4200
Explanation
One or more platform alarms in the major category exists. To determine which major platform alarms are being reported, see “How to Decode Platform Alarms”. When the active server reports major platform alarms that originated on the other server, the hostname of the other server is inserted before the alarm string.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-83 Event 4200 Details
GUI Notification |
|
Severity |
Major |
Text |
Major Platform Alarm [hostname]: <alarm_string> |
Surveillance Notification |
|
Text |
Notify:Sys Admin - ALM <alarm_string> |
Source |
Both servers |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
192 |
Trap MIB Name |
majorPlatAlarmMask |
4201
Explanation
All platform alarms in the major category have been cleared. When the active server reports that all major platform alarms have cleared on the other server, the hostname of the other server is inserted before the alarm string.
Recovery
No action necessary.
Event Details
Table B-84 Event 4201 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Major Platform Alarms Cleared |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Major Plat alrms clear |
Source |
Both servers |
Frequency |
Once |
Trap |
|
Trap ID |
198 |
Trap MIB Name |
majorPlatAlarmClear |
4300
Explanation
One or more platform alarms in the critical category exists. To determine which critical platform alarms are being reported, see “How to Decode Platform Alarms”. When the active server reports critical platform alarms that originated on the other server, the hostname of the other server is inserted before the alarm string.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-85 Event 4300 Details
GUI Notification |
|
Severity |
Critical |
Text |
Critical Platform Alarm [hostname]: <alarm_string> |
Surveillance Notification |
|
Text |
Notify:Sys Admin - ALM <alarm_string> |
Source |
Both servers |
Frequency |
Once |
Trap |
|
Trap ID |
193 |
Trap MIB Name |
criticalPlatAlarmMask |
4301
Explanation
All platform alarms in the major category have been cleared. When the active server reports that all major platform alarms have cleared on the other server, the hostname of the other server is inserted before the alarm string.
Recovery
No action necessary.
Event Details
Table B-86 Event 4301 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Critical Platform Alarms Cleared |
Surveillance Notification |
|
Text |
Notify:Sys Admin - Crit Plat alrms clear |
Source |
Both servers |
Frequency |
Once |
Trap |
|
Trap ID |
199 |
Trap MIB Name |
criticalPlatAlarmClear |
6000
Explanation
The
eagleagent
process has been started.
Recovery
No action required; for information only.
Event Details
Table B-87 Event 6000 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Eagleagent <CLLI> Has Been Started |
Surveillance Notification |
|
Text |
Notify:Sys Admin - <CLLI> started |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
1 |
Trap MIB Name |
eagleAgentStarted |
6001
Explanation
The
eagleagent
process has been stopped by
the
eagle
script.
Recovery
No action required; for information only.
Event Details
Table B-88 Event 6001 Details
GUI Notification |
|
Severity |
Critical |
Text |
Eagleagent <CLLI> Has Been Stopped by User |
Surveillance Notification |
|
Text |
Notify:Sys Admin - <CLLI> norm exit |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
2 |
Trap MIB Name |
eagleAgentStoppedbyscript |
6002
Explanation
The
npacagent
for the region indicated by
<
NPAC_region_ID>
has been
started.
Recovery
No action required; for information only.
Event Details
Table B-89 Event 6002 Details
GUI Notification |
|
Severity |
Cleared |
Text |
NPACagent Has Been Started |
Surveillance Notification |
|
Text |
Notify:Sys Admin - <NPAC_region_ID> NPACagent started |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
3 |
Trap MIB Name |
NPACAgentStarted |
6003
Explanation
The
npacagent
for the region indicated by
<region>
has been stopped using
the
lsms
command.
Recovery
No action required; for information only. If you desire to restart the agent, do the following:
-
Log in to the active server as
lsmsadm
. -
Enter the following commands to start the
npacagent
where<region>
is the name of the NPAC region:$ cd $LSMS_DIR
$ lsms start <region>
Event Details
Table B-90 Event 6003 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPACAgent Has Been Stopped by User |
Surveillance Notification |
|
Text |
Notify:Sys Admin - <NPAC_region_ID> norm exit |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
4 |
Trap MIB Name |
lsmsCommandStoppedNPACAgent |
6004
Explanation
The
eagleagent
process for the network
element identified by <CLLI> has failed. The
sentryd
process will attempt to
restart.
Recovery
No action required; the
sentryd
process will attempt to
restart the
eagleagent
process.
Event Details
Table B-91 Event 6004 Details
GUI Notification |
|
Severity |
Critical |
Text |
Eagleagent [<CLLI>] Has Failed |
Surveillance Notification |
|
Text |
Notify:Sys Admin - FAILD: <CLLI> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
74 |
Trap MIB Name |
lsmsEagleAgentFailed |
6005
Explanation
The
eagleagent
process for the network
element identified by <CLLI> has been successfully restarted by the
sentryd
process.
Recovery
No action required.
Event Details
Table B-92 Event 6005 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - RECOV: <CLLI> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
75 |
Trap MIB Name |
lsmsEagleAgentRestarted |
6006
Explanation
The
sentryd
process was unable to restart
the
eagleagent
process for the network
element identified by
<CLLI>
.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-93 Event 6006 Details
GUI Notification |
|
Severity |
Critical |
Text |
Failure Restarting Eagleagent [<CLLI>] |
Surveillance Notification |
|
Text |
Notify:Sys Admin - RFAILD: <CLLI> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
76 |
Trap MIB Name |
failureToRestartEagleAgent |
6008
Explanation
The
npacagent
process for the region
specified by <NPAC_region_ID> has failed. The
sentryd
process will attempt to
restart.
Recovery
No action required; the
sentryd
process will attempt to
restart the
npacagent
process.
Event Details
Table B-94 Event 6008 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPACagent [<NPAC_region_ID>] Failure |
Surveillance Notification |
|
Text |
Notify:Sys Admin - FAILD: <NPAC_region_ID> agent |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
78 |
Trap MIB Name |
NPACagentForRegionFailure |
6009
Explanation
The
npacagent
process for the region
specified by <NPAC_region_ID> has been successfully restarted by
the
sentryd
process.
Recovery
No action required. Any active LSMS GUI processes will automatically reconnect.
Event Details
Table B-95 Event 6009 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - RECOV: <NPAC_region_ID> agent |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
79 |
Trap MIB Name |
NPACagentForRegionRestarted |
6010
Explanation
The
sentryd
process was unable to restart
the
npacagent
process for the region
specified by
<NPAC_region_ID>
.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-96 Event 6010 Details
GUI Notification |
|
Severity |
Critical |
Text |
Failure Restarting NPACagent [<NPAC_region_ID>] |
Surveillance Notification |
|
Text |
Notify:Sys Admin - RFAILD: <NPAC_region_ID> agent |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
80 |
Trap MIB Name |
failureToRestartNPACagentRegion |
6020
Explanation
The
npacagent
process has been stopped due
to a fault in accessing the regional database.
Recovery
A database error has occurred. Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-97 Event 6020 Details
GUI Notification |
|
Severity |
Critical |
Text |
NPACagent Has Been Shut Down - Database Access Error |
Surveillance Notification |
|
Text |
Notify:Sys Admin - <NPAC_region_ID> DB error |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
189 |
Trap MIB Name |
NPACagentStopRegDBaccessFault |
8000
Explanation
The LSMS Surveillance feature is in operation.
Recovery
No action required; for information only.
Event Details
Table B-98 Event 8000 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
|
Source |
Both primary and secondary servers |
Frequency |
Every five minutes as long as condition exists |
Trap |
|
Trap ID |
19 |
Trap MIB Name |
survFeatureOn |
8001
Explanation
The network element resynchronization database contains more than 1 million entries.
Recovery
Each day, as part of a cron job, the LSMS trims the resynchronization database so that it contains 768,000 entries. The occurrence of this event means that more than 232,000 transactions have been received since the last cron job. If this event occurs early in the day, contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-99 Event 8001 Details
GUI Notification |
|
Severity |
Major |
Text |
ResyncDB Contains 1 Mil Entries |
Surveillance Notification |
|
Text |
Notify:Sys Admin - ResyncDB 1 Mil |
Source |
Active server |
Frequency |
Once |
Trap |
|
Trap ID |
34 |
Trap MIB Name |
resynchLogMidFull |
8003
Explanation
The pending queue, used to hold the transactions to send to the network element (which is indicated in the System field on the GUI or whose CLLI has the value that replaces <CLLI> in the Surveillance notification text), is over half full.
Recovery
No recovery is required. Informational only.
Event Details
Table B-100 Event 8003 Details
GUI Notification |
|
Severity |
Major |
Text |
EMS Pending Queue Is Half full |
Surveillance Notification |
|
Text |
Notify:Sys Admin - CLLI=<CLLI> |
Source |
Active server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
43 |
Trap MIB Name |
ensPendingQueueHalfFull |
8004
Explanation
The pending queue, used to hold the transactions to send to the network element (which is indicated in the System field on the GUI or whose CLLI has the value that replaces <CLLI> in the Surveillance notification text), is completely full. The association to that EMS will be broken.
Recovery
No manual recovery required. The LSMS will automatically re-establish the association to the EMS and synchronization will take place.
Event Details
Table B-101 Event 8004 Details
GUI Notification |
|
Severity |
Critical |
Text |
EMS Pending Queue Is Full |
Surveillance Notification |
|
Text |
Notify:Sys Admin - CLLI=<CLLI> |
Source |
Active server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
Trap |
|
Trap ID |
44 |
Trap MIB Name |
emsPendingQueueMaxReached |
8005
Explanation
There was a data error in a record that prevented the LSMS eagleagent from sending the record to the network element.
Recovery
Both the error and the ignored record are written to the
file
/var/TKLC/lsms/logs/trace/LsmsTrace.log.<mmdd>
,
where
<mmdd>
indicates the month and
day the error occurred. Examine the log file for the month and day this error
was reported to determine what the error was. Enter the data manually or send
it again.
Event Details
Table B-102 Event 8005 Details
GUI Notification |
|
Severity |
Minor |
Text |
Eagleagent <CLLI> Ignoring Record: <DataError> |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
46 |
Trap MIB Name |
eagleAgentIgnoredRecord |
8024
Explanation
The Service Assurance agent has started successfully.
Recovery
No action required; for information only.
Event Details
Table B-103 Event 8024 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
67 |
Trap MIB Name |
serviceAssuranceAgentStarted |
8025
Explanation
Association
with the Service Assurance Manager, identified by
<Service_Assurance_Manager_Name>
, has been
established successfully.
Recovery
No action required; for information only.
Event Details
Table B-104 Event 8025 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - <Service_Assurance_Manager_Name> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
68 |
Trap MIB Name |
establishServAssuranceMgrAssoc |
8026
Explanation
Association
with the Service Assurance Manager, identified by
<Service_Assurance_Manager_Name>
, has been
stopped or disconnected.
Recovery
Contact the Service Assurance system administrator to determine the cause of disconnection, then have Service Assurance system administrator reassociate the Service Assurance Manager to the Service Assurance Agent.
Event Details
Table B-105 Event 8026 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - <Service_Assurance_Manager_Name> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
69 |
Trap MIB Name |
servAssuranceMgrAssocBroken |
8027
Explanation
The Service Assurance agent is not currently running.
Recovery
No action required; the Service Assurance agent should be restarted automatically.
Event Details
Table B-106 Event 8027 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
70 |
Trap MIB Name |
servAssuranceAgentNotRunning |
8030
Explanation
This notification indicates that the LSMS is not able to confirm physical connectivity with the DCM.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-107 Event 8030 Details
GUI Notification |
|
Severity |
Critical |
Text |
EBDA Physical Connection Lost |
Surveillance Notification |
|
Text |
Notify:Sys Admin - NE=< NE CLLI > EBDA conn lost |
Source |
Active server |
Frequency |
Every 5 minutes |
Trap |
|
Trap ID |
73 |
Trap MIB Name |
noPhysicalConnectivityToDCM |
8037
Explanation
The
OSI process has failed.
The
sentryd
process will attempt to
restart.
Recovery
No action required; the
sentryd
process will attempt to
restart the failed process.
Event Details
Table B-108 Event 8037 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - FAILD: OSI |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
88 |
Trap MIB Name |
osiDaemonFailure |
8038
Explanation
The
OSI process has been
successfully restarted by the
sentryd
process.
Recovery
No action required. The
sentryd
process will attempt to
restart the
npacagent
processes for all active
regions. Any active
LSMS
GUI processes will automatically reconnect.
Event Details
Table B-109 Event 8038 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - RECOV: OSI |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
89 |
Trap MIB Name |
osiDaemonRestarted |
8039
Explanation
The
sentryd
process was not able to
restart the
OSI process.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-110 Event 8039 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - RFAILD: OSI |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
90 |
Trap MIB Name |
osiDaemonRestartFailure |
8040
Explanation
The Surveillance feature has detected that the
sentryd
process is no longer running.
Recovery
No action required; the
LSMS
HA software will attempt to restart the
sentryd
process.
Event Details
Table B-111 Event 8040 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - FAILD: sentryd |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
91 |
Trap MIB Name |
sentrydFailure |
8041
Explanation
This notification indicates that the surveillance process has detected that the Legacy lddAgent process has restarted and all functionality has resumed.
Recovery
No action required; this notification is for information only.
Event Details
Table B-112 Event 8041 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - RECOV: lddAgent legacy |
Source |
Both servers |
Frequency |
Once, as soon as the condition occurs |
Trap |
|
Trap ID |
92 |
Trap MIB Name |
IddAgentRestarted |
8042
Explanation
This notification indicates that the surveillance process has detected that the SCPMS lddAgent process has restarted and all functionality has resumed.
Recovery
No action required; this notification is for information only.
Event Details
Table B-113 Event 8042 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
Notify:Sys Admin - RECOV: lddAgent scpms |
Source |
Both servers |
Frequency |
Once, as soon as the condition occurs |
Trap |
|
Trap ID |
93 |
Trap MIB Name |
scpmsIddAgentRestarted |
8044
Explanation
This notification indicates that the LDD SCPMS Confirmation of Arrival message retry attempts have been exhausted. The MQSeries interface is not operational or network connectivity to the remote system is lost.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-114 Event 8044 Details
GUI Notification |
|
Severity |
Critical |
Text |
LDD SCPMS COA Retry Attempts Exhausted |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
116 |
Trap MIB Name |
scpmsMqSeriesFault |
8045
Explanation
This notification indicates that the LDD SCPMS system has not provided a response within the time limit specified by the LDD_SCP_SYSTEM_RESPONSE_TIMEOUT configuration parameter. The SCPMS system is not active.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-115 Event 8045 Details
GUI Notification |
|
Severity |
Critical |
Text |
LDD SCPMS Response Retry Attempts Exhausted |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
117 |
Trap MIB Name |
scpmsNotActive |
8046
Explanation
This notification indicates that the LDD Legacy Confirmation of Arrival message retry attempts have been exhausted.
The MQSeries interface is not operational or network connectivity to the remote system is lost.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-116 Event 8046 Details
GUI Notification |
|
Severity |
Critical |
Text |
LDD SCPMS COA Retry Attempts Exhausted |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
118 |
Trap MIB Name |
legacyMqSeriesFault |
8047
Explanation
This notification indicates that the LDD Legacy system has not provided a response within the time limit specified by the LDD_SCP_SYSTEM_RESPONSE_TOMEOUT configuration parameter. The SCPMS system is not active.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-117 Event 8047 Details
GUI Notification |
|
Severity |
Critical |
Text |
LDD Legacy Response Retry Attempts Exhausted |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
119 |
Trap MIB Name |
scpmsLegacyNotActive |
8048
Explanation
This notification indicates that a connection could not be made to the MQSeries local queue manager. The local queue manager is not started or operational.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-118 Event 8048 Details
GUI Notification |
|
Severity |
Critical |
Text |
Unable to Connect to Queue Manager: < queueMgrName > |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
120 |
Trap MIB Name |
mqSeriesQueueManagerNotActive |
8049
Explanation
The EMS/NE has rejected the NPANXX GTT creation, deletion, or modification transaction, and the NPANXX value in the transaction could not be determined.
Recovery
Look in the transaction log file,
/var/TKLC/lsms/logs/<CLLI>/LsmsTrans.log.MMDD
, and locate the
NE’s response to the
NPANXX
GTT command to determine
why the command failed. Re-enter the
NPANXX
GTT data correctly,
which will cause the
LSMS to try to command
again.
Event Details
Table B-119 Event 8049 Details
GUI Notification |
|
Severity |
Major |
Text |
<CLLI>: NPANXX GTT <type_of_operation> Failed |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
126 |
Trap MIB Name |
npanxxGTTValueNotFound |
8050
Explanation
The EMS/NE has rejected the NPANXX GTT creation, deletion, or modification transaction for the specified NPANXX value.
Recovery
Look in the transaction log file,
/var/TKLC/lsms/logs/<CLLI>/LsmsTrans.log.MMDD
, and locate the
NE’s response to the
NPANXX
GTT command to determine
why the command failed. Re-enter the
NPANXX
GTT data correctly,
which will cause the
LSMS to try to command
again.
Event Details
Table B-120 Event 8050 Details
GUI Notification |
|
Severity |
Major |
Text |
<CLLI>: NPANXX GTT <type_of_operation> Failed for NPANXX <NPANXX_value> |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
127 |
Trap MIB Name |
npanxxGTTValueRejected |
8051
Explanation
The EMS/NE has rejected the Override GTT creation, deletion, or modification transaction, and the LRN value in the transaction could not be determined.
Recovery
Look in the transaction log file,
/var/TKLC/lsms/logs/<CLLI>/LsmsTrans.log.MMDD
,
and locate the
NE’s response to the
Override
GTT command to determine
why the command failed. Re-enter the Override
GTT data correctly,
which will cause the
LSMS to try to command
again.
Event Details
Table B-121 Event 8051 Details
GUI Notification |
|
Severity |
Major |
Text |
<CLLI>: Override GTT <type_of_operation> Failed |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
128 |
Trap MIB Name |
overrideGTTValueNotFound |
8052
Explanation
The EMS/NE has rejected the Override GTT creation, deletion, or modification transaction for the specified LRN value.
Recovery
Look in the transaction log file,
/var/TKLC/lsms/logs/<CLLI>/LsmsTrans.log.MMDD
, and locate the
NE’s response to the
Override
GTT command to determine
why the command failed. Re-enter the Override
GTT data correctly,
which will cause the
LSMS to try to command
again.
Event Details
Table B-122 Event 8052 Details
GUI Notification |
|
Severity |
Major |
Text |
<CLLI>: Override GTT <type_of_operation> Failed for LRN <LRN_value> |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
129 |
Trap MIB Name |
overrideGTTValueRejected |
8053
Explanation
The LSMS was not able to complete the automatic synchronization with the EMS/NE. Possible reasons include:
-
The network failed temporarily but not long enough to cause the association with the EMS to fail.
-
The EMS/NE rejected the data because it is busy updating its databases.
Recovery
Verify the connection between the LSMS and the EMS; then reinitialize the MPS. If this notification appears again, perform one of the bulk download procedures in the LNP Database Synchronization User's Guide.
Event Details
Table B-123 Event 8053 Details
GUI Notification |
|
Severity |
Major |
Text |
Short Synchronization Failed |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
131 |
Trap MIB Name |
unableToCompleteAutoResynch |
8054
Explanation
The LSMS has started its automatic synchronization with the EMS/NE.
Recovery
No action required; for information only.
Event Details
Table B-124 Event 8054 Details
GUI Notification |
|
Severity |
Major |
Text |
Short Synchronization Started |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
132 |
Trap MIB Name |
autoResynchNEStarted |
8055
Explanation
The automatic resynchronization of databases after an outage between the LSMS and the NPAC has completed successfully.
Recovery
No action required; for information only.
Event Details
Table B-125 Event 8055 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Recovery Complete |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
133 |
Trap MIB Name |
dbResynchCompleted |
8059
Explanation
The LSMS has completed its automatic synchronization with the EMS/NE.
Recovery
No action required; for information only.
Event Details
Table B-126 Event 8059 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Short Synchronization Complete |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
138 |
Trap MIB Name |
emsShortSynchCompleted |
8060
Explanation
The
EMS pending queue used to
hold the transactions to send to the
EMS/NE identified by
<CLLI>
in the Survellance
notification, has fallen sufficiently below the halfway full point.
Recovery
No action required; for information only.
Event Details
Table B-127 Event 8060 Details
GUI Notification |
|
Severity |
Cleared |
Text |
EMS Pending Queue Less Than Half Full |
Surveillance Notification |
|
Text |
Notify:Sys Admin - CLLI=<CLLI> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
141 |
Trap MIB Name |
pendingQueueHalfFull |
8061
Explanation
The EMS pending queue used to hold the transactions to send to the EMS/NE identified by <CLLI> in the Survellance notification, has fallen sufficiently below the full point.
Recovery
No action required; for information only.
Event Details
Table B-128 Event 8061 Details
GUI Notification |
|
Severity |
Cleared |
Text |
EMS Pending Queue No Longer Full |
Surveillance Notification |
|
Text |
Notify:Sys Admin - CLLI=<CLLI> |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
142 |
Trap MIB Name |
pendingQueueNotFull |
8062
Explanation
This notification indicates that physical connection has been restored with the DCM.
Recovery
No action required; for information only.
Event Details
Table B-129 Event 8062 Details
GUI Notification |
|
Severity |
Cleared |
Text |
EBDA Physical Connection Restored |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
143 |
Trap MIB Name |
dcmConnectionRestored |
8063
Explanation
This notification indicates that the connection to the MQSeries local queue manager has been established following an outage.
Recovery
No action required; for information only.
Event Details
Table B-130 Event 8063 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Connected to Queue Manager: < queueMgrName > |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
144 |
Trap MIB Name |
connToMqSeriesQueueMngrRest |
8064
Explanation
The specified
NPA-NXX is opened for portability starting at the value of the
<EffectiveTimestamp>
field.
Recovery
No action required; for information only.
Event Details
Table B-131 Event 8064 Details
GUI Notification |
|
Severity |
Event |
Text |
New NPA-NXX: SPID [<SPID>], NPANXX [<NPANXX>], TS [<EffectiveTimestamp>] |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
145 |
Trap MIB Name |
npaNxxOpenedForPortabilityAtTS |
8065
Explanation
The first telephone number in the specified
NPA-NXX is ported starting at the value of the
<EffectiveTimestamp>
field.
Recovery
No action required; for information only.
Event Details
Table B-132 Event 8065 Details
GUI Notification |
|
Severity |
Event |
Text |
First use of NPA-NXX: SPID [<SPID>], NPANXX [<NPANXX>], TS [<EffectiveTimestamp>] |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
146 |
Trap MIB Name |
npaNxxPortedAtTS |
8066
Explanation
An audit of the network element identified by
<CLLI>
has begun.
Recovery
No action required; for information only.
Event Details
Table B-133 Event 8066 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Audit LNP DB Synchronization Started |
Surveillance Notification |
|
Text |
NE <CLLI> Audit started |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
147 |
Trap MIB Name |
ebdaAuditActive |
8067
Explanation
An audit of the network element identified by
<CLLI>
has completed
successfully.
Recovery
No action required; for information only.
Event Details
Table B-134 Event 8067 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Audit LNP DB Synchronization Completed |
Surveillance Notification |
|
Text |
NE <CLLI> Audit completed |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
148 |
Trap MIB Name |
ebdaAuditSuccess |
8068
Explanation
An audit of the network element identified by
<CLLI>
has failed.
Recovery
Inspect the log file
/var/TKLC/lsms/logs/<CLLI>/LsmsTrans.log.MMDD
for
details as to the cause of the error. After clearing the cause of the error,
start the audit again.
Event Details
Table B-135 Event 8068 Details
GUI Notification |
|
Severity |
Critical |
Text |
Audit LNP DB Synchronization Failed |
Surveillance Notification |
|
Text |
NE <CLLI> Audit failed |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
149 |
Trap MIB Name |
ebdaAuditFailure |
8069
Explanation
The user aborted an audit of the network element
identified by
<CLLI>
before it had completed.
Recovery
No action required; for information only.
Event Details
Table B-136 Event 8069 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Audit LNP DB Synchronization Aborted |
Surveillance Notification |
|
Text |
NE <CLLI> Audit aborted |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
150 |
Trap MIB Name |
ebdaAuditAbortedByUser |
8070
Explanation
A reconcile has started at the completion of an audit.
Recovery
No action required; for information only.
Event Details
Table B-137 Event 8070 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Reconcile LNP DB Synchronization Started |
Surveillance Notification |
|
Text |
NE <CLLI> Reconcile started |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
151 |
Trap MIB Name |
ebdaReconcileActive |
8071
Explanation
A reconcile, which was performed at the end of an audit, has completed.
Recovery
No action required; for information only.
Event Details
Table B-138 Event 8071 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Reconcile LNP DB Synchronization Complete |
Surveillance Notification |
|
Text |
NE <CLLI> Reconcile completed |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
152 |
Trap MIB Name |
ebdaReconcileSuccess |
8072
Explanation
A reconcile, which was performed at the end of an audit, has failed before it completed.
Recovery
Inspect the log file
/var/TKLC/lsms/logs/<CLLI>/LsmsAudit.log.MMDD
for
details as to the cause of the error. After clearing the cause of the error,
start the reconcile again.
Event Details
Table B-139 Event 8072 Details
GUI Notification |
|
Severity |
Critical |
Text |
Reconcile LNP DB Synchronization Failed |
Surveillance Notification |
|
Text |
NE <CLLI> Reconcile failed |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
153 |
Trap MIB Name |
ebdaReconcileFailure |
8073
Explanation
The user has stopped a reconcile before it completed.
Recovery
No action required; for information only.
Event Details
Table B-140 Event 8073 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Reconcile LNP DB Synchronization Aborted |
Surveillance Notification |
|
Text |
NE <CLLI> Reconcile aborted |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
154 |
Trap MIB Name |
ebdaReconcileAbortedByUser |
8078
Explanation
A bulk download is currently running.
Recovery
No action required; for information only.
Event Details
Table B-141 Event 8078 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Bulk Load LNP DB Synchronization Started |
Surveillance Notification |
|
Text |
NE <CLLI> Bulk load started |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
159 |
Trap MIB Name |
ebdaBulkLoadActive |
8079
Explanation
A bulk download has completed successfully.
Recovery
No action required; for information only.
Event Details
Table B-142 Event 8079 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Bulk Load LNP DB Synchronization Complete |
Surveillance Notification |
|
Text |
NE <CLLI> Bulk load completed |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
160 |
Trap MIB Name |
ebdaBulkLoadSuccess |
8080
Explanation
A bulk download has failed before it completed.
Recovery
Inspect the log file
/var/TKLC/lsms/logs/<CLLI>/LsmsBulkLoad.log.MMDD
for details as to the cause of the error. After clearing the cause of the
error, start the bulk download again.
Event Details
Table B-143 Event 8080 Details
GUI Notification |
|
Severity |
Critical |
Text |
Bulk Load LNP DB Synchronization Failed |
Surveillance Notification |
|
Text |
NE <CLLI> Bulk load failed |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
161 |
Trap MIB Name |
ebdaBulkLoadFailure |
8081
Explanation
The user has stopped a bulk download before it completed.
Recovery
No action required; for information only.
Event Details
Table B-144 Event 8081 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Bulk Load LNP DB Synchronization Aborted |
Surveillance Notification |
|
Text |
NE <CLLI> Bulk load aborted |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
162 |
Trap MIB Name |
ebdaBulkLoadAbortedByUser |
8082
Explanation
A user-initiated resynchronization is currently running.
Recovery
No action required; for information only.
Event Details
Table B-145 Event 8082 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Re-sync LNP DB Synchronization Started |
Surveillance Notification |
|
Text |
NE <CLLI> Re-sync started |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
163 |
Trap MIB Name |
ebdaResyncActive |
8083
Explanation
A user-initiated resynchronization has completed successfully.
Recovery
No action required; for information only.
Event Details
Table B-146 Event 8083 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Re-sync LNP DB Synchronization Complete |
Surveillance Notification |
|
Text |
NE <CLLI> Re-sync completed |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
164 |
Trap MIB Name |
ebdaResyncSuccess |
8084
Explanation
A user-initiated resynchronization has failed before it completed.
Recovery
Inspect the contents of the file
/var/TKLC/lsms/logs/<CLLI>/LsmsResync.log.MMDD
to
determine the cause of the error. After clearing the cause of the error, start
the user-initiated resynchronization again.
Event Details
Table B-147 Event 8084 Details
GUI Notification |
|
Severity |
Critical |
Text |
Re-sync LNP DB Synchronization Failed |
Surveillance Notification |
|
Text |
NE <CLLI> Re-sync failed |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
165 |
Trap MIB Name |
ebdaResyncFailure |
8085
Explanation
The user has stopped a user-initiated resynchronization before it completed.
Recovery
No action required; for information only.
Event Details
Table B-148 Event 8085 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Re-sync LNP DB Synchronization Aborted |
Surveillance Notification |
|
Text |
NE <CLLI> Re-sync aborted |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
166 |
Trap MIB Name |
ebdaResyncAbortedByUser |
8086
Explanation
This notification indicates that the Sprint lddAgent has failed to communicate with the Sprint Legacy System.
Recovery
No action required; for information only.
Event Details
Table B-149 Event 8086 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
FAILED:IddAgent legacy |
Source |
Both servers |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
167 |
Trap MIB Name |
sprintIddAgentCommFailureLegSys |
8087
Explanation
This notification indicates that the Sprint lddAgent has failed to communicate with the Sprint SCPMS System.
Recovery
Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Event Details
Table B-150 Event 8087 Details
GUI Notification |
|
Severity |
None |
Text |
|
Surveillance Notification |
|
Text |
FAILED:IddAgent scpms |
Source |
Both servers |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
168 |
Trap MIB Name |
sprintIddAgentCommFailureScpmsSys |
8088
Explanation
A scheduled file transfer has failed.
Recovery
Inspect the error log file/var/TKLC/lsms/logs/aft/aft.log.MMDD
for details as to the cause of the
error.
Event Details
Table B-151 Event 8088 Details
GUI Notification |
|
Severity |
Major |
Text |
Automatic File Transfer Failure - See Log for Details |
Surveillance Notification |
|
Text |
Notify:Sys Admin- Auto xfer Failure |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
171 |
Trap MIB Name |
automaticFileTransferFeatureFailure |
8089
Explanation
An NPA-NXX split activation completed successfully.
Recovery
No action required; for information only.
Event Details
Table B-152 Event 8089 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Activate Split Successful OldNPA=<old_NPA> NewNPA=<new_NPA> NXX=<NXX> |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
10 |
Trap MIB Name |
npaSplitActOk |
8090
Explanation
An NPA-NXX split activation failed.
Recovery
Perform and audit and reconcile of NPA Split information at the network element.
Event Details
Table B-153 Event 8090 Details
GUI Notification |
|
Severity |
Critical |
Text |
Activate Split Failed OldNPA=<old_NPA> NewNPA=<new_NPA> NXX=<NXX> |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
172 |
Trap MIB Name |
npaSplitActFailed |
8091
Explanation
At least one active NPA-NXX split is past its end date and needs to be deleted.
Recovery
Do the following:
-
View all split objects (for information, refer to the Database Administrator's Guide) to determine which objects have end dates that have already passed.
-
Delete the objects whose end dates have passed (for information, refer to the Database Administrator's Guide).
Event Details
Table B-154 Event 8091 Details
GUI Notification |
|
Severity |
Major |
Text |
Active Splits Are Past Their End Dates |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
173 |
Trap MIB Name |
activeSplitsPastEndDates |
8092
Explanation
This notification indicates the LDD SCPMS agent is switching from primary to backup SCPMS system.
Recovery
No action required; this notification is for information only.
Event Details
Table B-155 Event 8092 Details
GUI Notification |
|
Severity |
Critical |
Text |
LDD SCPMS Agent Switching from Primary to Backup SCPMS System |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
174 |
Trap MIB Name |
lddScpmsAgentSwitchToBackupScpms |
8093
Explanation
This notification indicates the LDD SCPMS agent is switching from backup to primary SCPMS system.
Recovery
No action required; this notification is for information only.
Event Details
Table B-156 Event 8093 Details
GUI Notification |
|
Severity |
Critical |
Text |
LDD SCPMS Agent Switching from Backup to Primary SCPMS System |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
175 |
Trap MIB Name |
lddScpmsAgentSwitchFromBackupToPrim |
8094
Explanation
This notification indicates the LDD SCPMS current system is primary SCPMS.
Recovery
No action required; this notification is for information only.
Event Details
Table B-157 Event 8094 Details
GUI Notification |
|
Severity |
Cleared |
Text |
LDD SCPMS Current System is Primary SCPMS |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
176 |
Trap MIB Name |
lddScpmsPrimary |
8095
Explanation
This notification indicates the LDD SCPMS current system is backup SCPMS.
Recovery
No action required; this notification is for information only.
Event Details
Table B-158 Event 8095 Details
GUI Notification |
|
Severity |
Cleared |
Text |
LDD SCPMS Current System is Backup SCPMS |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
177 |
Trap MIB Name |
lddScpmsBackup |
8096
Explanation
The EMS/NE has rejected the NPANXX Split operation indicated by <operation>, and the NPANXX value in the transaction could not be determined.
Recovery
Look in the transaction log file,
/var/TKLC/lsms/logs/<CLLI>/LsmsTrans.log.MMDD
, and locate the
NE’s response to the
NPANXX Split command to
determine why the command failed. Delete and re-enter the
NPANXX Split data
correctly, which will cause the
LSMS to try to command
again.
Event Details
Table B-159 Event 8096 Details
GUI Notification |
|
Severity |
Major |
Text |
<CLLI>: NPANXX Split <operation> Failed |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
178 |
Trap MIB Name |
EmsNeRejNpaNxxSplitNotDetermined |
8097
Explanation
The EMS/NE has rejected the NPANXX Split operation indicated by <operation> for the indicated NPANXX value.
Recovery
Look in the transaction log file,
/var/TKLC/lsms/logs/<CLLI>/LsmsTrans.log.MMDD
, and locate the
NE’s response to the
NPANXX Split command to
determine why the command failed. Delete and re-enter the
NPANXX Split data
correctly, which will cause the
LSMS to try to command
again.
Event Details
Table B-160 Event 8097 Details
GUI Notification |
|
Severity |
Major |
Text |
<CLLI>: NPANXX Split <operation> Failed for New NPANXX <NPANXX> |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
179 |
Trap MIB Name |
EmsNeRejectedNpaNxxSplit |
8098
Explanation
The
LSMS is not able to
confirm the physical connectivity with the directly connected query server
identified by
<hostname>
. The problem may be
one of the following:
-
Physical connectivity issues between the LSMS and directly connected Query Server.
-
The query server host name is not associated with the appropriate Internet Protocol (IP) address in
/etc/hosts
file. -
The Internet Protocol (IP) address specified for the special replication user for the for the query server is incorrect.
-
The proper TCP/IP ports are not open in the firewall(s) between the LSMS and the query servers.
Recovery
-
Check the physical connectivity of the LSMS to the query server.
-
Check that the query server hosts name is associated with corresponding Internet Protocol (IP) addresses in
/etc/hosts
file. -
Verify that the IP address for the query server is correct. Display the IP address of all configured query servers by using the
$LSMS_TOOLS_DIR/lsmsdb -c queryservers
command. -
Verify that the firewall TCP/IP port configuration is set correctly for both the LSMS and query servers directly connected to the LSMS (refer to Appendix A, “Configuring the Query Server,” of the Configuration Guide for information about port configuration for firewall protocol filtering).
Event Details
Table B-161 Event 8098 Details
GUI Notification |
|
Severity |
Major |
Text |
|
Surveillance Notification |
|
Text |
|
Source |
Active Server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
SNMP Trap |
|
Trap ID |
180 |
Trap MIB Name |
physicalConnectivityWithQueryServerLost |
8099
Explanation
The query server identified by
<hostname>
does not have a
replication connection established with the
LSMS. The problem may be
one of the following:
-
Query server cannot establish a connection with the master.
-
Query server not properly configured to connect to the master.
-
A query that succeeded on the master failed on the query server.
-
The binary log(s) that are needed by the query server to resynchronize itself to its master no longer exist.
-
Data on the query server does not agree with what is on the master when the binary log was started.
-
Replication was stopped at the query server by a user.
Recovery
-
At the query server, perform the following substeps:
-
Start the MySQL command line utility on the slave server:
# cd /opt/mysql/mysql/bin
# mysql -u root -p
Enter password:
<Query Server/s MySql root user password>
-
Determine whether the query server is running by entering the following command and looking at the Slave_IO_Running and Slave_SQL_Running column values.
mysql> SHOW SLAVE STATUS \G;
-
If the Slave_IO_Running and Slave_SQL_Running column values show that the slave is not running, verify the query server's
/usr/mysql1/my.cnf
option file (refer to “MySQL Replication Configuration for Query Servers,” in Appendix A, “Configuring the Query Server,” of the Configuration Guide) and check the error log (/usr/mysql1/<hostname>.err
) for messages. -
If the Slave_IO_Running and Slave_SQL_Running column values show that the slave (query server) is running, enter the following command to verify whether the slave established a connection with the master (LSMS or another query server acting as a master/slave).
mysql> SHOW PROCESSLIST;
Find the thread with the system user value in the
User
column and none in theHost
column, and check theState
column. If theState
column says “connecting to master,” verify that the master hostname is correct, that the DNS is properly set up, whether the master is actually running, and whether it is reachable from the slave (refer to Appendix A, “Configuring the Query Server,” of the Configuration Guide for information about port configuration for firewall protocol filtering if the master and slave are connecting through a firewall). -
If the slave was running, but then stopped, enter the following command:
mysql> SHOW SLAVE STATUS;
Look at the output. This error can happen when some query that succeeded on the master fails on the slave, but this situation should never happen while the replication is active if you have taken a proper snapshot of the master and never modify the data on the slave outside of the slave thread.
-
-
-
However, if this is not the case, or if the failed items are not needed and there are only a few of them, try the following:
-
First see if there is some stray record in the way on the query server. Understand how it got there, then delete it from the query server database and run
start slave
. -
If the above does not work or does not apply, try to understand if it would be safe to make the update manually (if needed) and then ignore the next query from the LSMS.
-
If you have decided you can skip the next query, enter one of the following command sequences:
-
To skip a query that uses AUTO_INCREMENT or LAST_INSERT_ID(), enter:
mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER=2;
mysql> start slave;
Queries that use AUTO_INCREMENT or LAST_INSERT_ID() take two events in the binary log of the master.
-
Otherwise, enter:
mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
mysql> start slave;
-
-
-
If you are sure the query server database started out perfectly in sync with the LSMS database, and no one has updated the tables involved outside of the slave thread, contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 so you will not have to do the above steps again.
-
If all else fails, read the error log,
/usr/mysql/<hostname>.err
. If the log is big, run the following command on the slave:grep -i slave /usr/mysql1/<hostname>.err
(There is no generic pattern to search for on the master, as the only errors it logs are general system errors. If it can, the master will send the error to the slave when things go wrong.)
-
If the error log on the slave conveys that it could not find a binary log file, this indicates that the binary log files on the master have been removed (purged). Binary logs are periodically purged from the master to prevent them from growing unbounded and consuming large amounts of disk resources. However, if a query server was not replicating and one of the binary log files it wants to read is purged, it will be unable to replicate once it comes up. If this occurs, the query server is required to be reset with another snapshot of data from the master or another query server (see “Reload a Query Server Database from the LSMS” and “Reload a Query Server Database from Another Query Server”).
-
When you have determined that there is no user error involved, and replication still either does not work at all or is unstable, please contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
-
Event Details
Table B-162 Event 8099 Details
GUI Notification |
|
Severity |
Major |
Text |
Query Server <hostname> Replication Connection Lost |
Surveillance Notification |
|
Text |
Query Server=<hostname> Replication Conn Lost |
Source |
Active Server |
Frequency |
As soon as condition occurs, and at five-minute intervals as long as condition exists |
SNMP Trap |
|
Trap ID |
181 |
Trap MIB Name |
queryServerConnectionWithLsmsLost |
8100
Explanation
The SV/NPB storage database has exceeded the configured percent usage threshhold.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-163 Event 8100 Details
GUI Notification |
|
Severity |
Event |
Text |
SV/NPB Storage Exceeds <%> percent |
Surveillance Notification |
|
Text |
Notify:Sys Admin - SV/NPB threshold % |
Source |
Both servers |
Frequency |
Every 5 minutes after condition occurs |
Trap |
|
Trap ID |
194 |
Trap MIB Name |
svNpbPercentUsage |
8101
Explanation
This event indicates that the SV/NPB storage database usage is below the configured percent usage threshold.
Recovery
No action is required
Event Details
Table B-164 Event 8101 Details
GUI Notification |
|
Severity |
Cleared |
Text |
SV/NPB storage falls below <%> percent |
Surveillance Notification |
|
Text |
Notify: Sys Admin - SV/NPB cleared |
Source |
Both servers |
Frequency |
As soon as condition clears |
Trap |
|
Trap ID |
207 |
Trap MIB Name |
svNpbBelowLimit |
8102
Explanation
The event number present in the untilClear filter list is cleared. The event number is removed from the untilClear filter list.
Recovery
No action is required.
Event Details
Table B-165 Event 8102 Details
GUI Notification |
|
Severity |
Event |
Text |
<Event number> in the untilClear filter list, event clear received at <%s> |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
None |
Trap MIB Name |
8103
Explanation
The alarm filter counter has reached its limit; the counter will start again from one.
Recovery
No action is required.
Event Details
Table B-166 Event 8103 Details
GUI Notification |
|
Severity |
Event |
Text |
Counter associated with event <event number> exceeds limit <%s>. Resetting counter. |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
None |
Trap MIB Name |
8104
Explanation
The event number present in the untilTimeout filter list is cleared. The event number is removed from the untilTimeout filter list.
Recovery
No action is required.
Event Details
Table B-167 Event 8104 Details
GUI Notification |
|
Severity |
Event |
Text |
<Event number> in the untilTimeout filter list, event timeout at <%s> |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
None |
Trap MIB Name |
8105
Explanation
The log capture started by the user has failed.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-168 Event 8105 Details
GUI Notification |
|
Severity |
Minor |
Text |
Logs Capture Failed |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
None |
Trap MIB Name |
8106
Explanation
The MySQL Port has been updated. The LSMS application must be restarted.
Recovery
The application must be restarted. Restart the LSMS application first on the active server and then on the standby server. For more information, refer to the Configuration Guide.
Event Details
Table B-169 Event 8106 Details
GUI Notification |
|
Severity |
Event |
Text |
MySQL Port changed from <%s> to <%s>. LSMS application restart required. |
Surveillance Notification |
|
Text |
Notify: Sys Admin - LSMS restart required |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
208 |
Trap MIB Name |
mysqlPortUpdated |
8107
Explanation
The MySQL Port has been updated. The Query Server configuration needs to be updated with the new MySQL port.
Recovery
Configure the Query Server with the updated MySQL port. For more information, refer to the Configuration Guide.
Event Details
Table B-170 Event 8107 Details
GUI Notification |
|
Severity |
Event |
Text |
MySQL Port changed from <%s> to <%s>. Query Server configuration updated required. |
Surveillance Notification |
|
Text |
Notify: Sys Admin - QS updated required |
Source |
Active server |
Frequency |
Once, as soon as condition occurs |
Trap |
|
Trap ID |
209 |
Trap MIB Name |
queryServerResetConfiguration |
8108
Explanation
At least one of the connected Query Servers is out of sync, and the binary logs cannot be purged without user confirmation.
Recovery
root
and enter the following command:
pruneBinaryLogs -force
Event Details
Table B-171 Event 8108 Details
GUI Notification |
|
Severity |
Minor |
Text |
Automatic purging of binary logs cannot be done. User confirmation required. |
Surveillance Notification |
|
Text |
Notify: Sys Admin - Purge need confirmation |
Source |
Both servers |
Frequency |
Every 45 minutes |
Trap |
|
Trap ID |
210 |
Trap MIB Name |
purgeConfirmRequired |
8109
Explanation
Disk usage is reaching the capacity threshold, and an automatic purge of binary logs is imminent.
Recovery
No action is required.
Event Details
Table B-172 Event 8109 Details
GUI Notification |
|
Severity |
Minor |
Text |
Disk usage reaching <%> percent. Purging of binary logs is imminent. |
Surveillance Notification |
|
Text |
Notify: Sys Admin - Purging is imminent |
Source |
Both servers |
Frequency |
Every 45 minutes |
Trap |
|
Trap ID |
211 |
Trap MIB Name |
purgeImminent |
8110
Explanation
Logs capture has been started by the user.
Recovery
No action is required.
Event Details
Table B-173 Event 8110 Details
GUI Notification |
|
Severity |
Cleared |
Text |
Logs Capture Started |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
None |
Trap MIB Name |
8111
Explanation
The logs capture started by the user completed successfully.
Recovery
No action is required.
Event Details
Table B-174 Event 8111 Details
GUI Notification |
|
Severity |
Minor |
Text |
Logs Captured Successfully |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
None |
Trap MIB Name |
8112
Explanation
Syscheck was not able to restart automatically by the cron job.
Recovery
Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
Event Details
Table B-175 Event 8112 Details
GUI Notification |
|
Severity |
Event |
Text |
Failed to restart syscheck services |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
None |
Trap MIB Name |
8116
Explanation
The HTTP protocol is enabled but secure HTTP (HTTPS) is recommended.
Recovery
For information on configuring the protocols, see Starting an Web-Based LSMS GUI Session.
Event Details
Table B-176 Event 8116 Details
GUI Notification |
|
Severity |
Event |
Text |
HTTP is enabled and it is recommended to use HTTPS. |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
None |
Trap MIB Name |
8117
Explanation
HTTP is disabled and HTTPS is enabled.
Recovery
No recovery required; only HTTPS is enabled now.
Event Details
Table B-177 Event 8117 Details
GUI Notification |
|
Severity |
Event |
Text |
Only HTTPS is enabled now. |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
None |
Trap MIB Name |
8118
Explanation
Both HTTP and HTTPS are enabled, but using only HTTPS is recommended.
Recovery
For information on configuring the protocols, see Starting an Web-Based LSMS GUI Session.
Event Details
Table B-178 Event 8118 Details
GUI Notification |
|
Severity |
Event |
Text |
Both HTTP and HTTPS are enabled and it is recommended to use HTTPS. |
Surveillance Notification |
|
Text |
None |
Source |
|
Frequency |
|
Trap |
|
Trap ID |
None |
Trap MIB Name |
Additional Trap Information
Trap Id | Trap MIB Name | Notification Description | Trap variables def | Retry Interval | Severity | Event Num | GUI Event Text | Pair Event Num |
---|---|---|---|---|---|---|---|---|
25 | dataReplInfo | This notification indicates that database replication is delayed. | eventNbr = Oracle specific unique identifier for event notification. This eventNbr field can be used to reference Oracle documentation. dbReplInfo = Info message from database replication. | Every 5 mins | event_notif_event | 4011 | DB Repl Info - %s | 0 |
201 | snapInvalidErr | This notification indicates that the Invalid Snapshot has been detected. | eventNbr = Oracle specific unique identifier for event notification. This eventNbr field can be used to reference Oracle documentation. snapName = Name of the invalid snapshot. | Every 30 mins | event_notif_critical | 4034 | Invalid Snapshot - %s | 4035 |
203 | snapFullErr | This notification indicates that the Snapshot is greater than 80% full. | eventNbr = Oracle specific unique identifier for event notification. This eventNbr field can be used to reference Oracle documentation. snapName = Name of the invalid/hanging snapshot. | Every 30 mins | event_notif_critical | 4036 | Full Snapshot - %s | 4037 |
Trap Id | Trap MIB Name | Notification Description | Frequency | Source | Clearing behavior |
---|---|---|---|---|---|
212 | resyncStartTrap | The trap is sent by the LSMS to NMS when the LSMS is about to start resynchronization | Every time when starting a resynchronization with a NMS | /vobs/lsms/apps/snmp/ lsmsSNMPResyncHandler.pl | None |
213 | resyncStopTrap | The trap is sent by the LSMS to NMS when resynchronization is complete | Every time when a resynchronization with a NMS is complete | /vobs/lsms/apps/snmp/ lsmsSNMPResyncHandler.pl | None |
214 | resyncRejectTrap | The trap is sent by the LSMS to NMS when a resynchronization request is rejected by LSMS | Every time when a resynchronization request is initialized while an existing resynchronization is still being processed | /vobs/lsms/apps/snmp/ lsmsSNMPResyncHandler.pl | None |
215 | resyncRequiredTrap | The trap is sent by the LSMS to NMS when the LSMS is rebooted or LSMS is started | Every time when LSM S is rebooted or restarted | /vobs/lsms/apps/snmp/ lsmsSNMPResyncHandler.pl | None |
216 | heartBeatTrap | The trap is sent by the LSMS to NMS periodically to indicate that the LSMS is up | Per the configured value in second (0, 5-7200), where 0 indicates the heartbeat trap is disabled. | /vobs/lsms/apps/snmp/ lsmsSnmpHeartbeatSender.pl | None |
217 | lsmsAlarmTrapV3 | The trap will indicate that the following information is for a particular event | Every v3 trap message sent to nms will carry this OID | /vobs/lsms/apps/snmp/ lsmsSNMPResyncHandler.pl | None |
218 | resyncErrCode | errorCode = 0, Resynchronization completed successfully. errorCode = 1, Resynchronization aborted by NMS. errorCode = 2, Resynchronization already in progress for the NMS. errorCode = 3, Resynchronization Aborted, Database error occurred. errorCode = 4, Resynchronization not in progress. | Every time when either resyncStopTrap or resyncRejectTrap sent to NMS | /vobs/lsms/apps/snmp/ lsmsSNMPResyncHandler.pl | None |
Platform Alarms
This section describes the following:
How Platform Alarms Are Reported
Each server runs syscheck
periodically and reports any problems found through platform alarms. The severity of platform alarms is one of the following:
- Critical, reported through event 4300
- Major, reported through event 4200
- Minor, reported through event 4100
When one or more problems in a given category has been found, the server reports one corresponding event notification to its Surveillance log and its serial port 3. If the server is not the active server, it also sends the event notification to the active server. The active server reports its own platform events to its own Surveillance log and to its Serial Port 3, and also sends an SNMP trap and displays a GUI notification for either its own platform events or for the non-active server’s platform events.
Each of the events 4100, 4200, and 4300 contain a 16-character hexadecimal bitmasked string that indicates all of the platform events in that category that currently exist. To decode which platform events exist, use the procedure described in “How to Decode Platform Alarms”.
Each time the combination of platform events in a given category changes, a new event is reported. Following is an example of how platform events are reported:
-
At first, only one major platform event is reported on the standby server. A 4200 event with the alarm number of the event is reported.
-
One minute later, another platform event exists on the standby server (and the first one still exists). Another 4200 event is reported, with a bitmasked string that indicates both of the platform events that exist.
-
One minute later, another platform event exists on the standby server (and the previous ones still exist). Another 4200 event is reported, with a bitmasked string that indicates all of the platform events that exist.
-
One minute later, the first platform event is cleared. Another 4200 event is reported, with a bitmasked string that indicates the two platform events that still exist.
How to Decode Platform Alarms
Use the following procedure to determine all the platform alarms that exist in a given category:
Platform Alarms
Platform errors are grouped by category and severity. The categories are listed from most to least severe:
Table B-179 shows the alarm numbers and alarm text for all alarms generated by the MPS platform. The order within a category is not significant. Some of the alarms described are not available with specific configurations.
Table B-179 Platform Alarms
Alarm Recovery Procedures
This section provides recovery procedures for the MPS, listed by alarm category and Alarm Code (alarm data string) within each category.
Major Platform Alarms
Major platform alarms involve hardware components, memory, and network connections.
3000000000000001 – Server fan failure
Alarm Type: TPD
Description: This alarm indicates that a fan in the EAGLE fan tray in the EAGLE shelf where the E5-APP-B is "jacked in" is either failing or has failed completely. In either case, there is a danger of component failure due to overheating.
Severity: Major
OID: TpdFanErrorNotify 1.3.6.1.4.1.323.5.3.18.3.1.2.1
Alarm ID: TKSPLATMA13000000000000001
Recovery
Note:
3000000000000002 - Server Internal Disk Error
This alarm indicates that the server is experiencing issues replicating data to one or more of its mirrored disk drives. This could indicate that one of the server disks has failed or is approaching failure.
Recovery
- Run
syscheck
in Verbose mode. - Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 and provide the system health check output.
3000000000000008 - Server Platform Error
This alarm indicates a major platform error such as a corrupt system configuration or missing files, or indicates that syscheck
itself is corrupt.
Recovery
- Run
syscheck
in Verbose mode. - Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 and provide the system health check output.
3000000000000010 - Server File System Error
This alarm indicates that syscheck
was unsuccessful in writing to at least one of the server file systems.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
3000000000000020 - Server Platform Process Error
This alarm indicates that either the minimum number of instances for a required process are not currently running or too many instances of a required process are running.
Recovery
- Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for recovery procedures.
3000000000000080 - Server Swap Space Shortage Failure
This alarm indicates that the server’s swap space is in danger of being depleted. This is usually caused by a process that has allocated a very large amount of memory over time.
Note:
In order for this alarm to clear, the underlying failure condition must be consistently undetected for a number of polling intervals. Therefore, the alarm may continue to be reported for several minutes after corrective actions are completed.Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
3000000000000100 - Server provisioning network error
Alarm Type: TPD
Note:
The interface identified as eth01 on the hardware is identified as eth91 by the software (in syscheck output, for example).Severity: Major
OID: TpdProvNetworkErrorNotify 1.3.6.1.4.1.323.5.3.18.3.1.2.9
Alarm ID: TKSPLATMA93000000000000100
Recovery
- Check the physical network connectivity between the LSMS and the NAS.
- Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
3000000000001000 - Server Disk Space Shortage Error
This alarm indicates that one of the following conditions has occurred:
-
A file system has exceeded a failure threshold, which means that more than 90% of the available disk storage has been used on the file system.
-
More than 90% of the total number of available files have been allocated on the file system.
-
A file system has a different number of blocks than it had when installed.
Recovery
3000000000002000 - Server Default Route Network Error
This alarm indicates that the default network route of the server is experiencing a problem. Running syscheck
in Verbose mode will provide information about which type of problem.
Caution:
When changing the network routing configuration of the server, verify that the modifications will not impact the method of connectivity for the current login session. The route information must be entered correctly and set to the correct values. Incorrectly modifying the routing configuration of the server may result in total loss of remote network access.Recovery
3000000000004000 - Server Temperature Error
Alarm Type: TPD
Description: The internal temperature within the server is unacceptably high.
Severity: Major
OID: TpdTemperatureErrorNotify 1.3.6.1.4.1.323.5.3.18.3.1.2.15
Alarm ID: TKSPLATMA153000000000004000
Recovery
3000000000008000 - Server Mainboard Voltage Error
This alarm indicates that at least one monitored voltages on the server mainboard is not within the normal operating range.
Recovery
- Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
3000000000010000 - Server Power Feed Error
This alarm indicates that one of the power feeds to the server has failed.
Recovery
3000000000020000 - Server Disk Health Test Error
This alarm indicates that the hard drive has failed or failure is imminent.
Recovery
- Immediately contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance with a disk replacement.
3000000000040000 - Server Disk Unavailable Error
This alarm indicates that the smartd
service is not able to read the disk status because the disk has other problems that are reported by other alarms. This alarm appears only while a server is booting.
Recovery
- Perform the recovery procedures for the other alarms that accompany this alarm.
3000000000080000 - Device Error
This alarm indicates that the offboard storage server has a problem with its disk volume filling.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
3000000000100000 - Device Interface Error
This alarm indicates that the IP bond is either not configured or not functioning.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
3000000400000000 - Multipath device access link problem
Alarm Type: TPD
Description: One or more "access paths" of a multipath device are failing or are not healthy, or the multipath device does not exist.
Severity: Major
OID: TpdMpathDeviceProblemNotify1.3.6.1.4.1.323.5.3.18.3.1.2.35
Alarm ID: TKSPLATMA353000000400000000
Recovery
- unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 should do the following:
- Contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
3000000800000000 – Switch Link Down Error
This alarm indicates that the switch is reporting that the link is down. The link that is down is reported in the alarm. For example, port 1/1/2 is reported as 1102.
Recovery Procedure:
- Verify cabling between the offending port and remote side.
- Verify networking on the remote end.
- If problem persists, contact unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 to verify port settings on both the server and the switch.
3000001000000000 - Half-open Socket Limit
Alarm Type: TPD
Description:This alarm indicates that the number of half open TCP sockets has reached the major threshold. This problem is caused by a remote system failing to complete the TCP 3-way handshake.
Severity: Major
OID: tpdHalfOpenSocketLimit 1.3.6.1.4.1.323.5.3.18.3.1.2.37
Alarm ID: TKSPLATMA37 3000001000000000
Recovery
3000002000000000 - Flash Program Failure
Alarm Type: TPD
Description: This alarm indicates there was an error while trying to update the firmware flash on the E5-APP-B cards.
Severity: Major
OID: tpdFlashProgramFailure 1.3.6.1.4.1.323.5.3.18.3.1.2.38
Alarm ID: TKSPLATMA383000002000000000
Recovery
3000004000000000 - Serial Mezzanine Unseated
Alarm Type: TPD
Description:This alarm indicates the serial mezzanine board was not properly seated.
Severity: Major
OID: tpdSerialMezzUnseated 1.3.6.1.4.1.323.5.3.18.3.1.2.39
Alarm ID: TKSPLATMA393000004000000000
Recovery
3000000008000000 - Server HA Keepalive Error
This alarm indicates that heartbeat process has detected that it has failed to receive a heartbeat packet within the timeout period.
Recovery
- Determine if the mate server is currently operating. If the mate server is not operating, attempt to restore it to operation.
- Determine if the keepalive interface is operating.
- Determine if heartbeart is running (service TKLCha status).
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
3000000010000000 - DRBD block device can not be mounted
This alarm indicates that DRBD is not functioning properly on the local server. The DRBD state (disk state, node state, or connection state) indicates a problem.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
3000000020000000 - DRBD block device is not being replicated to peer
This alarm indicates that DRBD is not replicating to the peer server. Usually this alarm indicates that DRBD is not connected to the peer server. A DRBD Split Brain may have occurred.
Recovery
- Determine if the mate server is currently operating.
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
3000000040000000 - DRBD peer needs intervention
This alarm indicates that DRBD is not functioning properly on the peer server. DRBD is connected to the peer server, but the DRBD state on the peer server is either unknown or indicates a problem.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
Minor Platform Alarms
Minor platform alarms involve disk space, application processes, RAM, and configuration errors.
5000000000000001 - Server Disk Space Shortage Warning
This alarm indicates that one of the following conditions has occurred:
-
A file system has exceeded a warning threshold, which means that more than 80% (but less than 90%) of the available disk storage has been used on the file system.
-
More than 80% (but less than 90%) of the total number of available files have been allocated on the file system.
Recovery
5000000000000002 - Server Application Process Error
This alarm indicates that either the minimum number of instances for a required process are not currently running or too many instances of a required process are running.
Recovery
5000000000000004 - Server Hardware Configuration Error
Recovery
- Run
syscheck
in verbose mode. - Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000000000008 - Server RAM Shortage Warning
This alarm indicates one of two conditions:
- Less memory than the expected amount is installed.
- The system is swapping pages in and out of physical memory at a fast rate, indicating a possible degradation in system performance.
This alarm may not clear immediately when conditions fall below the alarm threshold. Conditions must be below the alarm threshold consistently for the alarm to clear. The alarm may take up to five minutes to clear after conditions improve.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000000000020 - Server Swap Space Shortage Warning
This alarm indicates that the swap space available on the server is less than expected. This is usually caused by a process that has allocated a very large amount of memory over time.
Note:
In order for this alarm to clear, the underlying failure condition must be consistently undetected for a number of polling intervals. Therefore, the alarm may continue to be reported for several minutes after corrective actions are completed.Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000000000040 - Server Default Router Not Defined
This alarm indicates that the default network route is either not configured or the current configuration contains an invalid IP address or hostname.
Caution:
When changing the server’s network routing configuration it is important to verify that the modifications will not impact the method of connectivity for the current login session. It is also crucial that this information not be entered incorrectly or set to improper values. Incorrectly modifying the server’s routing configuration may result in total loss of remote network access.
Recovery
- To define the default router:
5000000000000080 – Server temperature warning
Alarm Type: TPD
Description: This alarm indicates that the internal temperature within the server is outside of the normal operating range. A server Fan Failure may also exist along with the Server Temperature Warning.
Severity: Minor
OID: tpdTemperatureWarningNotify 1.3.6.1.4.1.323.5.3.18.3.1.3.8
Alarm ID: TKSPLATMI85000000000000080
Recovery
5000000000000100 - Server Core File Detected
This alarm indicates that an application process has failed and debug information is available.
Recovery
- Run
syscheck
in verbose mode. - Run
savelogs
to gather system information (see Saving Logs Using the LSMS GUI or Command Line). - Contact the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66.
5000000000000200 - Server NTP Daemon Not Synchronized
This alarm indicates that the NTP daemon (background process) has been unable to locate a server to provide an acceptable time reference for synchronization.
Severity: Minor
Alarm ID: TKSPLATMI10
Recovery
5000000000000400 - Server CMOS Battery Voltage Low
The presence of this alarm indicates that the CMOS battery voltage has been detected to be below the expected value. This alarm is an early warning indicator of CMOS battery end-of-life failure which will cause problems in the event the server is powered off.
Recovery
5000000000000800 - Server Disk Self Test Warning
A non-fatal disk issue (such as a sector cannot be read) exists.
Recovery
5000000000001000 - Device Warning
This alarm indicates that either a snmpget
cannot be performed on the configured SNMP OID or the returned value failed the specified comparison operation.
Recovery
- Run
syscheck
in Verbose mode. - Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000000002000 - Device Interface Warning
This alarm can be generated by either an SNMP trap or an IP bond error. If syscheck
is configured to receive SNMP traps, this alarm indicates that a SNMP trap was received with the set state. If syscheck
is configured for IP bond monitoring, this alarm can mean that a slave device is not operating, a primary device is not active, or syscheck
is unable to read bonding information from interface configuration files.
Recovery
- Run
syscheck
in Verbose mode. - Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000000004000 - Server Reboot Watchdog Initiated
This alarm indicates that the server has been rebooted due to a hardware watchdog.
Recovery
5000000000008000 - Server HA Failover Inhibited
This alarm indicates that the server has been inhibited and HA failover is prevented from occurring.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000000010000 - Server HA Active To Standby Transition
This alarm indicates that the server is in the process of transitioning HA state from Active to Standby.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000000020000 - Server HA Standby To Active Transition
This alarm indicates that the server is in the process of transitioning HA state from Standby to Active.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000000040000 - Platform Health Check Failure
This alarm indicates a syscheck
configuration error.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000000080000 - NTP Offset Check Failure
This alarm indicates that time on the server is outside the acceptable range or offset from the NTP server. The alarm message provides the offset value of the server from the NTP server and the offset limit set for the system by the application.
Alarm Type: TPD
Severity: Minor
Alarm ID: TKSPLATMI20
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000000100000 - NTP Stratum Check Failure
This alarm indicates that NTP is syncing to a server, but the stratum level of the NTP server is outside the acceptable limit. The alarm message provides the stratum value of the NTP server and the stratum limit set for the system by the application.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000020000000 – Server Kernel Dump File Detected
Alarm Type: TPD
Description: This alarm indicates that the kernel has crashed and debug information is available.
Severity: Minor
OID: 1.3.6.1.4.1.323.5.3.18.3.1.3.30
Alarm ID: TKSPLATMI305000000020000000
Recovery
5000000040000000 – TPD Upgrade Failed
Alarm Type: TPD
Description: This alarm indicates that a TPD upgrade has failed.
Severity: Minor
OID: tpdServerUpgradeFailDetectedNotify 1.3.6.1.4.1.323.5.3.18.3.1.3.31
Alarm ID: TKSPLATMI315000000040000000
Recovery
5000000080000000 - Half Open Socket Warning Limit
Alarm Type: TPD
This alarm indicates that the number of half open TCP sockets has reached the major threshold. This problem is caused by a remote system failing to complete the TCP 3-way handshake.
Severity: Minor
OID: tpdHalfOpenSocketWarningNotify1.3.6.1.4.1.323.5.3.18.3.1.3.32
Alarm ID: TKSPLATMI325000000080000s000
Recovery
- Run
syscheck
. - Contact the Customer Care Center and provide the system health check output.
5000000000200000 - SAS Presence Sensor Missing
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance with a replacement server.
5000000000400000 - SAS Drive Missing
This alarm indicates that the number of drives configured for this server is not being detected.
Recovery
- Call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 to determine if the alarm is caused by a failed drive or failed configuration.
5000000000800000 - DRBD failover busy
This alarm indicates that a DRBD sync is in progress from the peer server to the local server. The local server is not ready to bethe primary DRBD node because its data is not current.
Recovery
- Wait for approximately 20 minutes, then check if the DRBD sync has completed. A DRBD sync should take no more than 15 minutes to complete.
- If the alarm persists longer than this time interval, call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 for assistance.
5000000001000000 - HP disk resync
This alarm indicates that the HP disk subsystem is currently resyncing after a failed or replaced drive, or after another change in the configuration of the HP disk subsystem. The output of the message will include the disk that is resyncing and the percentage complete. This alarm eventually clears after the resync of the disk is completed. The time to clear is dependant on the size of the disk and the amount of activity on the system..
Recovery
- Run
syscheck
in Verbose mode. - If the percent recovering is not updating, wait at least 5 minutes between subsequent runs of
syscheck
, then call unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 with thesyscheck
output.
Saving Logs Using the LSMS GUI or Command Line
During some corrective procedures, it may be necessary to provide Oracle Communications with information about the LSMS for help in clearing an alarm. These log files are used to aid the unresolvable-reference.html#GUID-646F2C79-C167-4B5A-A8DF-7ED0EAA9AD66 when troubleshooting the LSMS.
Use the following procedure to save logs using menu selections from the LSMS GUI.