Sun N1 System Manager 1.3 Troubleshooting Guide

Chapter 1 General Information

This section provides information concerning N1 System Manager operational processes that can assist you in troubleshooting. The following topics are discussed:

In this manual, the term manageable server is used for a server that is accessible by the N1 System Manager network, but has not yet been discovered by the N1 System Manager. A managed server is a server that has been successfully discovered by the N1 System Manager and is subsequently managed by the N1 System Manager.

Note –

The topics in this chapter and subsequent chapters are organized alphabetically.

DHCP Service Conflict With N1 Grid Service Provisioning System

If you are using both the N1 System Manager and the Sun N1TM Service Provisioning System with the OS provisioning plug-in, you must choose which product you want to use for OS deployment for a given target set of servers. Based on the product chosen for OS deployment, you must ensure that the DHCP service supplied by the other product is manually shut down (as the root user) using operating system commands. Failure to shut the service down might result in unreliable behavior of OS deployment operations as well as potential network related problems.

Discovery and Routers

Discovery of manageable servers works across routers if the network services used by the discovery process are not blocked by a firewall. Network services used by the discovery process can include SSH, IPMI, Telnet and SNMP.

Identifying Hardware and OS Threshold Breaches

If the value of a monitored hardware health attribute, or OS resource utilization attribute breaches a threshold value, an event log is immediately created, which indicates that the threshold has been breached. The event log is available from the browser interface. A symbol appears among the monitored data table in the browser interface to indicate that a threshold has been breached, as shown in the graphic at To Retrieve Threshold Values for a Server in Sun N1 System Manager 1.3 Discovery and Administration Guide.

Alternatively, use the show log command to verify that the event log has been generated:

N1-ok> show log
Id            Date                       Severity    Subject     Message
10            2005-11-22T01:45:02-0800   WARNING     Sun_V20z_XG041105786
A critical high threshold was violated for server Sun_V20z_XG041105786: Attribute cpu0.vtt-s3 Value 1.32

13            2005-11-22T01:50:08-0800   WARNING     Sun_V20z_XG041105786
A normal low  threshold was violated for server Sun_V20z_XG041105786: Attribute cpu0.vtt-s3 Value 1.2

If monitoring traps are lost, a particular threshold status may not be refreshed for up to 30 hours, although the overall status can still be refreshed every 10 minutes.

N1 System Manager Cannot Be Used to Manage System Management Servers

Do not use the N1 System Manager to manage servers that have system management software installed on them such as Sun Management Center, Sun Control Station, and any other system management applications including the N1 System Manager.

Regenerating Security Keys

The N1 System Manager uses strong encryption techniques and common agent container security keys to ensure secure communication between the management server and each managed server.

The security keys used by the N1 System Manager must be identical across all servers. Under normal operation, the security keys used by the keys can be left in their default configuration. You should regenerate the security keys if any of the following cases occur:

In each of the above cases, the security keys must be regenerated, and the N1 System Manager management daemon restarted, as described in To Regenerate Common Agent Container Security Keys.