Sun N1 System Manager 1.3 Discovery and Administration Guide

Full OS Monitoring (With Thresholds)

As part of the add server feature command, with the osmonitor and agentip keywords, you provide credentials to access the monitored server's operating system through ssh with the agentssh keyword. See To Add the OS Monitoring Feature for additional details. This procedure is important for OS health monitoring but not for monitoring hardware health or network reachability.

Adding the OS monitoring feature using the add server feature command, with the osmonitor keywords provides support for both OS monitoring and base management, and enables monitoring by default. After that, monitoring can be disabled and enabled by use of the set server command. See Enabling and Disabling Monitoring for more information.

The OS monitoring feature provides all the basic monitoring data that comes with the base management feature. See Base Management (Basic OS Monitoring) for details about the base management feature. In addition, the OS monitoring feature provides support for threshold monitoring. The OS monitoring feature allows you to set specific thresholds for individual monitored servers, or for groups of monitored servers, at the command line by using the set command. See Setting Threshold Values for details. For information about thresholds, see Monitoring Threshold Values.

Platform OS interface data is obtained through ssh and SNMP. All attribute data is retrieved from the server's operating system by using ssh and SNMP.

A complete list of OS health attributes is provided in Table 6–2. Associated supported thresholds are also provided.

Table 6–2 All OS Health Attributes

Attribute Name 

Description 

Supported Threshold 

Supported Threshold 

cpustats.loadavg1min

System load expressed as average number of queued processes over 1 minute 

warninghigh

criticalhigh

cpustats.loadavg5min

System load expressed as average number of queued processes over 5 minutes 

warninghigh

criticalhigh

cpustats.loadavg15min

System load expressed as average number of queued processes over 15 minutes 

warninghigh

criticalhigh

cpustats.pctusage

Percentage of overall CPU usage 

warninghigh

criticalhigh

cpustats.pctidle

Percentage of CPU idle 

warninglow

criticallow

memusage.pctmemused

Percentage of memory in use 

warninghigh

criticalhigh

memusage.pctmemfree

Percentage of memory free 

warninglow

criticallow

memusage.mbmemused

Memory in use in MB 

warninghigh

criticalhigh

memusage.mbmemfree

Memory free in MB 

warninglow

criticallow

memusage.kbswapused

Swap space in use in Kb 

warninghigh

criticalhigh

memusage.mbswapfree

Free swap space in MB 

warninglow

criticallow

memusage.pctswapfree

Percentage of free swap space 

warninglow

criticallow

fsusage.pctused

Percentage of file system space in use 

warninghigh

criticalhigh

fsusage.kbspacefree

File system free space in Kb 

warninghigh

criticalhigh

You can filter OS health monitoring information for all servers by using the show server command:


N1-ok> show server oshealth oshealth

See show server in Sun N1 System Manager 1.3 Command Line Reference Manual for details of possible values of the oshealth filters. For more information and a graphic explaining filtering servers by health state, see To View Failed Managed Servers.

The health of an OS resource can be shown as unknown if the server is reachable but the agent for the monitoring feature cannot be contacted on SNMP port 161. The health of an OS resource can be shown as unreachable if the server is unreachable due to, for example, being in standby mode. See also Understanding the Differences Between Unreachable and Unknown States for Managed Servers.

If you are not interested in the values of some attributes, you can disable the threshold severity for monitoring of those attributes. This action prevents annoyance alarms. Example 6–9 shows you how to accomplish this disabling action.