Full OS Monitoring (With Thresholds) (Sun N1 System Manager 1.3 Discovery and Administration Guide)

Sun N1 System Manager 1.3 Discovery and Administration Guide

Full OS Monitoring (With Thresholds)

As part of the add server feature command, with the osmonitor and agentip keywords, you provide credentials to access the monitored server's operating system through ssh with the agentssh keyword. See To Add the OS Monitoring Feature for additional details. This procedure is important for OS health monitoring but not for monitoring hardware health or network reachability.

Adding the OS monitoring feature using the add server feature command, with the osmonitor keywords provides support for both OS monitoring and base management, and enables monitoring by default. After that, monitoring can be disabled and enabled by use of the set server command. See Enabling and Disabling Monitoring for more information.

The OS monitoring feature provides all the basic monitoring data that comes with the base management feature. See Base Management (Basic OS Monitoring) for details about the base management feature. In addition, the OS monitoring feature provides support for threshold monitoring. The OS monitoring feature allows you to set specific thresholds for individual monitored servers, or for groups of monitored servers, at the command line by using the set command. See Setting Threshold Values for details. For information about thresholds, see Monitoring Threshold Values.

Platform OS interface data is obtained through ssh and SNMP. All attribute data is retrieved from the server's operating system by using ssh and SNMP.

A complete list of OS health attributes is provided in Table 6–2. Associated supported thresholds are also provided.

Table 6–2 All OS Health Attributes


Attribute Name	Description	Supported Threshold	Supported Threshold
`cpustats.loadavg1min`	System load expressed as average number of queued processes over 1 minute	`warninghigh`	`criticalhigh`
`cpustats.loadavg5min`	System load expressed as average number of queued processes over 5 minutes	`warninghigh`	`criticalhigh`
`cpustats.loadavg15min`	System load expressed as average number of queued processes over 15 minutes	`warninghigh`	`criticalhigh`
`cpustats.pctusage`	Percentage of overall CPU usage	`warninghigh`	`criticalhigh`
`cpustats.pctidle`	Percentage of CPU idle	`warninglow`	`criticallow`
`memusage.pctmemused`	Percentage of memory in use	`warninghigh`	`criticalhigh`
`memusage.pctmemfree`	Percentage of memory free	`warninglow`	`criticallow`
`memusage.mbmemused`	Memory in use in MB	`warninghigh`	`criticalhigh`
`memusage.mbmemfree`	Memory free in MB	`warninglow`	`criticallow`
`memusage.kbswapused`	Swap space in use in Kb	`warninghigh`	`criticalhigh`
`memusage.mbswapfree`	Free swap space in MB	`warninglow`	`criticallow`
`memusage.pctswapfree`	Percentage of free swap space	`warninglow`	`criticallow`
`fsusage.pctused`	Percentage of file system space in use	`warninghigh`	`criticalhigh`
`fsusage.kbspacefree`	File system free space in Kb	`warninghigh`	`criticalhigh`

You can filter OS health monitoring information for all servers by using the show server command:

N1-ok> show server oshealth oshealth

See show server in Sun N1 System Manager 1.3 Command Line Reference Manual for details of possible values of the oshealth filters. For more information and a graphic explaining filtering servers by health state, see To View Failed Managed Servers.

The health of an OS resource can be shown as unknown if the server is reachable but the agent for the monitoring feature cannot be contacted on SNMP port 161. The health of an OS resource can be shown as unreachable if the server is unreachable due to, for example, being in standby mode. See also Understanding the Differences Between Unreachable and Unknown States for Managed Servers.

If you are not interested in the values of some attributes, you can disable the threshold severity for monitoring of those attributes. This action prevents annoyance alarms. Example 6–9 shows you how to accomplish this disabling action.