Sun N1 System Manager 1.3 Discovery and Administration Guide

OS Health Monitoring

OS health can be monitored by the N1 System Manager.

Two distinct levels of OS Monitoring are possible with the N1 System Manager. These are as follows:

Base management

This feature provides support for basic OS monitoring. The base management feature also provides support for OS updates and remote command execution. For more information, see Base Management (Basic OS Monitoring).

Full OS Monitoring

This feature provides support for basic OS monitoring, and provides support for threshold monitoring. For more information, see Full OS Monitoring (With Thresholds).

Supported Operating Systems for OS Monitoring

All of the operating systems listed in Manageable Server Requirements in Sun N1 System Manager 1.3 Site Preparation Guide can be monitored by the N1 System Manager, with the exception of Microsoft Windows.


Note –

OS monitoring of managed servers running Microsoft Windows is not possible in this release.


For supported versions of the Solaris operating system:

When choosing which distribution groups to install, note that Entire Distribution plus OEM support must be chosen. All other distribution groups do not contain the necessary packages to support OS monitoring using the N1 System Manager.

For supported versions of the Red Hat Linux operating system:

When choosing which distribution groups to install, note that Everything must be chosen. All other distribution groups do not contain the necessary packages to support OS monitoring using the N1 System Manager.

For supported versions of the SUSE operating system:

When choosing which distribution groups to install, note that Default Installation must be chosen. All other distribution groups do not contain the necessary packages to support OS monitoring using the N1 System Manager.

Base Management (Basic OS Monitoring)

As part of the add server feature command, with the basemanagement and agentip keywords, you provide support for base management and you provide credentials to access the monitored server's operating system through ssh with the agentssh keyword. See To Add the Base Management Feature for additional details. This procedure is important for basic OS health monitoring but not for monitoring hardware health or network reachability.

Adding the base management feature using the add server feature command, with the basemanagement keyword provides support for base management, and enables monitoring by default. After that, monitoring can be disabled and enabled by use of the set server command. See Enabling and Disabling Monitoring for more information.

The base management feature provides basic OS monitoring, but does not provide support for monitoring of thresholds. For the monitoring of thresholds, the full OS monitoring feature must be added. See Full OS Monitoring (With Thresholds) for details.

With the base management feature, statistics related to the central processor unit (CPU) are provided, as is data related to memory, swap usage, and file systems. For the purposes of monitoring, system load data, memory usage, and swap usage data can be categorized as follows:

For more information about these monitored attributes, see Table 6–2.

The base management feature also provides support for remote command execution. See Issuing Remote Commands on Servers and Server Groups for details. In addition, the base management feature provides support for OS updates. see Chapter 5, Managing Packages, Patches, and RPMs, in Sun N1 System Manager 1.3 Operating System Provisioning Guide for information about OS updates.

Full OS Monitoring (With Thresholds)

As part of the add server feature command, with the osmonitor and agentip keywords, you provide credentials to access the monitored server's operating system through ssh with the agentssh keyword. See To Add the OS Monitoring Feature for additional details. This procedure is important for OS health monitoring but not for monitoring hardware health or network reachability.

Adding the OS monitoring feature using the add server feature command, with the osmonitor keywords provides support for both OS monitoring and base management, and enables monitoring by default. After that, monitoring can be disabled and enabled by use of the set server command. See Enabling and Disabling Monitoring for more information.

The OS monitoring feature provides all the basic monitoring data that comes with the base management feature. See Base Management (Basic OS Monitoring) for details about the base management feature. In addition, the OS monitoring feature provides support for threshold monitoring. The OS monitoring feature allows you to set specific thresholds for individual monitored servers, or for groups of monitored servers, at the command line by using the set command. See Setting Threshold Values for details. For information about thresholds, see Monitoring Threshold Values.

Platform OS interface data is obtained through ssh and SNMP. All attribute data is retrieved from the server's operating system by using ssh and SNMP.

A complete list of OS health attributes is provided in Table 6–2. Associated supported thresholds are also provided.

Table 6–2 All OS Health Attributes

Attribute Name 

Description 

Supported Threshold 

Supported Threshold 

cpustats.loadavg1min

System load expressed as average number of queued processes over 1 minute 

warninghigh

criticalhigh

cpustats.loadavg5min

System load expressed as average number of queued processes over 5 minutes 

warninghigh

criticalhigh

cpustats.loadavg15min

System load expressed as average number of queued processes over 15 minutes 

warninghigh

criticalhigh

cpustats.pctusage

Percentage of overall CPU usage 

warninghigh

criticalhigh

cpustats.pctidle

Percentage of CPU idle 

warninglow

criticallow

memusage.pctmemused

Percentage of memory in use 

warninghigh

criticalhigh

memusage.pctmemfree

Percentage of memory free 

warninglow

criticallow

memusage.mbmemused

Memory in use in MB 

warninghigh

criticalhigh

memusage.mbmemfree

Memory free in MB 

warninglow

criticallow

memusage.kbswapused

Swap space in use in Kb 

warninghigh

criticalhigh

memusage.mbswapfree

Free swap space in MB 

warninglow

criticallow

memusage.pctswapfree

Percentage of free swap space 

warninglow

criticallow

fsusage.pctused

Percentage of file system space in use 

warninghigh

criticalhigh

fsusage.kbspacefree

File system free space in Kb 

warninghigh

criticalhigh

You can filter OS health monitoring information for all servers by using the show server command:


N1-ok> show server oshealth oshealth

See show server in Sun N1 System Manager 1.3 Command Line Reference Manual for details of possible values of the oshealth filters. For more information and a graphic explaining filtering servers by health state, see To View Failed Managed Servers.

The health of an OS resource can be shown as unknown if the server is reachable but the agent for the monitoring feature cannot be contacted on SNMP port 161. The health of an OS resource can be shown as unreachable if the server is unreachable due to, for example, being in standby mode. See also Understanding the Differences Between Unreachable and Unknown States for Managed Servers.

If you are not interested in the values of some attributes, you can disable the threshold severity for monitoring of those attributes. This action prevents annoyance alarms. Example 6–9 shows you how to accomplish this disabling action.