Sun N1 System Manager 1.3 Discovery and Administration Guide

Chapter 6 Monitoring Servers and Server Groups

The chapter provides an explanation of what monitoring is, in the context of the N1 System Manager, and describes how to monitor servers that are part of the N1 System Manager. This chapter provides procedures for enabling and disabling monitoring, and for managing monitoring thresholds using the command line.

This chapter also contains information about managing jobs, event log entries, and about setting up notifications.

This chapter contains the following sections:

Some procedures are also possible using the browser interface. These procedures are provided in the Sun N1 System Manager browser interface help.

Introduction to Monitoring

Monitoring in the Sun N1 System Manager software enables you to track changes to specific attributes in specific managed objects. Managed objects include server hardware elements, operating systems, file systems, and networks. Attributes are the monitored elements, about which data is obtained and delivered by the N1 System Manager software. Attributes are associated with three main areas:

For a managed server or server group, hardware health and operating system health and network connectivity are all monitored by the management server. All comparisons and verifications for monitoring are performed by the N1 System Manager. Managed nodes are used only to access data about their health or network reachability.

Introduction to Events and Notifications

Monitoring is connected with the broadcasting of the events for each managed server or server group. Events are generated when certain conditions related to attributes occur. For information about events and when they occur, see Managing Event Log Entries. Monitoring data is stored as events in the N1 System Manager database instead of log files.

If monitoring is enabled for a managed servers, each event causes a notification to be emitted from the N1 System Manager for that event. Notification rules can be created to notify staff about events that happen with managed servers. See Setting Up Event Notifications for details.

Monitoring Using SNMP

An SNMP agent that is used for data retrieval is provided in the N1 System Manager software:


Note –

The default SNMP port for the agent for the monitoring feature is port 161. Changing the port number from the default is not supported in this release.


The SNMP agent is deployed when operating systems are provisioned on to servers that are managed by the N1 System Manager. The N1 System Manager passively listens for the traps generated by the SNMP agent whenever there is a threshold breach. In case the traps generated by the SNMP agent are lost, the N1 System Manager also performs two types of polling-based monitoring as a backup:

Hardware Health Monitoring

The hardware health of managed servers is monitored by the N1 System Manager. Sensors provided in the hardware of managed servers are used by the N1 System Manager to monitor temperature, voltage, and fan speed. For information about supported hardware, see Manageable Server Requirements in Sun N1 System Manager 1.3 Site Preparation Guide. For a managed server's hardware health to be monitored by the N1 System Manager, the managed server must have a service processor.

Sensor data is retrieved from the service processor for SPARC devices through the Advanced Lights Out Manager (ALOM) interface. Sensor data is retrieved through the Intelligent Platform Management Interface (IPMI) for x64 servers.


Note –

Managed servers that use ALOM do not send data to the management server by use of traps. Instead, managed servers that use ALOM send management data by email. To ensure that the management server collects data from these servers, the management server has its own port 25 email server.


The following characteristics of a managed server's hardware can be monitored:


Note –

The N1 System Manager does not monitor RAID controller states.


All details for a managed server's hardware health, where available, are displayed in the hardware monitoring table on the Server Details page of the browser interface, and in the Event Log.

Table 6–1 Hard Disk and Memory Failure Monitoring

Type 

Disk Monitoring 

Memory Failure Monitoring 

ALOM servers: Netra 240 and Netra 440 

None 

None 

ALOM servers: Sun Fire V210, V240 and V440 

None 

None 

ALOM servers: Sun Fire T1000 and T2000 

None 

None 

IPMI server: Sun Fire X2100  

None 

None 

ILOM servers: X4100 and X4200 

Yes 

Yes 

IPMI servers: Sun Fire V20z and V40z 

None 

Yes 

A detailed list of hardware health sensors is provided in the documentation that accompanies your hardware.

You can view filtered hardware health monitoring information for all servers by using the show server command:


N1-ok> show server hardwarehealth hardwarehealth

See show server in Sun N1 System Manager 1.3 Command Line Reference Manual for details of possible values of the hardwarehealth filters. For more information and a graphic explaining filtering servers by health state, see To View Failed Managed Servers.

The locator lights for Sun Fire X2100, X4100 and X4200 servers can be switched on or off using the N1 System Manager. You can switch on or off a managed server's locator light by using the set server command:


N1-ok> set server server locator locator-state

The locator-state value can be either on or off. For a group of servers, use the set group command with the group's name.

Hardware Memory Problems on Sun Fire V20z and V40z Managed Servers

Memory problems on the Sun Fire V20z and V40z managed servers are handled differently by the N1 System Manager. Sun Fire V20z and V40z memory problems, if they occur, are detected by polling through the managed server's service processor.

A memory error has occurred on a Sun Fire V20z or V40z server if all of the following are true:

If a memory error has occurred, see the example on how to correct it. To avoid false warning statuses in the future, the service processor's event log must be cleared after the defective memory has been replaced or repaired.


Example 6–1 Examining Memory Errors on Sun Fire V20z or V40z Managed Servers

If a memory error has occurred on a Sun Fire V20z or V40z managed server, log into the server's service processor.


# ssh -l admin 10.0.3.2

Enter the password and check the managed server's status.


# sp get status

Check the service processor's event log.


# sp get events
ID Last Update      Component Severity      Message
1  01/01/1970 00:02 SP        informational SP localhost.localdomain IP is now set to 0.0.0.0
2  01/01/1970 18:47 SP        informational SP localhost.localdomain IP is now set to 0.0.0.0
3  01/01/1970 18:47 SP        informational SP localhost.localdomain IP is now set to 10.0.3.2 

Clear the service processor's event log.


# sp delete event -a

Hardware Sensor Attributes

For x64 servers, the management server software obtains the list of hardware sensor attributes to monitor through IPMI from the service processor of the server. For servers running the SPARC architecture, the ALOM interface is used. The list of hardware sensor attributes can vary from server to server, and between firmware versions. A sample listing for some servers and firmware versions is provided in this section. The attributes depend on the server type and on the number of CPUs that the server has.

To receive notifications for events from discrete sensors, create a notification rule and subscribe to the Ereport.Physical.ThresholdExceeded topic, as described in Setting Up Event Notifications.

For Sun Fire X4100 and Sun Fire X4200 servers, refer to the hardware documentation for to see the monitored hardware sensors.

For Sun Fire X2100 servers, only sensors describing fan speed, voltage, and temperature are used to retrieve data. Here is a list of sensors that are monitored for SP firmware version 4.11:


DDR 2.6V
CPU Core Voltage
VCC 3.3V
VCC 5V
VCC 12V
Battery Volt
CPU TEMP
SYS TEMP
CPU FAN
SYSTEM FAN3
SYSTEM FAN1
SYSTEM FAN2

For X2100 servers with SP firmware versions previous to version 4.11, CPU Core Voltage was called CPU Voltage.

OS Health Monitoring

OS health can be monitored by the N1 System Manager.

Two distinct levels of OS Monitoring are possible with the N1 System Manager. These are as follows:

Base management

This feature provides support for basic OS monitoring. The base management feature also provides support for OS updates and remote command execution. For more information, see Base Management (Basic OS Monitoring).

Full OS Monitoring

This feature provides support for basic OS monitoring, and provides support for threshold monitoring. For more information, see Full OS Monitoring (With Thresholds).

Supported Operating Systems for OS Monitoring

All of the operating systems listed in Manageable Server Requirements in Sun N1 System Manager 1.3 Site Preparation Guide can be monitored by the N1 System Manager, with the exception of Microsoft Windows.


Note –

OS monitoring of managed servers running Microsoft Windows is not possible in this release.


For supported versions of the Solaris operating system:

When choosing which distribution groups to install, note that Entire Distribution plus OEM support must be chosen. All other distribution groups do not contain the necessary packages to support OS monitoring using the N1 System Manager.

For supported versions of the Red Hat Linux operating system:

When choosing which distribution groups to install, note that Everything must be chosen. All other distribution groups do not contain the necessary packages to support OS monitoring using the N1 System Manager.

For supported versions of the SUSE operating system:

When choosing which distribution groups to install, note that Default Installation must be chosen. All other distribution groups do not contain the necessary packages to support OS monitoring using the N1 System Manager.

Base Management (Basic OS Monitoring)

As part of the add server feature command, with the basemanagement and agentip keywords, you provide support for base management and you provide credentials to access the monitored server's operating system through ssh with the agentssh keyword. See To Add the Base Management Feature for additional details. This procedure is important for basic OS health monitoring but not for monitoring hardware health or network reachability.

Adding the base management feature using the add server feature command, with the basemanagement keyword provides support for base management, and enables monitoring by default. After that, monitoring can be disabled and enabled by use of the set server command. See Enabling and Disabling Monitoring for more information.

The base management feature provides basic OS monitoring, but does not provide support for monitoring of thresholds. For the monitoring of thresholds, the full OS monitoring feature must be added. See Full OS Monitoring (With Thresholds) for details.

With the base management feature, statistics related to the central processor unit (CPU) are provided, as is data related to memory, swap usage, and file systems. For the purposes of monitoring, system load data, memory usage, and swap usage data can be categorized as follows:

For more information about these monitored attributes, see Table 6–2.

The base management feature also provides support for remote command execution. See Issuing Remote Commands on Servers and Server Groups for details. In addition, the base management feature provides support for OS updates. see Chapter 5, Managing Packages, Patches, and RPMs, in Sun N1 System Manager 1.3 Operating System Provisioning Guide for information about OS updates.

Full OS Monitoring (With Thresholds)

As part of the add server feature command, with the osmonitor and agentip keywords, you provide credentials to access the monitored server's operating system through ssh with the agentssh keyword. See To Add the OS Monitoring Feature for additional details. This procedure is important for OS health monitoring but not for monitoring hardware health or network reachability.

Adding the OS monitoring feature using the add server feature command, with the osmonitor keywords provides support for both OS monitoring and base management, and enables monitoring by default. After that, monitoring can be disabled and enabled by use of the set server command. See Enabling and Disabling Monitoring for more information.

The OS monitoring feature provides all the basic monitoring data that comes with the base management feature. See Base Management (Basic OS Monitoring) for details about the base management feature. In addition, the OS monitoring feature provides support for threshold monitoring. The OS monitoring feature allows you to set specific thresholds for individual monitored servers, or for groups of monitored servers, at the command line by using the set command. See Setting Threshold Values for details. For information about thresholds, see Monitoring Threshold Values.

Platform OS interface data is obtained through ssh and SNMP. All attribute data is retrieved from the server's operating system by using ssh and SNMP.

A complete list of OS health attributes is provided in Table 6–2. Associated supported thresholds are also provided.

Table 6–2 All OS Health Attributes

Attribute Name 

Description 

Supported Threshold 

Supported Threshold 

cpustats.loadavg1min

System load expressed as average number of queued processes over 1 minute 

warninghigh

criticalhigh

cpustats.loadavg5min

System load expressed as average number of queued processes over 5 minutes 

warninghigh

criticalhigh

cpustats.loadavg15min

System load expressed as average number of queued processes over 15 minutes 

warninghigh

criticalhigh

cpustats.pctusage

Percentage of overall CPU usage 

warninghigh

criticalhigh

cpustats.pctidle

Percentage of CPU idle 

warninglow

criticallow

memusage.pctmemused

Percentage of memory in use 

warninghigh

criticalhigh

memusage.pctmemfree

Percentage of memory free 

warninglow

criticallow

memusage.mbmemused

Memory in use in MB 

warninghigh

criticalhigh

memusage.mbmemfree

Memory free in MB 

warninglow

criticallow

memusage.kbswapused

Swap space in use in Kb 

warninghigh

criticalhigh

memusage.mbswapfree

Free swap space in MB 

warninglow

criticallow

memusage.pctswapfree

Percentage of free swap space 

warninglow

criticallow

fsusage.pctused

Percentage of file system space in use 

warninghigh

criticalhigh

fsusage.kbspacefree

File system free space in Kb 

warninghigh

criticalhigh

You can filter OS health monitoring information for all servers by using the show server command:


N1-ok> show server oshealth oshealth

See show server in Sun N1 System Manager 1.3 Command Line Reference Manual for details of possible values of the oshealth filters. For more information and a graphic explaining filtering servers by health state, see To View Failed Managed Servers.

The health of an OS resource can be shown as unknown if the server is reachable but the agent for the monitoring feature cannot be contacted on SNMP port 161. The health of an OS resource can be shown as unreachable if the server is unreachable due to, for example, being in standby mode. See also Understanding the Differences Between Unreachable and Unknown States for Managed Servers.

If you are not interested in the values of some attributes, you can disable the threshold severity for monitoring of those attributes. This action prevents annoyance alarms. Example 6–9 shows you how to accomplish this disabling action.

Network Reachability Monitoring

All management interfaces of managed servers and all platform interfaces are monitored by default by the N1 System Manager. Platform interfaces include the service processor's management interface, such as eth0, and data network interfaces, such as eth1 or eth2.

Reachability is verified for Linux servers and servers running the Solaris OS by using an ICMP ping to the interface IP address.

The reachability of all network interfaces is verified at regular intervals. The monitoring of network reachability is based on the IP address. If any monitored IP address is unreachable, an event is generated.

You can filter information for all servers by using the show server command with the appropriate parameters to view monitoring information. See show server in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

Understanding the Differences Between Unreachable and Unknown States for Managed Servers

Distinguishing between the unreachable and unknown states for managed servers is important.


N1-ok> show server oshealth unreachable

This command lists all managed servers that are unreachable. Any managed server returned in the output of this command is unreachable due to a network problem: the server cannot be contacted about its hardware health status. The ping command to the server is unsuccessful. This behavior does not necessarily mean that the server is not transmitting hardware health status information. The server could be in standby mode.


N1-ok> show server oshealth unknown

This command lists all managed servers that are not returning any information about hardware health status. The ping command might be successful but servers returned in the output of this command are not returning any hardware health information. The agent for the monitoring feature could not be contacted on port 161.


N1-ok> show server power unreachable

This command lists all managed servers that are unreachable. Any server returned in the output of this command is unreachable due to a network problem: the server cannot be contacted about its power status. The ping command to the server is unsuccessful. This behavior does not necessarily mean that the server is not transmitting power status information. The server could be in standby mode.


N1-ok> show server power unknown

This command lists all managed servers that are not returning any information about power status. The ping command might be successful but servers returned in the output of this command are not returning any power status information. The agent for the monitoring feature could not be contacted on port 161.


N1-ok> show server oshealth unreachable

This command lists all managed servers that are unreachable. Any server returned in the output of this command is unreachable due to a network problem: the server cannot be contacted about its OS health. The ping command to the server is unsuccessful. This behavior does not necessarily mean that the server is not transmitting OS health information. The server could be in standby mode.


N1-ok> show server oshealth unknown

This command lists all managed servers that are not returning any information about OS health. The ping command might be successful but servers returned in the output of this command are not returning any OS health information. The agent for the monitoring feature could not be contacted on port 161.

Supporting OS Monitoring

Before full monitoring of a managed server can be enabled, OS monitoring must be supported for that server. OS Monitoring is supported for a server when the base management and OS monitoring features are installed on the server.

The base management and OS monitoring features are installed when a managed server's OS is installed or updated by use of the load group or load server commands. See load group in Sun N1 System Manager 1.3 Command Line Reference Manual and load server in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Note –

If the load server or load group command is used to install software on the managed server, and the managed server's networktype attribute is to dhcp, the feature attribute cannot be used. Therefore if you want to load the base management and OS monitoring features when loading an OS with the load server or load group commands, set the networktype attribute to static.

If you set the networktype attribute to dhcp, every time the server reboots you have to change the agent IP address as explained in To Modify the Agent IP for a Server.


The base management and OS monitoring features can also be installed or updated when the add server command is used, as explained in Adding and Upgrading Base Management and OS Monitoring Features.

If the OS monitoring feature is not installed and you use the set server monitored command to enable monitoring, only hardware health monitoring is enabled. OS monitoring is not enabled if the set server monitored command is executed without the OS monitoring feature first being installed. See Enabling and Disabling Monitoring for more information.

Adding and Upgrading Base Management and OS Monitoring Features

The base management and OS monitoring features provide support for monitoring and patching the OS profiles installed on managed servers, and for executing remote commands. This section describes how to add the base management and OS monitoring features, modify supported attributes, remove feature support, and upgrade the base management and OS monitoring features to the latest versions.

Adding the OS monitoring features provides support for OS monitoring and enables monitoring by default. You can subsequently enable and disable monitoring by using the set server command as explained in Enabling and Disabling Monitoring.

You can add the OS monitoring feature to a server that already has the base management feature added. Alternatively, you can add the OS monitoring feature to a server with a newly loaded OS and the base management feature is added automatically. The OS monitoring feature is used for full OS health monitoring and inventory management. For more information, see Full OS Monitoring (With Thresholds).

The add server feature osmonitor command creates an Add OS Monitoring Support job. You can submit multiple, overlapping add server feature osmonitor commands and have them run in parallel. However, you should limit the number of overlapping Add OS Monitoring Support jobs to a maximum of 15. For more information about jobs, see Managing Jobs.

This section describes the following tasks:


Note –

Many of the tasks in this section require credentials to be entered at the command line. The credentials are those of the manageable server and not those of the service processor.


ProcedureTo Add the Base Management Feature

This procedure describes how to add the base management feature on a server with a newly deployed OS. The base management feature is used to enable remote command execution and package deployment, and provides basic OS monitoring. For more information about base management, see Base Management (Basic OS Monitoring).


Note –

Uninstallation of the base management feature is not supported.


The agent IP used in this procedure is the IP address of the managed server's data network interface to be monitored by the management server. The interface can be eth1/bge1 or eth0/bge0, but usually is eth0/bge0. For more information on the server's agent IP address, see To Modify the Agent IP for a Server.


Note –

You can add the base management feature automatically as part of the load server or load group commands. See load server in Sun N1 System Manager 1.3 Command Line Reference Manual or load group in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Before You Begin
Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    Note –

    The SSH user account that is used in the following command must have root privileges on the remote machine:



    N1-ok> add server server feature basemanagement agentip agentip agentssh username/password
    

    An Add Base Management Support job is started.

    The necessary packages and scripts are added. See add server in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

  3. After successful completion of the Add Base Management Support job, type the following command:


    N1-ok> show server server
    

    The Base Management Supported field should appear with OK as the value.

Next Steps

To Add the OS Monitoring Feature

ProcedureTo Add the OS Monitoring Feature

This procedure describes how to add the OS monitoring feature on a server.

If you submit add server feature commands by using a script, see Example 6–4 for an example.


Note –

You can add the OS monitoring feature automatically as part of the load server or load group commands. See load server in Sun N1 System Manager 1.3 Command Line Reference Manual or load group in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Before You Begin
Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. To add the OS monitoring feature, perform one of the following actions:

    • If you have not added the base management feature, type the following command:


      Note –

      The SSH user account that is used in the following command must have root privileges on the remote machine.



      N1-ok> add server server feature osmonitor agentip agentip agentssh username/password
      
    • If you have already added the base management feature, type the following command:


      Note –

      You cannot specify the agent IP or SSH credentials when adding OS monitoring support to a server that has base management support.



      N1-ok> add server server feature osmonitor
      

    An Add OS Monitoring Support job starts.

    See add server in Sun N1 System Manager 1.3 Command Line Reference Manual for details about command syntax.

  3. Track the Add OS Monitoring Support job to completion.

    After the job completes successfully, the Servers table on the System Dashboard tab appears with values for OS Usage and OS Resource Health.

    Verify that the OS monitoring feature is supported by issuing the show server command. Output for the server appears with the OS Monitoring Supported value as OK.


    Note –

    It can take 5-7 minutes before all OS monitoring data is fully initialized. You may see that CPU idle is at 0.0%, which causes a Failed Critical status with OS usage. This Failed Critical status clears 5-7 minutes after adding or upgrading the OS monitoring feature.


    If no monitoring data is available for the server, see Monitoring Problems in Sun N1 System Manager 1.3 Troubleshooting Guide.

    If the managed server's IP address changes, use the set server command again before enabling or disabling monitoring


Example 6–2 Adding the OS Monitoring Feature to Managed Servers Discovered by SP-Based Discovery

The following example shows how to add the OS monitoring feature to a server that had an OS installed prior to being discovered through SP-Based discovery.


N1-ok> add server 192.168.1.1 feature osmonitor 
agentip 192.168.10.10 agentssh admin/admin

The agentip parameter specifies the IP address of the managed server's data network interface to be monitored by the management server. The ssh user name admin and password admin are used for root access authentication.

The following example of the show command shows how to verify that the OS monitoring feature was added successfully to a server that had an OS installed prior to being discovered through its SP.


N1-ok> show server 192.168.1.1
Name        Hardware  Hardware Health Power  OS Usage  OS Resource Health
192.168.1.1 V20z      Good            On     Solaris   Good

See SP-Based Discovery for details about this method of discovering servers.



Example 6–3 Adding the OS Monitoring Feature to Servers Discovered by OS-Based Discovery

The following example shows how to add the OS monitoring feature to a server that had an OS installed before being discovered by OS-based discovery.


N1-ok> add server 192.168.1.1 feature osmonitor 
agentip 192.168.10.10 agentssh admin/admin

The agentip parameter specifies the IP address of the managed server's data network interface to be monitored by the management server. The ssh user name admin and password admin are used for root access authentication.

The following example of the show command shows how to verify that the OS monitoring feature was added successfully to a server that had an OS installed prior to being discovered by OS-based discovery.


N1-ok> show server 192.168.1.1
Name        Hardware  Hardware Health Power  OS Usage  OS Resource Health
192.168.1.1 V20z      Good            On     Solaris   Good

See OS-Based Discovery for details about this method of discovering servers.



Example 6–4 Scripting OS Monitoring Support

The following example script issues multiple add server feature commands on servers that do not have the base management feature support:


n1sh add server 10.0.0.10 feature=osmonitor agentip 10.0.0.110 agentssh root/admin &
n1sh add server 10.0.0.11 feature=osmonitor agentip 10.0.0.111 agentssh root/admin &
n1sh add server 10.0.0.12 feature=osmonitor agentip 10.0.0.112 agentssh root/admin &

Troubleshooting

Adding the OS monitoring feature might fail due to stale SSH entries on the management server. If the add server feature osmonitor agentip command fails and no true security breach has occurred, remove the known_hosts file or the specific entry in the file that corresponds to the managed server. Then, retry the add server feature osmonitor agentip command. See To Update the ssh_known_hosts File in Sun N1 System Manager 1.3 Troubleshooting Guide for details.

The problem of stale SSH entries on the management server can be avoided if, during the n1smconfig configuration process, you modify SSH policies by accepting changed or unknown host keys. Accepting changed or unknown host keys carries a security risk but avoids the problem of stale SSH entries on the management server. For more information, see To Configure the N1 System Manager in Sun N1 System Manager 1.3 Installation and Configuration Guide.

Adding the OS monitoring feature will fail if you specify the agent IP or the SSH credentials in the add server feature osmonitor command when running it on servers that already have the base management feature support. To solve this problem, issue the add server feature osmonitor command without specifying values for the agent IP or for the SSH credentials.

ProcedureTo Remove the OS Monitoring Feature

There are two levels of removing the OS monitoring feature with this command. If you don't specify the uninstall keyword, the OS monitoring feature remains installed on the managed server, but the feature is no longer supported and the server's OS can no longer be monitored with the N1 System Manager. If you specify the uninstall keyword, the OS monitoring feature is completely uninstalled from the managed server and consequently the OS monitoring feature is no longer supported.

Once removed in either case, the OS resource health state for the server becomes uninitialized.

After you remove a feature, provided you used the recommended procedure, you can always use the add server command to add it back again. The Base Management Supported and OS Monitoring Supported fields in the show server output provide the current status on a server's features.


Note –

Do not manually remove the OS monitoring feature by attempting to delete the agent. Doing so will make it impossible to reinstall or reutilize the OS monitoring feature. Instead, to remove the OS monitoring feature, use the remove server feature procedure as described.


Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Remove the OS monitoring feature.


    N1-ok> remove server server feature osmonitor [uninstall]
    

    The necessary packages and scripts are removed. See remove server in Sun N1 System Manager 1.3 Command Line Reference Manual for details about command syntax.

ProcedureTo Remove the Base Management Feature

The OS monitoring feature must be removed before the base management feature can be removed. See To Remove the OS Monitoring Feature for details.

When you remove the base management feature, the feature is uninstalled from the managed server and it is no longer supported.

After you remove a feature, provided you used the recommended procedure, you can always use the add server command to add it back again. The Base Management Supported and OS Monitoring Supported fields in the show server output provide the current status on a server's features.


Note –

Do not manually remove the base management feature by attempting to delete the agent. Doing so will make it impossible to reinstall or reutilize the base management feature. Instead, to remove the base management feature, use the remove server feature procedure as described.


Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Remove the OS monitoring feature.


    N1-ok> remove server server feature basemanagement
    

    The necessary packages and scripts are removed. See remove server in Sun N1 System Manager 1.3 Command Line Reference Manual for details about command syntax.

ProcedureTo Modify the Agent IP for a Server

This procedure describes how to modify the agent IP for a server. The agent IP is the IP address of the managed server's network interface, which is to be monitored by the management server. This interface is usually the server's provisioning network interface. The agent IP is not the same as the server's management network IP address.

The following graphic shows the agent IP address for a server from the results table of a job, displayed in the Jobs tab. The graphic distinguishes the agent IP address for the server from the server's IP address.

The graphic highlights a job step in the Jobs tab and
distinguishes the agent IP address for the server from the server's IP address.
Note –

If you change the managed server's IP address and credentials or manually remove some services outside the N1 System Manager, the enabling of the services will not succeed. Arbitrary changes to the OS outside of the N1 System Manager require a rediscovery and subsequent addition of the base and OS management features.


When the load server or load group command is used to install software on the managed server, the managed server's networktype attribute could be set to dhcp. This setting means that the server uses DHCP to get its provisioning network IP address. If the system reboots and obtains a different IP address than the one that was used for the agentip parameter during the load command or add server commands, then the following features may not work:

In this case, use the set server server agentip command to correct the server's agent IP address as shown in this procedure.

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Run the following command:


    N1-ok> set server server agentip IP
    

    The agent IP is modified. See set server in Sun N1 System Manager 1.3 Command Line Reference Manual for details about command syntax. This operation touches the managed server.

ProcedureTo Modify the Secure Shell Credentials for the Management Features of a Server

This procedure describes how to modify the Secure Shell (SSH) credentials for the base management and OS monitoring features for a managed server. These management SSH credentials are required by or used in many N1 System Manager commands including add server, set server, load server, start server, load group, and start group. These credentials, specifically for the base management and OS monitoring features for a managed server and referred to by the examples in this chapter as agentssh credentials, are not the same as the SSH credentials required for the server's management network IP address.

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details. You need to have an SSH login and password for this step. Default SSH login/password pairs are provided in SP-Based Discovery.

  2. Run the following command:


    Note –

    The SSH user account that is used in the following command must have root privileges on the remote machine.



    N1-ok> set server server agentip IP agentssh username/password
    

    The agentssh user name and password are modified. See set server in Sun N1 System Manager 1.3 Command Line Reference Manual for details about command syntax.

ProcedureTo Modify the SNMP Credentials for the Management Features of a Server

This procedure describes how to modify the management feature SNMP credentials for a server. The management feature SNMP credentials allow the N1 System Manager to communicate with the Sun Management Center SNMP agent and are specifically for the base management and OS monitoring features for a managed server. These credentials, specifically for the base management and OS monitoring features for a managed server and referred to by the examples in this chapter as agentsnmp credentials, are not the same as the SNMP credentials required for the server's management network IP address.

See Introduction to Monitoring for more information about the SNMP agents for OS monitoring in the N1 System Manager.

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Run the following command to specify the SNMP credentials on a server:


    N1-ok> set server server agentsnmp agentsnmp
    

    The SNMP credentials are modified. See set server in Sun N1 System Manager 1.3 Command Line Reference Manual for details about command syntax.

    This set server operation does not actually touch the managed server. It just synchronizes the data on the management server itself.

ProcedureTo Modify the SNMPv3 Credentials for the Management Features of a Server

This procedure describes how to modify the management feature SNMPv3 credentials for a server. The management feature SNMPv3 credentials allow the N1 System Manager to communicate with the Sun Management Center SNMP agent and are specifically for the base management and OS monitoring features for a managed server. These credentials, specifically for the base management and OS monitoring features for a managed server and referred to by the examples in this chapter as agentsnmpv3 credentials, are not the same as the SNMP credentials required for the server's management network IP address.

See Introduction to Monitoring for more information about the SNMP agents for OS monitoring in the N1 System Manager.

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Run the following command to specify the SNMP credentials on a server:


    N1-ok> set server server agentsnmpv3 agentsnmpv3
    

    The SNMP credentials are modified. See set server in Sun N1 System Manager 1.3 Command Line Reference Manual for details about command syntax.

    This set server operation does not actually touch the managed server. It just synchronizes the data on the management server itself.

ProcedureTo Manually Uninstall the Linux OS Monitoring Feature

After successful completion of this procedure, the OS monitoring feature is unsupported for the managed server:

Steps
  1. Log in to the managed server as root.

  2. Type the following command:


    # /etc/rc.d/rc3.d/S99es_agent stop
    
  3. Issue the following command and follow the prompts.


    # /opt/SUNWsymon/sbin/es-uninst
    

    The agent is uninstalled.

  4. Manually remove the feature.


    # rpm -e n1sm-linux-agent
    

    The feature is removed.

  5. Remove directories related to the feature.


    # rm -rf /var/opt/SUNWsymon
    

    The directories are removed.

ProcedureTo Manually Uninstall the Solaris OS Monitoring Feature

After successful completion of this procedure, the OS monitoring feature will be unsupported for the managed server.

Steps
  1. Log in to the managed server as root.

  2. Stop the agent.


    # /etc/rc3.d/S81es_agent stop
    
  3. Run the uninstaller.


    # /var/tmp/solx86-agent-installer/disk1/x86/sbin/es-uninst -X
    
  4. Remove the packages.

    For the Solaris OS running on the SPARC architecture:


    # pkgrm SUNWn1smsparcag-1-2
    

    For the Solaris OS running on the x86 architecture:


    # pkgrm SUNWn1smx86ag-1-2
    
  5. Remove associated directories.


    # /bin/rm -rf /opt/SUNWsymon
    # /bin/rm -rf /var/opt/SUNWsymon
    

    The directories are removed.

ProcedureTo Upgrade the Base Management Feature on a Server

This procedure describes how to upgrade the base management feature on a server. This procedure is only necessary after upgrading the N1 System Manager from a previous release, for managed servers that still run the earlier version of the base management feature included N1 System Manager 1.1 or Sun Management Center 3.5.1. This procedure is for individual servers. You can upgrade the base management feature on multiple servers at once. See Chapter 3, Upgrading the Sun N1 System Manager Software, in Sun N1 System Manager 1.3 Installation and Configuration Guide for details.


Note –

If the server was freshly installed using the load server or load group commands from the latest version of the N1 System Manager, and the feature subcommand was used with the update keyword, this procedure is not necessary.


Use the add server feature basemanagement command with the upgrade keyword to upgrade a managed server to a new version from the existing base management feature.

If you submit add server feature commands by using a script, see Example 6–4 for an example.

Before You Begin
Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. To upgrade the base management feature, type the following command:


    N1-ok> add server server feature basemanagement upgrade
    

    An Add Base Management Support job starts.

    See add server in Sun N1 System Manager 1.3 Command Line Reference Manual for details about command syntax.

  3. Track the Add Base Management Support job to completion.

    After the job completes successfully, the show server command output for the server appears with the OS Monitoring Supported value as OK. In addition, the Base Management Supported column on the Server Details page is marked as Yes. See Enabling and Disabling Monitoring for a graphic that shows this.

Troubleshooting

Adding the base management feature might fail due to stale SSH entries on the management server. If the add server feature osmonitor agentip command fails and no true security breach has occurred, remove the known_hosts file or the specific entry in the file that corresponds to the managed server. Then, retry the add server feature osmonitor agentip command. See To Update the ssh_known_hosts File in Sun N1 System Manager 1.3 Troubleshooting Guide for details.

The problem of stale SSH entries on the management server can be avoided if, during the n1smconfig configuration process, you modify SSH policies by accepting changed or unknown host keys. Accepting changed or unknown host keys carries a security risk but avoids the problem of stale SSH entries on the management server. For more information, see To Configure the N1 System Manager in Sun N1 System Manager 1.3 Installation and Configuration Guide.

ProcedureTo Upgrade the OS Monitoring Feature on a Server

This procedure describes how to upgrade the OS monitoring feature on a server. This procedure is only necessary after upgrading the N1 System Manager from a previous release, for managed servers that still run the earlier version of the OS monitoring feature included N1 System Manager 1.1 or Sun Management Center 3.5.1. This procedure is for individual servers. You can upgrade the OS monitoring feature on multiple servers at once. See Chapter 3, Upgrading the Sun N1 System Manager Software, in Sun N1 System Manager 1.3 Installation and Configuration Guide for details.


Note –

If the server was freshly installed using the load server or load group commands from the latest version of the N1 System Manager, and the feature subcommand was used with the update keyword, this procedure is not necessary.


Use the add server feature osmonitor command with the upgrade keyword to upgrade a managed server to a new version from the existing base management feature and OS monitoring feature.

If you submit add server feature commands by using a script, see Example 6–4 for an example.

Before You Begin
Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. To upgrade the OS monitoring feature, type the following command:


    N1-ok> add server server feature osmonitor upgrade
    

    An Modify OS Monitoring Support job starts. Note that this command also upgrades the base management feature.

    See add server in Sun N1 System Manager 1.3 Command Line Reference Manual for details about command syntax.

  3. Track the Add OS Monitoring Support job to completion.

    After the job completes successfully, the Servers table on the System Dashboard tab appears with values for OS Usage and OS Resource Health.

    Verify that the OS monitoring feature is supported by issuing the show server command. Output for the server appears with the OS Monitoring Supported value as OK one of the following sets of commands on the managed server.


    Note –

    It can take 5-7 minutes before all OS monitoring data is fully initialized. You may see that CPU idle is at 0.0%, which causes a Failed Critical status with OS usage. This should clear up within 5-7 minutes after adding or upgrading the OS monitoring feature.


Troubleshooting

Upgrading the OS monitoring feature might fail due to stale SSH entries on the management server. If the add server feature osmonitor agentip command fails and no true security breach has occurred, remove the known_hosts file or the specific entry in the file that corresponds to the managed server. Then, retry the add server feature osmonitor agentip command. See To Update the ssh_known_hosts File in Sun N1 System Manager 1.3 Troubleshooting Guide for details.

The problem of stale SSH entries on the management server can be avoided if, during the n1smconfig configuration process, you modify SSH policies by accepting changed or unknown host keys. Accepting changed or unknown host keys carries a security risk but avoids the problem of stale SSH entries on the management server. For more information, see To Configure the N1 System Manager in Sun N1 System Manager 1.3 Installation and Configuration Guide.

Upgrading the OS monitoring feature will fail if you specify the agent IP or the SSH credentials in the add server feature osmonitor upgrade command when running it on servers that already have the base management feature support. To solve this problem, issue the add server feature osmonitor command without specifying values for the agent IP or for the SSH credentials.

Enabling and Disabling Monitoring

Monitored file system and OS health data for a managed server is not available unless an operating system is deployed on the managed server, and the OS monitoring feature has been installed.

Once the OS monitoring feature is installed on a server, monitoring is enabled by default. For information on installing the OS monitoring feature on a server, see Supporting OS Monitoring.

Use the set server monitored command to enable or disable monitoring. See Enabling and Disabling Monitoring. If the OS monitoring feature is not installed on a server or on every server in a group, using the set server monitored command enables only hardware monitoring for the server or group of servers.

The following graphic shows a section of the Server Details page. The server is powered on, an OS has been installed and the base management and OS monitoring features are supported. Monitoring is enabled for the server.

The graphic shows a section of the Server Details page.
Monitoring shown as enabled; base management and OS monitoring features are
highlighted.

Disabling monitoring by use of the set server monitored command does not remove the monitoring support provided by the OS monitoring feature, which remains installed on the server. However, disabling monitoring by the set server monitored command disables both hardware health and OS health monitoring.

ProcedureTo Monitor a Managed Server or a Managed Server Group

The following procedure describes how to use the command line to enable the monitoring of hardware health and operating system health of a managed server or a group of managed servers Hardware health and OS health monitoring are both enabled with this command, provided that the OS monitoring feature has been installed on the server or the server group. If the OS monitoring feature has not been installed on the managed server or a group of managed servers, then only hardware health monitoring is enabled.


Note –

It can take up to one minute for monitoring to be enabled after running the command in this procedure.


Before You Begin

To enable the management agent IP and security credentials on a managed server named server, add the management features on themanaged server or a group of managed servers as explained in Supporting OS Monitoring.

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Set the monitored attribute to true.

    • Use the set server command.


      N1-ok> set server server monitored true
      

      In this procedure, server is the name of the managed server that you want to monitor.

    • For a group of managed servers, set the monitored attribute to true by using the set group command.


      N1-ok> set group group monitored true
      

      This command is executed for the group of managed servers that you have already named. See set group in Sun N1 System Manager 1.3 Command Line Reference Manual for details. In this procedure, group is the name of the group of managed servers that you want to monitor.

  3. View the details to determine if monitoring is enabled.

    • View the managed server details.


      N1-ok> show server server
      
    • For a server group, view the managed server group details to determine if monitoring is enabled for each managed server in the group.


      N1-ok> show group group
      

    Detailed monitoring information appears in the output. Information is displayed about hardware health, OS health and network reachability. OS health monitoring threshold values are also displayed.Monitoring threshold values are explained in Monitoring Threshold Values.

ProcedureTo Disable Monitoring for a Managed Server or a Managed Server Group

The following procedure describes how to use the command line to disable the monitoring of hardware health and operating system health of a managed server or a group of managed servers. Hardware health and OS health monitoring are both disabled with this command, provided that the OS monitoring feature has been added.


Note –

It can take up to one minute for monitoring to be disabled after running the command in this procedure.


You might want to disable monitoring of a hardware component to perform maintenance tasks without generating events.

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Set the monitored attribute to false.

    • Use the set server command.


      N1-ok> set server server monitored false
      

      In this example, server is the name of the managed server that you want to stop monitoring. Executing this command disables monitoring of the server. With monitoring of a managed server disabled, the violation of threshold values by attributes related to that managed server does not generate events.

    • For a server group, set the monitored attribute to false by using the set group command.


      N1-ok> set group group monitored false
      

      This command is executed for the group of managed servers that you have already named. See set group in Sun N1 System Manager 1.3 Command Line Reference Manual for details. In this procedure, group is the name of the group of managed servers for which you want to disable monitoring.

  3. View the details to determine that monitoring is disabled.

    • View the managed server details.

      The output shows that monitoring is disabled.


      N1-ok> show server server
      

      If you are not interested in the values of some OS health attributes, you can disable the threshold severity for the monitoring of those attributes, while continuing to monitor other OS health attributes. This action prevents annoyance alarms. Example 6–9 shows how to accomplish this task. For general information about threshold values, see Monitoring Threshold Values. You can also remove the OS health monitoring feature. See To Remove the OS Monitoring Feature.

    • For a group of managed servers, view the managed server group details to determine if monitoring is disabled for each managed server in the group.


      N1-ok> show group group
      

Default States of Monitoring

The default status of monitoring in the Sun N1 System Manager for discovered servers and initialized operating systems is as follows:

Default status of hardware monitoring

When a managed server or other hardware is discovered, monitoring of the managed server or other hardware is enabled by default. Before a manageable server can be monitored, however, it must be discovered and correctly registered with the N1 System Manager. This process is described in Chapter 4, Discovering Manageable Servers. The monitoring of hardware sensors is enabled by default for all managed servers. If a server is deleted and then rediscovered, all states related to that managed server for the purposes of monitoring are lost, regardless of whether monitoring was enabled or disabled for that server when the server was deleted. When the managed server is rediscovered, monitoring is set to true by default. This is true only for servers that were discovered by SP-based discovery.

Default status of OS health monitoring

Disabled by default. When an OS has been successfully provisioned on a managed server and the N1 System Manager management features are supported by using the add server feature command with the agentip specified, OS health monitoring is enabled. The OS provisioning can be performed either through the N1 System Manager or by an external OS installation.

If you are not interested in the values of some OS health attributes, you can disable the threshold severity for the monitoring of those attributes, while continuing to monitor other OS health attributes. This action prevents annoyance alarms. Example 6–9 shows how to accomplish this task. For general information about threshold values, see Monitoring Threshold Values.

Default status of network reachability monitoring

When the management interface of the managed server is discovered, monitoring of the interface is enabled by default. When the management features are added, monitoring of other interfaces is enabled by default.

Monitoring Threshold Values

The value of any given monitored OS health attribute is compared to a threshold value. Low and high threshold values are defined and can be configured.

Attribute data is compared against thresholds at regular intervals.

When a monitored attribute's value is beyond the default or user-defined threshold safe range, an event is generated and a status is issued. If the value of the attribute is lower than the low threshold or higher than the high threshold, then depending on the severity of the threshold, an event is generated to show a status of nonrecoverable, critical, or warning. Otherwise the status of the OS health monitored attribute is OK, provided that a value can be obtained.

If no value can be obtained, an event is generated to show that the status of the monitored attribute is unknown. The health of an OS resource can be shown as unknown if the server is reachable but the agent for the monitoring feature cannot be contacted on SNMP port 161. For more information, see Understanding the Differences Between Unreachable and Unknown States for Managed Servers.

The nonrecoverable, critical, warning, and unknown statuses are represented by alarms displayed in the browser interface.

The values nonrecoverable, critical, and warning are discussed in show server in Sun N1 System Manager 1.3 Command Line Reference Manual.

Threshold values for OS health attributes can be configured at the command line. This process is explained in Setting Threshold Values. For threshold values measuring percentages, the valid range is from 0 to 100%. If you try to set a threshold value outside of this range, an error is generated. For attributes that do not measure percentages, these values depend on the number of processors in your system and on the usage characteristics of your installation.

What Happens When a Threshold Is Broken

If the value of an OS health monitored attribute rises above the warninghigh threshold, a status of warninghigh is issued. If the value continues to rise and passes the criticalhigh threshold, a status of Failed Critical is issued. If the value continues to rise above the nonrecoverablehigh threshold, a status of nonrecoverablehigh is issued.

If the value then falls back to the safe range, no further events are generated until the value falls below the Failed Warning threshold, at which point an event is generated to show a status of normal.

If the value of a monitored attribute falls below the warninglow threshold, a status of Failed Warning is issued. If the value continues to fall, and passes the criticallow threshold, a status of Failed Critical is issued. If the value continues to fall below the nonrecoverablelow threshold, a status of nonrecoverablelow is issued.

If the value then rises back to the safe range, no further events are generated until the value rises above the warninglow threshold, at which point an event is generated to show a status of normal.

Tuning Threshold Values for Your Installation

After a period of usage, you can develop an awareness of what levels to set for OS health attribute values. You can adjust thresholds once you determine more closely what value indicates a genuine justification for an event to be generated and for an event notification to be sent to your pager or email address. For example, you might want to receive event notifications every time a certain attribute reaches a warninghigh severity threshold level. For more information, see Setting Up Event Notifications.

For important or crucial attributes at your installation, you can set the warninghigh threshold level to a low percentage value so that you are notified about a rising value as early as possible.

ProcedureTo Retrieve Threshold Values for a Server

Before You Begin

To enable the management agent IP and security credentials on a server named server, add the management features on the server as explained in Adding and Upgrading Base Management and OS Monitoring Features.

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the show server command:


    N1-ok> show server server
    

    In this procedure, server is the name of the managed server for which you want to retrieve threshold values.

    Detailed monitoring threshold values appear in the output, including threshold information for the server's hardware health, OS health, and network reachability. Default values are shown if no specific values have been set.

    See show server in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

    • Threshold information is also available from the Server Details page in the browser interface. This is shown in the following graphic.

      The graphic shows that OS monitoring information can
be displayed on the Server Details page, with threshold status information.

Default Threshold Values

Factory-configured default threshold values are provided in the N1 System Manager software for some OS health thresholds. These values are stated as percentages. Table 6–3 lists default values for these OS health attributes.


Note –

Setting or modifying threshold values for hardware health attributes is not supported in this version of the Sun N1 System Manager.


Table 6–3 Factory-Configured Default Threshold Values for OS Health Attributes

Attribute Name 

Description 

Default Threshold 

Default Threshold 

cpustats.loadavg1min

System load expressed as average number of queued processes over 1 minute 

warninghigh >4.00

criticalhigh >5.00

cpustats.loadavg5min

System load expressed as average number of queued processes over 5 minutes 

warninghigh >4.10

criticalhigh >5.10

cpustats.loadavg15min

System load expressed as average number of queued processes over 15 minutes 

warninghigh >4.10

criticalhigh >5.10

cpustats.pctusage

Percentage of overall CPU usage 

warninghigh >80%

criticalhigh >90.1%

cpustats.pctidle

Percentage of CPU idle 

warninglow <20%

criticallow <10%

memusage.mbmemfree

Memory free in MB 

warninghigh <39%

criticalhigh <29%

memusage.mbmemused

Memory used in MB 

warninghigh >1501

criticalhigh >2001

memusage.pctmemused

Percentage of memory in use 

warninghigh >80%

criticalhigh >90%

memusage.pctmemfree

Percentage of memory free 

warninglow <20%

criticallow <10%

memusage.kbswapused

Swap space in use in Kb 

warninghigh >500000

criticalhigh >1000000

fsusage.kbspacefree

File system free space in Kb 

warninglow <94.0Kb

criticallow <89.0Kb

Specific threshold values can be set at the command line by following the procedures described in Setting Threshold Values.

Setting Threshold Values

Threshold values for OS health attributes can be set on specific servers. If you set specific threshold values at the command line for OS health attributes, that overwrites any factory-configured threshold values for the attributes.

ProcedureTo Set Threshold Values for a Server

Before You Begin

To enable the management agent IP and security credentials on a server named server, add the management features on the server as explained in Adding and Upgrading Base Management and OS Monitoring Features.

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Use the set server command with the threshold attribute.

    The syntax requires the threshold keyword to be followed by the attribute for which you are setting a threshold. The attribute is an OS health attribute. OS health attributes are described in OS Health Monitoring and listed in Table 6–2.

    The threshold is either criticallow, warninglow, warninghigh, or criticalhigh. The value is a numeric figure and usually represents a percentage.

    This set server operation does not actually touch the managed server. It just synchronizes the data on the management server itself.

    • To set one threshold value, type the following:


      N1-ok> set server server threshold attribute threshold value
      
    • To set multiple threshold values for the server, type the following:


      N1-ok> set server server threshold attribute threshold value threshold value
      
    • For a server group, use the set group command with the threshold attribute. To modify one threshold for the server group:


      N1-ok> set group group threshold attribute threshold value
      
    • To modify multiple thresholds for the server group:


      N1-ok> set group group threshold attribute threshold value threshold value
      

Example 6–5 Setting Multiple Threshold Values for CPU Percentage Usage on a Server

This example shows how to set the CPU usage warninghigh severity threshold on a managed server named serv1 to 53 percent. This example also shows how to set the criticalhigh severity threshold value to 75 percent.


N1-ok> set server serv1 threshold cpustats.pctusage warninghigh 53 criticalhigh 75


Example 6–6 Setting Multiple Threshold Values for File System Percentage Usage On a Server

This example sets the file system percentage usage warninghigh threshold on a managed server named serv1 to 75 percent. This example also sets the criticalhigh threshold value to 87 percent. This example sets the threshold for every file system on the server.


N1-ok> set server serv1 threshold fsusage.pctused warninghigh 75 criticalhigh 87

You can also specify the file system for which you want to set multiple threshold values. To set the warninghigh threshold to 75 percent and the criticalhigh threshold value to 87 percent, for the /usr file system on the same server, use the filesystem attribute:


N1-ok> set server serv1 filesystem /usr threshold fsusage.pctused 
warninghigh 75 criticalhigh 87


Example 6–7 Setting a Threshold Value for File System Free Space On a Server

This example sets the warninghigh threshold for file system free space for the /var file system on a managed server named serv1 to 150 Kbytes of free space.


N1-ok> set server serv1 filesystem /var threshold fsusage.kbspacefree warninghigh 150


Example 6–8 Setting a Threshold Value for Percentage of Free Memory On a Server

This example sets the criticalhigh threshold for the percentage of free memory on a managed server named serv1 to 5%.


N1-ok> set server serv1 threshold memusage.pctmemused criticalhigh 5


Example 6–9 Deleting a Threshold Value for File System Percentage Usage on a Server

This example shows how to delete a value that was set for the warninghigh threshold on a managed server named serv1.


N1-ok> set server serv1 threshold fsusage warninghigh none

In this case, any previously set value for this threshold at this severity is deleted. In effect, monitoring is disabled for the warninghigh threshold for file system usage for this server.



Example 6–10 Setting Multiple Threshold Values for File System Usage on a Server Group

This example shows how to set the file system usage warninghigh threshold to 75 percent on a group of managed servers with a group name of grp3. This example also shows how to set the criticalhigh threshold severity value to 87 percent.


N1-ok> set group grp3 threshold fsusage.pctused warninghigh 75 criticalhigh 87

Monitoring MIBs

Two Management Information Bases (MIBs) are provided with the N1 System Manager. These MIBs provide the data structure that third-party monitoring tools can use to retrieve the data from the N1 System Manager using SNMP, and provide the data structure that third party monitoring tools can use to parse the SNMP notifications generated by the N1 System Manager. The MIBs can be found at /opt/sun/n1gc/etc/. These MIBs therefore enable you to use any SNMP client to query the N1 System Manager, and to listen for events using SNMP. The following MIBs are provided:

SUN-N1SM-INFO-MIB

This MIB describes the information that you can retrieve from the N1 System Manager by querying it using an SNMP client.

SUN-N1SM-TRAP-MIB

This MIB describes all of the events related to the N1 System Manager about which you can receive SNMP traps.

These MIBs are read-only. Using them requires a detailed knowledge of SNMP, although detailed descriptions of each object are provided in the MIBs. How you configure your monitoring system to start receiving traps depends on the nature of your monitoring system.

The MIBs are hardware independent.


Example 6–11 Receiving SNMP Traps

This example shows you how to use the simple UNIX trap listener, the snmptrapd command, to start receiving N1 System Manager traps.


# snmptrapd -m all -M /opt/sun/n1gc/etc:/usr/share/snmp/mibs -P

This example uses the snmptrapd command to start monitoring on default port 162 for SNMP traps. It also instructs the command to use the MIBs stored at /opt/sun/n1gc/etc and /usr/share/snmp/mibs to parse the contents of SNMP traps.


Managing Jobs

This section describes jobs and their integral role in of server monitoring.

Each major action you take in the N1 System Manager starts a job. Use the job log to track the status on a currently running action or to verify that a job has finished. Monitoring jobs is useful particularly because some N1 System Manager actions can take a long time to finish. An example of such an action is installing an OS distribution on one or more managed servers.

You can track jobs through the Jobs tab in the browser interface or the show job command. The show job command provides information about most of the following characteristics:

Job ID

Generated unique identifier.

Date

Date on which the job was started.

Job Type

Type of job. See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details. When using the show job command with the type parameter, jobs can be any of the following types:

  • addbase Add base management support.

  • addosmonitor Add OS monitoring support.

  • createos Create OS distribution from CD/DVD media or ISO files.

  • deletejob Delete job.

  • discover Server discovery.

  • loadfirmware Load firmware update.

  • loados Load OS.

  • loadupdate Load OS update.

  • refresh Server refresh.

  • reset Server reboot.

  • removeosmonitor Remove OS monitoring support.

  • removeserver Server deletion.

  • setagentip Modify management feature configuration. Related to the base management and OS monitoring features.

  • start Server power on.

  • startcommand Remote command execution.

  • stop Server power off.

  • unloadupdate Unload OS update.

State

State of the current job step. Job steps indicate the progress of a job and update results. Each job step has a type, a start time and, when the job completes, a completion time. For the purposes of filtering, job progress is indicated with the following states:

notstarted

Jobs in a notstarted state cannot be stopped.

preflight

When you select a job by ID and view the details of that job, each step of that job can appear twice:the preflight check and the execution of the step itself.

running

The job is currently running. Jobs that are currently running cannot be deleted using the delete job command. Jobs that are currently running must finish running or be stopped using the stop job command.

Job completion is indicated with the following results:

completed

Indicates that the job step completed successfully.

warning

Indicates a warning during the job execution. A warning can be an issue reported that might be severe enough to terminate the job step, and the job, with errors.

stopped

Indicates that the job step stopped before it completed.

pendingstop

Indicates that the job is still running but that the job step cannot complete successfully.

error

Indicates a general error in that job step.

timed_out

Indicates that the job timed out before all of the job steps could complete successfully, or that the next step of the job started before the current step completed successfully.

Complete - Warning is issued in the output for an overall job status, if the job successfully completed all of its steps one or more WARNING states were issued for steps during the job execution and these warnings were not severe enough to terminate the job with errors.

You can filter jobs depending on their state. See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

Command

The command that was used to start the job.

Owner

The user who started the job. Also called the job creator.

Job Results

Provides details about the results of a completed job. You can review the standard output of remote command operations and completion statuses for all other job types.

ProcedureTo List Jobs

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. View the list of jobs.


    N1-ok> show job all
    

    A list of all jobs for the N1 System Manager is returned.

    See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Example 6–12 Listing All Jobs

This example shows that using the show job command with the all option returns a list of jobs by Job ID, together with the date and time at which the job was started. The job type and status are also returned, along with the identity of the user who created the job.


N1-ok> show job all
Job ID          Date                       Type                  Status        Owner
7               2005-09-16T10:51:07-0700   Discovery             Completed      root
6               2005-09-14T14:42:52-0700   Server Reboot         Error          root
5               2005-09-14T14:38:25-0700   Server Power On       Completed      root
4               2005-09-14T14:29:20-0700   Server Power Off      Completed      root
3               2005-09-09T13:01:35-0700   Discovery             Completed      root
2               2005-09-09T12:38:16-0700   Discovery             Completed      root
1               2005-09-09T10:32:40-0700   Discovery             Completed      root

ProcedureTo View a Specific Job

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. View a specific job.


    N1-ok> show job job
    

    Detailed information about the job appears in the output.


Example 6–13 Viewing Job Details

This example shows that using the show job command with the Job ID returns the date and time at which the job was started, the job type and status, and the identity of the user who created the job. The job in this example is to load an OS profile on a server named 192.168.200.4 using the load server command. Further details are provided for each step of that job, including the time at which the step started and completed and whether the step was successful.


N1-ok> show job 21
Job ID:   21
Date:     2005-10-27T10:09:18-0600
Type:     Load OS
Status:   Completed (2005-10-27T10:37:23-0600)
Command:  load server 192.168.200.4 osprofile SLES9RC5 
bootip=192.168.200.30 networktype=static ip=192.168.200.31
Owner:    root
Errors:   0
Warnings: 0

Steps
ID     Type             Start                      Completion                 Result   
1      Acquire Host     2005-10-27T10:09:19-0600   2005-10-27T10:09:19-0600   Completed
2      Execute Java     2005-10-27T10:09:19-0600   2005-10-27T10:09:19-0600   Completed
3      Acquire Host     2005-10-27T10:09:21-0600   2005-10-27T10:09:21-0600   Completed
4      Execute Java     2005-10-27T10:09:21-0600   2005-10-27T10:37:22-0600   Completed

Results
Result 1: 
Server:   192.168.200.4
Status:   0
Message:  OS deployment using OS Profile SLES9RC5 was successful.
IP address 192.168.200.30 was assigned.


Example 6–14 Viewing all OS Monitoring Jobs

This example shows how to use the show job command with the addosmonitor Job Type to filter all jobs that add OS monitoring support.


N1-ok> show job type addosmonitor

ProcedureTo Stop a Job

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Stop a specific job.


    N1-ok> stop job job
    

    The job is stopped.

    See stop job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

  3. View the job details.


    N1-ok> show job job
    

    The Result section of the output shows that the job was stopped.

    Any job can be stopped. In practice, however, only a job that is not in its last step can be stopped. Some jobs only have one step and so can never be stopped. Jobs in a notstarted state cannot be stopped. Operations that are performed on large groups of servers can take longer and might include a large number of steps.

    See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Example 6–15 Stopping a Job

This example shows that using the stop job command with the Job ID returns a message confirmed that the request has been received.


N1-ok> stop job 32

Stop Job "32" request received.

This example also shows that the show job command can be used with the Job ID of the job that was stopped to gain more data about the job that was stopped. The command returns the confirmation, in Status, that the job was stopped, and the command that was used to create the job. Further details are provided for each step of that job, including the time at which the step started and completed and whether the step was successful. The Result section shows that the job was stopped.


N1-ok> show job 32
Job ID:   32
Date:     2005-11-02T08:08:37-0700
Type:     Server Refresh
Status:   Stopped (2005-11-02T08:08:48-0700)
Command:  set server 192.168.200.2 refresh
Owner:    root
Errors:   0
Warnings: 0

Steps
ID   Type           Start                      Completion                 Result   
1    Acquire Host   2005-11-02T08:08:38-0700   2005-11-02T08:08:38-0700   Completed
2    Run Command    2005-11-02T08:08:38-0700   2005-11-02T08:08:38-0700   Completed
3    Acquire Host   2005-11-02T08:08:40-0700   2005-11-02T08:08:40-0700   Completed
4    Run Command    2005-11-02T08:08:40-0700   2005-11-02T08:08:47-0700   Stopped

See Also

To Issue Remote Commands on a Managed Server or a Group

ProcedureTo Delete a Job

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Determine the job you want to delete.


    N1-ok> show job all
    

    All jobs and job IDs appear in the output.

    See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

  3. Delete the desired job.


    N1-ok> delete job job
    

    The job is deleted.

    See delete job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

  4. Verify that the job was deleted.


    N1-ok> show job all
    

    The deleted job should not appear in the output.

    See show job in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Example 6–16 Deleting a Job

This example shows how to delete a job.

First, the show job command is used with the all option, which lists all jobs in descending order.


N1-ok> show job all
Job ID     Date                       Type                Status        Creator
7          2005-02-16T10:51:07-0700   Discovery           Completed     root
6          2005-02-14T14:42:52-0700   Server Reboot       Error         root
5          2005-02-14T14:38:25-0700   Server Power On     Completed     root
4          2005-02-14T14:29:20-0700   Server Power Off    Completed     root
3          2005-02-09T13:01:35-0700   Discovery           Completed     root
2          2005-02-09T12:38:16-0700   Discovery           Completed     root
1          2005-02-09T10:32:40-0700   Discovery           Completed     root

Job ID 6 has an error and can be deleted. The delete job command is now used with the Job ID of the job to be deleted.


N1-ok> delete job 6

The show job command is used again with the all option, which lists all jobs in descending order. The deleted job no longer appears on the list.


N1-ok> show job all
Job ID     Date                       Type               Status        Creator
7          2005-02-16T10:51:07-0700   Discovery          Completed     root
5          2005-02-14T14:38:25-0700   Server Power On    Completed     root
4          2005-02-14T14:29:20-0700   Server Power Off   Completed     root
3          2005-02-09T13:01:35-0700   Discovery          Completed     root
2          2005-02-09T12:38:16-0700   Discovery          Completed     root
1          2005-02-09T10:32:40-0700   Discovery          Completed     root


Example 6–17 Deleting All Jobs

This example shows how to delete all jobs.

First, the show job command is used with the all option, which lists all jobs in descending order.


N1-ok> show job all
Job ID     Date                       Type               Status        Creator
7          2005-09-16T10:51:07-0700   Discovery          Completed     root
6          2005-09-14T14:42:52-0700   Server Reboot      Error         root
5          2005-09-14T14:38:25-0700   Server Power On    Completed     root
4          2005-09-14T14:29:20-0700   Server Power Off   Completed     root
3          2005-09-09T13:01:35-0700   Discovery          Running       root
2          2005-09-09T12:38:16-0700   Discovery          Completed     root
1          2005-09-09T10:32:40-0700   Discovery          Completed     root

The delete job command is now used with the all option, to delete all jobs.


N1-ok> delete job all

Unable to delete job "3"

The show job command is used with the all option, to confirm whether all jobs were successfully deleted.


N1-ok> show job all
Job ID     Date                       Type             Status     Creator
3          2005-09-09T13:01:35-0700   Discovery        Running    root

Job ID 3 is still running. This is because jobs that were in a running state when the delete job command was issued must finish running, or must be stopped, before they can be deleted.

To stop the job and then delete it, first the stop job command is used with the ID of the job to be stopped.


N1-ok> stop job 3

Stop Job "3" request received.

The show job command is used to confirm that the job has been stopped.


N1-ok> show job all
Job ID     Date                       Type             Status        Creator
3          2005-09-09T13:02:35-0700   Discovery        Aborted       root

The job has been stopped while running and is in the aborted state. The delete job command is now used with the all option, to delete all jobs.


N1-ok> delete job all

The show job command is used to confirm that all jobs have now been deleted.


N1-ok> show job all
Job ID     Date                      Type              Status        Creator

Job Queueing

Each type of job in the N1 System Manager has a weight associated with it. The weight is a reflection of the load created by the job on the system resources. A global limit governs how much total load can be placed on the system. The following table provides a listing of the weight for each type of (user level) job. The maximum load permitted is 1000.

Table 6–4 Job Weight Values

Job 

Weight 

OS Deployment 

500 

Package Deployment 

500 

Package Uninstall 

500 

Discovery 

200 

Firmware Deployment 

500 

Remote Command Execution 

200 

Job Deletion 

400 

Create OS 

1000 

Reset Server 

200 

Server Power Off 

200 

Server Power On 

200 

Server Refresh 

200 

Set Server Feature 

200 

Remove Server 

100 

Add Server 

100 

The total load is the sum of the loads of all the current running jobs. The system will compare the current total load with the maximum permitted load at the following points in time:

If the difference between the current total load and the maximum permitted load is great enough to accommodate the job at the head of the job queue, then that job is promoted to a running state. Otherwise, it is left in the queued state. The current total load governs the permissible concurrent running job mix within the system.

For example, only two OS Deployment jobs can be running at one time:

500 + 500 = 1000

Or only one OS Deployment job and two Server Power Off jobs can be running at one time:

500 + 200 + 200 < 1000

Managing Event Log Entries

This section describes events and their integral role in to monitoring your servers.

Events are generated when certain conditions related to attributes occur. Each event has an associated topic. For example, when a server is discovered by the management server, an event is generated with the topic Action.Physical.Discovered. For a complete list of event topics, see create notification in Sun N1 System Manager 1.3 Command Line Reference Manual.

Events can be monitored. Monitoring is connected with the broadcasting of events for each monitored server or group of servers. When a monitored attribute's value is beyond the default or user-defined threshold safe range, an event is generated and a status is issued.

See Introduction to Monitoring for more information about monitoring.

See Setting Up Event Notifications for more information about event notifications.

Lifecycle events continue to be generated even with monitoring disabled. Lifecycle events include server discovery, server change or deletion, or server group creation. If you have requested notification of this type of event, you can still receive notifications even with monitoring disabled.

Event logs are created when events occur. For example, if any monitored IP address is unreachable, an event is generated. This event creates an event log record, which is visible from the browser interface.


Note –

Servers that use ALOM do not send event notifications to the management server by use of traps. Instead, they send event notifications by email. To ensure that the management server collects data from these servers, the N1 System Manager management server has its own port 25 email server.


Event Log Overview

During the installation and configuration of the N1 System Manager, you can configure which events to log and you can also interactively configure severity levels for event topics. See Configuring the N1 System Manager in Sun N1 System Manager 1.3 Installation and Configuration Guide.

Even if a log is not saved, it can still generate an event notification.

Use the show log command to view the following information about events:

The n1smconfig script can be used to change the number of days for which event logs are kept. Reducing the number of days for which event logs are stored reduces the average size of the event log files. This task ensures that the event log file size does not impair performance. The n1smconfig script is stored at /usr/bin for both the Linux and Solaris OS platforms. This script can be used to set the number of days for which event logs are held. To configure event logging, specify an event category and a resource category. The following event categories are defined:

Use the all event category to indicate that all events are to be logged. To understand how other event categories relate to actual events, see the event notification topics at create notification in Sun N1 System Manager 1.3 Command Line Reference Manual. General log files are saved to the syslog file at /var/adm/messages or /var/log/messages

ProcedureTo View the Event Log

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> show log [count count]

    The Events log appears with events listed most recent first. The value for the count attribute is the number of events to show in the output. The default value for count is 500. See show log in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

See Also

Event Log Overview

ProcedureTo Filter the Event Log

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> show log [after after] [before before] [count count] [severity severity]

    The output shows only the events that match the specified criteria. The before or after variable values must be formatted appropriately, for example, 2005-07-20T11:53:04. The possible values for severity are as follows:

    • unknown

    • other

    • information

    • warning

    • minor

    • major

    • critical

    • fatal

    See show log in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

ProcedureTo View Event Details

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> show log log
    

    The details of the event appear in the output. The log variable is the log ID. See show log in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Example 6–18 Viewing Event Details


N1-ok> show log 72
ID:       72
Date:     2005-03-15T13:35:59-0700
Subject:  RemoteCmdPlan
Topic:    Action.Logical.JobStarted
Severity: Information
Level:    FINE
Source:   Job Service
Role:     root
Message:  RemoteCmdPlan job initiated by root: job ID = 15. 

Setting Up Event Notifications

The N1 System Manager provides the ability to set up email or SNMP event notifications when events occur, either within the N1 System Manager itself or when specific events occur on managed servers. You can set up customized event notification rules for as many different scenarios as you need. Setting up default notifications for events can be done using the n1smconfig utility at install time. See Configuring the N1 System Manager in Sun N1 System Manager 1.3 Installation and Configuration Guide for more information about installing and configuring the N1 System Manager.

You can create additional event notifications at the command line. Use the create notification command to create event notification rules based on events that occur or that might occur, about which you are interested. Subscribe to a topic to create an event notification. For example, to receive notifications for discrete sensor events, subscribe to the Ereport.Physical.ThresholdExceeded topic. This topic covers events for both discrete sensors and bi-state sensors. For a list of topics, and to see the mapping of event categories to actual events, see create notification in Sun N1 System Manager 1.3 Command Line Reference Manual.

For setting up event notifications using SNMP traps, use the SNMP MIB located at /opt/sun/n1gc/etc/SUN-N1SM-TRAP-MIB.mib. For more information about SNMP MIBs, see Monitoring MIBs.

A notification rule can be used to send a notification of each type of event to a selected destination, using either email or SNMP as the communication medium. For example, you can create a notification rule so that each time a new managed server is discovered by the management server, you receive a message on your pager to indicate that the event has happened:


create notification notification destination destination topic topic 
type type [description description]

See create notification in Sun N1 System Manager 1.3 Command Line Reference Manual for more details of the terms used in this command syntax.

Viewing and Modifying Event Notifications

Use the show notification and set notification commands to view and modify event notification details. Type help show notification or help set notification at the N1–ok command line for syntax and parameter details.

ProcedureTo View Event Notifications

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> show notification all
    

    The event notifications for which you have read privileges appear in the output. See show notification in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

ProcedureTo View Event Notification Details

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> show notification notification
    

    The specified event notification details appear in the output. See show notification in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Example 6–19 Viewing Event Notification Details

This example shows how to use the show notification command to display the details about a notification.


N1-ok> show notification notif33
Name:          notif33
Event Topic:   EReport.Physical.ThresholdExceeded
Notifier Type: Email
Destination:   nobody@sun.com
State:         enabled

ProcedureTo Modify an Event Notification

This procedure describes how to change the name, description, or destination of an event notification.

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> set notification notification name name description description
     destination destination
    

    The specified event notification attributes are set to the new values specified. See set notification in Sun N1 System Manager 1.3 Command Line Reference Manual for details.


Example 6–20 Modifying an Event Notification Name

This example shows how to use the set notification command with the name option to change a notification name from notif22 to notif23.


N1-ok> set notification notif22 name notif23

Creating, Testing, and Deleting Event Notifications

Use the create notification or delete notification commands to create and delete event notifications.

Use the start notification command with the test keyword to test an even notification.

Type help create notification or help delete notification at the N1–ok command line for syntax and parameter details.

ProcedureTo Create and Test an Event Notification

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> create notification notification topic topic
    type type destination destination
    

    The event notification is created and enabled. See create notification in Sun N1 System Manager 1.3 Command Line Reference Manual for details and valid topics.

  3. Type the following command:


    N1-ok> start notification notification test
    

    A test notification message is sent. See start notification in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

    You can also create a notification that is triggered by a script. See To Create a Notification That is Triggered by a Script for details.


Example 6–21 Creating an Email Notification for Server Groups Being Created

This example shows how to create an event notification to be sent by email if a server group is created. Note that an SMTP email server must first be configured using the n1smconfig utility as described in Configuring the N1 System Manager in Sun N1 System Manager 1.3 Installation and Configuration Guide.

The event notification is called notif2. The recipient's email address is nobody@sun.com


N1-ok> create notification notif2 destination nobody@sun.com
Lifecycle.Logical.CreateGroup type email

The show notification command can be used to verify that the event notification has been created.


N1-ok> show notification
Name    Event Topic                         Destination       State
notif2  Lifecycle.Logical.CreateGroup       nobody@sun.com    enabled 

The event can be invoked by creating a false group, as a test.


N1-ok> create group test

An email should be sent if the notification was created successfully. Otherwise, the following error message is displayed:


Notification test failed.

Verify if the SMTP server is configured correctly and is reachable, and if the email address used in the notification rule is valid.



Example 6–22 Creating an SNMP Notification for Hardware Health Thresholds Being Exceeded

This example shows how to create an event notification to be sent by SNMP if a hardware health threshold is exceeded. The event notification is called notif3. The recipient SNMP address is sun.com


N1-ok> create notification notif3 destination sun.com
topic EReport.Physical.ThresholdExceeded type snmp

The topic, which is the type of event to trigger the notification, is Ereport.Physical.ThresholdExceeded

The show notification command can be used to verify that the event notification has been created.


N1-ok> show notification
Name    Event Topic                         Destination  State
notif3  EReport.Physical.ThresholdExceeded  sun.com      enabled

You can specify the event notification you want to see by using show notification command with the notification attribute value.


N1-ok> show notification notif3
Name    Event Topic                         Destination  State
notif3  EReport.Physical.ThresholdExceeded  sun.com      enabled


Example 6–23 Creating an Email Notification for Hardware State Changes

This example shows how to create an event notification to be sent by email if a server's hardware state changes. Hardware state changes include power state changes, such as a power supply failure. Note that an SMTP email server must first be configured using the n1smconfig utility as described in Configuring the N1 System Manager in Sun N1 System Manager 1.3 Installation and Configuration Guide.

The event notification is called notif44. The recipient's email address is nobody@sun.com


N1-ok> create notification notif44 destination nobody@sun.com
EReport.Physical.ThresholdExceeded type email

The show notification command can be used to verify that the event notification has been created.


N1-ok> show notification
Name    Event Topic                         Destination       State
notif44 EReport.Physical.ThresholdExceeded  nobody@sun.com    enabled 

Verify if the SMTP server is configured correctly and is reachable, and if the email address used in the notification rule is valid.


ProcedureTo Create a Notification That is Triggered by a Script

You can create a notification rule for an event that triggers the execution of a Borne shell script on the management server. The Borne shell script must be executable by the root user.

The script should be written to direct its output (stdout/stderr) to a log file.

The fields of the event are passed into the script as environment variables:

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> create notification notification destination destination topic topic
    type script
    

    The event notification is created and enabled. The destination must be a fully qualified path to a custom Bourne shell script used to manage the notification. The script must be executable by the root user. See create notification in Sun N1 System Manager 1.3 Command Line Reference Manual for details and valid topics.

    If the script is executed as a result of an event triggered internally by the N1 System Manager, the script is executed as root.

    If the script is executed as a result of an event triggered by a user, the script is executed by the user that triggered the event.

  3. Type the following command:


    N1-ok> start notification notification test
    

    A test notification message is sent. See start notification in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

ProcedureTo Delete an Event Notification

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> delete notification notification
    

    The event notification is deleted.

Starting and Stopping Event Notifications

Event notifications are enabled, or started, by default at creation. Use the start notification command to enable an event notification that has been disabled. Type help start notification at the N1–ok command line for syntax and parameter details.

ProcedureTo Start an Event Notification

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> start notification notification
    

    The event notification is enabled. See start notification in Sun N1 System Manager 1.3 Command Line Reference Manual for details.

ProcedureTo Stop an Event Notification

Steps
  1. Log in to the N1 System Manager.

    See To Access the N1 System Manager Command Line for details.

  2. Type the following command:


    N1-ok> stop notification notification
    

    The event notification is disabled. See stop notification in Sun N1 System Manager 1.3 Command Line Reference Manual for details.