Sun Management Center 3.6.1 User's Guide

Advanced System Monitoring Modules

Advanced System Monitoring (ASM) is a licensed value-added software product. You choose to install it when you install the Sun Management Center 3.6.1 software. ASM provides additional modules that support more complete system monitoring capabilities. ASM includes the following modules, which are described in this section:

Directory Size Monitoring Module Version 2.0

This module enables you to isolate and monitor the size of any directory and its subdirectories on a host on which an agent is installed. The subdirectories and links can be viewed recursively using a window that is accessible from the modules pop-up menu.


Note –

To monitor several directories individually, either load multiple instances of the Directory Size Monitoring module or add rows for additional directories in the properties table. See To Monitor Directory Size for more information.


The following table provides a brief description of the properties for Directory Size Monitoring.

Table C–119 Directory Size Monitoring Properties

Property  

Description 

Instance Name 

Single word or alpha-character string that is used internally within the Sun Management Center agent to identify uniquely a particular module or a row within a module 

Directory Name 

Name of the directory being monitored 

Directory 

Directory Existing Check 

Directory Size (KB) 

Current size of the directory in Kbytes 

Rate (KB/sec) 

Rate at which the directory is changing size in Kbytes per second 

Fault Manager Module Version 1.0

The Fault Manager module handles hardware and software faults effectively. This module also displays a detailed fault report or a message article for the selected fault.

The Fault Manager module has the following managed objects:

The following table provides a brief description of the properties for the fault manager.

Table C–120 Fault Manager Properties

Property 

Description 

Fault Management Daemon 

Shows the details of the Fault Management Daemon. 

FMD Configuration 

Shows the details of the modules loaded. 

FMD Fault Events 

Shows the latest 20 faults with their message IDs. An alarm will be generated for every new fault. 

The following table provides a brief description of the properties for Fault Management Daemon.

Table C–121 Fault Management Daemon Properties

Property 

Description 

Property 

Properties of the Fault Management Daemon. Properties are FMD program path, FMD program version, and FMD process id. 

Value 

Values of the properties of Fault Management Daemon. 

The following table provides a brief description of the properties for FMD configuration.

Table C–122 FMD Configuration Properties

Property 

Description 

Module Name 

Name of the FMD module. Examples of FMD modules are cpumem-diagnosis, cpumem-retire, and fmd-self-diagnosis. 

Version 

Version of the module. 

Status 

Status of the module. The status can be active or failed. An alarm is generated for this property when the value changes from active to failed. 

Description 

Description of the module. 

The following table provides a brief description of the properties for FMD fault events.

Table C–123 FMD Fault Events Properties

Property 

Description 

Time 

Time at which the diagnosis of the fault happened 

UUID 

Unique ID for the fault event 

SUNW-MSG-ID 

Message identifier that is used to access a corresponding knowledge article located at http://www.sun.com/msg/

ProcedureTo View a Fault Report

  1. Navigate through the topology or hierarchy view until you have accessed the FMD Fault Events table of the Fault Manager module.

  2. Select a fault for which you want to view a fault report.

  3. Press mouse button 3 and choose Show Fault Report from the pop-up menu.

    The Probe Viewer shows the detailed fault report for the selected fault.

ProcedureTo View a Message Article

The message article contains information such as the fault type, severity, description, impact, and suggested action. This article helps the user to take appropriate action for a specific fault.

  1. Navigate through the topology or hierarchy view until you have accessed the FMD Fault Events table of the Fault Manager module.

  2. Select a fault for which you want to view a message article.

  3. Press mouse button 3 and, from the pop-up menu, choose Show Message Article at http://www.sun.com/msg.

    A browser opens with a message article at the following site:

    http://www.sun.com/msg/<SUNW-MSG-ID>

    Where <SUNW-MSG-ID> is the message identifier and is the last column of the FMD Fault table.


    Note –

    The message article does not open in the browser if the Java Console is not installed on your system.


File Scanning Module Version 2.0

The File Scanning module scans files on a host for user-specified patterns. Multiple instances of the File Scanning module can be loaded to scan multiple files. This module requires you to add rows for the data property tables. For more information, see To Add a Row to a Data Property Table.

The File Scanning module has the following managed objects:

The following table provides a brief description of the properties for file scanning.

Table C–124 File Scanning Properties

Property 

Description 

File ID 

Name of the pattern used in the file scan 

File Stats 

State of the pattern listed 

Scan Table 

Name of the pattern used in the file scan 

The following table provides a brief description of the properties for file ID.

Table C–125 File ID Properties

Property 

Description 

Filename 

Full path name of the file to be scanned  

Scan Mode 

Mode in which the file is being scanned  

Start Time 

Time the file scan was first started  

The File Statistics table displays summary information on the file that is to be scanned. The following table provides a brief description of the properties for file statistics.

Table C–126 File Statistics Properties

Property 

Description 

Modification Time 

Date and time when the file was last modified  

File Size 

Size of the file in bytes  

Number of Lines 

Number of lines that are in the file  

Lines Per Second 

Rate at which the file is changing in lines per second  

The following table provides a brief description of the properties for scan.

Table C–127 Scan Table Properties

Property 

Description 

Row Status 

Status of the row 

Pattern Name 

Name of the pattern that was used in the file scan  

Pattern Description 

Name of the pattern entry to be displayed in the name field of the Scan Results section. To scan the fault messages in the syslog file, prefix the description with FMA:

Regexp Pattern 

Regular expression pattern to be used when scanning the file for entries. To scan the fault messages in the syslog file, specify the pattern in this format: <token>:<value>, where <token> is the fault parameter and <value> is the value of the fault parameter.

Pattern State 

State of the pattern listed (on/off). The off state indicates that the listed pattern is not used in the file scan 

Matches 

Number of lines that contain the pattern 

Hardware Diagnostic Suite Version 2.0

The Hardware Diagnostic Suite tests the system for hardware faults. When the module is loaded and the Hardware Diagnostic Suite software is installed, the Applications tab on the Details window allows you to initiate the tests. For more information about the Hardware Diagnostic Suite, see Sun Management Center Hardware Diagnostic Suite 2.0 User's Guide.

Health Monitor Module Version 2.0

The Health Monitor module monitors the health of your host. When alarm conditions occur, this module offers suggestions, if necessary, on how to improve the performance of the system.

For example, this module monitors the swap space that is available, reserved, allocated, and used. Sample alarm messages, from lowest to highest severity, include:

This section describes properties of the following Health Monitor module managed objects:

The Health Monitor module tracks the system properties for the above as described in the following table.

Table C–128 Health Monitor Properties

Property 

Description 

Swap 

Details the swap space 

Kernel Contention 

Monitors the kernel contention (mutex) properties 

NFS 

Provides NFS client information 

CPU 

Provides information on the power of the CPU 

Disk 

Presents the disc I/O information 

RAM 

Random access memory (RAM) information 

Kernel Memory 

Information on kernel memory 

Directory Cache 

Cache of the directory 

Swap Table

The following table provides a brief description of the properties for Swap.

Table C–129 Swap Properties

Property 

Description 

Swap Available KB 

Swap space value available 

Swap Reserved KB 

Swap space value reserved 

Swap Allocated KB 

Swap space value allocated 

Swap Used KB 

Swap space value used 

Swap Rule 

Rule for swap 

Kernel Contention Table

The following table provides a brief description of the properties for Kernel Contention (mutex).

Table C–130 Kernel Contention Properties

Property 

Description 

Spins On Mutexes  

Spins on mutexes (lock not acquired on first try) - Sum for all CPUs 

Number Of CPUs  

Number of CPUs 

Spins On Mutexes Rule  

Spins on mutexes (lock not acquired on first try) - Sum for all CPUs 

NFS Table

The following table provides a brief description of the properties for NFS client information.

Table C–131 NFS Client Information Properties

Property 

Description 

Calls  

Total number of RPC calls received 

Badcalls  

Total number of calls rejected by the RPC layer 

Retrans  

Call retransmitted due to a timeout 

Badxids  

Reply from server not corresponding to any outside call 

Timeouts  

Call timed-out while waiting for a reply from server 

Newcreds  

Number of times authentication information was refreshed 

Badverfs  

Calls failed due to a bad verifier in response 

Timers 

Number of times that calculated timeouts exceed the minimum specified timeout value for a call. 

Nomem  

Failure to allocate memory 

Can't Send  

Failure to send NFS/RPC rule 

NFS/RPC Rule 

Value of the NFS/RPC rule 

CPU Table

The following table provides a brief description of the properties for the central processing unit (CPU).

Table C–132 CPU Properties

Property 

Description 

Processes In Run Queue  

Number of processes in run queue 

Processes Waiting  

Number of processes blocked for resources 

Processes Swapped  

Number of processes that can be run but swapped  

CPU Power Rule  

CPU power rule 

Disk Table

The following table provides a brief description of the properties for disk.

Table C–133 Disk Properties

Property 

Description 

Disk Name  

Name of the disk 

Disk Alias  

Name of the disk, such as c0t0d0

Percent Disk Wait 

Average number of transactions waiting for service 

Percent Disk Busy  

Percent of time disk is busy 

Service Time (ms) 

Average service time in milliseconds 

Disk Rule  

Disk rule 

RAM Table

The following table provides a brief description of the properties for random access memory (RAM).

Table C–134 RAM Properties

Property 

Description 

Handspread 

Value of hand spread (one of kernel parameters) pages 

Scan rate 

Page scan rate 

Real Memory rule 

Real memory rule 

Kernel Memory Table

The following table provides a brief description of the properties for Kernel Memory.

Table C–135 Kernel Memory Properties

Property 

Description 

Total Kernel Allocation Fails 

Value of kernel allocation failure 

Physical Memory Free 

Value of free physical memory  

Kernel Memory Rule 

Value of kernel memory rule 

Directory Cache Table

The following table provides a brief description of the properties for Directory Cache.

Table C–136 Name Cache Statistics Properties

Property 

Description 

Cache Hits 

Number of times a previously accessed page is found 

Cache Misses 

Number of times a previously accessed page is missed 

DNLC Rule 

Directory name lookup cache rule 

Kernel Reader Module Version 2.0

The Kernel Reader module monitors kernel statistics and all kernel information including CPU statistics, system load statistics, disk statistics, file system usage, and so on. This section includes properties and their descriptions for all Kernel Reader managed objects:

Process Monitoring Module Version 2.0

The following section describes the Process Monitoring module parameters and their property descriptions. This module requires you to add rows for the data property tables. For more information, see To Add a Row to a Data Property Table.

When a matching process is found, the %CPU and a count of the number of processes that match is displayed. If you want to change the module parameters, you can edit all the parameters except for the entry name by accessing the pop-up menu. See To Access a Pop-Up Menu for more information.

Process Statistics Table

The following table provides a brief description of the properties for Process Statistics.


Note –

When you add a row to the process statistics table, you must provide the information in the first five rows in the following table. See To Add a Row to a Data Property Table for more information.


Table C–137 Process Statistics Properties

Property 

Description 

Entry Name 

Name of the process statistics table entry (must be a unique name). 

Name Pattern 

Pattern to match the name of the binary for the process that you want to monitor. 

Argv Pattern 

Pattern to match the arguments of the command that executes the process. 

User Specification 

User name that is executing the process. 

Entry Description 

Description of the entry (required field).  

Process Command 

Command used to initiate the process, if applicable. 

Process Count 

Number of processes currently running that match the patterns. 

% System CPU Usage 

Percentage CPU used by the system processes. This value is a time-weighted average that is taken at different time intervals. Do not confuse this percentage with the value that might result after you enter the UNIX ps command.

% User CPU Usage 

Percentage CPU used by the user processes. 

Virtual Size 

Total size of the processes in Kbytes. 

Resident Set Size 

Resident size of the processes in Kbytes. 

Monitoring State 

Toggle between on (row is enabled) and off (row is disabled). When the row is disabled, all the entries are displayed as 0 (zero). 

Microstate Information Table

The following table provides a brief description of the properties for Microstate Information.

Table C–138 Microstate Information Properties

Property 

Description 

Entry Name 

Name of the entry (must be a unique name). 

CPU wait time 

Percentage of time for CPU wait. 

Text page fault time 

Percentage of time for text page faults. 

Data page fault time 

Percentage of time for data page faults. 

Major page faults 

Number of major page faults per second (text and data faults). 

Characters in I/O 

Number of characters read and written per second. 

Involuntary context switches 

Number of involuntary context switches per second. 

CPU time for reaped children 

Percentage of CPU time used by child processes that have detached from their parent processes. 

User lock time 

Percentage of time spent for user locks. 

System trap time 

Percentage of time spent for system traps. 

Total swaps 

Percentage of time spent on swaps. 

Entry Description 

Description of the entry (required field).  

Executable code Rule 

Rule that applies to executable code. 

File access rule 

Rule that applies to file access. 


Note –

You might see extremely high percentages on a per-CPU basis if the following facts are true: