40 Monitoring Hosts

As a host administrator, you must have a grasp of how your host is functioning. Host monitoring can enable you to answer such questions as:

  • Is host swapping occurring?

  • Is the filesystem becoming full?

  • Is the CPU reaching maximum capacity?

  • Are the resources being used efficiently?

  • What is the best way to monitor multiple hosts?

  • How can I proactively schedule and purchase needed resources?

The answers to these questions are key for day to day monitoring activities and are available on the host monitoring pages explained in this chapter.

Note: This chapter explains many of the metrics available in Enterprise Manager, however it is not an exhaustive list. See the Oracle® Enterprise Manager Framework, Host, and Services Metric Reference Manual for a full description of all the host metrics available.

40.1 Overall Monitoring

When monitoring your host, the prime metrics to monitor are CPU, memory, and disk usage.

Note: To access the features explained in this section, from the Host menu on a host's home page, select Monitoring, and select the feature of interest.

40.1.1 CPU Details

Using the CPU statistics, you can determine whether CPU resources need to be added or redistributed. In particular, you can:

  • Determine the commands that are taking the most CPU resources and perform the appropriate action on the target host to reduce contention by using an administrative tool of your choice.

  • View trends in CPU Usage over various time periods including last 24 hours, last week and last month.

  • Monitor all CPUs, that is, not an aggregate view but a view of all the CPUs in the system.

Note: You can use the Execute Host Command feature in Enterprise Manager to perform actions on the host.

40.1.2 Memory Details

Using the Memory statistics, you can determine whether memory resources need to be added or redistributed. In particular, you can determine the processes that are using the most memory resources.

40.1.3 Disk Details

Using the Disk statistics, you can determine whether Disk resources need to be added or whether you can distribute the load more effectively across existing resources. In particular, you can determine the disks that are over utilized or experiencing longer service times.

Correlating the disk information with the response from applications that use the underlying storage allows you to determine whether the system is properly scaled. You can then answer the questions: Should the load on the disks be redistributed? Should additional storage be added?

To redistribute the load, modify the applications that use the storage.

40.1.4 Program Resource Utilization

Using the Program Resource Utilization data, you can see the trends in resource usage for:

  • Specific program or set of programs

  • Special user or set of users

  • Combination of programs and users

40.1.5 Log File Alerts

Enterprise Manager monitors log files and provides alerts. Once alerts are generated, you can:

  • Clear open alerts selectively or clear every open alert.

  • Purge open alerts selectively or purge every open alert.

Note: Clearing an alert results in the particular alert being marked as cleared but the alert is not deleted from the Management Repository. However, purging an alert permanently deletes the alert from the system.

40.1.6 Metric Collection Errors

Metric Collection Errors provide details about the errors encountered while obtaining target metrics. These details give you an idea of the metrics that may not represent the performance of the target accurately, as errors were encountered while collecting the metrics.

40.2 Storage Details

Tracking the storage resource allocation and usage is essential to large Information Technology departments. Unallocated and under utilized storage can be put to better use. Historical trends at a business entity level enable you to plan for future growth.

Storage Details are relevant to Enterprise Manager targets that are associated with one or more hosts. In particular:

  • Summary attributes presented are rolled up for one or multiple associated hosts.

  • A host is associated with a group either through:

    • Explicit membership, or

    • Implicit hosted by association which is inherited through a group member target

Note: The shared storage is accurately counted once when the storage is accessible from multiple systems or accessible through multiple physical paths on the same system. Globally unique identifiers have been instrumented for accurate counting of shared storage.

Refer to the online help to learn how the individual storage statistics are calculated.

40.2.1 Storage Utilization

Storage utilization is provided at the host level when launched in the context of a host target and associated hosts level when launched in the context of a group.

In the context of a host, the storage items are: Disks, Volumes, ASM (Automatic Storage Management), File Systems, Databases, and Vendor Distribution.

In the context of a group, the storage properties for the associated hosts are: Provisioning Summary by Host, Consumption Summary by Host, and Vendor Distribution

The graphs present historical trends over a period of time. Based on this intelligence, you can take appropriate action on the target host or group as necessary. Appropriate actions include:

  • Buying and adding more storage

  • Deleting underutilized application data after archiving

  • Deleting unneeded application data

  • Altering the storage deployment configuration for optimal use

Note: The storage information shown in a group is the aggregate of the individual host information of the associated hosts in the group.

40.2.2 Overall Utilization

Overall Utilization represents summary attributes (unallocated, overhead, used, and free) that provide a system level view of storage resource utilization. The overall statistics enable you to determine:

  • How much storage is unallocated?

  • How much space is still free among deployed applications?

40.2.3 Provisioning Summary

Provisioning Summary represents allocation related summary attributes (allocated, unallocated, and overhead) for File Systems (Writeable NFS part), ASM, Volumes, and Disks for the associated hosts.

Note that Writeable NFS is shown in Provisioning Summary to account for the storage attached to the host over NFS. These layers are managed by IT administrators who are responsible for provisioning space to applications.

Allocation related attributes do not change frequently, change typically results from an administrative action taken by an IT administrator. See Provisioning Summary section in the About Storage Computation Formula help topic for details on how this information is calculated.

The bar chart summarizes the allocated, unallocated, and overhead for all entities present in Disk, Volume, Oracle ASM, and Writeable Network File Systems (NFS) portion of File System layer for the host or associated hosts of the group.

If a specific layer is not deployed, the corresponding bar is omitted from the chart. The bar chart answers the following questions.

  • How much space is available for allocation from the entities present in the given layer?

  • How much space was allocated from the entities present in the given layer?

  • What is the overhead of deployed Volume Management software?

  • What is the overhead of deployed Oracle ASM software?

    Note: When launched in the context of a group, rollup information shown in the charts excludes NFS mounts that are based on Local File Systems present in the associated hosts.

40.2.4 Consumption Summary

Consumption Summary provides usage related summary attributes (used and free) for Databases, File Systems (Local File Systems and Writeable NFS parts).

Usage related attribute values tend to change more frequently relative to allocation related attributes. See Consumption Summary section in the About Storage Computation Formula help topic for details on how this information is calculated.

The bar chart shows used and free space summary information for all Databases, all Local File Systems, and all Writeable Network File Systems (NFS) in the host or the associated hosts of the group.

Note: When launched in the context of a group, rollup information shown in the charts excludes NFS mounts that are based on Local File Systems present in the associated hosts.

40.2.5 ASM

Oracle Automatic Storage Management (ASM) is a simple storage management solution that obviates the need for using volumes layer technologies for Oracle databases.

40.2.6 Databases

Databases refer to Oracle databases (including Real Application Cluster (RAC) databases) on top of which other applications may be running. Databases can consume space from disks, volumes, file systems, and Oracle Automatic Storage Management (ASM) layers.

40.2.7 Disks

Disks statistics provide the allocated and unallocated storage for all the disks and disk partitions on a host. All disks are listed including virtual disks from external storage systems such as EMC Storage Array.

Note: Overhead information for virtual disks is not instrumented nor presented.

For a disk to be deployed for usage, the disk must first be formatted. After formatting, the disk can be configured (using vendor-specific configuration utilities) to have one or more partitions.

A disk or disk partition can be associated (using vendor-specific configuration utilities) with exactly one entity from one of the upper layers (volumes, Oracle ASM, databases, and file systems) on the host. When an association exists for a disk or disk partition to an upper layer entity, it is reported as allocated space in Enterprise Manager.

40.2.8 File Systems

File Systems Layer contains directories (also known as folders) and files that are accessed, managed, and updated through the use of databases, middle tier applications, and end-user tools. They can be broadly categorized into local file systems that are disk based and remote file systems like NFS. In Enterprise Manager, summary attributes are provided for local file systems and the Writeable NFS part of File Systems layer.

Local File Systems

Local File Systems are based on disk storage visible to the host. Various operating systems support different types of local file systems. The following table provides examples:

Local File System Operating System
lofs Solaris (Monitored only if NMUPM_SUPPORT_LOFS property is set to 1 for the target instance.)
nfs Solaris, Linux
tmpfs Solaris
ufs Solaris, Linux, AIX, HP
vxfs Solaris, Linux, AIX, HP
zfs Solaris, Linux, AIX
ext2 Linux, AIX
ext3 Linux, AIX

NFS

Network File Systems (NFS) are accessible over the network from the host. A remote server (NFS Server) performs the I/O to the actual disks. There are appliances that provide dedicated NFS Server functionality, such as Network Appliance Filer. There are also host systems, for example, Solaris and Linux, that can act as both NFS Server and Client.

Writeable NFS refers to the NFS mounted on a host with write privilege.

Suggestions for Monitoring NFS Mounts

The following are suggestions on monitoring NFS mounts.

  1. Monitor the remote host if NFS exports are coming from another host supported by Enterprise Manager. The Filesystems metric will monitor the local file systems on the remote host.

  2. Monitor the Netapp Filer if NFS exports are coming from a remote Netapp Filer. Volumes and Qtress metrics will monitor the exports from the remote Netapp Filer.

  3. Use the 'File and Directory Monitoring' metric if any of the previous choices do not meet the need. Set the threshold against the 'File or Directory Size' metric to monitor specific remote mounts.

40.2.9 Volumes

Various software packages are available in the industry that are either generically known as Volume Manager technology or Software*RAID (Redundant Arrays of Independent Disks) technology. These technologies are deployed to improve the RAS (Reliability, Availability, and Scalability) characteristics of the underlying storage. For example, Veritas Volume Manager is a popular product used across multiple operating systems. Such technologies are referred to as Volumes in Enterprise Manager.

The Volumes option displays the allocated and unallocated storage space for all the entities present in the Volumes layer, including relevant attributes for the underlying Volumes layer technology.

40.2.9.1 Types of Entities

The Volumes layer can have entities of various types present internally. Entity type shown in Enterprise Manager is based on the terminology as defined by the deployed Volumes layer technology. For example, a Veritas volume manager defines and supports the following entity types: Volume, Plex, Sub Disk, VM Disk, VM Spare Disk, and Diskgroup. Refer to the vendor documentation for more details about the Volumes technology deployed on your system.

40.2.9.2 Top-Level Entities

For each vendor technology, entities of specific types from their layer can be associated with entities from the upper layers. File Systems, Databases, and ASM are examples of upper layers. For example, entities of type 'Volume' in Veritas Volume Manager are such entities. These entities are referred to as top-level Volumes layer entities in this documentation.

Top-level Volumes layer entities provide storage space to the upper layers for usage. If a top-level entity does not have an association to an entity from an upper layer, the top-level entity is unallocated and it is available for further allocation related activity.

40.2.9.3 Bottom-Level Entities

For each vendor technology, entities of specific types from their layer can be associated with entities from the disk layer. For example, VM Disk and VM Spare Disk entities in Veritas Volume Manager are such entities. These entities are considered to be bottom-level Volumes layer entities in this documentation.

Bottom-level Volumes layer entities consume storage space from the disk layer and provide storage space to the rest of the entities in the Volumes layer. Bottom-level entities of 'reserve' or 'spare' type are always allocated and no space is available from them for allocation purposes. Note that spare entities are utilized by the Volumes technology for handling disk failures and they are not allocated to other entities present in the Volumes layer by way of administrator operations.

Non-spare bottom-level entities can have an association to an intermediate or top-level entity configured using respective vendor administration utilities. If no association exists for a non-spare bottom-level entity, then it is unallocated. If one or more associations exist for the non-spare bottom-level entity, then the space consumed through the existing associations is allocated. It is possible that some space could be left in the bottom-level entity even if it has some associations defined for it.

Storage space in non-spare bottom-level entities not associated with intermediate or top-level entities is available for allocation and it is accounted as unallocated space in the bottom-level entity.

40.2.9.4 Intermediate Entities

Non top-level and bottom-level entities are considered to be intermediate level entities of the Volumes layer. For example, Volume (layered-volume case), Plex and Sub Disk entities in Veritas Volume Manager are such entities.

If an intermediate entity has association to another intermediate or top-level entity, the storage space consumed through the association is allocated. Space present in the intermediate entity that is not consumed through an association is unallocated.

The following vendor products are instrumented:

Platform Product
Solaris Solaris Volume Manager
Linux mdadm, raidtool, Suse LVM

40.2.10 Vendor Distribution

The Vendor Distribution statistic reflects the host-visible storage for associated hosts, that is:

Sum of the size of all disks
+ Sum of the size of all Writeable NFS mounts

40.2.11 Storage History

Enterprise Manager provides historical trends for its storage statistics. Historical trends can be viewed over last month, last three months, or last year. Using this historical trend, you can predict how much storage your organization may need in the future.

In the case of a group, history is not enabled by default. The user interface allows you to enable or disable the history for each group. Computation of history for a group consumes resources in the Enterprise Manager Repository database. It is not anticipated that a given deployment would find it useful to have the history for all instances of groups, so the control is given to you to choose for which groups it is worth keeping the history.

40.2.12 Storage Layers

The stack of storage management technologies is deployed on a host. Deployed technology at any layer can provide storage resources to any layer above it and consume the storage resources from any layer below it.

The ultimate consumer of the storage is application level software such as an Oracle database or the end users. In Enterprise Manager, Volumes refers to Volume Management and Software*RAID (Redundant Arrays of Independent Disks) technologies offered by various vendors.

In Enterprise Manager, the following storage layers and their associations have been modeled.

Storage Layer Can Provide Storage To:
Disks Volumes, File Systems, Database, ASM
Volumes File Systems, Database, ASM
ASM Database
File Systems Database

40.2.13 Storage Refresh

Storage Refresh is performed in the context of two types of targets: host target and group target.

Storage Refresh in Context of Host Target

Storage Refresh functionality, in the context of a host target, allows you to refresh the storage data in your Enterprise Manager repository by:

  1. Forcing Enterprise Manager to perform a real-time collection of all storage attributes from the host, and

  2. Uploading the storage attributes into the Enterprise Manager repository

Once the refresh operation is complete, the Storage UI pages display the latest information about the host.

Storage Refresh in Context of Group Target

Storage Refresh functionality, in the context of a group target, allows you to refresh the storage data in your Enterprise Manager repository by:

  1. Forcing Enterprise Manager to do a real-time collection of all storage attributes from all the member hosts of the group, and

  2. Uploading the storage attributes into the Enterprise Manager repository

Since this refresh could take some time, depending on the number of hosts involved, the functionality is provided as an Enterprise Manager job submission.

Once the refresh job is complete, the Storage UI pages display the latest information about the group.