|Skip Navigation Links|
|Exit Print View|
|Oracle Solaris Cluster Concepts Guide Oracle Solaris Cluster 4.1|
System resources include aspects of CPU usage, memory usage, swap usage, and disk and network throughput. Oracle Solaris Cluster enables you to monitor how much of a specific system resource is being used by an object type. An object type includes a host, node, zone, disk, network interface, or resource group. Oracle Solaris Cluster also enables you to control the CPU that is available to a resource group.
Monitoring and controlling system resource usage can be part of your resource management policy. The cost and complexity of managing numerous machines encourages the consolidation of several applications on larger hosts. Instead of running each workload on separate systems, with full access to each system's resources, you use resource management to segregate workloads within the system.
Resource management ensures that your applications have the required response times. Resource management can also increase resource use. By categorizing and prioritizing usage, you can effectively use reserve capacity during off-peak periods, often eliminating the need for additional processing power. You can also ensure that resources are not wasted because of load variability.
To use the data that Oracle Solaris Cluster collects about system resource usage, you must do the following:
Analyze the data to determine what it means for your system.
Make a decision about the action that is required to optimize your usage of hardware and software resources.
Take action to implement your decision.
By default, system resource monitoring and control are not configured when you install Oracle Solaris Cluster. For information about configuring these services, see Chapter 10, Configuring Control of CPU Usage, in Oracle Solaris Cluster System Administration Guide.
Collect data that reflects how a service that is using specific system resources is performing.
Discover resource bottlenecks or overload and so preempt problems.
More efficiently manage workloads.
Data about system resource usage can help you determine the hardware resources that are underused and the applications that use many resources. Based on this data, you can assign applications to nodes that have the necessary resources and choose the node to which to failover. This consolidation can help you optimize the way that you use your hardware and software resources.
Monitoring all system resources at the same time might be costly in terms of CPU. Choose the system resources that you want to monitor by prioritizing the resources that are most critical for your system.
When you enable monitoring, you choose the telemetry attribute that you want to monitor. A telemetry attribute is an aspect of system resources. Examples of telemetry attributes include the amount of free CPU or the percentage of blocks that are used on a device. If you monitor a telemetry attribute on an object type, Oracle Solaris Cluster monitors this telemetry attribute on all objects of that type in the cluster. Oracle Solaris Cluster stores a history of the system resource data that is collected for seven days.
If you consider a particular data value to be critical for a system resource, you can set a threshold for this value. When setting a threshold, you also choose how critical this threshold is by assigning it a severity level. If the threshold is crossed, Oracle Solaris Cluster changes the severity level of the threshold to the severity level that you choose.
If you want to apply CPU shares, you must specify the Fair Share Scheduler (FFS) as the default scheduler in the cluster. A CPU share is the portion of the system's CPU resources that is allocated to a project. Shares define the relative importance of workloads in relation to other workloads. When you assign CPU shares to a project, your primary concern is not the number of shares the project has. Rather, you should know how many of those other projects will be competing with it for CPU resources.
By viewing the output of system resource usage and CPU control, you can do the following:
Anticipate failures due to the exhaustion of system resources.
Detect unbalanced usage of system resources.
Validate server consolidation.
Obtain information that enables you to improve the performance of applications.