14 Advanced Threshold Management
There are monitoring situations in which different workloads for a target occur at regular (expected) intervals. Under these conditions, a static alert threshold would prove to be inaccurate. For example, the accurate alert thresholds for a database performing Online Transaction Process (OLTP) during the day and batch processing at night would be different. Similarly, database workloads can change based purely on different time periods, such as weekday versus weekend. In both these situations, fixed, static values for thresholds might result in false alert reporting.
Advanced Thresholds allow you to define and manage alert thresholds that are either adaptive (self-adjusting) or time-based (static).
-
Adaptive Thresholds are thresholds based on statistical calculations from the target's observed behavior (metrics).
-
Time-based Thresholds are user-defined threshold values to be used at different times of the day/week to account for changing target workloads.
This chapter covers the following topics:
Accessing the Advanced Threshold Management Page
You manage advanced thresholds from the Enterprise Manager console. The Advanced Threshold Management page allows you to create time-based static thresholds and adaptive thresholds. To access this page:
Adaptive Thresholds
Adaptive thresholds are statistically computed thresholds that adapt to target workload conditions. Adaptive thresholds apply to all targets (both Agent and repository-monitored).
Important Concepts
Creating an adaptive threshold is based on the following key concepts:
-
For the purpose of performance evaluation, a baseline period is a period of time used to characterize the typical behavior of the system. You compare system behavior over the baseline period to that observed at some other time.
There are two types of baseline periods:
-
Moving window baseline periods: Moving window baselines are defined as some number of days prior to the current date. This "window" of days forms a rolling interval that moves with the current time. The number of days that can be used to define moving window baseline in Enterprise Manager are:
-
7 days
-
14 days
-
21 days
-
30 days
Example: Suppose you have specified trailing 7 days as a time period while creating moving window baseline. In this situation, the most recent 7-day period becomes the baseline period for all metric observations and comparisons today. Tomorrow, this reference period drops the oldest day and picks up today.
Moving window baselines allow you to compare current metric values with recently observed history, thus allowing the baseline to incorporate changes to the system over time. Moving window baselines are suitable for systems with predictable workload cycles.
Note:
Enterprise Manager computes moving window statistics every day rather than sampling.
-
Registering Adaptive Threshold Metrics
Adaptive threshold metrics are not immediately available by default; they must be defined and added to the system (registered) in order for them to become available for use by Enterprise Manager. Not all metrics can have adaptive thresholds: Adaptive Threshold metrics must fall into one of the following categories:
-
Load
-
LoadType
-
Utilization
-
Response
You can register adaptive threshold metrics from the Advanced Threshold Management page.
Configuring Adaptive Thresholds
Once you have registered the adaptive metrics, you now have the option of configuring the thresholds if the predefined thresholds do not meet your monitoring requirements.
To configure adaptive thresholds:
Determining whether Adaptive Thresholds are Correct
Even though Enterprise Manager will use the adaptive threshold settings to determine an accurate target workload-metric threshold match, it is still be necessary to match the metric sampling schedule with the actual target workload. For example, your moving window baseline period (see Moving Window Baseline Periods ) should match the target workloads. In some situations, you may not know the actual target workloads, in which case setting adaptive thresholds may be problematic.
To help you determine the validity of your adaptive thresholds, Enterprise Manager allows you to analyze threshold using various adaptive settings to determine whether the settings are correct.
To analyze existing adaptive thresholds:
Testing Adaptive Metric Thresholds
Because adaptive metric thresholds utilize statistical sampling of data over time, the accuracy of the thresholds will rely on the quantity and quality of the data collected. Hence, a sufficient amount of metric data needs to have been collected in order for the thresholds to be valid. To verify whether enough data has been collected for metrics registered with adaptive thresholds, use the Test Data Fitness function.
Deregistering Adaptive Threshold Metrics
If you no longer want specific metrics to be adaptive, you can deregister them at any time. To deregister an adaptive threshold metric:
- From the Register Adaptive Metrics regions, select the metric(s) you wish to deregister.
- Click Deregister. A confirmation displays asking if you want the metric removed from the target's adaptive setting.
- Click Yes.
Setting Adaptive Thresholds using Monitoring Templates
You can use monitoring templates to apply adaptive thresholds broadly across targets within your environment. For example, using a monitoring template, you can apply adaptive threshold setting for the CPU Utilization metric for all Host targets.
To apply adaptive thresholds using monitoring templates:
-
Create a template out of a target that already has adaptive threshold settings enabled.
From the Enterprise menu, select Monitoring and then Monitoring Templates.The Monitoring Templates page displays.
-
Click Create. The Create Monitoring Template: Copy Monitoring Settings page displays.
-
Choose a target on which adaptive thresholds have already been set and click Continue.
-
Enter a template Name and a brief Description. Click OK.
Once the monitoring template has been created, you can view or edit the template as you would any other template. To modify, add, or delete adaptive metrics in the template:
- From the Enterprise menu, select Monitoring and then Monitoring Templates.The Monitoring Templates page displays.
- On the Monitoring Templates page, select the monitoring template from the list.
- From the Actions menu, select Edit Advanced Monitoring Settings. The Edit Advanced Monitoring Settings page displays with the Adaptive Settings tab selected.
- Modify the adaptive metrics as required.
Time-based Static Thresholds
Time-based static thresholds allow you to define specific threshold values to be used at different times to account for changing workloads over time. Using time-based static thresholds can be used whenever the workload schedule for a specific target is well known or if you know what thresholds you want to specify.
Determining What is a Valid Metric Threshold
As previously discussed, static thresholds do not account for expected performance variation due to increased/decreased workloads encountered by the target, such as the workload encountered by a warehouse database target against which OLTP transactions are performed. Workloads can also change based on different time periods, such as weekday versus weekend, or day versus night. These types of workload variations present conditions where fixed static metric threshold values may cause monitoring issues, such as the generation of false and/or excessive metric alerts. Ultimately, your monitoring needs dictate how to best go about obtaining accurate metric thresholds.